HuggingFaceFW/fineweb-edu
Viewer β’ Updated β’ 3.5B β’ 506k β’ 1.13k
Zenyx-42M is a 42M parameter GPT-2 style decoder-only transformer trained from scratch on high-quality educational web text.
The name "Zenyx" fuses Zen (calm, focused intelligence) and Onyx (strength, power)βembodying an efficient, capable language model.
| Component | Value |
|---|---|
| Architecture | GPT-2 style decoder-only transformer |
| Parameters | ~42M (41.87M) |
| Layers | 8 |
| Hidden Size | 512 |
| Attention Heads | 8 |
| Context Length | 512 tokens |
| Vocabulary Size | 32,000 (BPE) |
| Positional Encoding | Learned embeddings |
| Setting | Value |
|---|---|
| Training Data | FineWeb-Edu (streamed, deduped, filtered) |
| Training Mode | Scratch (no pretrain) |
| Optimizer | AdamW (lr=3e-4, wd=0.1) |
| LR Schedule | Warmup (2k) + Cosine decay |
| Batch Size | 16 per device, 8 grad. accumulation |
| Effective Batch | 128 |
| Precision | BFloat16 mixed precision |
| Hardware | NVIDIA L4 (24GB VRAM) |
| Training Time | ~6 hours |
| Iterations | 100,000 |
| Tokens Processed | ~6.55B |
| Model | Params | Val Loss | Zenyx Advantage |
|---|---|---|---|
| Zenyx-42M | 42M | 3.08 | Baseline |
| GPT-1 | 117M | ~3.3 | +22% better |
| DistilGPT-2 | 82M | ~3.1 | Nearly tied |
| GPT-2 Small | 124M | ~2.8 | 3x more efficient |
Install dependencies
pip install torch transformers tokenizers
Quick Start Example
import torch
from model import NanoGPT
from config import NanoGPTConfig
from tokenizers import Tokenizer
# Load model
checkpoint = torch.load('best_model.pt', map_location='cpu', weights_only=False)
config = checkpoint['config']
model = NanoGPT(config)
model.load_state_dict(checkpoint['model'])
model.eval()
# Load tokenizer
tokenizer = Tokenizer.from_file('tokenizer.json')
def generate(prompt, max_tokens=100, temperature=0.6):
tokens = tokenizer.encode(prompt).ids
x = torch.tensor(tokens).unsqueeze(0)
with torch.no_grad():
output = model.generate(x, max_new_tokens=max_tokens, temperature=temperature, top_k=40)
return tokenizer.decode(output.tolist())
prompt = "Artificial intelligence is"
output = generate(prompt)
print(output)
Not recommended for:
Designed as a base model for further fine-tuning.
Next Steps:
@misc{zenyx_42m_2025,
author = {Arko007},
title = {Zenyx-42M: Efficient Language Model Trained From Scratch},
year = {2025},
month = {October},
publisher = {Hugging Face},
url = {https://huggingface.co/Arko007/Zenyx-42M}
}
Apache License 2.0
See LICENSE file for details.
ZENYX
Symbol: π diamond/geometric
Colors: Deep purple (#5B21B6) + Cyan (#06B6D4)
Font: Modern, clean, geometric
Taglines:
Built with π using PyTorch and L4 GPU