Bajju360
/

nanochat_d20base

Model card Files Files and versions

hf download Bajju360/nanochat_d20base

Create .venv and install transformers and torch

uv venv
source .venv/bin/activate

Run Inference

python base_nano.py

Base model loss

timestamp: 2025-12-14 23:52:17

train bpb: 0.8162
val bpb: 0.8135

Base model training

timestamp: 2025-12-14 22:45:05

Base model evaluation

timestamp: 2025-12-15 00:17:50

Model: base_model (step 10700)
CORE metric: 0.2036
hellaswag_zeroshot: 0.2555
jeopardy: 0.0874
bigbench_qa_wikidata: 0.5157
arc_easy: 0.5253
arc_challenge: 0.1069
copa: 0.2200
commonsense_qa: 0.1308
piqa: 0.3765
openbook_qa: 0.0987
lambada_openai: 0.3852
hellaswag: 0.2591
winograd: 0.2821
winogrande: 0.0355
bigbench_dyck_languages: 0.0890
agi_eval_lsat_ar: 0.1141
bigbench_cs_algorithms: 0.4030
bigbench_operators: 0.1905
bigbench_repeat_copy_logic: 0.0000
squad: 0.2085
coqa: 0.2078
boolq: -0.1902
bigbench_language_identification: 0.1770

Downloads last month: 8

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support