albertfares/MNLP_M3_dpo_dataset
Viewer • Updated • 234k • 61
This model is a fine-tuned version of Qwen/Qwen3-0.6B-Base using filtered Direct Preference Optimization (fDPO) on the MNLP M3 DPO dataset.
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("albertfares/MNLP_M3_dpo_model_69k")
tokenizer = AutoTokenizer.from_pretrained("albertfares/MNLP_M3_dpo_model_69k")
This model uses SafeTensors format for enhanced security and faster loading.
Base model
Qwen/Qwen3-0.6B-Base