·
AI & ML interests
multilingual NLP, tokenization
Recent Activity
Organizations
view article There is no such thing as a tokenizer-free lunch
catherinearnett
• • 98
view article An Analysis of Multilingual Models on Hugging Face
catherinearnett
• • 5
published an article about 1 year ago view article Best Practices for Open Multilingual LLM Evaluation
catherinearnett
• • 6
published an article over 1 year ago view article They Said It Couldn’t Be Done
Pclanglais
• • 91
published an article over 1 year ago view article Releasing the largest multilingual open pretraining dataset
Pclanglais
• • 107
published an article over 1 year ago view article Detoxifying the Commons
catherinearnett
• • 6
published an article over 1 year ago view article wHy DoNt YoU jUsT uSe ThE lLaMa ToKeNiZeR??
catherinearnett
• • 54