Instructions to use openbmb/MiniCPM-o-2_6 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use openbmb/MiniCPM-o-2_6 with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("openbmb/MiniCPM-o-2_6", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
How to distill a mini minicpm-o-2.6?
These days, I try to train a 3B distill-minicpm-o, but I encountered many problems like cuda memory、Compatibility of DataCollector and so on. I do want to know how to distill this model.
I'm sorry, but we haven't considered this requirement and haven't distilled this model yet. I think it might be difficult.
Can you use a quantized model or a framework like llama.cpp for efficient inference to solve your problem?
I have tried to using BitsAndBytesConfig to load a quantized model, but if I use 8bit it reporting "RuntimeError: "normal_kernel_cpu" not implemented for 'Char'", and if I use nf4&4bit, the jupyter kernel crushed.
https://huggingface.co/openbmb/MiniCPM-o-2_6-int4
We provide a repository for autogptq's quantitative methods. Compared to developing your own, it should be easier to get results by following the official repository's methods.