How many proprietary use cases truly need pre-training or even fine-tuning as opposed to RAG approach? And at what point does it make sense to pre-train/fine tune? Curious.
I am rooting for Mistral with their different approach: not really competing on the largest and advanced models, instead doing custom engineering for customers and generally serving the needs of EU customers.
Mistral has been releasing some cool stuff. Definitively behind on frontier models but they are working a different angle. Was just talking at work about how hard model training is for a small company so we’d probably never do it. But with tools like this, and the new unsloth release, training feels more in reach.
This is definitely the smart path for making $$ in AI. I noticed MongoDB is also going into this market with https://www.voyageai.com/ targeting business RAG applications and offering consulting for company-specific models.
Huh. I initially thought this is just another finetuning end point. But apparently they are partnering up with customers on the pretraining side as well. But RL as well? Jeez RL env are really hard to get right. Best wishes I guess.
https://docs.mistral.ai/api/endpoint/deprecated/fine-tuning
It's feasible for small models but, I thought small models were not reliable for factual information?
Is it possible to retrain daily or hourly as info changes?