Need help?
<- Back

Comments (35)

  • ogou
    Don't sleep on Mistral. Highly underrated as a general service LLM. Cheaper, too. Their emphasis on bespoke modelling over generalized megaliths will pay off. There are all kinds of specialized datasets and restricted access stores that can benefit from their approach. Especially in highly regulated EU.Not everyone is obsessed with code generation. There is a whole world out there.
  • mark_l_watson
    I am rooting for Mistral with their different approach: not really competing on the largest and advanced models, instead doing custom engineering for customers and generally serving the needs of EU customers.
  • upghost
    > Pre-training allows organizations to build domain-aware models by learning from large internal datasets.> Post-training methods allow teams to refine model behavior for specific tasks and environments.How do you suppose this works? They say "pretraining" but I'm certain that the amount of clean data available in proper dataset format is not nearly enough to make a "foundation model". Do you suppose what they are calling "pretraining" is actually SFT and then "post-training" is ... more SFT?There's no way they mean "start from scratch". Maybe they do something like generate a heckin bunch of synthetic data seeded from company data using one of their SOA models -- which is basically equivalent to low resolution distillation, I would imagine. Hmm.
  • roxolotl
    Mistral has been releasing some cool stuff. Definitively behind on frontier models but they are working a different angle. Was just talking at work about how hard model training is for a small company so we’d probably never do it. But with tools like this, and the new unsloth release, training feels more in reach.
  • dmix
    This is definitely the smart path for making $$ in AI. I noticed MongoDB is also going into this market with https://www.voyageai.com/ targeting business RAG applications and offering consulting for company-specific models.
  • csunoser
    Huh. I initially thought this is just another finetuning end point. But apparently they are partnering up with customers on the pretraining side as well. But RL as well? Jeez RL env are really hard to get right. Best wishes I guess.
  • ryeguy_24
    How many proprietary use cases truly need pre-training or even fine-tuning as opposed to RAG approach? And at what point does it make sense to pre-train/fine tune? Curious.
  • hermit_dev
    The future of AI is specialization, not just achieving benevolent knowledge as fast as we can at the expense of everything and everyone along the way. I appreciate and applaud this approach. I am looking into a similar product myself. Good stuff.
  • rorylawless
    The fine tuning endpoint is deprecated according to the API docs. Is this the replacement?https://docs.mistral.ai/api/endpoint/deprecated/fine-tuning
  • andai
    They mention pretraining too, which surprises me. I thought that was prohibitively expensive?It's feasible for small models but, I thought small models were not reliable for factual information?
  • aavci
    How does this compare to fine tuning?
  • bsjshshsb
    Id training or FT > context? Anyone have experience.Is it possible to retrain daily or hourly as info changes?
  • codance
    [dead]
  • shablulman
    [dead]
  • gpubridge
    [flagged]