Need help?
<- Back

Comments (33)

  • montroser
    Well, this is certainly not benchmaxxed, I'll give it that. And props for being honest about how far behind Qwen 3.6 MoE is this model.But yeah, it's not the best look to have to stretch and say it's "competitive" with other models in it's weight class, when it offers not much else that's useful or novel.
  • amunozo
    Are these models trained from scratch or do they necessarily need distillation from bigger models to be competitive? It's usually the case that they're a small model for a family with a bigger model. In the first case, does anybody know what's the economy of training this 30B-A3B model vs. training a DeepSeek V4 Pro or Flash size of models (1.6T, 200 something B, less activated)?
  • matt_daemon
    > Hardware (minimum): 1× H100 @ FP8Cool to see this but seems like it would be pretty expensive to run
  • moojacob
    I was a fan of coheres general purpose LLM. Command A I think? Before they came out with their reasoning model.More competition is better.
  • AbuAssar
    strange, I already submitted the same url 6 days ago:https://news.ycombinator.com/item?id=48475095
  • tonyrice
    I'm excited to see more OSS models
  • zuzululu
    Wasn't aware that Cohere was still around but this release doesn't exactly instill confidence.
  • chattermate
    [flagged]
  • cyanydeez
    looks like it's just qwen 3.6 coder.
  • moralestapia
    >Our plan to being profitable is to give mediocre stuff for free