Rio de Janeiro's "homegrown" LLM appears to be a merge of an existing model

<- Back

Rio de Janeiro's "homegrown" LLM appears to be a merge of an existing model

unrvl22

Comments (195)

rafaquintanilha
I have no affiliation with them but here's what I think happened:1. They claim the official model is based on Qwen 397B. It's likely they didn't disclose Nex Pro at all because Nex itself is based on the same base model (not saying they shouldn't).2. The improvement would come from merging the weights PLUS on-policy distillation. The confusion is that the uploaded model didn't have the distillation at all.3. It's important to notice they didn't advertise the model besides posting it on Reddit 2 days ago. It became viral organically, over the weekend, and during Brazil's World Cup debut (Brazilians will understand). Of course the mayor of Rio took the opportunity to capitalize over the free coverage, but that wasn't done in conjunction with the researchers.4. I don't see why they would disclose Qwen 397B as base and mention the SwiReasoning paper but not mention Nex if all they did was to merge both models.5. In any case, what they are claiming is easily verifiable once (if) they upload the right model.
hintymad
> Every weight tensor in Rio is, to thousands of standard deviations, the same 0.6/0.4 blend of Nex and Qwen — across all 60 layers and every component of the network. Other finetunes cannot be explained as interpolations.I find it amazing how robust the current deep learning models are. A simple linear combination of every weight did not degrade the performance of the model, but enhanced it.
unrvl22
The municipality of Rio de Janeiro (via its IT company IplanRIO) released Rio-3.5-Open-397B, presented as a homegrown Qwen3.5 fine-tune that beats comparable open models on benchmarks. The linked issue argues it's actually a weighted merge of ~60% Nex-N2 Pro + ~40% Qwen3.5-397B-A17B - Nex-N2 having been released about a week earlier.
zinodaur
Oh no, someone is profiting off of their work without proper attribution!?!?
jordz
Can someone please explain or link to some information about how models are merged? Is this genuinely merging weights mathematically or some kind of distillation (presumably not if they’ve done zero training as the post suggests).
aaronbrethorst
They really missed out by not calling it Neuromancer.
anon
undefined
fkozlowski
I'm honestly surprised that they even had the inclination to attempt creating a model. I guess it's bullish that a municipal IT department had the guts to try this?
jkwang
This is a concerning pattern. Rebranding merged models as "homegrown" without disclosure undermines trust in open-source AI development. The community needs better provenance tracking and transparency standards for model releases.
RandyOrion
Please do not claim you trained a new model, only to got caught red-handed by others. There are already several people or groups did that, got caught, and vanished in no time.Check how the "authors" of "this model" react to this problem [1]. See how they deal with this problem by first changing their affiliation from https://iplanrio.rio.rj.gov.br to https://iplanrio.prefeitura.rio [2], then saying that they are sorry for being caught [3], then just remove all their affiliations once for all [4].I think the "authors" of "this model" [5] should be held accountable until they upload new checkpoints, and the performance of the new model is verified by third-parties.P.S. To people who downvoted me, show me why you're doing this.[1] https://huggingface.co/prefeitura-rio/Rio-3.5-Open-397B/comm...[2] https://huggingface.co/prefeitura-rio/Rio-3.5-Open-397B/comm...[3] https://huggingface.co/prefeitura-rio/Rio-3.5-Open-397B/comm...[4] https://huggingface.co/prefeitura-rio/Rio-3.5-Open-397B/comm...[5] https://huggingface.co/prefeitura-rio
jrm4
“Well, Steve (Jobs), I think it’s more like we both had this rich neighbor named Xerox, and I broke into his house to steal the TV set, but I found out that you had already stolen it.”-- Bill Gates
AlienRobot
The model's webpage at https://huggingface.co/prefeitura-rio/Rio-3.5-Open-397B says it's a merge now. It previously didn't contain this paragraph:>The model is built via a merge of https://huggingface.co/nex-agi/Nex-N2-Pro and https://huggingface.co/Qwen/Qwen3.5-397B-A17B, proceeded by On-Policy Distillation from a stronger model. We detected an incorrect upload in the previous version, where the base merged version was upload instead of the final distilled model. We are sorry for the confusion and apologize profusely.Incidentally are people using Github issues as blogs now?
blitzar
Its stupid and hilarious when someone in Rio does it; when a techbro in silicon valley does it they get VC funding, a maserati and an entry on the 30 under 30 list.
anon
undefined
nicman23
is it any good?
thelonelyborg
this is probably occurring all over the world including in startups.
ekjhgkejhgk
One funny thing about incompetence is that they don't have the competence to know that their incompetence is straightforward to verify by a competent person.
AnotherGoodName
This is fascinating that it worked though. Can we just merge all the open weight models and get something better?
FooBarWidget
Can anyone explain to me what a merge is and why that works? It seems utterly bizarre to me that you can just merge weights. You can't make a working program by just merging machine instruction pages. Aren't weights tightly coupled to a specific architecture?
delusional
It's absolutely insane to me that we are now at a point where the top of the front page of hacker news is a random GitHub issue about attribution to some random LLM merge, written in just the most disgusting AI slop style.I would like to downvote this please.
yieldcrv
Didn’t the last thread about this have someone from the lab or an enthusiast in Rio saying exactly that?Its a fine tune of QwenNot a conspiracy
PixComicOS
[flagged]
hottrends
[flagged]
Aurornis
[dead]
flowbarai
[flagged]
jing09928
[flagged]
antii
[dead]
diego_moita
WHAT!? There are thieves in Rio de Janeiro?Oh, I am so SHOCKED, so SHOCKED! /sExplaining the joke: in Brazil, Rio de Janeiro is known as "Terra de bandido" (Gangster's Land).Kinda like Chicago in the 20's or Naples and Palermo in the 90s.
elzbardico
[flagged]
Scroll_Swe
[flagged]
pelasaco
an eternal 7x1.. and I am not talking about Curaçao..
MadrasTh0rn
Not surprised
alfiedotwtf
Wasn’t it already obvious given the awfully familiar parameter numbers?
Havoc
Nex in turn is also based on qwen so don’t think they’re too far off