Claude Code Is Steganographically Marking Requests

<- Back

Claude Code Is Steganographically Marking Requests

kirushik

Comments (117)

meowface
Value judgment aside: I am a bit surprised at how sloppily they did this. I think they could've achieved the same effect while decreasing the odds of detection via reverse engineering like this.(This field is known as "underhanded code", coined by the Underhanded C contest: https://www.underhanded-c.org. It's a little-known "art"; little-known for probably self-explanatory reasons. There are much cleverer ways of achieving effects like this. One obviously being you can move more out of the client and into the server, but the other being you can write plausibly deniable client code in a much more benign-seeming way than this. Some of what they added can only be done on the client, but I think more could've been moved.)It's possible they knew the JS bundle gets so heavily scrutinized that it'd eventually get spotted and reported on regardless so they didn't bother doing something more subtle and duplicitous. But still a bit surprising.
VortexLain
Codex CLI is FOSS, unlike Claude Code, so Codex is less likely to do things like that, and it's one more reason to avoid Claude Code and Claude in general. Hopefully, many eyes will be looking into Codex for malicious things like that.
dehrmann
Anthropic must think that their moat isn't very large if they're this worried about distillation.
sebastiennight
Can somebody clarify for me - if ANTHROPIC_BASE_URL is set to a different provider... then isn't this "marked" system prompt being sent to that provider's API rather than Anthropic's?I understand how this can be useful to Anthropic if the 3rd-party is acting as a proxy (because they end up hitting the Claude API with the marked prompt), but it looks like requests where "hostname contains deepseek" would never be sending data to Anthropic. What am I missing?
matheusmoreira
I reported a similar system prompt injection mechanism here:https://news.ycombinator.com/item?id=48259288 https://github.com/anthropics/claude-code/issues/62061Looks like they found new "creative" uses for it, as expected. I'll keep patching it out.
jacobgold
> "That also means the client itself deserves scrutiny. If a coding agent can read your repo and run commands, the binary that ships it should be boring (ƒor example, pi harness)"You're actually trust your security to your harness AND model AND inference API provider in this scenario: https://jacob.gold/posts/why-i-wont-run-untrusted-models/
MattDamonSpace
“So the feature mostly punishes the exact people who are easier to fingerprint: normal developers doing weird but legitimate things”What’s the punishment here exactly?
wolttam
I used Claude Code for a month because my boss gifted me a sub and wanted me to try it.I used that month to complete a work project and then beef up my personal harness so I'd never have to deal with Anthropic (and these sorts of shenanigans) again.
LPisGood
This is very interesting. Combating resellers and distillation seems like a very difficult problem indeed. Interesting to me is that these techniques mentioned in the article are just like anti-observation techniques used by some of the more sophisticated malware out there, however defeating them is pretty trivial.
throwawayffffas
Claude code does feel very malwarey to be honest. They have been like that from the start.
port3000
That's a lot of effort when they could just play a short video saying 'You wouldn't steal a car' instead
827a
This seems really, really stupid. Similar to the weird Zig runtime signature thing from a few months ago ago, it was bound to be discovered, quickly, and all the resellers have to do is find a new domain name that (checks notes) doesn't have the word DEEPSEEK in it. Like, seriously? Your goal was to identify resellers by checking if the proxy has the corporate name of one of your competitors in it? Is this amateur hour?All Anthropic has done is reduce trust, once again, with legitimate customers, while doing nothing to stop illegitimate customers. They need to get adults into key leadership roles, quickly.
sigmoid10
If they only collect the data for analysis I guess this is fine (they already get way more sensitive data from users anyways, so if privacy is your concern you've made the mistake many steps ago). The much more interesting question is if they directly act on this data in their API. For example by rate-limiting, compute-limiting or rerouting to weaker models. That might even be legally questionable. I would really like to see this as a follow-up analysis, but I guess it is way more difficult and will also cost quite a bit in tokens.
tgtweak
None of this is surprising - they're trying to mask and relay when they detect known patterns of what looks like distillation attacks and client app copying/modification. The list obfuscation here is likely to prevent or make it difficult for those same adversaries to work around this or delete/null it out when making a bootleg copy.Cool reverse engineering/analysis report but if this is the extent of nefarious activity that came of it (trying to catch/mitigate chinese lab model distillations), that's kind of encouraging.
100ms
What's the point of even trying to obfuscate this with such a simple method? Could at least have hidden the targeted features by storing their hashes or embedding a bloom filter or similar
iqandjoke
It is about China detection. They seems to put a tracker on the email as well.
fny
This was already discovered during the source map leak.> This is not a malicious feature, but it is a weird choice for a developer tool that asks for trust.They already tell you they scan for malicious prompts, and they have no ZDR guarantees for consumers. Why do signatures like this matter at all?
SaaShack26
I use its too
MangoCoffee
The AI race right now is in a sad state. Chinese's playbook is releases open weight models and trains them on their own chips.Anthropic pushes fear and control. But the only way to win is by innovating. China is flooding the market with cheap, good enough models, while the U.S. is building a Chinese firewall.
ahmedehab_01
Frankly, I don't see this as the concerning behaviour the article describes. It is fine to try to protect against distillation through a technique like this. This will also allow them to, instead of blocking the distillation agents, respond with a poorer result/model, hindering the progress of distillation, momentarily at least.I would guess that's their first line of defense; they should have more techniques to identify distillation because that's a very simple way of detecting the host and can be easily spoofed.
felipelalli
Ridiculous.
Klonoar
If there weren't already enough tells that something is AI-generated, I guess you could add this to the list.
a_c
It piqued my interest. I think I’ve found a weekend project
bitlad
Silicon valley season 6 was on point.
mosfets
I clicked the link to learn what steganography mean...
phendrenad2
Non-hugged: https://archive.is/Wdhp0
ductsurprise
Is it just a minified localization(l10n) function maybe?
hhh
Cool fingerprinting avenue.
ajross
Headline is, frankly, awful. This isn't the AI secretly doing stuff and hiding it. This is the very human Anthropic engineers trying to detect Chinese scraping via some frankly hamfisted and unimaginative URL trickery.
love0972
Is that really how it is? How will this affect our future?
grayhatter
Here's the sha of the prompt I submitted... no I don't know why there are no saved prompts with that sha.What do you mean you don't know where the bug is coming from?No, I absolutely didn't make it up, how could you accuse me of that?Does anyone know when this regex isn't working? I double checked it 27 times, I even asked the LLM. They all say this regex should be finding these dates.Weird, suddenly all the conversations are breaking when I feed them into this other tool? Something about UTF-8 errors, but I'm sure I'm only using ASCII?I do try to take care to make sure the things I build can be used by other people even when they care about different things. I care about understandably, determinism (as it relates to computing), and repeatability (because I want to be able to trust the systems I use).If y'all would be willing to try to account for use cases of others, and try not to break them... that would be nice.Please note: that generally when you modify something that belongs to someone else without telling them... things should be expected to break.
anon
undefined
maxothex
[flagged]
123sereusername
[dead]
saddlerustle
[flagged]
midtake
[flagged]
atonse
[flagged]
theplumber
The more I learn about Anthropic the more they disgust me. Finger crossed for all the companies from their “ban list”