SynthID

<- Back

SynthID

tosh

Comments (66)

parliament32
Note that watermarking (yes, including text) is a requirement[1] of the EU AI Act, and goes into effect in August 2026, so I suspect we'll see a lot more work in this space in the near future.[1] Specifically, "...synthetic audio, image, video or text content, shall ensure that the outputs of the AI system are marked in a machine-readable format and detectable as artificially generated or manipulated", see https://artificialintelligenceact.eu/article/50/
gregorkas
I genuinely feel that in this AI world we need the inverse. That every analogue or digital photo taken by traditional means of photography will need to be signed by a certificate, so anyone can verify its authenticity.
jamiecode
The text watermarking is the more interesting problem here. Image watermarking is fairly tractable - you can embed a robust signal in spatial or frequency domains. Text watermarking works by biasing token selection at generation time, and detection is a statistical test over that distribution.Which means short texts are basically useless. A 50-token reply has too little signal for the test to reach any confidence. The original SynthID text paper puts minimum viable detection at a few hundred tokens - so for most real-world cases (emails, short posts, one-liners) it just doesn't work.The other thing: paraphrase attacks break it. Ask any other model to rewrite watermarked text and the watermark is gone, because you're now sampling from a different distribution. EU compliance built on top of this feels genuinely fragile for anything other than long-form content from controlled providers.
Aldipower
As a synthesizer collector with serious GAS I find this particular name very offensive.
throwaway13337
These sorts of tools will only be able to positively identify a subset of genAI content. But I suspect that people will use it to 'prove' something is not genAI.In a sense, the identifier company can be an arbiter of the truth. Powerful.Training people on a half-solution like this might do more harm than good.
manbash
It's nice that they explain the "what" (...it is doing) but not the "why". Who is going to use it and for what reasons?Also, if it's essentially a sort of metadata, can't the output generated image be replicated (e.g. screenshot) and thus stripped of any such data?
kingstnap
It's security through obscurity. I'm sure with the technical details or even just sufficient access to a predictive oracle you could break this.But I suppose it ads friction so better than nothing.Watermarking text without affecting it is an interesting seemingly weird idea. Does it work any better than (with knowledge of the model used to produce said text), just observing the perplexity is low because its "on policy" generated text.
ks2048
How about a database of verified non-AI images?I'm thinking of historical images, where there aren't a huge number of existing images and no more will ever be created.If I see something labeled "Street scene in Paris, 1905". I want to know if it is legit.
anon
undefined
galleywest200
This is great, but there is no way for me to verify if groups or nation states can pay for a special contract where they do not have to have their outputs watermarked.
u1hcw9nx
This technology could be used to copyrights as well.>The watermark doesn’t change the image or video quality. It’s added the moment content is created, and designed to stand up to modifications like cropping, adding filters, changing frame rates, or lossy compression.But does it survive if you use another generative image model to replicate the image?
PaulHoule
...But it can be hard to tell the difference between content that’s been AI-generated, and content created without AI. Pro-Tip: Something like that Sherbet colored dog is always AI generated
zelias
Seems like this really just validates whether a piece of AI content was generated by Google, not AI generated in generalWhat incentive do open models have to adopt this?
anon
undefined
geor9e
This is from 2025. Did something new happen? What am I missing here?
squigz
Looks like there's a lot more info here, at least about the text version.https://ai.google.dev/responsible/docs/safeguards/synthid
ekjhgkejhgk
Is there a paper for this?
gigel82
Reposting a comment I made on an earlier thread on this.We need to be super careful with how legislation around this is passed and implemented. As it currently stands, I can totally see this as a backdoor to surveillance and government overreach.If social media platforms are required by law to categorize content as AI generated, this means they need to check with the public "AI generation" providers. And since there is no agreed upon (public) standard for imperceptible watermarks hashing that means the content (image, video, audio) in its entirety needs to be uploaded to the various providers to check if it's AI generated.Yes, it sounds crazy, but that's the plan; imagine every image you post on Facebook/X/Reddit/Whatsapp/whatever gets uploaded to Google / Microsoft / OpenAI / UnnamedGovernmentEntity / etc. to "check if it's AI". That's what the current law in Korea and the upcoming laws in California and EU (for August 2026) require :(
ChrisArchitect
something new here OP?Some previous discussion:https://news.ycombinator.com/item?id=45071677
andrewmcwatters
I wonder how it stands up to feature analysis."Generate a pure white image." "Generate a pure black image." Channel diff, extract steganographic signature for analysis.