Need help?
<- Back

Comments (66)

  • parliament32
    Note that watermarking (yes, including text) is a requirement[1] of the EU AI Act, and goes into effect in August 2026, so I suspect we'll see a lot more work in this space in the near future.[1] Specifically, "...synthetic audio, image, video or text content, shall ensure that the outputs of the AI system are marked in a machine-readable format and detectable as artificially generated or manipulated", see https://artificialintelligenceact.eu/article/50/
  • gregorkas
    I genuinely feel that in this AI world we need the inverse. That every analogue or digital photo taken by traditional means of photography will need to be signed by a certificate, so anyone can verify its authenticity.
  • jamiecode
    The text watermarking is the more interesting problem here. Image watermarking is fairly tractable - you can embed a robust signal in spatial or frequency domains. Text watermarking works by biasing token selection at generation time, and detection is a statistical test over that distribution.Which means short texts are basically useless. A 50-token reply has too little signal for the test to reach any confidence. The original SynthID text paper puts minimum viable detection at a few hundred tokens - so for most real-world cases (emails, short posts, one-liners) it just doesn't work.The other thing: paraphrase attacks break it. Ask any other model to rewrite watermarked text and the watermark is gone, because you're now sampling from a different distribution. EU compliance built on top of this feels genuinely fragile for anything other than long-form content from controlled providers.
  • Aldipower
    As a synthesizer collector with serious GAS I find this particular name very offensive.
  • throwaway13337
    These sorts of tools will only be able to positively identify a subset of genAI content. But I suspect that people will use it to 'prove' something is not genAI.In a sense, the identifier company can be an arbiter of the truth. Powerful.Training people on a half-solution like this might do more harm than good.
  • manbash
    It's nice that they explain the "what" (...it is doing) but not the "why". Who is going to use it and for what reasons?Also, if it's essentially a sort of metadata, can't the output generated image be replicated (e.g. screenshot) and thus stripped of any such data?
  • kingstnap
    It's security through obscurity. I'm sure with the technical details or even just sufficient access to a predictive oracle you could break this.But I suppose it ads friction so better than nothing.Watermarking text without affecting it is an interesting seemingly weird idea. Does it work any better than (with knowledge of the model used to produce said text), just observing the perplexity is low because its "on policy" generated text.
  • ks2048
    How about a database of verified non-AI images?I'm thinking of historical images, where there aren't a huge number of existing images and no more will ever be created.If I see something labeled "Street scene in Paris, 1905". I want to know if it is legit.
  • anon
    undefined
  • galleywest200
    This is great, but there is no way for me to verify if groups or nation states can pay for a special contract where they do not have to have their outputs watermarked.
  • u1hcw9nx
    This technology could be used to copyrights as well.>The watermark doesn’t change the image or video quality. It’s added the moment content is created, and designed to stand up to modifications like cropping, adding filters, changing frame rates, or lossy compression.But does it survive if you use another generative image model to replicate the image?
  • PaulHoule
    ...But it can be hard to tell the difference between content that’s been AI-generated, and content created without AI. Pro-Tip: Something like that Sherbet colored dog is always AI generated
  • zelias
    Seems like this really just validates whether a piece of AI content was generated by Google, not AI generated in generalWhat incentive do open models have to adopt this?
  • anon
    undefined
  • geor9e
    This is from 2025. Did something new happen? What am I missing here?
  • squigz
    Looks like there's a lot more info here, at least about the text version.https://ai.google.dev/responsible/docs/safeguards/synthid
  • ekjhgkejhgk
    Is there a paper for this?
  • gigel82
    Reposting a comment I made on an earlier thread on this.We need to be super careful with how legislation around this is passed and implemented. As it currently stands, I can totally see this as a backdoor to surveillance and government overreach.If social media platforms are required by law to categorize content as AI generated, this means they need to check with the public "AI generation" providers. And since there is no agreed upon (public) standard for imperceptible watermarks hashing that means the content (image, video, audio) in its entirety needs to be uploaded to the various providers to check if it's AI generated.Yes, it sounds crazy, but that's the plan; imagine every image you post on Facebook/X/Reddit/Whatsapp/whatever gets uploaded to Google / Microsoft / OpenAI / UnnamedGovernmentEntity / etc. to "check if it's AI". That's what the current law in Korea and the upcoming laws in California and EU (for August 2026) require :(
  • ChrisArchitect
    something new here OP?Some previous discussion:https://news.ycombinator.com/item?id=45071677
  • andrewmcwatters
    I wonder how it stands up to feature analysis."Generate a pure white image." "Generate a pure black image." Channel diff, extract steganographic signature for analysis.