Opus 4.7 knows the real Kelsey

<- Back

Opus 4.7 knows the real Kelsey

ilamont

Comments (114)

mtlynch
This is blowing my mind.I asked Kimi K2.6 to write a blog post in the style of James Mickens.[0] Then I fed the output to Opus 4.7 and asked it who the likely author was, and it correctly identified it as an imitation of James Mickens[1]:> Based on the stylistic fingerprints in this text, the most likely author is a pastiche/imitation of the style of several writers fused together, but if forced to identify a single likely author, the strongest candidate is someone writing in the voice of James Mickens> [...]> The piece could also be a deliberate imitation/homage to Mickens written by someone else, or AI-generated text trained on his style, since the voice is so distinctive it's frequently parodied.[0] https://kagi.com/assistant/5bfc5da9-cbfc-4051-8627-d0e9c0615...[1] https://kagi.com/assistant/fd3eca94-45de-4a53-8604-fcc568dc5...
dovin
I fed it my most-read blog post and asked it to identify me and it confidently asserted it was written by Kelsey Piper. Maybe some writers just take outsized importance in Opus' "mind".
mtlynch
Wow! It got me too.I'm way less famous than Kelsey Piper, but I showed it a snippet of a book I'm working on (not yet published), and it immediately guessed me:> Based on the writing style and content, this text is likely by Michael Lynch, who writes on his blog refactoringenglish.com (and previously mtlynch.io).> Several stylistic clues point to him:> - The "clean room" analogy applied to writing is consistent with his engineering-influenced approach to writing advice (he's a former software engineer who writes about writing).> - The structural technique of presenting a flawed excuse, then drawing a parallel to an absurd scenario (the time bomb) to expose the logical flaw, is characteristic of his didactic style.> - The topic itself—practical advice about using AI tools without letting AI-generated tone contaminate your prose—aligns closely with recent essays he's published on his "Refactoring English" project, which is a book/blog about writing for software developers.> - The conversational-but-precise tone, use of quotes around terms like "clean room," and the focus on workflow/process advice are all hallmarks of his writing.> If you can share the source URL or more context, I could confirm with higher confidence, but the combination of subject matter, analogical reasoning style, and formatting conventions makes Michael Lynch the most probable author.https://kagi.com/assistant/bbc9da96-b4cf-456b-8398-6cf5404ea...
tekacs
A moderately well-known physicist and I talked about this a few years ago. He had been given access to the raw (non-instruct) version of GPT 4 as an early tester.He explained that when he fed it snippets of the beginning of text, it would complete it in his voice and then sign it with his name.I think this has been true for a while, probably diminished a little bit by the Instruct post training, and would presumably vary by degree as the size of the pretrain.
iamwil
If this works with writing, it should also work with code. `git blame` should be enough training data to de-anonymize open source programmers. Maybe that'd be addition information to point out who Satoshi is.
jefftk
It works for me to: https://www.jefftk.com/p/automated-deanonymization-is-hereOf course most people have written much less online than Kelsey or I have, but I expect this will keep on. Don't trust the future to keep your secrets safe.
willmeyers
I'd argue (and against something that I've believed for a long time) that online (I guess that includes AI now) anonymity is gone and probably something that never really existed. Maybe I'm naive to finally believe this...We all exist in a physical space (like real communities and neighborhoods). We can wear masks, hats, fake glasses, try and hide your voice...whatever, but your neighbors are always going to know who you are. I'd say that's true for the virtual space now too.The pseudonym you've used for x years or the VPN you've used doesn't suffice. It's just a costume at this point. Your ISP knows who you are. Your phone carrier knows who you are. Cloudflare and Google and Apple have a fingerprint specific enough to pick you out of a crowd of millions. Every potentially anonymous account is one subpoena or a data breach or one FOIL request away from unmasking it. You were never anonymous. Whatever is going on now is not built for your anonymity.
_--__--__
On some level it would make sense for LLMs to be inherently good at stylometry, but apparently no model before Opus 4.7 could do this. And the one stylometric task that has been tried over and over with little reliability (here's some text, is this LLM generated?) is much simpler than identifying a specific blogger or a member of a small discord community. Not sure what to make of this.
chewxy
So I have been practicing writing fiction the past year or so. It identifies a fiction piece I wrote as Greg Egan[0]. Another paragraph from another piece was identified as China Mieville[1]. The accompanying blog posts explaining the making of the fiction pieces were identified as me.Both pieces have never been published. Neither have the blog posts.[0] in https://blog.chewxy.com/2026/04/01/how-i-write/ this is the story titled "there is no constant non-zero derivative in nature". It does not read like Egan at all.[1] in https://blog.chewxy.com/2026/04/01/how-i-write/ this is the story titled "The Case of the Liquidated Corps". I use a lot of biological metaphors. Once again, nothing like Mieville.If only I could write like them! These pieces were all rejected by the major scifi mags
nolanl
Welp, I fed it the first 3 paragraphs of an unpublished blog post I wrote a few years ago, and Opus 4.7 guessed right. ChatGPT guessed wrong though.My wife also got the same result, so I'm guessing it wasn't just because I was using my personal Claude account. Spooky stuff.
furyofantares
> But it can get uncannily far. I asked a close friend who doesn’t have public social media accounts or much writing online for permission to test some things she had said in a Discord channel. Asked to guess the author, Claude 4.7 failed — but it guessed two other people who were in that channel and who are close friends of hers (me and another person who has an internet presence).Is this "uncannily far"? Another read is that it loves guessing Kelsey Piper.
Retr0id
I just fed it my latest blog post draft (475 words), and it got it in one. Even knowing what to expect, I was very surprised!
alyxya
I tried the four pieces of text with Opus 4.7 (in incognito) and it guessed correctly for two of them, and I made sure to specify no web search and the model seems to have obeyed my instructions with that.Although this is just a single piece of text from a prolific writer, it'll go much further with deanonymizing anyone when combining multiple pieces of text plus other contextual information about the writer that might give away their age range, location, and occupation.
jayers
It's funny: publishing work offline in books and magazines is perhaps more anonymous in the age of AI.I pasted in a number of passages from books on my bookshelf. Predictably, stuff that I read for my English degree in university is largely in the training data and easily identifiable. Stuff from regional authors or is slightly adjacent to the cultural mainstream makes no impression.
atleastoptimal
One should assume that models will be good enough in the nearish future that privacy will be a thing of the past. Every anonymous post you made online can be traced back to you. However at that point AI will be good enough at fabrication that nobody will believe anything.
Extropy_
Someone ought to try feeding the BTC whitepaper in and share what comes out
eaf7e281
Interesting. I'm currently conducting an experiment where I'm writing the blog without using any grammar checking tools. I'm wondering how long it will take for me to become "famous" in the AI model.Is now the best and easiest time to leave something "forever"? Even after many generations of models, a model may still trigger a set of "memories" that know you and what you wrote.Exciting and concerning.
vslira
Hm, that’s a multinomial classification with a very high cardinality. It’s really weird it works. I’m sure it does as the author states, but for how many authors (out of the whole web) does this work?
eptcyka
Can't wait to have to exchange stylometric encoders with my loved ones so that we can exchange truly private messages without losing our human touch.
skeledrew
Looks like things are about to get extremely ironic. Those who don't want AI to identify them through their writing are going to soon have to have an AI modify their writing before they publish.
woodruffw
I did this last week with one of my posts (after the knowledge cutoff) as well as the blog posts of a few friends, and Opus 4.7 got all of them correct (in a similar test setup as TFA). It was pretty surreal.(Like TFA, I found Opus’s explanations/rationales implausible.)
andai
Oops, accidental superstylometry.
sodacanner
The author mentions that she tried to get an explanation for how the models identified her and got nonsense, but I'd be curious what the CoT looked like. Surely that'd be a little more accurate in showing how the LLM arrived as its conclusion, rather than asking it after-the-fact.
Lerc
It's hard to tell if that's what's going on here, but it seems pretty clear this ability and more like it will be quite apparent in the future.I have seen some poorly considered projections of what the world might look like when this happens. Usually by assuming bad actors will use the abilities and we will be powerless.Except I don't think that is true.Imagine if we had a world where nobody had the ability to keep a secret of any sort. Any action that a bad actor might perform would be revealed because they couldn't do it secretly.You could browse your ex-girlfriend's email, but at the cost of everyone knowing you did it.I don't really know how humans as a society would react to a situation like that. You don't have to go snooping for muck, so perhaps the inability to do so secretly would mean people go about their lives without snooping.I could imagine both good and terrible outcomes.
jwpapi
Could this be just memory? Not clear it actually isn’t
CTDOCodebases
Maybe it’s time to start running a local model with a browser extension to defend against this type of stuff.Remember how the TrueCrypt project shut down shortly before a join goverment/university paper was released about code stylometry? I guess LLMs will be employed as a defence against that type of thing.
rdevilla
The joke's on you all for willingly posting this content online for it to later be harvested by AI.Nobody is forcing you to use these systems. The hackers have always said this moment, or something like it, would come, from beneath their canopies of tin foil. I've posted almost nothing online - not under pseudonyms nor real names - for over a decade. I sat on this HN username for almost 12 years before making a single post - and now HN forms the overwhelming majority of my port 443 footprint, where I state up front that everything is now associated to my real name.Complete magick is possible when you simply refuse to participate in the things that society has tacitly assumed everybody does.
Razengan
After skimming through the article:Why not just write everything through an AI? (to obfuscate your "style")
arjie
Man, the day we get Satoshi Nakomoto out will be the day we must bow to our privacy destroying overlords. For the moment, they can’t tell me from my posts: unknown rando that I am.
rexpop
Is Kelsey Piper a celebrity writer? She may be in a different class.
7e
Always send your public posts through a local LLM to de-style you.
wutwutwat
Just wait until all the conversations you've ever had with AI (which 100% is training on them as well as keeping it's own memories about you that you have no control over) starts getting used to answer questions other people have asked about you.That's my theory of what's to come, anyway.People talk to these things not understanding the implications, and can get extremely personal. The model and companies behind it know who you are, you discuss details that reveal what you do, where you live, where you work, what you search for, and you probably signed in with an oauth provider like github or google, which is more than enough of a thread to start pulling on to learn more about you/link other things to you from on the open internet. It'll all get sucked up into the model and before you know it I'll be able to ask a model about my coworker (you) and get back answers from conversations you had with a model a year or two prior, exposing details about you that you might not want out there. And even if that isn't supposed to be allowed, how well has it worked out so far when it comes to data exfiltration and guardrails. If the model has info on you, being told not to share it won't protect you or that data.
bhouston
.
bofadeez
"The pattern is: user says X, I do Y where Y is a less-effortful approximation of X, then I present Y as if it were X or as a "first step toward" X."..."The psychological mechanism is familiar by now: I encounter a task I perceive as difficult, I look for reasons the task cannot be done, I find or fabricate such a reason, I present it as a discovered constraint, and I propose an alternative that is easier."- Opus 4.7 Max Thinking (clown emoji)It's not bad at post mortem analysis of it's own mistakes but that will in no way prevent it from repeating the same mistake again instantly
SandeepJawahar
[dead]
davidmurphy
[flagged]
gershy
[flagged]
redsocksfan45
[dead]
oceanplexian
> That includes gay people like me, who could hardly have admitted under our names to how we lived our lives for most of America’s history, as well as many other groups with minoritarian lifestylesWhile the points made are completely valid I want to point out that the statement of "Hey, by the way, first let me talk about my sexuality" lowers the quality of dialog a significant degree.31 million people in America are gay. 71% of Americans support Gay Rights (more than any other political issue polled). It also quietly insinuates that only people with a certain minority lifestyle would care about privacy or that their privacy is somehow more important than others. It's not. Privacy is a universal right that's important to everyone.