Need help?
<- Back

Comments (40)

  • ciroduran
    I stopped being concerned about email harvesting years ago, I just simply leave the email on my website. Spam handling is okay enough, I guess.But I like this review of techniques, even the simplest ones are very effective, that surprised me.
  • Croak
    One trick is having an tarpit email adress on your website. It is hidden using CSS so no real visitor sees it but it is visible in source. If your mail server recieves mail for that adress you can just block that IP for 24h.
  • badsectoracula
    Some time ago i was wondering if the common "me at foobar dot com" you still see a lot of people do actually helps at all, especially now with LLMs, so i searched for some common "obfuscation" techniques and found this site (not the 2026 update, but the previous - it was a few months ago). Then i wrote a simple LLM query with a bunch of examples from the site[0] (the tool is just a frontend for a commandline program that uses llama.cpp and Mistral Small 3.1 in Q4_K_M quantization since it loads relatively fast and is fine for simple prompts). AFAICT it could reveal anything that wasn't relying on CSS tricks or JavaScript.Like others mentioned, though, personally i haven't bothered by email harvesting for years now since spam filters seem to do a decent job. I have my email posted in plaintext here (which i bet is harvested very often) and in various other places and the occasional spam i get is eclipsed from "spam" from services i've actually signed up for (coughlinkedincough).[0] https://i.imgur.com/ytYkyQW.png
  • binaryturtle
    When I wrote my own brainf*ck interpreter (in C) at the start of the year I was really struggling to find a use for the language. Eventually I had the idea to obfuscate emails on my websites with the language.Basically each email gets written as a brainf*ck program and stored in a "data-" attribute. The html only includes a more primitively obfuscated statement "Must enable Javascript to see e-mail." by default which then gets replaced by another brainf*ck interpreter (in JS) with the output of the brainf*ck code. Since we only output ASCII we can reduce the size of the brainf*ck code by always adding 32 to each value it outputs. The Javascript is loaded from what seemingly looks like a 3rd party domain. There we filter basing on heuristics and check if the "referer" matches before sending out the actual interpreter code.Of course all this would not help if a scraper properly runs things through Javascript too.Recently I read you soon will be able to run DOOM via CSS, so certainly it should be possible to have a brainf*ck interpreter in CSS? That would be the next step… just to get rid of the Javascript, but then I'm okay with all the downsides of using Javascript just for the e-mail obfuscation.Anyway… I also regularly (at least once a year) rotate those public contact addresses.
  • xiconfjs
    WTH, a 302 into a "mailto:" (search for "HTTP redirect" in the featured article) opens up my e-mail client without clicking a mailto link!? This seems wrong.
  • bit1993
    Good stuff, but I think the title should be Email address obfuscation. Thank you for sharing I guess, but spammers can now learn from this too (:
  • sureglymop
    What I often see is js that fetches the email from the server separately and inserts it.
  • newscracker
    > HTML entities are often decoded automatically by server-side libraries, which means that even the most basic harvesters can get your email addresses without any special effort. This technique should be worthless—and, yet, it still stops most harvesters.Anecdotal, but I’ve used HTML entities on a public static website for a long time using an href tag with mailto, and yet I’ve not seen any spam.I guess any spammer who uses some level of GenAI to process and extract email addresses would have a lot more success against all the methods listed in this article.
  • dandersch
    Very interesting. It seems for his own email the author has opted for a combination of the CSS display none technique and a XOR cipher: <span class="hidden email"><b>999a8f84898f98</b>aa<b>878b8386c4</b>999a8f84898f988785989e8f84998f84c4898587</span>
  • siruwastaken
    I'm surprised that html entity supstitution performs so well. I would have assumed that scrappers could at least speak proper html.
  • anon
    undefined
  • fmajid
    I use SVG where I created a text object in Affinity Designer and converted it to curves so the SVG doesn't have text any more, just vectors for the glyphs of it. Seems to work pretty well at keeping spammers at bay.
  • gfody
    I filter everything that does NOT include “+asdf” in the to:
  • DevCrate
    [dead]
  • _ache_
    I'm sorry, but that is not how email address are spammed in bulk.The data-source are the enormous data breach that are more and more frequent. There is more intensive to collect more information on someone you already know something about than spamming an email you don't even know if it's a valid one.The spam can also be very more effective as it present itself with personal information about the spammed.
  • jwr
    This is such a waste of effort. Your E-mail address is not and can't be a secret. It will get into spammer databases eventually, no matter what you do. You will spend a lot of effort doing all these fancy tricks, and eventually you will get spam anyway.Also, a note to those who make fancy "me+someservice@somedomain.com" addresses: make really sure you are in control and these work. Some services (including mine) will need to E-mail you one day, for example to tell you that your account will be deleted because of inactivity. If you don't receive that E-mail because of your fancy spam defenses, your account will be deleted. I've seen people hurt themselves like this and it makes me sad.On a constructive note: what works very well is spam filtering using LLMs. We have AI to help us with this problem today. I wrote an LLM despammer tool which processes my inbox via IMAP using a local LLM (for privacy reasons). I see >97% accuracy in my benchmarks on my (very difficult) testing corpus. It's nearly perfect in real life usage. I've tested many local models in the 4-32B range and the top practical choice is gpt-oss:20b (GGUF, I run it from LM Studio, MLX quantizations are worse) — not only does it perform very well, but it's also really fast.