Writing "/etc/hosts" breaks the Substack editor

<- Back

Writing "/etc/hosts" breaks the Substack editor

scalewithlee

Comments (355)

matt_heimer
The people configuring WAF rules at CDNs tend to do a poor job understanding sites and services that discuss technical content. It's not just Cloudflare, Akamai has the same problem.If your site discusses databases then turning on the default SQL injection attack prevention rules will break your site. And there is another ruleset for file inclusion where things like /etc/hosts and /etc/passwd get blocked.I disagree with other posts here, it is partially a balance between security and usability. You never know what service was implemented with possible security exploits and being able to throw every WAF rule on top of your service does keep it more secure. Its just that those same rulesets are super annoying when you have a securely implemented service which needs to discuss technical concepts.Fine tuning the rules is time consuming. You often have to just completely turn off the ruleset because when you try to keep the ruleset on and allow the use-case there are a ton of changes you need to get implemented (if its even possible). Page won't load because /etc/hosts was in a query param? Okay, now that you've fixed that, all the XHR included resources won't load because /etc/hosts is included in the referrer. Now that that's fixed things still won't work because some random JS analytics lib put the URL visited in a cookie, etc, etc... There is a temptation to just turn the rules off.
netsharc
Reminds me of an anecdote about an e-commerce platform: someone coded a leaky webshop, so their workaround was to watch if the string "OutOfMemoryException" shows up in the logs, and then restart the app.Another developer in the team decided they wanted to log what customers searched for, so if someone typed in "OutOfMemoryException" in the search bar...
Y_Y
Does it block `/etc//hosts` or `/etc/./hosts`? This is a ridiculous kind of whack-a-mole that's doomed to failure. The people who wrote these should realize that hackers are smarter and more determined than they are and you should only rely on proven security, like not executing untrusted input.
simonw
"How could Substack improve this situation for technical writers?"How about this: don't run a dumb as rocks Web Application Firewall on an endpoint where people are editing articles that could be about any topic, including discussing the kind of strings that might trigger a dumb as rocks WAF.This is like when forums about web development implement XSS filters that prevent their members from talking about XSS!Learn to escape content properly instead.
blenderob
> This case highlights an interesting tension in web security: the balance between protection and usability.But it doesn't. This case highlights a bug, a stupid bug. This case highlights that people who should know better, don't!The tension between security and usability is real but this is not it. Tension between security and usability is usually a tradeoff. When you implement good security that inconveniences the user. From simple things like 2FA to locking out the user after 3 failed attempts. Rate limiting to prevent DoS. It's a tradeoff. You increase security to degrade user experience. Or you decrease security to increase user experience.This is neither. This is both bad security and bad user experience. What's the tension?
SonOfLilit
After having been bitten once (was teaching a competitive programming team, half the class got a blank page when submitting solutions, after an hour of debugging I narrowed it down to a few C++ types and keywords that cause 403 if they appear in the code, all of which happen to have meaning in Javascript), and again (working for a bank, we had an API that you're supposed to submit a python file to, and most python files would result in 403 but short ones wouldn't... a few hours of debugging and I narrowed it down to a keyword that sometimes appears in the code) and then again a few months later (same thing, new cloud environment, few hours burned on debugging[1]), I had the solution to his problem in mind _immediately_ when I saw the words "network error".[1] the second time it happened, a colleague added "if we got 403, print "HAHAHA YOU'VE BEEN WAFFED" to our deployment script, and for that I am forever thankful because I saw that error more times than I expected
pimanrules
We faced a similar issue in our application. Our internal Red Team was publishing data with XSS and other injection attack attempts. The attacks themselves didn't work, but the presence of these entries caused our internal admin page to stop loading because our corporate firewall was blocking the network requests with those payloads in them. So an unsuccessful XSS attack became an effective DoS attack instead.
mrgoldenbrown
Everything old is new again :) We used to call this the Scunthorpe problem.https://en.m.wikipedia.org/wiki/Scunthorpe_problem
petercooper
I ran into a similar issue with OpenRouter last night. OpenRouter is a “switchboard” style service that provides a single endpoint from which you can use many different LLMs. It’s great, but last night I started to try using it to see what models are good at processing raw HTML in various ways.It turns out OpenRouter’s API is protected by Cloudflare and something about specific raw chunks of HTML and JavaScript in the POST request body cause it to block many, though not all, requests. Going direct to OpenAI or Anthropic with the same prompts is fine. I wouldn’t mind but these are billable requests to commercial models and not OpenRouter’s free models (which I expect to be heavily protected from abuse).
jmmv
I encountered this a while ago and it was incredibly frustrating. The "Network error" prevented me from updating a post I had written for months because I couldn't figure out why my edits (which extended the length and which I assumed was the problem) couldn't get through.Trying to contact support was difficult too due to AI chatbots, but when I finally did reach a human, their "tech support" obviously didn't bother to look at this in any reasonable timeframe.It wasn't until some random person on Twitter suggested the possibility of some magic string tripping over some stupid security logic that I found the problem and could finally edit my post.
robertlagrant
> This case highlights an interesting tension in web security: the balance between protection and usability.This isn't a tension. This rule should not be applied at the WAF level. It doesn't know that this field is safe from $whatever injection attacks. But the substack backend does. Remove the rule from the WAF (and add it to the backend, where it belongs) and you are just as secure and much more usable. No tension.
josephcsible
WAFs were created by people who read https://thedailywtf.com/articles/Injection_Rejection and didn't realize that TDWTF isn't a collection of best practices.
johnklos
Content filtering should be highly context dependent. If the WAF is detached from what it's supposed to filter, this happens. If the WAF doesn't have the ability to discern between command and content contexts, then the filtering shouldn't be done via WAF.This is like spam filtering. I'm an anti-spam advocate, so the idea that most people can't discuss spam because even the discussion will set off filters is quite old to me.People who apologize for email content filtering usually say that spam would be out of control if they didn't have that in place, in spite of no personal experience on their end testing different kinds of filtering.My email servers filter based on the sending server's configuration: does the EHLO / HELO string resolve in DNS? Does it resolve back to the connecting IP? Does the reverse DNS name resolve to the same IP? Does the delivery have proper SPF / DKIM? Et cetera.My delivery-based filtering works worlds better than content-based filtering, plus I don't have to constantly update it. Each kind has advantages, but I'd rather occasional spam with no false positives than the chance I'm blocking email because someone used the wrong words.With web sites and WAF, I think the same applies, and I can understand when people have a small site and don't know or don't have the resources to fix things at the actual content level, but the people running a site like Substack really should know better.
arp242
Few years ago I had an application that allowed me to set any password, but then gave mysterious errors when I tried to use that password to login. Took me a bit to figure out what was going on, but their WAF blocked my "hacking attempt" of using a ' in the password.The same application also stored my full password in localStorage and a cookie (without httponly or secure). Because reasons. Sigh.I'm going to do a hot take and say that WAFs are bollocks mainly used by garbage software. I'm not saying a good developer can't make a mistake and write a path traversal, but if you're really worried about that then there are better ways to prevent that than this approach which obviously is going to negatively impact users in weird and mysterious ways. It's like the naïve /(fuck|shit|...)/g-type "bad word filter". It shows a fundamental lack of care and/or competency.Aside: is anyone still storing passwords in /etc/passwd? Storing the password in a different root-only file (/etc/shadow, /etc/master.passwd, etc.) has been a thing on every major system since the 90s AFAIK?
Osiris
I understand applying path filters in URLS and search strings, but I find it odd that they would apply the same rules to request body content, especially content encoded as valid JSON, and especially for a BLOG platform where the content would be anything.
dvorack101
Indeed a severe case of paranoia?1. Create a new post. 2. Include an Image, set filter to All File types and select "/etc/hosts". 3. You get served with an weird error message box displacing a weird error message. 4. After this the Substack posts editor is broken. Heck, every time i access the Dashboard, it waits forever to build the page.Did find this text while browsing the source for an error (see original ascii art: https://pastebin.com/iBDsuer7):SUBSTACK WANTS YOUTO BUILD A BETTER BUSINESS MODEL FOR WRITING https://substack.com/jobs
Null-Set
This looks like it was caused by this update https://developers.cloudflare.com/waf/change-log/2025-04-22/ rule 100741.It references this CVE https://github.com/tuo4n8/CVE-2023-22047 which allows the reading of system files. The example given shows them reading /etc/passwd
eniac111
https://en.wikipedia.org/wiki/Bush_hid_the_facts
wglb
The problem with WAF is discussed in https://users.ece.cmu.edu/~adrian/731-sp04/readings/Ptacek-N....One of the authors of the paper has said "WAFs are just speed bump to a determined attacker."
Habgdnv
I have a lifetime Pastebin account that I hadn't used for some years. Last year I enrolled in a "linux administration" class and tried to use that pastebin (famous for sharing code) to share some code/configurations with other students. When I tried to paste my homework I kept getting a Cloudflare error page. I don't even remember what I was pasting, but it was normal linux stuff. I contacted pastebin support - of course I got ghosted.I am sharing this in relation to the WAF comments and how much the companies implementing WAF care about your case.
nickagliano
As a card carrying Substack hater, I’m not suprised.> "How could Substack improve this situation for technical writers?"They don’t care about (technical) writers. All they care about is building a TikTok clone to “drive discoverability” and make the attention-metrics go up. Chris Best is memeing about it on his own platform. Very gross.
donatj
We briefly had a WAF forced upon us and it caused so many problems like this we were able to turn it off, for now. I'm sure it'll be back.
jkrems
Could this be trivially solved client-side by the editor if it just encoded the slashes, assuming it's HTML or markdown that's stored? Replacing `/etc/hosts` with `/etc/hosts` for storage seems like an okay workaround. Potentially even doing so for anything that's added to the WAF rules automatically by syncing the rules to the editor code.
aidog
It's something I ran into quite a few times in my career. It's a weird call to get if the client can't save their cms site, due to typing something harmless. I think worst was when there was a dropdown that I defined which had a value in the mod rules that was not allowed.
vintermann
That reminds me of issues I once had with Microsoft's boneheaded WAF. We had base64 encoded data in a cookie, and whenever certain particular characters were produced next to each other in the data - I think the most common was "--" - the WAF would tilt and stop the "attempted SQL injection attack". So every so often someone would get an illegal login cookie and just get locked out of the system until they deleted it or it expired. Took a while to find out what went wrong, and even longer to figure out how to remove the more boneheaded rules from the WAF.
rpigab
Between around 2005 and 2011 in France, if a child was born and parents Mr Bar and Mrs Baz wanted to transmit both of their last names, he or she had to be named "Foo Bar--Baz". No, that's not a typo, that's two hyphens. Check out "Circulaire du 6 décembre 2004 relative au nom de famille" if you don't believe me.Yes, the people in charge probably didn't think or know of SQL comments. However, it worked well as long as input is sanitized and not concatenated, which is often the case using modern frameworks or common sense.However, nowadays, we just put a WAF in front of everything, it's cheaper that way because common sense is hard to come by. People like Foo Bar--Baz still exist, and unless they've had their name changed, they're sometimes running into extremely wierd issues in the web software they're using.
halffullbrain
At least, in this case, the WAF in question had the decency to return 403.I've worked with a WAF installation (totally different product), where the "WAF fail" tell was HTTP status 200 (!) and "location: /" (and some garbage cookies), possibly to get browsers to redirect using said cookies. This was part of the CSRF protection. Other problems were with "command injection"-patterns (like in the article, expect with specific Windows commands, too - they clash with everyday words which the users submit), and obviously SQL injections which cover some relevant words, too.The bottom line is that WAFs in their "hardened/insurance friendly" standard configs are set up to protect the company from amateurs exposing buggy, unsupported software or architectures. WAF's are useful for that, but you still gave all the other issues with buggy, unsupported software.As others have written, WAFs can be useful to protect against emerging threats, like we saw with the log4j exploit which CloudFlare rolled out protection for quite fast.Unless you want compliance more than customers, you MUST at least have a process to add exceptions to "all the rules"-circus they put in front of the buggy apps.Whack-a-mole security filtering is bad, but whack-a-mole relaxation rule creation against an unknown filter is really tiring.
nicoledevillers
it was a cf managed waf rule for a vulnerability that doesn't apply to us. we've disabled it.
badgersnake
Seems like a case of somebody installing something they couldn’t be bothered to understand to tick a box marked security.The outcome is the usual one, stuff breaks and there is no additional security.
thayne
As soon as I saw the headline, I knew this was due to a WAF.I worked on a project where we had to use a WAF for compliance reasons. It was a game of wack-a-mole to fix all the places where standard rules broke the application or blocked legitimate requests.One notable, and related example is any request with the string "../" was blocked, because it might be a path traversal attack. Of course, it is more common that someone just put a relative path in their document.
0xDEAFBEAD
Weird idea: What if user content was stored and transmitted encrypted by default? Then an attacker would have to either (a) identify a plaintext which encrypts to an attack ciphertext (annoying, and also you could keep your WAF rules operational for the ciphertext, with minimal inconvenience to users) or (b) attack the system when plaintext is being handled (could still dramatically reduce attack surface).
teddyh
> For now, I'll continue using workarounds like "/etc/h*sts" (with quotes) or alternative spellings when discussing system paths in my Substack posts.Ahh, the modern trend of ”unalived”¹ etc. comes to every corner of society eventually.1. <https://knowyourmeme.com/memes/unalive>
mifydev
It's /con/con all over again
godelski
I don't get it. Why aren't those files just protected so they have no read or write permissions? Isn't this like the standard way to do things? Put the blog in a private user space with minimal permissions.Why would random text be parsed? I read the article but this doesn't make sense to me. They suggested directory transversal but your text shouldn't have anything to do with that and transversal is solved by permission settings
sudb
I had a problem recently trying to send LLM-generated text between two web servers under my control, from AWS to Render - I was getting 403s for command injection from Render's Cloudflare protection which is opaque and unconfigurable to users.The hacky workaround which has been stably working for a while now was to encode the offending request body and decode it on the destination server.
mattrighetti
This reminds me of that time I was discussing with friends about something we did in our computer science class that day and I realised writing toString in the Whatsapp client for macOS would crash the application. At the time I didn’t have the skills to understand why so I recorded the bug on my phone to share with friends :)
mike-cardwell
Just rot13 any request data using javascript before posting, and rot13 it again on the server side. Problem solved. (jk)
driverdan
This is a common problem with WAFs and, more specifically, Cloudflare's default rulesets. If your platform has content that is remotely technical you'll end up triggering some rules. You end up needing a test suite to confirm your real content doesn't trigger the rules and if it does you need to disable them.
swyx
substack also does wonderful things like preserve weird bullet points, lack code block displays, and make it impossible to customize the landing page of your site beyond the 2 formats they give you.generally think that Substack has done a good thing for its core audience of longform newsletter writer creators who want to be Ben Thompson. however its experience for technical people, for podcasters, for people who want to start multi-channel media brands, and for people who write for reach over revenue (but with optional revenue) has been really poor. (all 4 of these are us with Latent.Space). I've aired all these complaints with them and theyve done nothing, which is their prerogative.i'd love for "new Substack" to emerge. or "Substack for developers".
paxys
This isn't a "security vs usability" trade-off as the author implies. This has nothing to do with security at all./etc/hostsSee, HN didn't complain. Does this mean I have hacked into the site? No, Substack (or Cloudflare, wherever the problem is) is run by people who have no idea how text input works.
chrisjj
> Substack's filter is well-intentioned - protecting their platform from potential attacks.There is sadly no evidence in this article that the supposed filter does protect the platform from potential attacks.
anon
undefined
gitroom
The amount of headaches I've had from WAFs blocking legit stuff is unreal. I just wish the folks turning those rules on had to use them for a week themselves.
lofaszvanitt
Using a WAF is the strongest indicator that someone doesn't know what's happening and where or something underneath is smelly and leaking profusely.
toogan
The title would be improved with "Writing the string ...". I first read it as "Writing the file" which was pretty weird.
bastawhiz
I once helped maintain some PHP software that was effectively a CMS. You'd drop a little PHP snippet into any page (e.g., that you make with Dreamweaver) and it would automatically integrate it with the CMS functionality.We had unending trouble with mod_security. The worst issue I can remember was that any POST request whose body contained the word "delete" was automatically rejected. That was the full rule. To this day I still can't imagine what the developers were thinking.
ChrisArchitect
Just tried to post a tweet with this article title and link and got a similar error (on desktop twitter.com). Lovely.
skybrian
Did anyone try reporting this to Substack?
righthand
Similar:Writing `find` as the first word in your search will prevent Firefox from accepting the “return” key is pressed.Pretty annoying.
iefbr14
So "/etc/h*sts" is not stopped by the filters? Nice to know for the hackers :)
nottorp
So everyone should start looking for vulnerabilities in the substack site?If that's their idea of security...
anon
undefined
HenryBemis
Aaaahh they are trying to prevent a Little Bobby Tables story..
stefs
this feels like blocking terms like "null" or "select" just because you failed to properly parameterize your SQL queries.
anon
undefined
t1234s
writing "bcc: someone@email.com" sometimes triggers WAF rules
anon
undefined
julik
Ok so: there is a blogging/content publishing engine, which is somewhat of a darling of the startup scene. There is a cloud hosting company with a variety of products, which is an even dearer darling of the startup scene. Something is posted on the blobbing/content publishing engine that clearly reveals that* The product provided for blogging/content publishing did a shitty job of configuring WAF rules for its use cases (the utility of a "magic WAF that will just solve all your problems" being out of the picture for now) * The WAF product provided by the cloud platform clearly has shitty, overreaching rules doing arbitrary filtering on arbitrary strings. That filtering absolutely can (and will) break unrelated content if the application behind the WAF is developed with a modicum of security-mindedness. You don't `fopen()` a string input (no, I will not be surprised - yes, sometimes you do `fopen()` a string input - when you are using software that is badly written).So I am wondering:1. Was this sent to Substack as a bug - they charge money for their platform, and the inability to store $arbitrary_string on a page you pay for, as a user, is actually a malfunction and disfunction"? It might not be the case "it got once enshittified by a CIO who mandated a WAF of some description to tick a box", it might be the case "we grabbed a WAF from our cloud vendor and haven't reviewed the rules because we had no time". I don't think it would be very difficult for me, as an owner/manager at the blogging platform, to realise that enabling a rule filtering "anything that resembles a Unix system file path or a SQL query" is absolutely stupid for a blogging platform - and go and turn it the hell off at the first user complaint.2. Similarly - does the cloud vendor know that their WAF refuses requests with such strings in them, and do they have a checkbox for "Kill requests which have any character an Average Joe does not type more frequently than once a week"? There should be a setting for that, and - thinking about the cloud vendor in question - I can't imagine the skill level there would be so low as to not have a config option to turn it off.So - yes, that's a case of "we enabled a WAF for some compliance/external reasons/big customer who wants a 'my vendor uses a WAF' on their checklist", but also the case of "we enabled a WAF but it's either buggy or we haven't bothered to configure it properly".To me it feels like this would be 2 emails first ("look, your thing <X> that I pay you money for clearly and blatantly does <shitty thing>, either let me turn it off or turn it off yourself or review it please") - and a blog post about it second.
curtisszmania
[dead]
selfselfgo
[dead]
chaitrack
[dead]
untill
[flagged]
Sharo2025
[flagged]
0xbadcafebee
Worth noting that people here are assuming that the author's assumption is correct, that his writing /etc/hosts is causing the 403, and that this is either a consequence of security filtering, or that this combination of characters at all that's causing the failure. The only evidence he has, is he gets back a 403 forbidden to an API request when he writes certain content. There's a thousand different things that could be triggering that 403.It's not likely to be a WAF or content scanner, because the HTTP request is using PUT (which browser forms don't use) and it's uploading the content as a JSON content-type in a JSON document. The WAF would have to specifically look for PUTs, open up the JSON document, parse it, find the sub-string in a valid string, and reject it. OR it would have to filter raw characters regardless of the HTTP operation.Neither of those seem likely. WAFs are designed to filter on specific kinds of requests, content, and methods. A valid string in a valid JSON document uploaded by JavaScript using a JSON content-type is not an attack vector. And this problem is definitely not path traversal protection, because that is only triggered when the string is in the URL, not some random part of the content body.