Need help?
<- Back

Comments (156)

  • jpollock
    The design of the system is very interesting, particularly how it expects to handle errors.In 90's Telco, you used to have a pair of systems and if they disagreed, they would decide which side was bad and disable it.In modern cloud, you accept there are errors. There's another request in ~10+ms. You only look when the error rate becomes commercially important.My understanding of spacecraft is that there would be 3 independent implementations and they would vote.The plane has a matrix of sensors and systems, allowing faults to be bubbled up and bad elements disabled independently.The ADIRU does compare values to detect failures (median of 3 sensors), but they could only detect errors that last >1s. The flight computer used the raw data - because the sensors aren't interchangeable (they won't have consistent readings in all flight modes)!Very nifty.One thing, they say "memorisation period", I don't think it's a memorisation period? From my reading of the algorithm, it should be more "last value retention period"? Or "sensor spurious fault reading delay"?Section 2.1 A330/A340 flight control system design "AOA computation logic"https://www.atsb.gov.au/sites/default/files/media/3532398/ao...
  • addaon
    I’d really, really like to know what microcontroller family this was found on. Assuming that this is a safety processor (lockstep, ECC, etc) it suggests that ECC was insufficient for the level of bit flips they’re seeing — and if the concern is data corruption, not unintended restart, it means it’s enough flips in one word to be undetectable. The environment they’re operating in isn’t that different from everyone else, so unless they ate some margin elsewhere (bad voltage corner or something), this can definitely be relevant to others. Also would be interesting to know if it’s NVM or SRAM that’s effected.
  • rossjudson
    My armchair guess is that they had a new control pathway not properly participating in their integrity hand-off protocols, doing some kind of transformation outside of that protection.I once saw some HW engineers go nuts trying to find out why a storage device had an error rate several orders of magnitude higher than the extremely low error rate they expected (and triggering data corruption errors). It turns out to be one extremely deep VHDL-based control area for an FPGA that didn't properly do integrity. You'd have to flip a bit at an incredibly precise point in time for error to occur, but that's what was happening. When all the math was said and done, that FPGA control path integrity miss exactly accounted for the the higher error rate.
  • rene_d
    The Aviation Herald has more technical details:https://avherald.com/h?article=52f1ffc3&opt=0
  • nickdothutton
    I’d just like to point out that if you are in the computing industry long enough, you will get to see a few such incidents under different circumstances, not only in industries like aerospace. Mostly things like ECC save your a*, sometimes your software will be able to recognise a temporary spurious reading and disregard it because you had enough alternative checking logic, or in the case of realtime and safety critical maybe even your systems can take a vote between them. Got caught out by (cpu cache line) bit flips in the 90s, months of pain trying to track it down. Some of your will know :-)
  • pyb
    The aerospace industry has had countermeasures in place against bit-flips for a long time, oftentimes thanks to redudancyAirbus/Thales's fix in this case appears to add more error checking, and to restart the misbehaving component. https://bea.aero/fileadmin/user_upload/BEA2024-0404-BEA2025-...("une supervision interne du composant à l’origine de la défaillance ; - un mécanisme de redémarrage automatique de ce composant dès lors que la défaillance est détectée)
  • supernova87a
    I wonder how the incident was diagnosed? Does the FDR record low level errors that might've contributed to this? I thought that it only recorded certain input parameters and high-level flight metrics but I'm no expert.If a radiation event caused some bit-flip, how would you realize that's what triggered an error? Or maybe the FDR does record when certain things go wrong? I'm thinking like, voting errors of the main flight computers?Anyway, would be very interested to know!
  • qaq
    Has BoFesc vibes "It's friday, so I get into work early, before lunch even. The phone rings. Shit!I turn the page on the excuse sheet. "SOLAR FLARES" stares out at me. I'd better read up on that..."
  • 65a
    There's a great postmortem here about what might have been a similar SEU (single event upset--bitflip) here: https://www.atsb.gov.au/sites/default/files/media/3532398/ao...
  • skx001
    This video shows the the A320 computer and how the computer cooling system workshttps://www.youtube.com/watch?v=HQuc_HhW6VA
  • minitoar
    We flew too close to the sun
  • joelthelion
    Do they really need to ground the entire fleet for that? One incident for ten thousand planes in the air for years. I'd think that giving airlines two months to fix it would be sufficient.
  • 1970-01-01
    They said the same thing at Toyota when the unintended accel problem was in the news, but never found a real world example. There are a lot more old Toyotas still on the road than Airbuses in the air, so distance to the sun makes all the difference here? I wonder if they only see issues when flying near the north pole?
  • nubinetwork
    Why would a CME disrupt a single brand and model of aircraft, when the entire planet is covered in computers that almost never have bitflip issues when a CME rolls through every few months?
  • rishabhaiover
    I hope Airbus only uses Honeywell or Collins in their newer planes.
  • owenthejumper
    A friend works at Jetblue. They are scrambling hard to do the updates.
  • jfoster
    I've noticed that some carriers seem to be suggesting that there might be no impact to flights, but isn't this an immediate grounding for each aircraft until the update is made?How is it possible that this wouldn't impact upon flight schedules?
  • jakub_g
    From newspaper reporting on this, they are rolling back a software update. I wonder what was the original cause or the update? How often are flight computers software updated and why?
  • oofbey
    This is in response to JetBlue flight 1230 from Cancun to Newark on October 30, 2025, where a cosmic ray of some kind flipped a bit and caused a dangerous situation. At the time there was a minor (G1) geomagnetic storm - meaning more cosmic rays than normal. The Planetary K-index was at 5. These are somewhat elevated numbers - enough to produce a visible Aurora in Canada, but probably not even the northernmost US. But also this level of space weather is also very common. We hit G1 or higher about once a week. That's the really damning part. If it had happened in a G4 or G5 storm, then the engineers might have responded "we can't fix everything", but this level of reliability is clearly unacceptable.
  • ChrisArchitect
  • op00to
    Solar radiation like solar wind, or sunlight? They don’t say.
  • raverbashing
    Apparently the fix is reverting to a previous version of the SW (see https://avherald.com/h?article=52f1ffc3&opt=0 )Curious what a sw change might have done in terms of resiliency. Maybe an incorrect memory setting or some code path that is not calculating things redundantly maybe?
  • anon
    undefined
  • kappi
    Following the Airbus A320 emergency airworthiness action, everyone will be talking about the ELAC (Elevator Aileron Computer) manufactured by Thales, which caused a sudden pitch-down without pilot input on JetBlue 1230 back in October.So here’s everything you need to know about ELAC.The ELAC System in the Airbus A320: The Brains Behind Pitch and Roll Control https://x.com/Turbinetraveler/status/1994498724513345637
  • jMyles
    This is one of the rare cases where, IMO, it makes sense to use a modified title as you've done here.
  • rvz
    [flagged]
  • viiralvx
    I was traveling during this entire ordeal. My flight got delayed by 7 hours. Insane day, just now boarding my flight. American Airlines was in shambles today.