Sudden Event Upsets (SEUs) in spacecraft computer systems are a well known source of trouble. High-energy particles from solar flares or cosmic rays enter a computer chip and change data or command bits from 0 to 1 or vice versa. SEUs can, and do, cause a loss of data or commanding errors. In low-Earth orbits, most of these events happen in the so-called South Atlantic Anomaly (SAA) zone located over Brazil. This is where van Allen radiation belt particles come closest to Earth’s surface.
This picture shows SEU events in the UOSAT satellite computer as it orbits Earth.
The 23rd Cycle – Chapter 6 – “Even more troubling than satellite electronics is that energetic neutrons produced when solar flare particles strike atoms in the Earth’s atmosphere, can travel all the way to the ground. There they affect aircraft avionics causing temporary glitches in both civilian and military aircraft. About one in ten avionics errors are ‘unconfirmed’ which means that no obvious hardware or software problem could have caused them.
One important source of information on these particles is cardiac pacemakers. Millions of these are installed in people, many of whom take trips on jet planes. They record any irregularities in the rate at which they trigger their pulses, and this information can be examined when they return to ground. These errors, among airline staff, do correlate with solar activity levels. There is also another ‘down to earth’ problem with these solar storm particles. Whenever computers crash for no apparent reason, some new studies suggest that these energetic particles are to blame. With more components crammed onto smaller chips, the sizes of these components has shrunk to the point that designers are now paying close attention to energetic particles from solar flares. These particles invade the manufacturing plants for these sophisticated computer chips and cause problems. The American Micro Devices K-6 processor, for example, was designed using SEU modeling programs. Because this background even at ground-level cannot be eliminated by shielding, and because it is ubiquitous, it may prove the final, ultimate limit to just how small, and how fast, designers can make the next generations of computers…
Commercial computer systems operate with 500 megaHertz processors and 10 gigabyte memories. In the 1990’s, the Space Shuttle was upgraded to an IBM 80386 system, the difference being that the Shuttles’ ‘386’ can withstand major bursts of radiation and still operate reliably. Intel Corporation and the Department of Defense announced in 1998 that Sandia National Laboratories will receive a license to use the $1 billion Pentium processor design to develop a custom made radiation-hardened version for US space and defense purposes. The process of developing ‘rad-hard’ versions of current, high-performance microchips is complicated because the tricks used to increase chip speed often make the chip vulnerable to ionizing radiation. Larger-than-commercial etched wiring, and thinner-than-commercial oxide layer deposition, are the keys to making chips hardier it seems. ..
The first satellite in the NASA, Tracking and Data Relay Satellite System (TDRSS-1) was launched in April 1983, and from that time onwards, the satellite has been continuously affected by SEUs. The satellite anomalies affected the spacecraft’s Attitude Control System, and like mosquitoes on a warm day, they remain a constant problem today. The SEUs have been traced to changes in the computer’s RAM, and the most serious of these SEUs were considered mission-threatening. If left uncorrected, they could lead to the satellite tumbling out of control. Ground controllers have to constantly keep watch on the satellite’s systems to make certain it keeps its antennas pointed in the right direction. This has become such an onerous task that one of the ground controllers, the late Don Vinson, once quipped, “If this [the repeated SEU’s] keeps up, TDRS will have to be equipped with a joystick”
The problems with TDRSS-1 quickly forced NASA to redesign the next satellites in the series, TDRSS-3 and 4 (TDRSS-2 was lost in the Challenger accident), and the solution was fortunately very simple. In engineering-speak, “The Fairchild static, bi-polar 93L422 RAMS were swapped for a radiation-hardened RCA CMM5114 device based on a different semiconductor technology”. Radiation-hardening is a complex process of redesigning microcircuits so that they are more resistant to the high-energy particles that pass through them. The result is that neither of the two new TDRSS satellites have recorded SEUs while during the same operation period, hundreds still cause TDRS-1 to rock and roll, keeping the satellites human handlers steadily employed for the foreseeable future.”
Single Event Upsets in Avionics – Boeing Radiation Effects Lab 1998 report by Dr. Eugene Normand. Four independent sources confirm that atmospheric neutrons from cosmic rays are the cause of Single Event Upsets in flight computers on board commercial and military aircraft. The evidence shows that the numbers of these neutrons increase with aircraft altitude as does the number of SEU events. These kinds of events can also be seen at ground level in tests of computer memory. SEUs occurring in cache memory will cause a ‘reconfiguration’ every 2-3 hours while SEUs in unprotected devices will cause a reboot every 100 to 200 hours. In commercial avionics, about 20% of all ‘Could Not Duplicate’ events are caused by SEUs”
Flight-Critical Computer System Recovery from Space Radiation-Induced Error – An article by Chung Yu-Liu in the IEEE AESS Systems magazine, September 22, 2002 pg 19-25 “It is well known that space radiation, containing energetic particles such as protons and ions, can cause anomalies in digital avionics onboard satellites, spacecraft, and aerial vehicles flying at high altitude. Semiconductor devices embedded in these applications become more sensitive to space radiation as the features shrink in size. One of the adverse effects of space radiation on avionics is a transient error known as single event upset (SEU). Given that it is caused by bit-flips in computer memory, SEU does not result in a damaged device. However, the SEU induced data error propagates through the run-time operational flight program, causing erroneous outputs from a flight-critical computer system. ”
System Effects of SEUs – Alan M. Finn at the United Technologies Research Center reports in a paper that “The flip-flops and logic gates of a processor are also susceptible to SEUs. The evidence to date indicates that fault rates for microprocessors are commensurate with the rates of high-density RAM. For example, using the CREME program, the upset rate for several single-chip microprocessors was shown to be between 0.00012/hour and 0.00084/hour (784 km 98° orbit, 1g/cm2Al shielding,solar minimum weather). For the worst case solar flare, the rates increased to between 3 and 18 upsets per hour. As another example, the on-chip RAM of the INMOS Transputer was found to contribute almost 95% of the observed SEUs. Even if a Transputer had a protected off-chip RAM, the expected fault rate is 1.5/day in the processor during a worst-case solar flare.”
Susceptibility of Electronic Systems To Atmospheric Neutrons (SEXTANT Working Group 2002 report) – “In the European Union, 400 million citizens use more than 200 million cars, trucks and planes, an average of 100,000 people traveling by plane at any daytime time), more and more patients cured by proton therapy, patients with todays heart implants (such as pacemakers),defibrillators, internal medication distribution and future bioartificial prosthesis. SEU are encountered in many different domains: avionics, railways, automotive, medical therapy andinstrumentation, safety application and information technologies (e-society). Concerning avionics, SEU is experienced by sensitive electronics in aircraft systems, because of the increasing radiation flux with altitude. A significant effort has gone into monitoring the environment andanalysing operational systems for SEU. Upset rates of about 1 per 200 hours were measured in the Boeing 777 autopilot: this is much higher than the tolerances required by the aircraft manufacturers directives (1 upset every 1 million hours). These error rates might have important consequences if the flight lasts several hours..Personal and vital risks require a very high level of confidence in the equipment, not only with respect toelectromagnetic compatibility (EMC) but also against natural radiation fluxes whatever the circumstances:aircraft travels, living in altitude, even in time of solar eruption, and also sporadic therapy sequence. All theserisks must be asserted, updated along the technology and society, and the emerging ones quantified