Satellite Anomalies

Chapter 6: They Call Them ‘Satellite Anomalies’

“Space weather is working its way into the national consciousness as we see an increasing number of problems with parts of our technological infrastructure such as satellite failures and widespread electrical power brownouts and blackouts [NSWP, 1999]”

January 20, 1994, was a moderately active day for the Sun. There were no obvious solar flares in progress, and no evidence for any larger-than-normal amounts of X-rays, but a series of coronal holes had just swept across the Sun between January 13-19th. According to the NOAA Space Environment Center, the only sign of unrest near the Earth was the high-speed solar wind from these coronal holes which had produced active-to-minor storm conditions in their wake. NASA’s, SAMPEX satellite, was beginning to tell another, more ominous, story. There were now signs of energetic electrons near geosynchronous orbit, whose concentration were rising to a maximum. These particles came from the passage of a disturbance from the magnetotail region into the inner magnetic field regions around the Earth. Within minutes, the GOES-4 and GOES-5 weather satellites began to detect accumulating electrostatic charges on their outer surfaces. Unlike the discharge you feel after shuffling across a floor, there is no way that satellites can unload the excess charges they accumulate, and so they continue to build until the surfaces reach voltages of hundreds, or even thousands of volts.

The Anik E1 and E2 satellites, owned by Telesat Canada, were a twin pair of GE Astro Space model 5000 satellites, weighing about 7000 pounds, and lunched into space in 1991. From their orbital slots on the equator 900 miles southwest of Mexico City, and 1,500 miles apart in space, they soon become the most powerful satellites in commercial use in all of North America. Virtually all of Canada’s television broadcast traffic passed through the E2 transponders at one time or another. The E2 satellite provided the business community with a variety of voice, data, and image services. Despite some technical difficulties with the deployment of the Anik E2 antenna which dogged engineers for several months, the satellites soon became a reliable corner stone for North American commerce and entertainment.

Canadians eagerly awaited this satellite service because major cities were few and far between across Canada; a territory bigger than the United States. With hundreds of small towns, and only a few dozen major cities with television stations, the satellites quickly became the information lifeline for many parts of Canada. 2,300 cable systems throughout Canada, and nearly 100,000 home satellite dish owners depended on these satellites to receive their programming. Far-flung newspapers relied on these satellites to beam their newspapers to distant printing presses to serve local communities. Most people thought the satellites would continue working until at least 2003, but on January 20, 1994 this optimism came to an end.

As the GOES satellites began to accumulate electric charges from the influx of energetic particles, the Intelsat-K satellite began to wobble on January 20, 1994, and experienced a short outage of service. About two hours later, the Anik satellites took their turn in dealing with these changing space conditions, and did not do as well. The satellites experienced almost identical failures having to do with their momentum wheel control systems. The first to go was Anik E1 at 12:40 PM which began to roll end-over-end uncontrollably. The Canadian Press was unable to deliver news to over 100 newspapers and 450 radio stations for the rest of the day, but was able to use the Internet as an emergency back-up. Telephone users in 40 northern Canadian communities were left without service. It took over seven hours for Telesat Canada’s engineers to correct Anik E1’s pointing problems using a back-up momentum wheel system.

About 70 minutes later at 9:10 PM, the Anik E2 satellite’s momentum wheel system failed, but its backup system also failed, so the satellite continued to spin slowly, rendering it useless. This time, 3.6 million Canadians were affected as their major TV satellite went out of service. Popular programs such as MuchMusic, TSN and the Weather Channel were knocked off the air for three hours while engineers rerouted the services to Anik E1. For many months, Telesat Canada wrestled with the enormous problem of trying to re-establish control of Anik E2. They were not about to scrap a $300 million satellite without putting up a fight. After five months of hard work, they were at last able to re-gain control of Anik E2 4 on 21 June 1994. The bad news is that, instead of relying on the satellite’s now useless pointing system, they would send commands up to the satellite to fire its thrusters every minute or so to keep it properly pointed. This ground intervention would have to continue until they ran out of thruster fuel, shortening the satellites lifespan by several years. The good news is that Telesat Canada became the first satellite company to actively stabilize a satellite without using any satellite attitude system. In the end, it would turn out to be something of a Pyrrhic victory because on March 26, 1996 at 3:45 PM, a crucil diode on the Anik E1 solar panel shorted out, causing a permanent loss of half the satellite’s power. Investigators later concluded that this, too, was caused by an unlucky solar event.

The connection between the geomagnetic disturbance and the Anik satellite outages seemed to be entirely straight-forward to the satellite owners at the time, and Telesat Canada publicly acknowledged the cause-and-effect relationship in press releases and news conferences following the outages. They also admitted that the Anik space weather disturbance which had ultimately cost their company nearly $5 million to fix, was consistent with past spacecraft-affecting events they had noticed and that very similar problems had also bedeviled the Anik-B satellite 15 years earlier. What also made this story interesting is that the Intelsat-K and the two Anik satellites are of the same satellite design. The crucil difference however, is that the Intelsat Corporation specifically modifies its satellites to survive electrostatic disturbances including solar storms and cosmic rays. This allowed the Intelsat-K satellite to recover quickly following the storms that disabled the unmodified Anik satellites. Clearly, it is possible, and desirable, to ‘harden’ satellite systems so that they are more resistant to solar storm damage. This lesson in spacecraft design is not a new one we have just learned, but a very old one that has been applied more or less conscientiously since the dawn of the Space Age itself when these problems were first uncovered.

Although the USSR managed to surprise the United States by orbiting Sputnik 1, our entry into the Space Age came in 1958 with the launch of the Explorer 1 satellite. The main objective of the satellite was simply to staunch the perception that we had fallen behind the USSR in a critical technological area. So the satellite, no bigger than a large beach ball, was put on the engineering fast track and equipped with a simple experiment devised by James van Allen at the University of Iowa. Even before the first satellite entered the space environment, scientists had long suspected that there would be some interesting things for instruments to measure when they got there. What they couldn’t imagine was that billions of dollars of satellite real estate would eventually fall victim to these same cosmic bullets.

More than ten years earlier, physicists working with photographic films on mountaintops had detected a rainstorm of ‘cosmic rays’ streaming into the atmosphere, but their origins were unknown. Van Allen wanted to measure how intense this rain was before it was muffled by the Earth’s blanket of atmosphere, and perhaps even sniff out a clue about where they were coming from in the first place. His experiment was nothing more than a Geiger Counter tucked inside the satellite, but no sooner was the satellite in space but the instrument began to register the clicks of incoming energetic particles. Space was indeed ‘radioactive’. Since then, the impact that these particles have had on delicate satellite electronics has been well documented by civilian and military scientists.

Satellites receive their operating power from large-area solar panels which have surfaces covered by solar cells. When the Sun ejects clouds of high-energy protons, these particles can literally scour the surfaces of these solar cells. Direct collisions between the high-speed protons, and the atoms of silicon in the cells, cause the silicon atoms to violently shift position. These shifting atoms produce crystal defects that increase the resistance of the solar cells to the currents of electricity they are producing. Solar cell efficiency steadily decreases, and so does the power produced by the solar panels. Engineers have learned to compensate for this erosion of power by making solar panels over-sized. This lets the satellite start out with extra capacity to cover for this steady degradation of electrical output. But this degradation doesn’t happen smoothly over time. Like a sudden summertime hailstorm, the Sun produces unpredictable bursts of particles, which do considerable damage in only a few hours. During October 19-26, 1989 a series of powerful solar flares caused many satellites to experience about five years of solar panel degradation in just seven days. Satellites that were designed to last 10 years, were now expected to last only five before their panels could no longer provide full power. The GEOS-7 weather satellite lost half of its mission lifetime in just this way, from a single solar flare in March 1989.

High-energy particles also do considerable internal damage to spacecraft. At the atomic scale, to an incoming proton, the walls of a satellite look more like a porous spaghetti colander than some solid wall of matter. When high-energy protons do manage to collide with atoms in the walls of the satellite, they produce sprays of secondary, energetic electrons that penetrate even deeper into the interior of the satellite, producing what engineers call ‘Internal Dielectric Charging’. As the charging continues, eventually the electrical properties of some portion of the satellite breaks down and a discharge is produced. In a word, you end up with a miniature lighting bolt that causes a current to flow in some part of an electrical circuit it’s not supposed to. As anyone who has inserted new boards into their PC can tell you, just one static discharge can destroy the circuitry on a board. Beyond actual physical damage, these particles can also change information stored in a computer’s memory.

Microscopic current flows can flip a computer memory position from ‘1’ to ‘0’ or cause some components, or an entire spacecraft system, to switch-on when it is not supposed to. When this happens, it is called a ‘Single Event Upset’ or SEU, and like water they come in two flavors: hard and soft. A hard SEU actually does unreparable physical damage to a junction or part of a microcircuit. A soft SEU merely changes a binary value stored in a device’s memory, and this can be corrected by simply ‘re-booting’ the device. Engineers on the ground cannot watch the circuitry of a satellite as it undergoes a discharge or SEU event, but they can monitor the functions of the satellite. When these change suddenly, and without any logical or human cause, they are called ‘Satellite Anomalies’. They happen a lot more often than you will ever read about in the news media.

Gordon Wrenn is the Section Leader of the Space and Communications Department of DRA Farnborough in England. Some years ago, he looked into a rash of unexpected changes in an unnamed, commercial, geosynchronous satellite’s pointing direction. The owners of the satellite let him look at their data under condition that he not divulge its name or who owned it. This particular satellite experienced many SEUs in its attitude sensor system. When the SEUs were compared to the radiation sensor data from the GOES-7 and METEOSAT-3 satellites, it was pretty clear that the anomalies followed along with increases in the number of energetic electrons detected by GOES-7. These insights, however, cannot be uncovered without cooperation from the satellite owners. The specific way that energetic particles cause internal dielectric charging can only be ferrited-out when satellite owners provide investigators with satellite data as Wrenn explains,

“Prompt and open reporting offers the opportunity to learn from others’ mistakes. Sometimes the lesson can be fairly inexpensive; Telsat Canada were not so fortunate [with the loss of the Anik satellites]”

More readily available data on this problem can be had from government research and communication satellites because the information is, at least in principle, open to public scrutiny if you happen to know who to talk to, or can extract the information from thousands of technical reports.

The first satellite in the NASA, Tracking and Data Relay Satellite System (TDRSS-1) was launched in April 1983, and from that time onwards, the satellite has been continuously affected by soft SEUs. The satellite anomalies affected the spacecraft’s Attitude Control System, and like mosquitoes on a warm day, they remain a constant problem today. The SEUs have been traced to changes in the computer’s RAM, and the most serious of these SEUs were considered mission-threatening. If left uncorrected, they could lead to the satellite tumbling out of control. Ground controllers have to constantly keep watch on the satellite’s systems to make certain it keeps its antennas pointed in the right direction. This has become such an onerous task that one of the ground controllers, the late Don Vinson, once quipped, “If this [the repeated SEU’s] keeps up, TDRS will have to be equipped with a joystick”

The problems with TDRSS-1 quickly forced NASA to redesign the next satellites in the series, TDRSS-3 and 4 (TDRSS-2 was lost in the Challenger accident), and the solution was fortunately very simple. In engineering-speak, “The Fairchild static, bi-polar 93L422 RAMS were swapped for a radiation-hardened RCA CMM5114 device based on a different semiconductor technology”. Radiation-hardening is a complex process of redesigning microcircuits so that they are more resistant to the high-energy particles that pass through them. The result is that neither of the two new TDRSS satellites have recorded SEUs while during the same operation period, hundreds still cause TDRS-1 to rock and roll, keeping the satellites human handlers steadily employed for the foreseeable future.

Finding additional examples of satellites that have suffered from serious damage is complicated by the fact that commercial satellite companies do not want it widely known what the cause of a satellite problem was. The military, on the other hand, considers this kind of satellite vulnerability information a sensitive issue. Although the military satellite impacts are inaccessible, it is possible to ferret-out from news reports and from a variety of published trade journals, many examples of satellite problems caused by, or likely to have been caused by, solar storm events.

Over 80,000 objects are tracked by the powerful radars used by the US Space Command, but during the March 1989 storm, over 1,300 of the objects moved from the ‘identified’ to ‘unidentified’ category as increased atmospheric drag affected their orbits and temporarily converted them into unidentified objects. Later on that same year, another powerful flare between August 15-16 caused half of the GEOS-6 telemetry circuits to fail immediately. Meanwhile, back on Earth, the Toronto Stock Exchange closed unexpectedly when all three of their ‘fault tolerant’ disk drives crashed at the same time.

It seems that a common way for satellite system to fail is for their Attitude Control Systems to be damaged or compromised in some way. Why this happens has a lot to do with how a satellite recognizes its orientation in space. These systems contain a set of sensors to determine the direction that a satellite is pointing in space, a set of thrustors or gyros to move the satellite in three directions, and a system for ‘dumping’ angular momentum usually through a mechanical component called a momentum wheel. The basic operating principle for many of these attitude systems is to use some type of sensor or ‘star tracker’ to take frequent images of the sky and compare the locations of the detected stars with an internal catalog. A computer then compares the position differences and causes the satellite to reorient itself to point in the right direction. Energetic particles can impact sensitive electronic camera elements, specifically the so-called ‘CCD chip’, and produce false stars. During September 29, 1989, a powerful X-ray flare caused power panel and star tracker upsets on NASA’s Magellan spacecraft enroute to Venus. The storm was also detected near Earth by the GOES-7 satellite. The flare was the most powerful one recorded since February 1956. Even the Hubble Space Telescope, whose mission is to actually observe stars, sees more of these than it is supposed to, because its attitude system is also under steady attack every day.

Earlier generations of communications satellites that didn’t require star trackers for high-precision pointing, used an even simpler position system. Because of the very large transmission beams that were used covering entire continents, these satellites used sensors which detected the local magnetic field of the Earth. On-board pointing systems compared the detected field orientation against an internal table of what it ought to be if the satellite were pointing correctly. Although using the local magnetic field only gives pointing measurements that are good to a degree or so, this is often good enough for some types of satellites. During the March 13-14 1989 solar storm which triggered the Quebec Blackout, geostationary satellites, which used the Earth’s magnetic field to determine their orientation, had to be manually controlled to keep them from literally flipping upside down as the orientation of the magnetic field became disturbed and changed direction. Records show that some low altitude, high-inclination, and polar-orbiting satellites experienced uncontrolled tumbling.

When a satellite changes its pointing direction, it can either do so by using thrusters or by pushing against an internal mass of some kind. Thrusters are quite messy and only used for gross maneuvers. A momentum wheel is a symmetric mass of material oriented so that the spin axis is exactly along the major axis of the satellite. Each time the satellite pointing direction is altered slightly, the laws of physics require that each push has to be matched by one in the opposite direction. It is this latter one that causes the momentum wheels to spin-up as the satellite pushes in the opposite direction against the momentum wheel to alter its pointing direction. Eventually the rotational energy has to be unloaded or ‘dumped’ so that the momentum wheel system doesn’t, literally, fly apart. During October 19-26, 1989 solar storm, a 13-satellite geostationary satellite constellation (unnamed by Allen) reported 187 ‘glitches’ with its attitude system.

The introduction of off-the-shelf components into the design of satellites has been one of the major revolutions pointed to by satellite manufacturers which is keeping space access costs plummeting. It is increasingly being touted as good news for consumers, because the cost-per-satellite becomes very low when items can be mass-produced rather than built one at a time. Based on its experience with the 72-satellite Iridium series, Motorola will begin the 14-month, mass production of the 288 satellites for the Teledesic network in the fastest satellite construction project ever attempted. According to Chris Galvin, CEO for Motorola, their perception is that, “Satellites are not rocket science so much any more as much as [simply] assembly”. This attitude has come to revolutionize the way that satellite manufacturers view both their products and the risks.

But there is a downside to this exuberance and economic savings. Most of this revolution in thinking has happened during the 1990’s while solar activity has been low between the peaks of Solar Cycle 22 and 23. The fact that energetic particles can invade poorly shielded satellites and disrupt sensitive electronics in a variety of ways, is not a recently discovered phenomenon that we have to experimentally re-confirm. It has been a fact of life for satellite engineers for over 40 years. Data from government research satellites, and weather satellites, convincingly show that the particulate showers from solar wind particles, cosmic rays, solar flares and CMEs can all affect spacecraft electronics in a variety of ways. Some of these are inconsequential and are a nuisance, others can be fatal. They do not constitute a mystery that we have only encountered by actually placing expensive satellites in harms way. For this reason, our current situation with respect to solar storms and satellite technology is very different than when previous technologies were developed and deployed for commercial use.

Even more troubling than satellite electronics is that energetic neutrons produced when solar flare particles strike atoms in the Earth’s atmosphere, can travel all the way to the ground. There they affect aircraft avionics causing temporary glitches in both civilian and military aircraft. About one in ten avionics errors are ‘unconfirmed’ which means that on obvious hardware or software problem could have caused them. One important source of information on these particles is cardiac pacemakers. Millions of these are installed in people, many of whom take trips on jet planes. They record any irregularities in the rate at which they trigger their pulses, and this information can be examined when they return to ground. These errors, among airline staff, do correlate with solar activity levels. There is also another ‘down to earth’ problem with these solar storm particles. Whenever computers crash for no apparent reason, some new studies suggest that these energetic particles are to blame. With more components crammed onto smaller chips, the sizes of these components has shrunk to the point that designers are now paying close attention to energetic particles from solar flares. The American Micro Devices K-6 processor, for example, was designed using SEU modeling programs. Because this background cannot be eliminated by shielding, and because it is ubiquitous, it may prove the final, ultimate limit to just how small, and how fast, designers can make the next generations of computers.

Even though the conventional approach to reducing radiation effects is to increase the amount of shielding in a satellite, this will not work for all types of radiation encountered in space. For example, the APEX satellite investigators concluded,

“…conventional shielding is not an effective means to reduce SEUs in space systems that traverse the inner high energy proton belt”.

The reason for this is that the particles most effective in producing SEUs are the energetic protons with energies above 40 million volts. When these enter spacecraft shielding, they collide with atoms in the shielding to spawn showers of still more particles. In fact, the thicker the shielding, the more secondary particles are produced to penetrate still deeper into the satellite. Low energy particles, however, can be stopped by nothing more than a few millimeters of aluminum shielding.

For TDRSS-1, it was too late to do anything to make the satellite less susceptable to SEUs, however subsequent satellites in the TDRSS series were equipped with radiation-hardened ‘chips’ which virtually eliminated further SEUs in these satellite systems. Commercial computer systems operate with 500 megaHertz processors and 10 gigabyte memories. The Space Shuttle was only recently upgraded to an IBM 80386 system, the difference being that the Shuttles’ ‘386’ can withstand major bursts of radiation and still operate reliably. Intel Corporation and the Department of Defense announced in 1998 that Sandia National Laboratories will receive a license to use the $1 billion Pentium processor design, to develop a custom made radiation-hardened version for US space and defense purposes. The process of developing ‘rad-hard’ versions of current, high-performance microchips is complicated because the tricks used to increase chip speed often make the chip vulnerable to ionizing radiation. Larger-than-commercial etched wiring, and thinner-than-commercial oxide layer deposition, are the keys to making chips hardier it seems. The reason these efforts are expended is pretty simple, though expensive. Peter Winokur, a physicist at Sandia noted that,

“When a satellite fails in space, it’s hard to send a repair crew to see what broke. You need to put in parts as reliable as possible from the beginning to prevent future problems”.

Telegraph, telephone, and radio communications were invented, and brought into commercial use, before it was fully understood that geomagnetic and solar storms could produce disruptions and interference. With satellite technology, we have understood in considerable detail the kind of environment into which we are inserting them so that the resulting radiation effects. Their implications for the reliability of satellite services, have been fully anticipated. There are no great mysteries here that beg exploration by using multi-million dollar satellites as high-tech ‘test particles’.