This October marks the 145th anniversary of the “Great Chicago Fire” of 1871. The inferno began on the evening of 8 October and was not finally put out until it began to rain 36 hours later in the early morning of 10 October. By then, nearly 3.3 square miles of downtown Chicago had been consumed, resulting in an estimated 300 souls perishing, along with over 17,000 buildings and 73 miles of streets being destroyed. One-third of Chicago residents ― some 100,000 ― were left homeless, and the city’s business district was in ruins.
Several factors contributed to the destructive power of the fire. First, there were abundant fuel sources to feed it. Since ample, inexpensive supplies of timber were available from nearby Michigan, Chicago was a city made almost literally out of wood, from its high-density, tar and felt-roof buildings to its extensive network of sidewalks to its Nicolson wood-blocked paved streets. In addition, there were copious accumulations of combustibles lying in great heaps about the city, including hay for livestock as well as kerosene and wood for winter heating and cooking.
The second factor was the weather. Chicago was in the midst of a severe drought, with less than half its normal rainfall received from July through September. No rain fell at all the week prior to the fire. On the evening of the fire, winds also increased to over 20 mph from the southwest, and the air humidity dropped to under 30%, making the combustibles sources even easier to ignite.
The third, unfortunate, factor was that of human error. The fire was spotted early by the Chicago fire department’s watchman on duty, but he mistakenly sent the firefighters to the wrong location. By the time the mistake was realized, it was too late, as the fire was by then burning out of control. It was so intense that it spawned “fire tornadoes.” These, coupled with the southwest wind, spread burning embers across the urban landscape, including to the city’s water works. When it burned down in the early morning of 9 October, the city was without a supply of water. Firefighters even used explosives to bring down buildings to try to keep the fire from spreading.
After the fire, the city's mayor and Common Council rapidly approved changes to the fire and building codes as well as improvements to the city’s infrastructure the Board of Police had been recommending for years, but which the politicians had strenuously fought against as being too costly and driving away business. City buildings were now required to be constructed out of fireproof materials; building density was reduced; combustibles were better controlled; and the city’s fire department, hydrant system, and water supply were greatly improved.
The official inquiry into how the fire started and was fought never was able to uncover whether its cause was accidental or deliberate. At the time, while there was talk of the fire being a case of arson, the story that took hold was one where Mrs. Catherine O’Leary’s cow was blamed for knocking over a lantern and igniting a fire in the barn, and that unproven accusation remained in the popular lore ever since. [Note: In October of 1997, Chicago’s Committee on Police and Fire officially exonerated Mrs. O’Leary (and her cow).]
Digital Déjà vu?
Chicago in the late 1860s and early 1870s was an overflowing tinderbox just waiting for the right spark. There were plenty of warnings of the danger. Fires were becoming increasingly common, and the Board of Police continually warned that a large fire might get out of control of a publicly acknowledged underequipped and understaffed fire department. Yet, as I noted, the mayor and Common Council accepted the risk of fires as being less costly and more publicly palatable than raising the taxes needed to reduce their risk. The danger of an uncontrolled fire was underscored when a fire broke out every day in the month of October of 1871. Indeed, just the night before the Great Fire, over 500 structures burned down in a four-block area before the fire department was able to put the fire out after 17 hours. Yet, there seemed no real heightened sense of alarm by city leaders.
I was reminded of the Great Chicago Fire by the many parallels I encountered while I was writing the “Lessons Learned from a Decade of IT Failures” series of articles for IEEE Spectrum magazine last year. One lesson that stood out in the analysis of IT systems “gone bad” is how the likelihood and consequences of an uncontrolled “digital inferno” seem to be increasing by the year as computer and communication systems and devices become ever more integrated. Another lesson that emerged is how easy it is for a small flaw ― whether accidentally caused or deliberately exploited ― to turn quickly into a major digital conflagration. A third lesson is how risk warnings are routinely ignored or deliberately downplayed until after an operational system is “burned to the ground” in some spectacular fashion.
Take, for instance, the infamous meltdown of the Royal Bank of Scotland Group’s computer systems in June 2012. An update to the banking group’s accounts processing software went awry, keeping 17 million customers from the group’s three banks (NatWest, Northern Ireland’s Ulster Bank, and the Royal Bank of Scotland) from accessing their accounts for a week. Some 6.6 million RBS Group customers were affected for several weeks, and tens of thousands for more than six weeks. The problem was traced to an update that was incompatible with the previous version of the software; human mistakes made in trying to recover from the error severely exacerbated its effects.
RBS senior executives later admitted that the banking group had underinvested in its IT system infrastructure for decades and promised that over $1 billion would be invested in bringing them up to an acceptable state. An official government inquiry into the outage, however, pointedly stated that RBS management failed to identify and manage the potential risks of a major outage even though its banking systems had suffered several other less significant system outages prior to the June 2012 meltdown that should have served as warnings. The inquiry stated RBS’s IT organizational culture was to react to incidents, not to try to prevent them from happening. In addition, the RBS Group IT organization’s lack of knowledge of the potential problems inherent in the banks’ complex IT infrastructure, and more importantly, how to recover from an error, had been growing over the past several years, the inquiry said.
The RBS banking operational IT system failure was merely one of many that appeared in our IT failure data over the past decade. There are several other IT-related banking outages related to updates and upgrades gone bad in Australia, the US, and the UK, as well as in other industries including automotive, aviation, healthcare, telecommunications, transportation, and, of course, government and defense. In virtually every case, post-failure inquiries into the malfunctions point to the same reasons as found in the RBS situation: short-changing IT investment; vaguely known risks that were underestimated or ignored; and reactive or nonexistent risk and/or contingency management. An underlying theme often alluded to but rarely stated outright was the glaring mismatch among the complexity of the IT system, the processes used to manage that complexity, and the skills of those responsible for implementing those processes.
Time to Reassess
The Great Chicago Fire of 1871 served as a wake-up call not only to Chicago, but to other cities throughout the US that they needed to do much more to manage their fire risks. As we move into an era of truly hyperconnected, ubiquitous, cyberphysical systems that can have huge consequences if they fail either accidentally or through deliberate actions, the need for robust, reliable, and secure approaches across the lifecycle is obvious. However, there are reasons to be concerned that despite the obvious need for superior management of IT risk, it will be ignored as was the serious and growing risk of fire in Chicago prior to the Great Fire.
For instance, there is a disturbing lack of knowledge as well as lack of concern by CEOs not only in terms of the need for cybersecurity, but increasingly robust computing systems in general. The recent cyber breach of 500 million accounts at Yahoo, and the reported disdain by senior executives for implementing cybersecurity measures if it interfered with generating or maintaining business revenue, seems more the norm than the exception. The same is true for the development of IT systems, be they commercial or government, such as in the botched rollout of the Canadian government’s new payroll system even though the government was repeatedly warned that it wasn’t ready to go live.
Digital systems are becoming more complex, and according to some, so overcomplicated to the point that they are becoming unpredictable. Unless attitudes at the top of companies and government organizations begin to change to match the risks IT systems currently do and will pose, it may only be a matter of time before an equivalent O’Leary digital cow creates the spark that causes a regional, nation-wide, multi-country, or even an international digital firestorm.