Wednesday, August 31, 2011

Come On, Irene

Because the group is defined only by minimum standards, it places Herman in a group with players who are vastly superior to him, while eliminating those who are only a little bit behind him.

- Bill James, "Whatever Happened To The Hall of Fame?"

What’s easier to evaluate is how much coverage Hurricane Irene received in comparison with other hurricanes. By that standard, the coverage was quite proportionate to the amount of death and destruction that the storm caused.

- Nate Silver, fivethirtyeight.com

Hurricane Irene has come and gone, but the battle to determine if Irene was 'overhyped' is raging just fine, thank you. The battle has produced some unexpected soldiers on either side, perhaps none more surprising than fivethirtyeight.com's Nate Silver, who apparently felt the need to defend himself from 'overhyping' the storm (in an essay entitled "A New York Hurricane Could Be a Multibillion-Dollar Catastrophe") by writing an essay explaining "How Irene Lived Up to the Hype". Unfortunately, Silver's methodology doesn't live up to his rhetoric -- for a former baseball sabermetrician, Silver's analysis is stunningly shoddy and arguably supports the exact opposite of the argument he's making. Where does Silver go wrong?

1. The main metric for comparing 'hype' is largely irrelevant to the argument.

Silver correctly points out that, when comparing 'hype', one should compare apples to apples by comparing news coverage of Irene to that of other tropical storms. Unfortunately, he commits a rookie sabermetrician's mistake by finding a source of data for his comparison (NewsLibrary.com), and then crafting his model to match his data.

NewsLibrary.com is a searchable database of newspaper articles, with some magazine articles and television broadcast transcripts thrown in for good measure. It seems highly reputable. The problem, though, is that the main argument in favor of Irene being overhyped has nothing to do with how many times it was referenced in The Orlando Sentinel or Newsweek; it's the number of times someone glanced at a computer or smartphone and noticed that a story about Irene was the top news item on Yahoo!, or on Google Reader; and the number of times someone logged into Facebook to see a link in their News Feed to a story about Irene; and the number of times they checked Twitter to see #HurricaneIrene trending, often with links to the same news stories.

By ignoring blogs and other online sources of news, Silver missed a lot of chatter about Irene in his 'News Unit' metric (NewsLibrary.com doesn't reference Silver's own article, which appears only in the NY Times online blogs). By ignoring social media, Silver ignores the 'force multiplier' that Facebook and Twitter served to the stories that did exist, some in traditional media sources, and others online. And, given the nature of Google rankings and people's interest in passing along links, the stories that were visible were largely stories that were apocalyptic or frightening. (How many Facebook statuses did you see that read, "This guy doesn't think Irene will be all that dangerous, but I'm praying for the folks on the East Coast anyway!")

A counterargument that Silver could make would be that incorporating social media into the analysis would unfairly penalize Irene for occurring during a time of relative social media maturity; Hurricanes Gustav and Ike, two very damaging storms from 2008, came along when social media was still in its infancy, so to speak, while Hurricanes Katrina and Rita, two of the most damaging storms in US history, occurred just over a year after Facebook was launched and before Twitter even existed.

The counter to the counterargument, of course, is that this is precisely the point -- social media were a huge part of what made Irene so ubiquitous, and particularly stories about how apocalyptic Irene's approach to New York could or would be. Silver's 'News Units' deliberately underestimate the degree to which Irene was being 'talked about' based on its appearance in news sources.

If that's not enough, Silver's measure also makes Irene look less hyped than other storms in another dimension as well -- since the primary source for stories in NewsLibrary.com's database come from newspapers and magazines, and newspapers and magazines have been in decline throughout the entire period of Silver's analysis, there are fewer newspapers to carry stories about Irene than there were to carry stories about 1992s devastating Hurricane Andrew.

If one takes into account the fewer number of sources in Silver's source data from 2011 versus other periods, then compensates for the recent rise of social media, Irene's placement on Silver's list of 'most covered' hurricanes would clearly be higher than 10, possibly much, much higher. Keep that in mind; it'll become even more significant in the next point.

2. Silver commits a fallacy of comparison by putting Irene 'in a group' with more devastating storms, rather than comparing it to other storms of similar intensity.

This is where the James quote at the top of the essay comes in -- James is writing about a baseball player being considred for the Hall of Fame based on being compared to players who are much better than he is while other, similar players are left out of the comparison. Silver does something similar in his essay.

Though Silver provides a list of 20 hurricanes with 'most media coverage', he compares Irene only to the others in the top 10, of which Irene is #10. Silver freely admits that every other storm in the top 10 was more powerful than Irene (Category 3) at their strongest, and that only Hurricane Gilbert was weaker than Irene (Category 1) when it made its US landfall (though Gilbert had already done massive damage as a Category 5 hurricane in the Carribean). However, this analysis leaves off the storms below Irene on the list of media coverage. The next five:

- #15 Hurricane Fran

Was at its peak strength of Category 3 when it made landfall in North Carolina in 1996. Resulted in 27 deaths (22 direct) and over $3 billion in damage.

- #14 Hurricane Katrina

No introduction needed here - easily the most deadly and damaging tropical storm to hit the US. It was a Category 5 hurricane while just off the coast of Florida, and was Category 3 when it finally made landfall in Louisiana.

- #13 Hurricane Wilma

Another Category 5 hurricane in 2005, Wilma came after Katrina and surpassed it on the scale of hurricane intensity, becoming the most intense tropical storm ever recorded in the Atlantic basin. Wilma remained extremely powerful while tracking through the Atlantic Ocean, remaining a Category 3 hurricane while moving northward, but the only US land Wilma crossed was the Florida peninsula, limiting the damage it did in the US. (The Yucatan peninsula, however, was another matter.)

- #12 Hurricane Isabel

Another storm that reached Category 5 status, Isabel was the costliest and deadliest hurricane of 2003, making landfall in Virginia as a Category 2 hurricane. 16 deaths were directly attributable to the storm (another 35 were indirectly attributed), and it did about $3.6 billion in damage.

- #11 Hurricane Ike

A Category 4 hurricane that made landfall in Galveston, Texas as a Category 2 storm; with 103 direct deaths and nearly $30 billion of damage just in the United States.

For comparison, Irene is currently slated as having caused or contributed to 40 deaths and about $13 billion in damage. Keeping in mind, however, that both of these totals are provisional, and that the initial prediction in this case is likely to be exaggerated (Silver's own essay pegs Irene at $14 billion, but also points out that it's a provisional figure), given the propensity of those in the media to want to justify their reportage. (Compare the 9/11 attacks, which were originally estimated to have caused over 6700 deaths, but which have since been revised to fewer than 3000.)

So, given the likelihood of reducing the death and economic damage tolls from the hurricane, Irene (2.25 Silver News Units) belongs 'in a group' with Isabel (1.88 Silver News Units) and Fran (1.47 Silver News Units), not with Rita (3.13 Silver News Units) and Andrew (3.68 Silver News Units). Remembering that both Irene and Fran pre-date the social media era makes the comparison even more stark -- clearly Irene received far more 'news' coverage than was warranted by its size and damaging capability.

There's one more complaint I have about Silver's essay, but it's a bit nit-picky; Silver produces a list of 'normalized' U.S. economic damage figures for hurricanes by Category, then claims that, because Irene's current estimate exceeds Silver's 'normalized' Category 3 value of $12.7 billion, that Irene did damage "tantamount to a Category 3 hurricane". Silver, a practiced statistician, should be aware of the biases in his statement, given that Katrina was a Category 3 hurricane and its near $100 billion in damage provided a powerful 'long tail' to drive up the averages of all other Category 3 hurricanes in his study. That Irene did more damage than a typical Category 1 hurricane is due to its landfall -- most Category 1 hurricanes land in parts of the US that are accustomed to hurricanes and engineer appropriately. New Jersey, which saw just one Category 1 hurricane from 1900-1996, has much more property likely to become damaged, as they don't take such engineering specifics into account (and probably shouldn't, given the rarity of hurricane events there).

All told, Irene was a dangerous storm that received much more attention than an event of its strength and rarity warranted.