A month ago, I changed web browsers from Firefox to Chrome (which recently became the most popular browser). Firefox crashed too often (about once per day). Chrome crashes much less often (once per week?) presumably because it confines trouble caused by a bad tab to that tab. ”Separate processes for each tab is EXACTLY what makes Chrome superior” to Firefox, says a user. This localization was part of Chrome’s original design (2008).
After a few weeks, I saw that crash rate was the only difference between the two browsers that mattered. After a crash, it takes a few minutes to recover. With both browsers, the “waiting time” distribution — the distribution of the time between when I try to reach a page (e.g., click on a link) and when I see it — is very long-tailed (very high kurtosis). Almost all pages load quickly (< 2 seconds). A few load slowly (2-10 seconds). A tiny fraction (0.1%?) cause a crash (minutes). The Firefox and Chrome waiting-time distributions are essentially the same except that the Chrome distribution has a thinner tail. As Nassim Taleb says about situations that produce Black Swans, very rare events (in this case, the very long waiting times caused by crashes) matter more (in this case, contribute more to total annoyance) than all other events combined.
Curious about Chrome/Firefox differences, I read a recent review (“Chrome 24 versus Firefox 18 — head to head”). Both browsers were updated shortly before the review. The comparison began like this:
Which browser got the biggest upgrade? Who’s the fastest? The safest? The easiest to use? We took a look at Chrome 24 and Firefox 18 to try and find out.
Not quite. The review compared the press releases about the upgrades. It said nothing about crash rate.
Was the review superficial because the reviewer wasn’t paid enough? If so, Walt Mossberg, the best-paid tech reviewer in the world, might do a good review. The latest browser review by Mossberg I could find (2011) says this about “speed”:
I found the new Firefox to be snappy. . . . The new browser didn’t noticeably slow down for me, even when many tabs were opened. But, in my comparative speed tests, which involve opening groups of tabs simultaneously, or opening single, popular sites, like Facebook, Firefox was often beaten by Chrome and Safari, and even, in some cases, by the new version 9 of IE . . . These tests, which I conducted on a Hewlett-Packard desktop PC running Windows 7, generally showed very slight differences among the browsers.
No mention of crash rate, the main determinant of how long things take. Mossberg ignores it — the one difference between Chrome and Firefox that really matters. He’s not the only one. As far as I can tell, all tech reviewers have failed to measure browser crash rate. For example, this review of the latest Firefox. ”I’m still a big Firefox fan,” says the reviewer.
Browser reviews are a small example of a big rule: People with jobs handle long-tailed distributions poorly. In the case of browser reviews, the people with jobs are the reviewers; the long-tailed distribution is the distribution of waiting times/annoyance. Reviewers handle this distribution badly in the sense that they ignore tail differences, which matter enormously.
Another browser-related example of the rule is the failure of the Mozilla Foundation (people with jobs) to solve Firefox’s crashing problem. My version of Firefox (18.0.1) crashed daily. Year after year, upgrade after upgrade, people at Mozilla failed to add localization. Their design is “crashy”. They fail to fix it. Users notice, change browsers. Firefox may become irrelevant for this one reason. This isn’t Clayton Christensen’s “innovator’s dilemma”, where industry-leading companies become complacent and lose their lead. People at Mozilla have had no reason to be complacent.
Examples of the rule are all around us. Some are easy to see:
1. Taleb’s (negative) Black Swans. Tail events in long-tailed distributions often have huge consequences (making them Black Swans) because their possibility has been ignored or their probability underestimated. The system is not designed to handle them. All of Taleb’s Black Swans involve man-made systems. The financial system, hedge funds, New Orleans’s levees, and so on. These systems were built by people with jobs and react poorly to rare events (e.g., Long Term Capital Management). Taleb’s anti-fragility is what others have called hormesis. Hormesis protects against bad rare events. It increases your tolerance, the dose (e.g., the amount of poison) needed to kill you. As Taleb and others have said, many complex systems (e.g., cells) have hormesis. All of these systems were fashioned by nature, none by people with jobs. No word means anti-fragile, as Taleb has said, because there exist no products or services with such a property. (Almost all adjectives and nouns were originally created to describe products and services, I believe. They helped people trade.) No one wanted to say buy this, it’s anti-fragile. Designers didn’t (and still don’t) know how to add hormesis. They may even be unaware the possibility exists. Products are designed by people with jobs. Taleb doesn’t have a job. Grasping the possibility of anti-fragility — which includes recognizing that tail events are underestimated — does not threaten his job or make it more difficult. If a designer tells her boss about hormesis her boss might ask her to include it.
2. The Boeing 787 (Dreamliner) has had battery problems. The danger inherent in use of a lithium battery has a long-tailed distribution: Almost all uses are safe, a very tiny fraction are dangerous. In spite of enormous amounts of money at stake, Boeing engineers (people with jobs) failed to devise adequate battery testing and management. The FAA (people with jobs) also missed the problem.
3. The designers of the Fukushima nuclear power plant (people with jobs) were perfectly aware of the possibility of a tsunami. They responded badly (did little or nothing) when their assumptions about tsunami likelihood were criticized. The power of the rule is suggested by the fact that this happened in Japan, where most things are well-made.
4. Drug companies (people with jobs) routinely hide or ignore rare side effects, judging by the steady stream of examples that come to light. An example is the tendency of SSRIs to produce violence, including suicide. The whole drug regulatory system (people with jobs) seems to do a poor job with rare side effects.
Why is the rule true? Because jobs require steady output. Tech reviewers want to write a steady stream of reviews. The Mozilla Foundation wants a steady stream of updates. Companies that build nuclear power plants want to build them at a steady rate. Boeing wants to introduce new planes at a steady rate. Harvard professors (criticized by Taleb) want to publish regularly. At Berkeley, when professors come up for promotion, they are judged by how many papers they’ve written. Long-tailed distributions interfere with steady output. To seriously deal with them you have to measure the tails. That’s hard. Adding hormesis (Nature’s protection against tail events) to your product is even harder. Testing a new feature to learn its effect on tail events is hard.
This makes it enormously tempting to ignore tail events. Pretend they don’t exist, or that your tests actually deal with them. At Standard & Poor’s, which rated all sorts of financial instruments, people in charge grasped that they were doing a bad job modelling long-tailed distributions and introduced new testing software that did a better job. S & P employees rebelled: We’ll lose business. Too many products failed the new tests. So S & P bosses watered down the test: “If the transaction failed E3.0, then use E3Low [which assumes less variance].” Which test (E3.0 or E3Low) was more realistic? The employees didn’t care. They just wanted more business.
It’s easy to rationalize ignoring tail events. Everyone ignores them. Next tsunami, I’ll be dead. The real reason they are ignored is that if your audience is other people with jobs (e.g., a regulatory agency, reviewers for a scholarly journal, doctors), it will be easy to get away with ignoring them or making unrealistic assumptions about them. Tail events from long-tailed distributions make a regulator’s job much harder. They make a doctor’s job much harder. If doctors stopped ignoring the long tails, they would have to tell patients That drug I just prescribed — I don’t know how safe it is. The hot potato (unrealistic risk assumptions) is handed from one person to another within a job-to-job system (e.g., drug companies market new drugs to the FDA and to doctors) but eventually the hot potato (or ticking time bomb) must be handed outside the job-to-job system to an ordinary Person X (e.g., a doctor prescribes a drug to a patient). It is just one of many things that Person X buys. He doesn’t have the time or expertise to figure out if what he was told about risk (the probability of very bad very rare events) is accurate. Eventually, however, inaccurate assumptions about tail events may be exposed when people without jobs related to the risk (e.g., parents whose son killed himself after taking Prozac, everyone in Japan, airplane passengers who will die in a plane crash) are harmed. Such people, unlike people with related jobs, are perfectly free to complain and willful ignorance may come to light. In other words, doctors cannot easily complain about poor treatment of rare side effects (and don’t), but patients and their parents can (and do).
There are positive Black Swans too. In some situations, the distribution of benefit has a very long-tailed distribution. Almost all events in Category X produce little or no benefit, a tiny fraction produce great benefit. One example is scientific observations. Almost all of them have little or no benefit, a very tiny fraction are called discoveries (moderate benefit), and a very very tiny fraction are called great discoveries (great benefit). Another example is meeting people. Almost everyone you meet — little or no benefit. A tiny fraction of people you meet — great benefit. A third example is reading something. In my life, almost everything I’ve read has had little or no benefit. A very tiny fraction of what I’ve read has had great benefits.
I came to believe that people with jobs handle long-tailed distributions badly because I noticed that jobs and science are a poor mix. My self-experimentation was science, but it was absurdly successful compared to my professional science (animal learning research). I figured out several reasons for this but in a sense they all came down to one reason: my self-experimentation was a hobby, my professional science was a job. My self-experimentation gave me total freedom, infinite time, and commitment to finding the truth and nothing else. My job, like any job, did not. And, as I said, I saw that scientific progress per observation had a power-law-like distribution: Almost all observations produce almost no progress, a tiny fraction produce great progress.
It is easy enough for scientists to recognize the shape of the distribution of progress per observation but, if you don’t actually study the distribution, you’re not going to have much of an understanding. Professional scientists ignore it. Thinking about it would not help them get grants and churn out papers. (Grants are given by people with jobs, who also ignore the distribution.) Because they don’t think about it, they have no idea how to change the “slope” of the power-law distribution (such distributions are linear on log-log coordinates). In other words, they have no idea how to make rare events more likely. Because it is almost impossible to notice the absence of very rare events (the great discoveries that don’t get made), no one notices. I seem to be the only one who points out that year after year, the Nobel Prize in Physiology/Medicine indicates lack of progress on major diseases. When I was a young scientist, I wanted to learn how to make discoveries. I was surprised to find that everything written on the topic — which seemed pretty important — was awful. Now I know why. Everything on the topic was written by a person with a job.
With long-tailed distributions of benefit, there is nothing like hormesis. If any organism has evolved something to improve long-tailed distributions of benefit, I don’t know what it is. Our scientific system handles the long-tailed distribution of progress poorly in two ways:
1. The people inside it, such as professional scientists, do a poor job of increasing the rate of progress, i.e., making the tails thicker. I think you can make the tails thicker via subject-matter knowledge (Pasteur’s “chance favors the prepared mind”), methodological knowledge (better measurements, better experiments, better data analysis), and novelty. Professional scientists understand the value of the first two factors, but they ignore the third. They like to do the same thing over and over because it is safer. Great for their careers, terrible for the rest of us.
2. When an unlikely observation comes along, the system is not set up to develop it. An example is Galvani’s discovery of galvanism, which led to batteries, which led to widespread electricity. This one discovery, from one observation, arguably produced more progress than all scientific observations in the last 100 years. Galvani’s job (surgery research) left him unable to go further with his discovery. (“Galvani had certain commitments. His main one was to present at least one research paper every year at the Academy.”) His research job left him unable to develop one of the greatest discoveries of all time. In contrast, Darwin (no job) was able to develop the observations that led to his theory of evolution. It took him 18 years to write one book, longer than any job would have allowed. He wouldn’t have gotten tenure at Berkeley.
After a discovery has been made, the shape of the benefit distribution changes. It becomes more Gaussian, less long-tailed. As our understanding increases, science becomes engineering, which becomes design, which becomes manufacturing. Engineering and design and making things fit well with having a job. Take my chair. Every time I use it, I get a modest benefit, always about the same size. Every time I use my pencil, I get a modest benefit, always about the same size. No long-tailed distribution.
Modern science works well as a way of developing discoveries, not making them. An older system was better for encouraging discovery. Professors mainly taught. Their output was classes taught. They did a little research on the side. If they found something, fine, they had enough expertise to publish it, but nothing depended on their rate of publication. Mendel was expert enough to write up his discoveries but his job in no way required him to do so. Just as Taleb recommends most of your investments should be low-risk, with a small fraction high-risk, this is a “job portfolio” where most of the job is low benefit with high certainty and a small fraction of the job is high benefit with low certainty. In the debate over climate change (is the case that humans are dangerously warming the planet as strong as we’re told?) it is striking that everyone with any power on the mainstream side of the debate (scientists, journalists, professional activists) has a job involving the subject. Everyone on the other side with any power (Stephen McIntyre, Bishop Hill, etc.) does not. People without jobs are much more free to speak the truth as they see it.
We need personal science (using science to help yourself) to better handle long-tailed distributions, but not just for that reason. Jobs disable people in other ways, too. Personal science matters, I’ve come to believe, for three reasons.
1. Personal scientists can make discoveries that professional scientists cannot. The Shangri-La Diet is one example. Tara Grant’s discovery of the effect of changing the time of day she took Vitamin D is another. For all the reasons I’ve said.
2. Personal scientists can develop discoveries that professional scientists cannot. Will there be a clinical trial of the Shangri-La Diet (by a professional weight-control researcher) in my lifetime? Who knows. It is so different from what they now believe. (When I applied to the UC Berkeley Animal Care and Use Committee for permission to do animal tests of SLD, I was turned down. It couldn’t possibly be true, said the committee.) Long before that, the rest of us can try it for ourselves and tell others what happened.
3. By collecting data, personal scientists can help tailor any discovery, even a well-developed one, to their own situation. For example, they can make sure a drug or a diet works. (That’s how my personal science started — testing an acne medicine.) They can test home remedies. By tracking their health with sensitive tests, they can make sure a prescribed drug has no bad side effects. Individualizing treatments takes time, which gets in the way of steady output. You have all the time in the world to gather data that will help you be healthy. Your doctor doesn’t. People who have less contact with you than your doctor, such as drug companies, insurance companies, medical school professors and regulatory agencies, are even less interested in your special case.