MLODINOW As I start talking about events in the world around us and looking at the psychological components–and I dealt with that, I greatly expanded that part–they were fascinating studies and I was just so interested I just kept putting more and more into the book.
ROBERTS Yes, that’s when you decided to ask me for help. “Oh, I wasn’t planning on this.” How did you learn about the lottery winner who won twice–the Canadian?
MLODINOW It was in a book somewhere, an academic book. A lot of those interesting stories came from academic papers or books.
ROBERTS That’s interesting.
MLODINOW Sometimes I’ll find something in the newspaper that was really interesting and I would track it down but a lot of it was in academic research. I don’t know why they found it.
ROBERTS Yes, who knows where they got it, but that’s where you got it. How did you learn about the Girl Named Florida stuff? Some professor told you?
MLODINOW My friend Mark Hillery that I mentioned from Berkeley.
ROBERTS A physics professor.
MLODINOW He heard it somewhere… It wasn’t quite this problem but then I kind of tweaked it and made it the Girl Named Florida Problem. That’s a great problem for the book.
ROBERTS Yes, I loved that. So he got it from some physicist . . .
MLODINOW I’m not sure; probably. I took a few days to figure out how to make it into this problem; I don’t remember exactly the problem he told me but I tweaked it into this problem. Just to show you how much work goes into the book, I even spent a whole afternoon deciding on the name Florida. I went back into the records–I needed a rare name–and I looked up different names and tried to find one that would be colorful, interesting, but that was rarely used, and I wanted to know the percentage that it was used; I dug up percentages of names. Everything in the book . . . if you read it, it might just sound like, ’Oh, you know’ . . .
Not a thing is just tossed out there. Or very little; there’s an amazing amount of thought and work that goes behind every little detail.
ROBERTS That’s a very memorable detail I must say. I like it better than the Monty Hall Problem.
MLODINOW I do, too. I think it’s interesting; I found in the reactions to the book that the Monty Hall Problem has gotten more press and in some ways more reactions, which I found interesting given that it has been talked about before and this problem was completely new. I think this problem is in some ways even more striking than the Monty Hall Problem, more counterintuitive and more difficult to believe and certainly closer to something you might actually encounter. And yet I’ve gotten a lot more response based on the Monty Hall Problem and a few places have said that I gave the best explanation they’ve seen. I think the New York Times review said that, too. The New York Times did mention the Girl Named Florida Problem and said that they still find it hard to believe even though they followed the explanation.
ROBERTS I thought your explanation of the Girl Named Florida problem was very clear.
I am very familiar with the Girl Named Florida problem — I have read Mlodinow’s book and other sources that cited it. I agree with the answer, though I don’t really find the explanation all that satisfying either in Mlodinow’s book or on other websites. I came up with my own explanation that arrives at the same result and works better for me. I’m hoping that my alternate explanation will help others to understand the problem and the result.
As Mlodinow explains, the key to solving many difficult problems in probability is ensuring that you arrive at the correct sample space. Usually in confusing problems like this one, there are multiple “filters” on a larger sample space that must be applied to screen out irrelevant outcomes. The order in which these filters are applied doesn’t really matter from a mathematical standpoint, but it can make a huge difference in whether the explanation helps to clear up the confusion. In this problem, I think this is a key factor and for me (and I’m guessing many others) it helps to apply the filters in a different order than that chosen by Mlodinow.
Mlodinow’s explanation starts with an original sample of all families with children. This is reduced to all families with two children, then all families with two children one of whom is a girl, then to all families with two children one of whom is a girl named Florida.
Now here is my approach. Start with all families with two children. In this particular problem it was useful for me to think of the family as being represented by a particular set of parents rather than a particular set of children. The reason is that the child’s name is an attribute that tells you more about the parents than the child since they are the ones who chose it. So let’s then reduce the set of all parents of two children to the set of all parents of two children for whom the name Florida ranks in their top 2 favorite names for girls. Further dissecting this set we observe that there are twice as many families with a boy and a girl than there are families with two girls. (This should be obvious since it just involves counting the four possible outcomes, discarding the two boy families, and looking at the proportion of the remaining possible outcomes: BG, GB, GG). Now, all of the GG families will have at least one Florida. I’ll assume as Mlodinow did that the number parents who chose to name both of their girls Florida is immaterial. For the families with only one girl, it depends on whether Florida was the parents’ first choice or second choice. Let’s assume that this is a 50/50 split. It now should be clear that a girl named Florida with one sibling is equally likely to have a sister as she is to have a brother.
Not only do I find my own explanation easier to follow. I think it provides some additional subtle insights that were missing in Mlodinow’s answer: (1) By focusing on parents preferences, it is more clear that the random distribution of names is between, but perhaps not within families. The chances that two girls will both be named Florida in the same family is probably much less than the probability of one girl named Florida squared. (2) When you use my approach, another important assumption becomes evident. Are parents more likely to give the first child or the second child an unusual name? A hidden assumption in Mlodinow’s solution is that parents are equally likely grant an unusual name to the first and second born. With my approach, this assumption is more exposed. This is probably a very testable assumption, though I have no idea whether it is true or not. I wouldn’t be surprised if the evidence showed a bias one way or the other greatly altering the result.
I’ve developed an analogy for these kinds of probability problems by generalizing a paradox used by Joseph Bertrand, in 1889, to illustrate how *NOT* to solve probability problems. I believe Professor Mlodinow referred to it in his book; most treatments of problems like this do. And it’s ironic, since most will use that incorrect solution for the Two Child Problem
Assume there is a set of circumstances that can arise through random means. For lack of a better word, I will call the set of circumstances a box; but games shows, and families of two, work as well. Then, assume that there is a quality of these boxes that can take on exactly two values, representing symmetric but opposite sides of that quality. I will call them Value 1 and Value 2, or V1 and V2 for short. All boxes will have one of these values, but some have both.
Because they are symmetric, the probability that a random box has either one, but not both, is the same. Call this probability P, and note that it must be less than 1/2. We can deduce that the probability that a random box has just one value is 2P, and the probability that it has both values is (1-2P).
Bertrand used actual boxes. Each held two coins; one held two gold coins, one held two silver, and one held one of each color. The values were “the box holds at least one gold coin” and “the box holds at least one silver coin.” Since one of the three boxes holds only gold coins, P=1/3.
Now suppose I pick a random box, and after examining it without letting you see, I tell you that it has V1. What is the probability that it does not have V2? You might be tempted to say it is P/(P+(1-2P))=P/(1-P), which is the ratio of boxes with just V1 to all that have V1. But if you give that answer, you also say the same probability is P/(1-P) if I were to tell you it had V2, and ask for the probability it doesn’t have V1. And if the probability of one value is P/(1-P) regardless of what value I tell you it has, the probability is P/(1-P) even if I don’t tell you about a value. But we have already deduced that probability is 2P, so we have a paradox; specifically, a more general form of Bertrand’s Box Paradox.
In Bertrand’s specific example, one coin was withdrawn and its color observed. Since two of the three boxes are still possible, you might be tempted to say the chances are even that the remaining coin is gold. But since you will always remove a gold coin this way from a box with two, but one half of the time from a box with one of each, the chances are 2P=2/3.
The Monty Hall Problem is an example of a generalized Box problem, identical to Bertrand’s actual example, although the quality that makes it so is a bit unintuitive. The boxes – the set of circumstances – are the events that lead up to you being offered a chance to switch doors. The quality is which of the two doors you didn’t choose have goats; each game must have at least one goat in those doors, and some have two. The host reveals one of these values to you by opening a door without the car. And you want to switch if the game has just one value. Most people think the chances the switching will win is (1/3)/(1-1/3)=1/2. Bertrand’s answer of 2P=2/3 says that you do better by switching.
The two-child problem is another example. The boxes are two-child families, and the quality is whether the family includes boys and/or girls. Since, of the four possibilities, exactly one has only girls, P=1/4. Your incomplete memory of these genders reveals one value to you. The incorrect answer to the question “what are the chances both are the gender you recall” is not (1/4)/(1-1/4)=1/3, it is 2*(1/4)=1/2.
That’s right, 1/2. Professor Mlodinow’s answer, 1/3, is wrong. It would be correct if, for families of one boy and one girl, you could never remember that one was a boy. Since you can’t assume that, 1/3 can’t be right. And the reason his Florida answer seems unintuitive, is it does not allow you to remember a girl named Mary, or a boy named Indiana, if they have a sister named Florida.