Bayes for days: What to do with signal
Authored By: Kara Lilly | Publish Date: June 22, 2017
Authored By: Kara Lilly | Publish Date: June 22, 2017
The morning of September 11, 2001, Christian Siya-Jothy and his colleagues at Goldman Sachs heard that a plane had crashed into the World Trade Center. Siya-Jothy, a pilot as well as a trader, turned to his colleagues and told them this was no accident; the sky was clear, the attack must be purposeful. He immediately began closing out of the positions that would likely fall after a terrorist attack while purchasing options that would give him the right to buy government bonds in the future.1 In the days and weeks that followed, investors flocked to government bonds as Siva-Jothy had expected.
The story above touches on an important consideration for investors: how we incorporate signal. Investors are constantly receiving new information, updating their beliefs based on this information, and then reflecting their views in the positions they take. However, investors often err in how they weigh new information and, consequently, make sub-optimal decisions. This may sound like no big deal until you add up the cumulative consequences of many poor micro decisions over time. We need to ask: what is the best way to deal with new information?
Fortunately, there is a tool that can help us with this problem: Bayes’ Theorem (Bayes). An equation invented in the 18th century, Bayes is a formula that provides a rational way to update beliefs based on new evidence. Unfortunately, this tool is not widely used; even for those who are aware of it, Bayes can seem unwieldy and unintuitive.
This is the equation. Don’t worry if the math looks a bit daunting, it’s far more straightforward than it seems.
This needn’t be the case. Even when it is not used in a strict mathematical sense, Bayes can be a powerful mental model for checking assumptions. On our team, we use Bayes to help us distinguish signal vs. noise and update our beliefs once new information comes to light. In fact, when you deconstruct the equation, you realize the whole thing is basically asking three simple questions—ones that investors should always be asking themselves as they are excellent gut checks.
While deconstructing a math equation may seem dry at first (“boring,” if you will), it is actually fascinating and—as you will soon see—fun. Moreover, Bayes provides a framework for clear, level-headed thinking (a rarity these days, it seems).
Imagine that your friend has been dating a girl he met online. He describes his new girlfriend as nice, pretty and introverted. Your friend is very excited, as he is quiet and loves to spend Sunday morning reading. Based on this information, do you believe your friend’s new girlfriend is most likely:
If you answered librarian, you would be among the majority. You would also most likely be wrong. When most people read this question and see the girl’s description, “librarian” seems to be the logical choice. However, the actual probability that she is a librarian is quite low as there simply aren’t that many in the economy (it’s a small pool vs. bartenders and sales people).* This example demonstrates how we might benefit from an understanding of Bayes.
Before we examine the equation in-depth, we need to take a 10,000 foot view of the theory. Remember: ultimately, Bayes is a tool that helps us update our beliefs (expressed in a probability term), when given pieces of information. Using Bayes, one might answer the following: What is the chance that Donald Trump will win Michigan and Pennsylvania, given that he just won Virginia?
Here is the beauty of Bayes; the equation just asks three questions:
1) What is the base rate?
2) How rare is the evidence?
3) How relevant is the evidence to the hypothesis?
A base rate is the prior probability of something happening (often referred to simply as “priors”). This is an important number to understand, as it has a significant impact on the end result.
In the librarian example, we would ask: what is the probability that anyone is employed as a librarian? In this case the probability your friend’s girlfriend is a librarian would be 0.11%. This is because there are very few librarian jobs (166,164)2 in the overall job market (146 million)3. Now you have discovered the base rate: 0.11%.
The equation looks like this:
Another example could be trying to determine the chance your child has done her homework at the end of the night. Absent any other information, you would reference her historical pattern to arrive at your conclusion. So, if she finished her homework 95 times in the last 100 days, the base rate would be 95% that she completed her homework this evening.
It’s critical when using Bayes’ to start with these historical figures. It is only after achieving the base rate that additional evidence is incorporated.
When signals are rare, it is far more likely that the evidence will be strong.
For example, in the question about your friend’s new girlfriend, we offered several pieces of evidence about the girl. We said that she was (1) a girl, (2) nice, (3) someone he met online, (4) pretty, and (5) introverted. In sum, we gave five pieces of evidence.
But was any of this evidence predictive of her being a librarian? Of course not, it’s all way too common. So what if the girl is introverted? Psychologists expect that as many as 50% of the population is introverted.4 Moreover, it is unremarkable for someone to be a girl, described as nice and pretty, and for people to meet online.
We can easily see how this is applicable to investing. Imagine you want to invest in a wealth-creating business and are looking for management teams that will increase the odds of this. What kind of signals could you look for?
According to Bayes, you want to look for signals that are rare. If a CEO is under investigation by the authorities for something as significant as fraud, this could be important—and it’s rare. In comparison, knowing whether or not a CEO drinks water is superfluous.
Returning to the example of your child’s homework, you know that your child’s base rate for completing her homework is 95%. However, what if you learned that she forgot her backpack on the school bus? This would be an abnormal event—rare enough that it probably is worth paying attention to.
When evidence is rare, P(E) ends up being smaller—its mathematical contribution drives up the chance of your hypothesis being true.
Clearly, strong signals are going to be relevant to the hypothesis in question.
Returning to the librarian example, it’s obvious that our pieces of evidence are not really relevant to the hypothesis. What does a subjective assessment that his girlfriend is “nice” have to do with the probability that she is also a librarian? Very little.
Likewise, in the example of the search for management teams that would best create wealth, it’s easy to see how some signals could be relevant versus others. A CEO that is being investigated for fraud is going to be an important (relevant) signal, whereas it would be immaterial to learn they are vegan and despise steak. Even if you are passionate about red meat, it’s pretty hard to draw the conclusion that a CEO is going to run a company poorly because they don’t eat meat or dairy (unless, perhaps, they are running McDonald’s).
And if your child forgot her backpack on the school bus? This development seems likely to be relevant, given that the homework is presumably inside said backpack.
How do you actually figure this number out? You need to examine cases where the outcome/hypothesis (H) is true and see how many times the signal pops up.
For example, in the question about your friend’s new girlfriend, we would examine whether introversion is related to being a librarian. Thus, we’d ask: in all the times when the hypothesis was true (in the population of librarians), how many are introverted?
In the example of your child and their homework, we would ask: in all the times when the hypothesis was true (homework was done) were there any cases this occurred when there WASN’T a backpack present?
This last example is interesting because it shows just how powerful the signal portion of the equation can be. If your child has never had a situation where they completed their homework AND her backpack was missing, then the numerator would be zero, and the entire Bayes equation would collapse to zero. Thus, the probability of your kid finishing their homework tonight, given that she left her backpack behind, would be zilch.
It’s time to put it all together.
And here it is, deconstructed:
What is the probability that your friend’s new girlfriend is a librarian? Below are the steps using Bayes.
To arrive at these probabilities, we’d need to do some research.
For example, to find the base rate—the % of librarians in the population—we could go to the American Library Association. As of April 2015, the ALA put the number of librarians in the U.S. at 166,164. This compares to the 146 million jobs in the U.S. This would make the base rate 0.11%.
To find how rare our evidence is, P(E), we’d look at the % of introverts in the population. According to the 1998 MBTI Manual, which surveyed Myers-Briggs personalities in the U.S., nearly half the population is introverted. Thus, we take a value of 50% for P(E).5
Finally, to find the relevancy of the evidence, P(E|H), we’d want to look at the % of librarians who are introverts. To find this number, we might review the 1992 study by Mary Alice Scherdin which tested 1,600 librarians to determine their Myers-Briggs Type Indicator. This survey found that 63% of librarians are introverts.6
Next, we’d plug these numbers into the equation. We’d get:
And now we have an answer. The probability that your friend’s girlfriend is a librarian given that she is an introvert would be less than 1%!
What useful insights can we make with this example? First and foremost, notice how the base rate simply dominates the answer. The very low base rate—nearly zero—drives the result. This is one reason why it’s so important to pay attention to base rates instead of getting swept away in a narrative about a particular “signal.”
Second, in this instance, we saw that the evidence slightly increased the chances that she was a librarian (although not by much). This actually tells us quite a bit about how the math works for the right hand side of the equation. The more a piece of evidence is related to the hypothesis being true, and the rarer the evidence, the more likely the hypothesis is true.
How applicable is this to investing? Very. Every single day we receive new data and must integrate this information into our beliefs. Bayes is an integral tool in our arsenal for doing so in a rational manner.
For example, let’s say that you are evaluating the probability of a U.S. recession. With no other information, what would be your best estimate of the probability of recession next year? The base rate, of course! In this case, we might say the base rate is 13% (since 1962, the time period we will use in this example, there have been 7 instances of recessions in the U.S. over 55 years).*
But what if I told you the yield curve was inverted and it had been for months? (FYI reader—this is not actually the case!) How would you update your beliefs?
If you were using Bayes to answer this question, you would then evaluate the evidence component and use it to adjust your base rate.
First, you might look at P(E): the probability of the evidence. How often has the yield curve inverted in the U.S. since 1962? In this case, you’d find that the yield curve has inverted meaningfully only 12 times in that time period—22%. Among market signals, this would be uncommon.
Second, you might look at P(E|H): the relationship between the hypothesis and the evidence. In how many U.S. recessions, did the yield curve invert beforehand? Here, we would find that a relationship seems to exist: in the 7 recessions since 1962, the yield curve was inverted every time. This would give us a P(E|H) of 100%.
Therefore, the probability of a recession in the next year, given the yield curve is inverted, would be 59%. This would be an important signal.
Solving for probability of recession, given yield curve inversion:
Bayes is an extremely useful mental model for investors. Not only does it provide a theoretically sound process for updating beliefs, it helps us better understand signal vs. noise and what kind of evidence to look for. Moreover, any investor can gain from systematically looking for base rates and questioning the worth of the evidence before them. It is significant that the best forecasters and machine learning in the world is fundamentally Bayesian.
As with almost everything in life, there are challenges with using Bayes. One of the greatest is that the probabilities we seek are often not easy, or possible, to observe. In these cases, Bayes can quickly become too speculative. The way we counteract this challenge is by going through many rounds of looking for signal and updating our beliefs. (There is a mathematical process for doing this, but it rests beyond the scope of this discussion). Another challenge is that Bayes is based on historical patterns and relationships. These can and do change over time.
Nonetheless, Bayes is an excellent tool even if it’s imperfect. Clearly, Bayesian analysis (in its mathematical form) has a place in investing, where it can inform the user on the probability of an outcome once a new card turns up—particularly since people are generally pretty bad at coming to statistical answers intuitively.
But even when we aren’t running through the math and just asking ourselves those three questions, Bayes offers a good gut check and defense against sloppy thinking, which is essential in a world of rampant storytelling.
Kara Lilly (CFA) is the investment strategist at Mawer Investment Management Ltd.
1 Drobny, Steven. Inside the House of Money: Top Hedge Fund Traders on Profiting in the Global Markets. John Wiley & Sons, Inc. Hobeken, New Jersey. 2006.
2 American Library Association – http://www.ala.org/tools/libfactsheets/alalibraryfactsheet02
3 Bureau of Labor Statistics – https://data.bls.gov/timeseries/CES0000000001
4 Myers, I.B., McCaulley, M.H., Quenk, N.L., & Hammer, A.L. (1998). MBTI Manual: A guide to the development and use of the Myers-Briggs Type Indicator (3rd ed.) Palo Alto, CA: Consulting Psychologists Press
6 Mary Jane Scherdin and Anne K. Beaubien. “Shattering Our Stereotype: Librarians’ New Image, “ Library Journal 120 no. 12 (1995): 35-38.