Category Archives: probability

What Is Your Risk Of Exposure To Coronavirus In A Public Outing?

If you're going to a social event with lots of people, what is your risk of coming into contact with someone who has been infected by the coronavirus?

Alex Tabarrok posted the math that can provide insight into the answer to that question, which we'll convert into a tool you can use to run your own scenarios. Here's how he describes the math:

The mathematics for calculating the probability of exposure given the number of carriers in a population and group size aren’t difficult but they can be surprising. Even a low number of carriers can generate a relatively high probability for reasonably sized groups. For example, assume you run a firm of 1000 people in the San Francisco Bay Area (population 8 million.) Let’s suppose that there are just 500 carriers in the area. In this case, assuming random draws, the probability that at least one of your employees is a carrier is 6%. You can run your own calculations at Wolfram Alpha following this format:

p=8000000, c=500, g=1000, 1-1(1-c/p)^g //N

where p is the population size, c is the number of carriers, g is the group size and the //N at the end isn’t a division but a command to Wolfram Alpha to give you a numerical answer.

We're huge fans of Wolfram Alpha, but since the data entry is a bit cumbersome. Hopefully, you'll find the user interface in the following tool to be a bit more friendly. If you're accessing this article on a site that republishes our RSS news feed, please click through to our site to access a working version.

Data for Population, Group, and Number of Infected
Input Data Values
Overall Population
Number of Infected (Carriers) Within the Overall Population
Size of a Group (Subset of the Overall Population)

Probability of Carrier Within The Group
Calculated Results Values
Probability of a Carrier Being Within the Group

One important thing to note is that the probability the tool calculates assumes that carriers are randomly distributed among the whole population. In practice, many will be concentrated within smaller groups of the population, where effective quarantine practices, or what the kids are calling "social distancing" these days, will reduce the probability of exposure.

Alex makes the key point for what we can learn from the math if you're thinking of going to a public event while the risk of viral infection remains high:

Now here is the most important point. It’s the size of the group, not the number of carriers that most drives the result. For example, suppose our estimate of the number of carriers if off by a factor of 10–that is instead of 20,000 there are just 2000 carriers in the United States. In this case, the probability of at least one carrier at a big event of 100,000 drops not by a factor of ten but just to 45%. In other words, large events are a bad idea even in scenarios with just a small number of carriers.

You can confirm that insight using the tool above, or if you prefer, here is Joshua Weitz' COVID-19 Event Risk Assessment Planner that presents the data visually:

Joshua Weitz' COVID-19 Event Risk Assessment Planner Chart

Numbers like these are why the NBA has suspended its season.


3/10/17: Ambiguity Fun: Perceptions of Rationality?



Here is a very insightful and worth studying set of plots showing the perceived range of probabilities under subjective measure scenarios. Source: https://github.com/zonination/perceptions




The charts above speak volumes about both, our (human) behavioural biases in assessing probabilities of events and the nature of subjective distributions.

First on the former. As our students (in all of my courses, from Introductory Statistics, to Business Economics, to advanced courses of Behavioural Finance and Economics, Investment Analysis and Risk & Resilience) would have learned (to a varying degree of insight and complexity), the world of Rational expectations relies (amongst other assumptions) on the assumption that we, as decision-makers, are capable of perfectly assessing true probabilities of uncertain outcomes. And as we all have learned in these classes, we are not capable of doing this, in part due to informational asymmetries, in part due to behavioural biases and so on. 

The charts above clearly show this. There is a general trend in people assigning increasingly lower probabilities to less likely events, and increasingly larger probabilities to more likely ones. So far, good news for rationality. The range (spread) of assignments also becomes narrower as we move to the tails (lower and higher probabilities assigned), so the degree of confidence in assessment increases. Which is also good news for rationality. 

But at that, evidence of rationality falls. 

Firstly, note the S-shaped nature of distributions from higher assigned probabilities to lower. Clearly, our perceptions of probability are non-linear, with decline in the rate of likelihoods assignments being steeper in the middle of perceptions of probabilities than in the extremes. This is inconsistent with rationality, which implies linear trend. 

Secondly, there is a notable kick-back in the Assigned Probability distribution for Highly Unlikely and Chances Are Slight types of perceptions. This can be due to ambiguity in wording of these perceptions (order can be viewed differently, with Highly Unlikely being precedent to Almost No Chance ordering and Chances Are Slight being precedent to Highly Unlikely. Still, there is a lot of oscillations in other ordering pairs (e.g. Unlikely —> Probably Not —> Little Chance; and We Believe —> Probably. This also consistent with ambiguity - which is a violation of rationality.

Thirdly, not a single distribution of assigned probabilities by perception follows a bell-shaped ‘normal’ curve. Not for a single category of perceptions. All distributions are skewed, almost all have extreme value ‘bubbles’, majority have multiple local modes etc. This is yet another piece of evidence against rational expectations.

There are severe outliers in all perceptions categories. Some (e.g. in the case of ‘Probably Not’ category appear to be largely due to errors that can be induced by ambiguous ranking of the category or due to judgement errors. Others, e.g. in the case of “We Doubt” category appear to be systemic and influential. Dispersion of assignments seems to be following the ambiguity pattern, with higher ambiguity (tails) categories inducing greater dispersion. But, interestingly, there also appears to be stronger ambiguity in the lower range of perceptions (from “We Doubt” to “Highly Unlikely”) than in the upper range. This can be ‘natural’ or ‘rational’ if we think that less likely event signifier is more ambiguous. But the same holds for more likely events too (see range from “We Believe” to “Likely” and “Highly Likely”).

There are many more points worth discussing in the context of this exercise. But on the net, the data suggests that the rational expectations view of our ability to assess true probabilities of uncertain outcomes is faulty not only at the level of the tail events that are patently identifiable as ‘unlikely’, but also in the range of tail events that should be ‘nearly certain’. In other words, ambiguity is tangible in our decision making. 



Note: it is also worth noting that the above evidence suggests that we tend to treat inversely certainty (tails) and uncertainty (centre of perceptions and assignment choices) to what can be expected under rational expectations:
In rational setting, perceptions that carry indeterminate outruns should have greater dispersion of values for assigned probabilities: if something is is "almost evenly" distributed, it should be harder for us to form a consistent judgement as to how probable such an outrun can be. Especially compared to something that is either "highly unlikely" (aka, quite certain not to occur) and something that is "highly likely" (aka, quite certain to occur). The data above suggests the opposite.

Streaks in the S&P 500

Now that we've quantified all of the streaks of two-or-more consecutive days in which the S&P 500 was either up or down for every trading day since 3 January 1950, it's time to do something with the results of our deep data dive.

So we've taken the math we generated and built the following tool, in which you only need to enter the duration of a particular streak. We'll calculate the odds of a streak that long occurring, the odds of it being either a winning streak (multiple consecutive up days) or a losing streak (multiple consecutive down days), and also the odds of the streak lasting just one more day!

It all begins below. If you're accessing this tool on a site that republishes our RSS news feed, just click through to our site to access a working version....

S&P 500 Streak Duration
Input Data Values
Number of Consecutively Up or Down Days

Odds of Streak Occurring
Calculated Results Values
Odds of Streak [1 in ...]
Odds that it's a Winning Streak [1 in ...]
Odds that it's a Losing Streak [1 in ...]
Odds of Streak Lasting One More Day [1 in ...]

So, if a streak in the S&P 500 is underway and you're a speculator, the question is: do you feel lucky?

Well, do you?