People are natural Bayesians – just bad at math (Or “The value of ‘I Don’t Know'”)

~1000 words, ~5 min reading time

Those who have heard me talk about statistics have probably heard me talk about how great the overall Bayesian approach is when compared to the more commonly used frequentist approach. I’m not going to give a full defense here. Rather, I’m going to focus on an adjacent topic: my impression that people are what I’m going to call “natural” Bayesians when faced with arguments where there is uncertainty. It turns out that this idea has the possibility of explaining a few observations about how people interpret statements about evidence – specifically in ways that are traditionally considered fallacious, but which can easily be explained using Bayesian reasoning. So, let’s get to the examples!

Example 1: There’s no evidence that…

One point that scientists sometimes say is “There’s no evidence that A”. Typically when this is said, what the scientist *means* is that there haven’t been good enough studies yet – or that the studies we have were inconclusive at this point. So, A may or may not be true. “There’s no evidence that A” is just a stand-in for saying “We don’t actually know about A.”

However, that doesn’t seem to be how a lot of people interpret the phrase. Instead, people take “There’s no evidence that A” as evidence AGAINST A. Why do this?

Because people are natural Bayesians. Let me lay out Bayes’s theorem, as it would be applied in this case.

P(A| there is no evidence for A) = P(there is no evidence for A|A)P(A)/ [ P(there is no evidence for A|A)P(A) + P(there is no evidence for A|not A)P(not A)]

[English: the probability of A given that there is no evidence for A is equal to the probability there is no evidence of A given that A is true times the prior probability that A is true divided by that same thing plus the probability there is no evidence that A is true given that it’s not true times the prior probability that A is not true.]

With a little bit of algebra, you’ll end up with this result:

P(A| there is no evidence for A) < P(A) iff P(there is no evidence for A|not-A) > P(there is no evidence for A|A)

Translating into plain English: As long as I think the probability of not finding evidence for A is higher if A is not true than if A is true, then I will interpret the absence of evidence in favor of A as evidence AGAINST A. Basically, if A is true, I expect it to leave evidence. If I can’t find evidence, then that makes me realize it is more likely that A is just not true.

Example 2: Bad arguments for are arguments against

Classical logic tells us that just because someone makes a bad argument for A, that doesn’t mean A isn’t true. However, by Bayesian probabilistic reasoning, it almost certainly increases the *probability* that A isn’t true. Here’s why, modifying the previous formula:

P(A| someone used a bad argument for A) < P(A) iff P(someone used a bad argument for A|not-A) > P(someone used a bad argument for A|A)

Put another way: as long as we think people making bad arguments supporting A is more likely if A is not true, then bad arguments in favor of A actually provide evidence (though inconclusive evidence!) that A is not true. Put another way: if people can’t make a good argument for A, then that means it’s pretty likely that A isn’t true.

But… people are bad at math

Now, let’s get specific with some math. Let’s say that I’m totally unsure about A being true. I think it’s a 50/50 chance that A is true. However, if A is true, I think there’s only a 20% chance that we wouldn’t find evidence for A. Meanwhile, if A is not true, I think there’s a 95% chance that we won’t find evidence for A. (I mean, sometimes we find evidence that seems to support something even if that thing isn’t true.)

Turns out there’s no evidence for A. So, what is the probability that A is true once I learn that?

My guess is that very few people are going to get this right, even though the math isn’t that hard. A lot of people will say “There’s a 95% chance that we won’t find evidence for A if A isn’t true. Since we didn’t find evidence for A, there’s a 95% chance A isn’t true.”

The math shows this is wrong. The true probability is about 82-83%.

In brief: people apply Bayesian reasoning in informing their beliefs, but do so imprecisely.

Practical Implications

The reality that people are natural Bayesians (but bad at math) suggests we need to be careful how we communicate. For example, if we don’t know one way or the other about “A”, then “I don’t know” is a *better* statement than “There is no evidence for A” (even if the latter is technically true – perhaps because we’ve not done the study yet). Or, tell people your priors. If you think A is likely based on your intuition, but you’ve not yet gathered data, say so. “I suspect A is true, but we’re still gathering data that might change that.” Save “There is no evidence for A” for cases where you want to suggest that “A” isn’t true.

Similarly, if you think A is true but only have a bad argument supporting it (maybe you’ve just not thought about it much), then, rather than make the bad argument, just say what your impression is and that you’re still thinking about it. You are, in fact, allowed to not know things.

Why do I write this? Because I’ve come to realize that we live in a world that asks us to take stands on things that we can’t possibly know with any kind of certainty. In the absence of certainty, Bayesian reasoning (however bad we may be at the details of it!) comes in. It’s a good idea for us to be aware of this fact, and to communicate in a way consistent with it.

COVID Thoughts: Is 50 cases/100,000 over 28 days reasonable?

~675 words, ~4 min reading time

Disclaimer: I’m not an epidemiologist. However, economists are pretty good with mathematical models, and I’m sticking to some of the simplest epidemiology models here.

As I write this, cases in Ohio are tanking. Our 7 day average for cases is literally 80% below its high. Hospitalizations are also tumbling. Things are getting better.

But, the CDC says we should continue to be cautious? Why? Because we’ve not gotten below 50 cases/100,000 population over a 28 day period. This got me thinking: where did this standard come from? Is it actually reasonable?

The reality is this: since COVID is going to be endemic, it’s not going away. Immunity isn’t perfect (whether from previous infection or from vaccination), and it doesn’t last forever (again, whether from previous infection or from vaccination). So, there is always going to be some underlying number of people that will be infected with COVID. So, is that number less than 50 per 100,000?

We can try to answer this question with a simple SIR model with a few tweaks. I set the model up with a population of 100,000. I started with 1 infected person, and R0 of 8 – which is on the low end of what I’ve heard estimated for the Omicron variant. I set a 14 day average recovery time (so, 1/14 infected people get over their infection each day), and a 180 day average immunity time (so, 1/180 previously immune people become susceptible each day). To keep things simple, I assumed the virus didn’t kill anyone. (Since I was looking for a “per 100K” number, having people die would make the math too hard.) Then, I had the thing simulate 2 years.

At the end of 2 years, the virus had basically stabilized. We’ll call this the “steady-state”.

And 6,314 people were infected. So, each day, 1/14 of those recovered and the same number of people took their place – that’s 451 new infections each day, which adds up to that 6,314/100,000 in a 2 week period – or 12,628/100,000 in a 28 day period.

So, how are we going to hit 50 cases/100K population?

Let’s think about this a bit more. Naturally, we could just stop testing. Since cases are based on positive tests, not actual infections, if we just test < 0.5% of cases, we can hit 50/100K cases over a 28 day period.

What about vaccination? Here, I assumed natural infection and natural immunity. Can vaccination get us to 50 infections/100k as a stable endemic equilibrium?

Handling this was a bit tricky. My first inclination was to assume some kind of vaccination rate and level of vaccine efficacy and booster schedule. Then, I realized that was adding a lot of parameters when there was a simpler path forward.

Let’s just look at the ideal case. Suppose that there’s a segment of the population that keeps up on their boosters and that the vaccine is 100% perfect. (Obviously not realistic! But, I’m just setting a benchmark here!) In that case, the R0 of 8 applies if the population is all unvaccinated. However, the presence of people with immunity drives down the actual ability of the virus to spread.

Fairly simple math shows that an immunity rate of 87.5%+ results in the virus being slowly eradicated. So, if we had 87.5% of the population vaccinated with a perfect vaccine, then the virus would go away completely. Anything less, and it will linger, but be rare. So, let’s try some vaccination rates.

In the US, our vaccination rate is about 65% for “fully vaccinated” – that would result in a stable number of cases around 1,623/100k – with a two week recovery period, that means 3,246/100k new cases every 28 days. About 60 x the CDC target. What vaccination rate would be need to stabilize with peak infections around 50/100k over a 28 day period? A vaccination rate of 87.16% – if the vaccine is perfect. So, LESS THAN 0.5% away from the rate needed for eradication.

Of course, if the vaccine isn’t totally effective then the proportion of the population vaccinated would need to be higher to allow for the reality that the vaccine won’t be effective for everyone – and if the vaccine isn’t at least 87.16% effective, then 50 infections/100k over a 28 day period is literally impossible. (A quick Google search found an NEJM article claiming that one vaccine they tested was ~70% effective against omicron.)

All to say: I’d really like to see how the CDC got this 50/100K case number as a target, as it doesn’t even seem possible to me, unless we just give up on testing a large proportion of symptomatic cases.

Writing Introductions

~350 words, 2 min read time

This semester, I’m trying to do a better job doing research, so I’ve adopted a “1 hour writing per working day” goal, with the guidelines to (1) Finish a rough draft within 1 month of starting it, and (2) Spend no more than 1 month polishing before I send the paper to a journal. This is a MUCH faster pace that I normally work on research. Hey, I’m a teacher at heart, so it’s easy for me to focus on that.

Anyway, I’m writing this blog entry to remind myself some important guidance that I’ve gotten about writing introductions. It’s really just two points:

(1) Write the introduction last – Okay, maybe not literally “last” – you might write it before the abstract – but, definitely after the core and conclusion of the paper are written.

(2) The introduction should include 5 elements: (a) it should answer the question “What?” – that is, what is the question you’re answering? (b) it should answer the question “Why?” – why does this research matter? (c) it should show “deficiencies” in the previous literature – that is, why aren’t the previous answers good enough? (d) it should state the exact “gap” that it fills – this should be connected with the deficiencies in c. (e) it should summarize the results. – You’re not writing a mystery novel. Most people will just read the abstract of your paper. Most of those that continue will just read the introduction and, maybe the conclusion. Use that fact. Yes, it can be personally upsetting that people don’t read every word you write. But, playing hard to get in the intro is more likely to lose you citations than to convince people to read the entire paper.

Based on #2, any paper is actually 5 papers in one, because people will read the papers in five different ways. The paper should be written so that all of these make sense and they are all consistent.

Reader #1: Just the Abstract

Reader #2: Abstract, Introduction

Reader #3: Abstract, Introduction, Conclusion.

Reader #4: Abstract, Introduction, Body, Conclusion

Reader #5: Abstract, Introduction, Conclusion, Body, Conclusion

Write the Abstract, Intro, and Conclusion with these five readers in mind.

Thoughts on JS Mill and Social Media Bans

~200 words, ~1 min reading time

I’ve been reading Chapter 2 of John Stuart Mill’s On Liberty – “Of the Liberty of Thought and Discussion”. In part, I wanted to figure out what Mill would think about things like social media outlets banning or restricting the speech of people like President Trump.

A couple observations:

  1. JS Mill does not draw a sharp distinction between legal consequences and social ones. As far as he’s concerned, if there are penalties that arise from simply expressing a thought – even if those penalties are just social stigma – then it is a violation of the liberty of discussion. In this way, Mill would disagree with a position that I’ve seen many Hoppe/Rothbard libertarians suggesting that “these are private companies, so they can do what they want”.
  2. On the question of instigation to riot, JS Mill’s most informative passage in this chapter is in a footnote. In this footnote, he talks about how discussion of Tyrannicide should be allowed. In brief, his view is that the discussion of tyrannicide should be allowed – it is a totally valid moral question to consider – but that instigation to tyrannicide could be punishable IF there is an actual act and “at least a probable connexion can be established between the act and the instigation.”

Given all this, I suspect that Mill would be opposed to a Twitter ban for Pres. Trump, though he seems be in favor of treating incitement to riot as a crime. But, dishing out punishments for a crime before there’s a trial would probably be an issue.

Did Disney Really Underestimate Day 1 Demand for Disney+?

~400 words, ~2 reading time

It’s a common refrain. Before release: “New service/game/etc. coming out!” “Lots of pre-orders!” Day 1: “Site is slow and crashing!” “Underestimated demand”, etc.

The question: is it really reasonable for us to conclude that these companies, which are SO GOOD at estimating demand most of the time suddenly suck at it – ON A REGULAR BASIS, even when they have great data about what demand is going to be?

I submit that the best answer is “no”.

A few assumptions that I’m making here: (1) Day 1 demand is unusually high. (2) Setting up temporary servers that you won’t need in the long term is costly. (3) Companies believe that very few customers are going to punish them for poor performance on Day 1.

Put these all together, and you have an obvious interpretation of the situation. Disney+ estimated how many servers they’d need on an ongoing basis, and has that much capacity ready to go. (This follows from #2 above.) Now, they know this isn’t enough for Day 1 (#1 is something they can predict), so that there will be server problems – poor performance, crashing, etc. – on Day 1. But, that’s okay. The company isn’t really going to be worse off (#3). Yeah, there’s some PR that they’ll have to deal with, but that’s okay. They can always pull an EA, and offer people some free stuff (that’s how I got SimCity 4) – maybe put some movies they “didn’t plan to” on the platform, or offer people a free download of a specific movie through Google Play/Amazon Prime/iTunes, as long as they redeem in the next week. The point is that these are manageable, fairly cheap options compared to preventing the problem in the first place.

Now, there’s another possibility as well: it might be that Disney DOESN’T know how much long-term demand they’ll have. BUT, if it is costly to set servers up and then take them down, it might make sense for them to deliberately work UNDER the needed long-term capacity for the first couple days, so that they can get a better idea of what they should do.

Traditionally, manufacturers overproduce a bit – after all, if you run out of your good, customers will often just buy from a competitor that DIDN’T. (So, assumption #3 wasn’t true.) However, in a world of strong intellectual property protections and strong brand loyalty, that concern fades.

Fall Commencement Address 2018

~2400 words, ~12 min read time

What follows is the text of the Commencement Address that I gave today, December 16, 2018, to the graduates of Kent State University’s Stark Campus.

Congratulations, graduates! This afternoon we are here to celebrate your accomplishments with you. You have worked hard to master the skills and the content that we have thrown at you, and today you receive the evidence of what you have achieved. So, on the behalf of the faculty, let me say

Lessons from Bullet Journaling

~500 words, ~3 min read time

I started using a bullet journal back in early April in an attempt to have some kind of organizational system that would actually match what I wanted. I made some mistakes along the way, but on the whole I like the bullet journal method. So, here are some lessons I’ve learned in the past couple months.

(1) Choose your size wisely – My first attempt at using a BuJo failed because I tried to use a standard sized composition book. This was too big to fit in my pocket, so I could only carry it if I was bringing my briefcase, or if I specifically decided to carry a notebook with me. This was inconvenient, so I stopped using it. This time around, I chose a small notebook – about 3.5″x5.5″ (like this), so roughly cell phone sized. This fits in my pocket with my phone, and is FAR more convenient. At the same time, I’m sure many would find the thing cramped.

(2) Dedicate enough space for your index – My notebook has 131 pages in it. But, I only dedicated 34 lines to an index at first. Surprise! I ran out of lines in the index, so now I have a secondary index around page 80. Dedicate a line per page, just to be safe.

(3) It’s okay to be simply functional – I’ll be honest. I am not artistic. So, my BuJo is not pretty. Mostly, it’s a dated task list with a few pages of checklists. That’s it. But, it works for me.

(4) Don’t dedicate a page to each day ahead of time – I should have known better. The BuJo guidelines on say this. I had used a Franklin Covey page a day planner for years. I should have known that with a dedicated page per day, many pages would be basically empty while others would be full. Poor planning.

(5) Don’t rewrite your to-do list every day. It’s easy to flip through the Daily Log and find what you need to do. Just migrate at the beginning of the month and you’ll be fine.

(6) A BuJo is not a productivity strategy. It is simply a tracking system that is beautifully flexible. But, you still need a separate strategy for how to deal with the information you store in it. But, in my experience, a BuJo can pair extremely well with Mark Forster’s Do It Tomorrow. I suspect it would also work well with David Allen’s Getting Things Done, but I always found GTD too undirected to motivate me. Now, not everyone needs a system for dealing with tasks. I find it very useful, as it removes the mental load of having to figure out what to do. But, as always, it’s up to you and what works for you.

(7) Use checklists for routine tasks. I have 3 checklists: a yearly, a monthly, and a daily. I prioritize in that order, making exceptions in the evening’s

Welcome to Prof E’s Blog! – Expectation Formation

Welcome to my new blog! I’m going to include a smattering of whatever I happen to be thinking about. So, no promise of any unified theme – but, you’ll probably see a few common topics:

  • Economics (especially in the Austrian tradition)
  • My kids
  • Quakerism (mostly of the evangelical variety)
  • College Teaching
  • Eurovision