Heuristics: the Good, the Bad, and the Necessary

[This is post 3 in the "Structure and Cognition" series; links to all the posts can be found here]

It’s pretty well known today that research in psychology has found humans to be irrational and biased. However, views in the field are actually much more subtle and nuanced – just kidding – it’s mostly vociferous debate and partisanship. The controversy concerns the use of heuristics, simple processes that often achieve pretty good results but tend to fail in certain situations. It seems that much of our cognition relies on these "quick and dirty" tricks to solve problems and make decisions. If using heuristics means missing an opportunity to apply a better algorithm, then heuristics are a problem and we shouldn't use them. Much of the debate turns on questions about how good heuristics actually are and how much better the alternatives could be. (If heuristics are actually amazing, or the alternatives terrible, then using heuristics is great and people are rational for doing so. If heuristics are awful or we have processes that could outperform them with a little effort, it doesn't reflect well on us if we use heuristics anyway).

I think a lot of the rancor here stems from differences in opinion over the source of human intelligence and decision-making ability. The argument being developed in this series of posts is that a lot of cognition is simpler than it appears, because the sources of apparent complexity are not the processes themselves. If the complexity comes in elsewhere, maybe simple heuristics aren't so bad.

The popular view, associated with Daniel Kahneman and Amos Tversky, locates cognitive complexity (or lack thereof) in the mind. This leads to the conclusion that people are over-reliant on heuristics when they should be using more complex reasoning processes. This accords with the intuitive view that cognition and people are complicated. It also means that when others argue that people can be easily made rational or that cognitive biases are transient results of certain types of laboratory stimuli, this gets misinterpreted as a statement about human nature or human cognition, when it’s more a claim about the environment. If cognitive processes are simple and the complexity is in the environment, changing the environment or using different stimuli can have large effects on cognition. This isn't an argument that actually the brain is intelligent, because it performs well in some environments; it's a claim that much of the intelligence in the environment all along.

In the interest of length, this post will briefly introduce the major views in the field and leave the details of the interaction between complex environments and simple heuristic processes for next time.

Kahneman and Tversky’s research on heuristics and biases in the 1970s is often seen as pessimistic about our abilities but might be better understood as optimistic about our potential. Its proponents view human decision making as fundamentally flawed, but our flaws are mostly the result of unconsciously using inappropriate cognitive mechanisms. Because more rigorous cognitive procedures are available to us, our irrational decisions can be minimized if we applied enough effort and knowledge to the task.

The main paradigm underlying this literature are [sic]* dual-process theories [sic] of cognition. These theories were popularized in the title of Kahneman’s 2011 book “Thinking, Fast and Slow.” On this view, there are two systems of reasoning that can each be called upon to solve problems. Arguments about the precise nature of these theories abound, but the general distinctions are roughly that System 1 is fast, automatically activated, and effortless while System 2 is slow, needs to be deliberately engaged, and requires cognitive effort. The difference can be illustrated by examples of the two systems at work from Kahneman (2011):

When we judge someone’s mood based on their facial expressions, the type of inference we conduct is quick and intuitive (System 1) and feels nothing like the effortful process used to solve a multiplication problem (System 2).

Dual-process theorists tend to assume that people are lazy (the term often used is “cognitive misers”) and default to using System 1 even when conscious, deliberative thought is called for. For example, take the famous Bat and Ball Problem:

“A bat and a ball cost $1.10 in total. The bat costs $1.00 more than the ball. How much does the ball cost?” (Kahneman & Frederick, 2002; p. 7)

Most people’s intuitive answer to this question is that the ball costs $0.10. If you’re willing to engage System 2 for a quick check though, you’ll notice that this can’t be right. If the ball costs $0.10 and the bat costs $1.00, then $1.00 – 0.10 = 0.90, not the 1-dollar price difference called for by the problem.

If we weren’t so lazy, we could just use deliberate processing to check the result and get the right answer, but we default to System 1 and stick with our intuitive wrong answer.

The cognitive miser also prefers to avoid answering difficult questions. Instead, Kahneman and Frederick propose that when faced with a hard problem, System 1 swaps in an easier problem and solves that instead. (The technical term for this is “attribute substitution”).

For example, when asked how dangerous it is to fly in a plane, instead of attempting some Fermi calculation of how many people in your life have flown and multiplying by how often they fly and then comparing that result to the number of people you know who have been in a plane crash, what people do instead is answer the question “how easily do plane crashes come to mind?”

For most people, the answer is quite easy, in large part because every plane crash is talked about and presented on the news for days, so it is very available to memory. This “availability heuristic” causes people to overestimate the risk of various dangers that are easy to think of and underestimate the risk of those that are harder to recall. We are biased toward overestimating risks that are highly available.

And there are serious consequences of this bias. Gigerenzer (2015) notes that in the aftermath of 9/11, people stopped flying and chose to drive instead. Cars are much more dangerous than planes and this increased driving is estimated to have cost 1,600 lives in the 12 months following the attacks. (Though he attributes this to a different bias – that of fearing “dread risks,” where lots of people die in the same way at the same time.

Another of Kahneman and Tversky’s major heuristics/biases was "representativeness." Here, we judge how likely something is to occur by how well it resembles an instance of a category. For example, when asked which of the following series of coin flips is more likely, THTHTT or HHHTTT, people say the first one, because it is more “representative” of a stereotypical fair coin. However, any set series of coinflips of the same number must have the same probability. I.e., if you must have 6 flips come up in an exact order, it doesn’t matter if the order is HTHTHT or HHHHHH, they all have a probability of 0.5^6.

Here’s another example of representativeness that’s a bit less… representative of the standard examples of this bias. Say you’re told a student scored in the 90th percentile in their first year of college, what is your estimate of their GPA?

In this experiment, people tended to respond around 3.5.

Now what if instead, you’re told that they scored in the 90th percentile on a test of mental concentration, what’s your best guess for their GPA?

Responses to the second question were still about 3.5 (Kahneman & Tversky, 1973).

This likely occurs because 90% “resembles” a 3.5, however, given that the mental concentration test is almost certainly less predictive of GPA than their past GPA was, you should regress your estimate toward the mean to account for the fact that the information is less relevant to predicting GPA. (To get the intuition for this, take an extreme example: what if you’re told the person is in the 90^th percentile for shoe size?)

There are tons of demonstrations of these any many other biases, but I assume most readers who are 4 pages into this are familiar with this material and if you aren’t and are interested in learning more, you can readily find more detail on this elsewhere.

Biases may seem bad, but it’s important to stress that just about everyone agrees that heuristics are useful tools that help decision making more than they hurt. Though they predictably lead to biases, they probably even more predictably lead to quick, effortless decisions that are pretty good most of the time. Arguably, they are even better in the evolutionary environment and might be particularly ill-suited for the modern world, where using System 2 would be more beneficial than ever. Despite some accusations to the contrary, Kahneman and Tversky always maintained that heuristics were basically useful tools. The first paragraph of their landmark 1974 paper says, “In general, these heuristics are quite useful.”

What Kahneman and Tversky believed was that there was an “effort-accuracy tradeoff,” where expending more effort would get you better results but the brain was biased toward preferring low effort even if it meant suboptimal performance. This view of decision making does tend to see System 2 as superior to System 1, but its proponents usually acknowledge that System 2 is not perfect either.

Still, if System 2 is significantly better than System 1, then we can reliably improve performance by choosing to expend the effort to use System 2. We can overcome our biases and instruct others to do the same – this is great news!

Unfortunately, as this chart from Wilson and Brekke (1994) illustrates, engaging System 2 is not enough to avoid contaminated thought processes. Even assuming you become aware of the bias, which you might not, because System 1 works subconsciously, you still have to be motivated to correct the bias, aware of how the bias works to target it appropriately, and have the willpower and ability to overcome the intuitive response.

This multi-step process to avoid bias is part of why “Nudging” (Thaler & Sunstein, 2009) became so popular. By arranging the environment to take advantage of cognitive biases, we can use our shortcomings for our own benefit. For example, if people are lazy and take a default option rather than weighing the pros and cons of a decision, we can achieve much higher rates of consent to be an organ donor if we set the default to be opt-in and require people to exert effort to opt out. Below, you can see the effect this has in countries with opt-in vs. op-out defaults. Countries with opt-out policies have consent percentages in the high 80s-90s. On the other hand, in countries where you need to opt-in to be a donor, you're lucky to get 25% actively signing up. (Though, note that more recent research has found that results from nudging defaults in organ donation are probably more complicated than this). Nudges are useful because they effectively pre-empt the entire process of Wilson and Brekke’s chart and simply change the “Unwanted Mental Processing is Triggered” to “Wanted Mental Processing is Triggered.”

In sum, heuristics are supposed to be basically adaptive procedures for making pretty good decisions for less effort than engaging the costly System 2. Decision making might not be perfect even if we did activate System 2, but it would probably improve our decisions and make us more rational. Even without using System 2 explicitly, we can take advantage of the predictable irrationality caused by our heuristics to pre-empt them before they cause problems.

This is the standard view as it’s often presented. Except basically all of this is wrong and rationality is impossible. Oops.

II.

Herbert Simon’s name gets thrown around a lot by all parties in what has sometimes been called “The Great Rationality Debate” (Tetlock & Mellers, 2002). Though I’m really not one to talk. This blog so far is basically a summary of a few small corners of Simon’s work and I don’t intend to stop here.

Simon’s argument was simple: heuristics are necessary for decision making. Full stop.

Deliberative processing (Keith Stanovich introduced the terms System 1 and System 2 in 1999, 2 years before Simon’s death. I don’t know if he ever used that term) might seem to be the solution to our pesky heuristics problem, but no amount of effort or processing is going to save us from the inevitability of taking cognitive shortcuts.

The problem is that the world is so complicated that in any decision, finding all the options available to us, searching for all of the reasons and evidence supporting or opposing each option, and comparing each possible solution with all of the others is not feasible if we want to make even one decision before the sun burns out.

Imagine trying to decide what to eat: first list every single food in the world, then rate them all on taste, then on nutrition (make sure you note the concentrations of all the vitamins and minerals), then by ease of access, then by price, then by ease of preparation, etc. Now compare each food on all these dimensions (or convert each dimension’s score into utility and sum each item’s utility for each dimension, e.g., apples have +2 for taste, +5 nutrition, -3 access, +1 price, +6 prep time = +11) and then cycle through the list (you did memorize the list, right?) and find the option that wins.

When laid out like this, it’s obvious not only that we don’t optimize when we make decisions, but that we wouldn’t want to, and we couldn’t even if we did want to. Even seemingly simple decisions would be computationally intractable.

And don’t think you can escape by restricting the subset of foods to consider or setting a cutoff for minimum acceptable taste to reduce your options – those are decisions too! Which subset should you select? What cutoff is appropriate? Better get to work assembling all the possible subsets of food items.

Simon is probably most well-known in decision making for coining the term “satisficing” – a portmanteau of satisfying and sufficing,** to characterize how people actually have to make decisions because optimizing is impossible. He’s also commonly associated with vague redefinitions of the term “bounded rationality” - his idea that human rationality had to be bounded by constraints like time and costs of acquiring information.

The fact that Simon’s pioneering work in decision making, problem solving, memory, and computational modeling of cognitive processes has been reduced to verbal labels that have lost most of their referents in the collective memory of psychology is not one I find satisficing, but what can you do?

What Simon meant by satisficing was that people set aspiration levels for what they hope to achieve and consider only a subset of their possible choices. If I want particularly tasty food, I set a high aspiration level, search for some options, and see if any of them meet my criterion. If one does, I can just stop the search. No need to keep looking just in case there’s something better that I could find with another 20 minutes of effort. Alternatively, if I quickly find a lot of good candidates, I can raise my aspiration level and see if I can do better. If I find no options that meet my expectations, I can broaden the search or lower my aspiration level and make do with what I can find.

As is the case for many of Simon’s descriptions of cognitive processes, this feels exactly like what I actually do. I can imagine myself standing in front of the pantry after failing to find something to eat weighing the pros and cons of searching for something that might have been shoved to the back or just deciding to make do with a decent option in front of me.

In sum, Simon’s view is that heuristics are necessary for restricting the amount of searching we need to do and for allowing us to make decisions that achieve reasonable goals. They won’t help us achieve optimal goals, but that’s because those aren’t reasonable, and no other procedure could get us there anyway. In contrast to the heuristics and biases camp who want nothing more than to dispense with heuristics so we can finally achieve true rationality, Simon’s account would pessimistically note that true rationality is probably one gloriously triumphant optimal decision to drink exactly 32.78 oz of water immediately followed by death by dehydration.

Except never mind – forget optimizing – who wants to optimize anyway? It turns out heuristics can do better than optimizing.

-But wait, doesn’t “optimal” mean-

III.

Kahneman and Tversky think heuristics are pretty good, but our cognitive-miser brains decide to take the easier side of the effort-accuracy tradeoff, leading to biases. Others, mainly associated with Gerd Gigerenzer, argue that alleged “biases” can be explained away as a result of the researchers applying improper normative standards, as a result of problematic experimental stimuli, or as actually rational behavior that is being mischaracterized as irrational.

Sometimes this goes a bit too far, in my opinion, like when they argue that the coin flips example of the representativeness heuristic presented above is not really a bias because technically the representative looking sample is a bit more likely to be encountered in a run of flips. The reason for this can be illustrated with smaller examples of runs: if we flip a coin four times, we are in fact more likely to encounter HHT than HHH. If you write out all possible series of 4 flips, you notice that HHH is a substring of 3 of them while HHT is a subset of 4. In the diagram below, a check highlights the columns with HHH and a plus denotes those with HHT.

This is very cool, but (imo) seems very unlikely to be the reason for people’s responses here.

This is where the debate gets heated, but I’m going to postpone this discussion and focus instead on Gigerenzer’s other set of arguments: that we should forget the effort-accuracy tradeoff because simple heuristics actually allow us to achieve a “less-is-more” effect. Less information can produce better decisions than more information. This is a radical sounding view and explaining how it can be possible is the subject of the next post, but here are a few examples to set the stage.

In one study, researchers pitted 14 investment strategies, including one that received the Nobel Prize, against a simple 1/N strategy that divides its money evenly between the N options available to it. The other strategies had access to 10 years of data on market performance. 1/N, as its strategy implies, has 0 knowledge, but that didn’t stop it from beating every other strategy.

Another win for simple rules came in modeling return customers for businesses. If you have a store, you might want to send some flyers to customers to entice them back in, but you don’t want to send the flyers to people who have no chance in buying anything. How do you estimate whether a customer is enticeable?

The gold standard model is a Pareto/NBD (negative binomial distribution) model that estimates the parameters of a Poisson process*** modeling customer purchasing behavior and an exponential distribution modeling customer dropout rate. This complex model was compared to a simpler “hiatus” model: if a customer doesn’t buy anything for, e.g., 9 months, consider them inactive, otherwise, consider them active.

The parameters of the Pareto/NBD model were estimated from 40 weeks of customer data from music, airline, and clothing industries. Again, the hiatus model has no knowledge whatsoever here about past trends in the data. The models were used to estimate the next 40 weeks of customer activity. The result? The Pareto/NBD model predicted behavior with 77%, 75%, and 74% accuracy for the 3 industries while the hiatus model achieved scores of 77%, 77%, and 83% (Brighton & Gigerenzer, 2015).

The topic of the next post will be how this works.

*"Psychologists who champion dual-process models are not usually stuck on two. Few would come undone if their models were recast in terms of three processes, or four, or even five. Indeed, the only number they would not happily accept is one, because claims about dual processes in psychology are not so much claims about how many processes there are, but claims about how many processes there aren’t. And the claim is this: there aren’t one" – Gilbert (1999) cited in Stanovich (2011; p.33).

**Alternatively, satisficing might have been borrowed from an old Northumbrian synonym for suffice. I can’t find a definitive source on where Simon got the term.

*** A Poisson process models a situation where events occur randomly at some rate. For example, if it rains roughly one day out of seven or if emails hit your inbox at a rate of 5 per hour.

An exponential distribution models the wait time before the next occurrence of the Poisson variable. Given the Poisson rate, when is it likely to rain next/when should your inbox expect the next email to arrive?

References:

Brighton, H., & Gigerenzer, G. (2015). The bias bias. Journal of Business Research, 68(8), 1772-1784.

Kahneman, D. (2011). Thinking, fast and slow. Macmillan.

Kahneman, D., & Frederick, S. (2002). Representativeness revisited: Attribute substitution in intuitive judgment. Heuristics and biases: The psychology of intuitive judgment, 49, 81.

Kahneman, D., & Tversky, A. (1973). On the psychology of prediction. Psychological review, 80(4), 237.

Gigerenzer, G. (2015). Risk savvy: How to make good decisions. Penguin.

Stanovich, K. (2011). Rationality and the reflective mind. Oxford University Press.

Tetlock, P. E., & Mellers, B. A. (2002). The great rationality debate. Psychological Science, 13(1), 94-99.

Thaler, R. H., & Sunstein, C. R. (2009). Nudge: Improving decisions about health, wealth, and happiness. Penguin.

Tversky, A., & Kahneman, D. (1974). Judgment under uncertainty: Heuristics and biases. science, 185(4157), 1124-1131.

Wilson, T. D., & Brekke, N. (1994). Mental contamination and mental correction: unwanted influences on judgments and evaluations. Psychological bulletin, 116(1), 117.

Comments

AnonymousDecember 1, 2022 at 4:57 AM
The games are played 카지노 in a very related approach to their online counterparts, although due to of} space limitations you’ll find that the choice is probably not as nice. Online slot machines are renowned for being utterly random, so no quantity of skill will provide you with|provides you with} the edge when it comes to of|in relation to} these enticing on line casino games. Back at the Bally showroom, Trask and I had sat in entrance of the company’s new Duck Dynasty game. "There’s by no means been extra slot machines on the earth than there are right now," he said.

Bayes and Bounds

Search This Blog