The Great Mental Models, Vol 3 – Complete Book Summary & All Key Ideas

The Great Mental Models, Volume 3: Systems and Mathematics – A Masterclass in Thinking Smarter

In The Great Mental Models, Volume 3: Systems and Mathematics, Rhiannon Beaubien and Rosie Leizrowice invite readers on a journey to sharpen their thinking by exploring fundamental concepts from systems theory and mathematics. This book is the third installment in a series dedicated to building a latticework of timeless knowledge, aiming to democratize high-quality, multidisciplinary education. Beaubien and Leizrowice present complex ideas in an accessible, conversational style, demonstrating how these mental models describe the intricate behaviors and interactions that govern our lives. By understanding these frameworks, readers can align their actions with how the world truly works, gaining a powerful tailwind instead of a constant headwind in their decision-making. This summary will comprehensively break down every important idea, example, and insight from the book, ensuring nothing significant is left out, all presented in clear, accessible language.

Introduction

The introduction sets the stage for understanding the book’s core purpose and the broader Great Mental Models project. It emphasizes that the quality of your thinking depends on the models in your head, which help you see the world as it is, not as you wish it to be. Volume I introduced general thinking concepts, and Volume II delved into physics, chemistry, and biology, but Volume III focuses on the seemingly abstract yet profoundly applicable subjects of systems thinking and mathematics. The authors stress that these models are tools for making better decisions and gaining different perspectives on challenges. They highlight the importance of not just acquiring knowledge but also knowing when to apply which model, which requires constant reflection and learning from mistakes. The goal is to provide a multidisciplinary, interconnected education, making knowledge widely available and actionable. The book promises to start each chapter by explaining the theory and then situating it in real-world examples, helping readers find analogous uses in their own lives.

Systems

This section delves into various mental models that help us understand the interconnectedness and dynamics of systems, offering tools to navigate complexity and anticipate outcomes. As Donella H. Meadows states, “In spite of what you majored in, or what the textbooks say, or what you think you’re an expert at, follow a system wherever it leads. It will be sure to lead across traditional disciplinary lines.”

Feedback Loops

Feedback loops are presented as a ubiquitous and crucial mental model, explaining how information from a system’s outputs can influence its future behaviors. The authors define “feedback” as information communicated in response to an action, whether formal (like performance reviews) or informal (like body language). They highlight the ubiquity of feedback loops in daily life, from incentives driving human behavior to the influence of the five people closest to you. The core challenge is learning to filter useful feedback and communicate it effectively.

The technical definition from systems theory is that a feedback loop is when the outputs (information) of a system affect its own behaviors. Systems can have simple or multiple, interconnected feedback sources. Understanding them helps in being more flexible and providing better feedback. There are two basic types:

Balancing feedback loops (negative): These tend toward an equilibrium, counteracting change. An example is a thermostat adjusting a furnace to maintain a desired temperature.
Reinforcing feedback loops (positive): These amplify a particular process, causing change to continue. Examples include fashion trends or poverty cycles. Breaking these often requires external intervention or a change in conditions.

The authors also discuss the issue of delayed or indirect feedback, which can complicate establishing cause and effect. Immediate feedback, such as the pleasure from junk food, might have negative long-term consequences like type 2 diabetes. The quicker one gets accurate feedback, the faster one can iterate and improve, but feedback can also be too fast or too strong, causing the system to surge.

Adam Smith and the Feedback Loop of Reactions

The section uses Adam Smith’s The Theory of Moral Sentiments to illustrate social feedback loops. Smith believed that human behavior, despite inherent selfishness, is guided by the approval and disapproval of others, whether real or imagined. This feedback system, as described by Russ Roberts, creates loops that encourage good behavior and discourage bad behavior, forming the basis of civilization. Formal laws and informal social norms, like greeting a friend, provide this feedback. Smith argued that without social interaction, a person would have no awareness of their character or right/wrong actions. The desire to be loved and accepted prompts moral behavior relative to societal standards, which in turn reinforces that behavior in others. This concept also shows how morality is not fixed and changes over time, as seen with historical examples like infanticide, where changing feedback loops led to shifts in societal acceptance.

Everyday Loops

The authors explain that many daily challenges can be addressed by adjusting feedback loops, from changing personal and others’ behavior to dealing with inaccurate information and building trust. They emphasize that any system with an unchecked reinforcing feedback loop is ultimately unsustainable and destructive. Legal systems, for instance, often serve to stop negative reinforcing loops and promote balancing ones. The text explores four aspects of social systems through this lens:

Creating the right future incentives: Decisions today can inadvertently create negative reinforcing feedback loops for the future. A classic example is paying off kidnappers, which solves an immediate problem but incentivizes more kidnappings. Courts, as described by Ward Farnsworth in The Legal Analyst, must consider future incentives to prevent recurrence of wrongs, rather than just compensating for past suffering. For instance, attorney-client privilege or copyright laws might seem to have an immediate negative effect (e.g., hiding information, limiting free distribution) but create balancing feedback loops that incentivize long-term societal benefits (e.g., trust in legal counsel, creation of new works). The apparent “unfairness” in specific cases often creates greater justice for the collective by deterring future undesirable actions.
Influencing behavior at the margins: This means addressing problems incrementally, not in an all-or-nothing way. A good customer retention strategy, for example, tailors actions to different customer segments. Farnsworth explains that individual and group behaviors have “margins,” and legal rules often aim to reduce practices at these margins. For example, taxing sugary drinks or limiting their sale aims to influence consumption incrementally. Negative reinforcing feedback loops often start at the margins (e.g., a new customer leaving after a price hike). Implementing balancing feedback loops (like loyalty programs) at these points can prevent larger sales drops. In criminal law, penalties are scaled to preserve marginal deterrence (e.g., a thief should not face the death penalty, or they have no incentive not to kill during theft). However, adjusting one feedback loop can create unwanted reinforcing loops elsewhere (e.g., limiting public sugary drink consumption might increase private consumption).
Dealing with information cascades: These are reinforcing feedback loops where people draw inferences from others’ actions and follow suit. Farnsworth illustrates this with a street performer attracting a crowd; early curious onlookers draw in more with “normal thresholds.” Restaurateurs exploit this by seating first patrons near windows or allowing lines to form, signaling popularity. Information cascades can be damaging if unchecked, as seen with illegal activities like speeding or pirating media. Legal systems interrupt these by requiring public disclosure of information (e.g., financial transparency for public companies) and prosecuting high-profile cases (e.g., Al Capone for tax evasion) to send stronger signals than peer behavior. Ignorance and uncertainty are fertile ground for cascades; stronger signals can disrupt them.
Building trust: Complex societies require trust, facilitated by processes and legal enforcement. Farnsworth notes that “Law often amounts to a substitute for trust in situations too complex or dispersed for trust to arise.” The Prisoner’s Dilemma illustrates trust dynamics: in a single game, defection is the rational self-interested choice. However, in iterated versions, cooperation becomes viable. Feedback loops provide the information for trust-based decisions. The tit-for-tat strategy (cooperate first, then reciprocate) is effective in repeated Prisoner’s Dilemmas, building a feedback loop of trust. Legal systems encourage trust by enforcing contracts, increasing the cost of defection and promoting cooperation. Laws governing common goods (like fishing quotas) also impose rules to prevent negative reinforcing loops (e.g., over-fishing) and encourage cooperation for the common good.

Kandinsky’s Iterations

This sidebar uses artist Wassily Kandinsky’s creation of Painting with White Border to demonstrate the value of feedback loops in learning and improvement. Kandinsky’s masterpiece was not a single stroke of inspiration but a monthslong process of 21 iterations. Each sketch provided feedback, allowing him to make small changes and move closer to his vision, highlighting how inevitable failures and disappointments are crucial to the learning process. The story emphasizes that success often removes the iterative learning process from artistic creation.

Equilibrium

Equilibrium in systems refers to a stable state where forces are in balance. While static equilibrium means things are consistent and unchanging, most real-world systems experience dynamic equilibrium, where variables fluctuate within a particular range, maintained by balancing feedback loops. The authors use the example of a household system adjusting expenses (e.g., cutting meals out for piano lessons) or cleaning routines (e.g., more cleaning with a dog) to maintain balance. The constant tweaking we do to keep things as we like them illustrates this.

Homeostasis is introduced as the biological process of continual adjustments to maintain ideal internal conditions, coined by physician Walter Cannon. The human body, for instance, maintains blood glucose, body temperature, and sodium levels within narrow parameters. The important point is that systems can have multiple different equilibria; a system at equilibrium is stable, but not necessarily optimal. Short-term deviations, like an argument with a sibling, can lead to long-term stability by resolving tension.

A Different Look at Homeostasis

Antonio Damasio’s The Strange Order of Things offers a deeper perspective on homeostasis, defining it as the process that counters disorder to maintain order at a new, more efficient steady state, conducive to “flourishing.” This applies beyond biology to organizations, communities, and countries. Feelings act as a key feedback loop, providing information about how our body system is doing. After a disaster, homeostasis doesn’t necessarily return a system to its previous state but to a place where it “feels good” under new conditions. This highlights the potential of homeostasis: how systems define “feeling good” impacts their adaptation to stress and change, meaning it’s never a static state.

When Information Can Help

Maintaining dynamic equilibrium often requires accurate information. In biological systems, components constantly communicate vast amounts of information (e.g., skin sensations, potassium levels). The modern approach to doctor-patient communication in medical systems is an example of applying this model. Historically unbalanced, the relationship is shifting toward patients becoming more active participants, receiving extensive information about their condition and treatment options, including risks. This is driven by the acknowledgment that diagnostics and treatments are rarely black and white, and patients bear the consequences of decisions. Shared decision-making (SDM) is a mechanism to decrease informational and power asymmetry, aiming to keep the “information variable within a range that allows for both the doctor and patient to navigate the situation in an informed way.” Building trust is crucial for this information flow, as seen in neonatal intensive care units (NICUs) where acknowledging parents’ needs improves communication and decision-making.

Exploiting Assumptions

This sidebar highlights that over-reliance on a particular equilibrium makes one vulnerable to changing circumstances. Versatility and flexibility are gained by functioning in a wider range of conditions and having homeostatic processes to reorient after disruption. In competitive situations, those who flounder when thrown off by the unexpected suffer. The example of magician Ralph Hull and “The Tuned Deck” card trick illustrates this. Hull confounded even expert magicians because he didn’t rely on a single technique, but rather used a mixture of different methods based on audience reactions, shifting away from the equilibrium assumption that a named trick is done the same way every time.

The Complexity of Equilibrium

The project Biosphere 2 is presented as a grand experiment exploring the complexity of achieving and maintaining equilibrium in a closed system, mimicking planet Earth (Biosphere 1). This 180,000-square-meter structure in Arizona contained five distinct ecosystems and areas for agriculture. The goal was to sustain life with no outside inputs. The 1991 sealing of eight “Biospherians” for two years, despite media portrayal as a failure, was a remarkable voyage of discovery. Oxygen levels drastically sank, and caloric requirements were challenging, but the Biospherians remained healthy. The experiment showed that maintaining life in a sealed environment is infinitely complex because ecosystems are complex adaptive systems with countless feedback loops. Humans attempting to control them must preempt needs, learn to sense problems, and create new feedback loops. The project highlighted the impact of human activity on ecosystems and the difficulty of anticipating all variables. The detailed considerations for including animal species, like the bat needing 20 moths nightly, illustrate the immense, interconnected consequences of even small interventions. Biosphere 2, now utilized by the University of Arizona, continues to serve as a unique research site, demonstrating that even partial equilibrium in such a complex system is a monumental achievement.

Bottlenecks

A bottleneck is defined as the slowest part of a system, which limits the overall output, much like the neck of a bottle limits liquid flow. Bottlenecks create waste as resources pile up behind them and are the points most under strain, most likely to break down. The authors emphasize that focusing on anything besides the bottleneck is a waste of time, as it only increases pressure on it. Every system has a bottleneck, and they cannot be completely eliminated, as fixing one will simply reveal another. The goal is to anticipate them and either plan accordingly or leverage them as an impetus for innovation. One must avoid opening one bottleneck only to create worse ones later.

The distinction between a bottleneck and a constraint is highlighted: a bottleneck can be alleviated (e.g., a breaking machine), while a constraint is a fundamental limitation (e.g., twenty-four hours in a day). The concept of false dependencies is also introduced, where one might attribute a lack of progress to an external factor (e.g., needing a dedicated desk to write) when the true bottleneck is internal (e.g., time or ideas). Validating the true limiting factor is crucial to solving the right problem.

The Trans-Siberian Railway

The construction of the Trans-Siberian Railway (TSR) serves as a complex example of dealing with bottlenecks and their far-reaching consequences. The TSR, the longest railway in the world, faced immense challenges, including thousands of miles of supply lines, diverse terrain, extreme climate, and a centralized decision-making process in St. Petersburg leading to communication delays and uncoordinated solutions. A key bottleneck was Lake Baikal, requiring goods and people to be ferried across until a southern track was built decades later.

Most critically, a severe labor shortage created a bottleneck across the five simultaneous construction projects, as they competed for the same limited pool of workers. To ease this, skilled workers were imported from Europe and Asia, and convicts were used. However, enormous time pressures led to excessive payments to unsupervised peasant contractors. This caused a trade-off: money intended for quality materials was instead absorbed by expensive labor (and pocketed by subcontractors), leading to a materials bottleneck. The lack of quality assurance meant embankments were too narrow, ballast insufficient, drainage inadequate, and inclines too steep, making it a dangerous railway. The central committee was unable to react fast enough. The result was that the railway, despite its monumental achievement, had problems from the start and needed continuous improvements. This illustrates that spending money without quality assurance only moves problems into the future, and Russia effectively had to rebuild the same railroad multiple times. The lesson is to carefully consider how addressing one bottleneck can create new, potentially worse problems. A root-cause fix that improves the system fundamentally is preferable to constant patching.

Bottlenecks and Innovation

The authors argue that bottlenecks are not inherently negative; they inspire innovation. Shortages of resources often prompt people to find alternatives. This is common during wars when necessary materials become unobtainable.

Nylon, the first synthetic fiber, was invented in the early 1930s as an alternative to silk, which was primarily obtained from Japan, addressing a strategic bottleneck. Its versatility and advantages over silk led to widespread use, including in military applications during WWII.
Ameripol, a synthetic rubber, was invented by Waldo Semon during World War II to address the critical rubber shortage caused by conflict with Japan. Its rapid development and production were crucial for the Allied war effort, highlighting how innovation can overcome strategic limitations.
Medical science often advances fastest during wars due to new demands and supply shortages. The American Civil War saw inventions in prosthetic limbs and improved infection control, drastically reducing mortality rates. During WWII, penicillin production skyrocketed.
Sun lamps were invented by Kurt Huldschinsky during World War I to cure rickets, a condition caused by nutrient (vitamin D) deficiency due to food rationing. Simulating sunlight, they provided an alternative way to meet nutritional needs when food was a bottleneck.

These examples demonstrate how bottlenecks, by forcing creativity and problem-solving, can lead to innovations that ultimately benefit everyone, even if they transfer the “pain” to a different part of the system or create new challenges.

Scale

The concept of scale explores how systems change as they grow or shrink, and how their functioning behaves differently at different sizes. It’s often more effective to stay small if growth introduces undesirable changes. When scaling up, it’s crucial to acknowledge that growth is often nonlinear, meaning a change in one component can fundamentally alter the system, creating new opportunities and dependencies. The example of doubling a baking recipe (e.g., needing less yeast due to geometry) illustrates this nonlinearity.

The authors emphasize that systems become more complex as they scale up, leading to more connections and interdependencies. Combining scale with bottlenecks is important, as larger systems will experience different parts struggling to keep up. Imagining business scaling helps anticipate breakages. The key question for growth is: How well will it age? Resilience can be increased by maintaining a measure of independence between parts, as dependencies tend to age poorly.

Economies of Scale

This sidebar explains that in economics, production processes change as they scale, leading to diminishing marginal costs for each additional unit produced. Economies of scale work by enabling cost-cutting measures like bulk purchasing. However, systems do not scale indefinitely; economies of scale eventually break down due to limitations like finite resources (energy, raw materials, computing power) or market saturation.

Long-Lived Japanese Family-Run Companies

The business world often equates scaling with success, but the authors challenge this by presenting long-lived Japanese companies (shinise) as a counter-example. While the average S&P 500 company lasts only 24 years, over 50,000 Japanese companies are over a century old, with nearly 4,000 exceeding 200 years. Most of these companies, like the famous Kongo Gumi (a construction company specializing in Buddhist temples, which operated independently from 578 AD to 2006, enduring 40 generations of family ownership), tend to be small, family-run, and operate within a localized area. They prioritize durable, loyal customer relationships and are driven by a strong internal philosophy that allows adaptation without excessive growth. Staying small helps them retain traditional values and weather economic downturns, fostering greater employee investment and pride due to reduced diffusion of responsibility. The flexibility in succession (e.g., adopting a suitable husband for a daughter) highlights their commitment to longevity over strict lineage. Kongo Gumi’s pivot to making coffins during WWII demonstrates its adaptability. This illustrates that for longevity, staying small and simple can be a superpower, as systems change fundamentally with scale, and bigger is not always better.

On Being the Right Size

J. B. S. Haldane’s 1926 essay “On Being the Right Size” explores the biological implications of scale. He argues that animals cannot become much bigger or smaller without changing their appearance (e.g., a larger gazelle needing thicker legs). Changing an animal’s scale also transforms the impact of gravity. Haldane’s example of a mouse surviving a thousand-yard fall due to its high surface-area-to-weight ratio, while a horse “splashes,” highlights how size fundamentally alters physical properties. Geoffrey West’s quote, “Scaling up from the small to the large is often accompanied by an evolution from simplicity to complexity while maintaining basic elements or building blocks of the system unchanged or conserved,” further reinforces this idea.

The Story of Illumination

The history of artificial light serves as an example of nonlinear scaling and its impact on human productivity and infrastructure. Humans have continuously invested ingenuity in making light better, safer, and cheaper, with each improvement requiring a scaling up of surrounding systems.

Early lamps (40,000 years ago): Simple limestone pieces with burning animal fat, offering tiny range but enabling activities like cave art. Powering them required little work, but the output was limited.
Oil lamps and candles (Romans onward): Used various oils, labor-intensive to make and maintain. Still limited in brightness, but showed the value of artificial light. No elaborate systems; people made their own fuel.
Whale oil (18th century): Marked a drastic change as light became an industry. Fuel was purchased from far away, requiring elaborate, dangerous whaling operations. This was a huge increase in the scale of the system, moving from self-sufficiency to external production.
Gas lighting (18th-19th centuries): A by-product of coke, gas produced a clearer, stronger flame. Factories first embraced it, emancipating the working day from natural daylight. Frederick Albert Winsor pioneered centralized gas supply via underground pipes, decreasing the marginal price for additional users and cementing gas as an essential utility. This represented another leap in scale, with dedicated production and distribution systems. People became dependent on infrastructure they had little say in, and houses became part of larger city systems. Gas also scaled up nightlife in cities, enabling new activities like coffeehouses, shops, and theaters.
Electricity (late 19th century onward): The advent of electric light, without fire, allowed human activities to scale up by an order of magnitude. It was cheaper, safer, and could evenly light whole spaces. Thomas Edison copied the gas model of a central supply and grid, requiring massive-scale engineering undertakings like utilizing Niagara Falls. The electric grid became one of humanity’s greatest ideas. As light coverage increased, so did concerns about surveillance and the “nightmare of a light from which there is no escape.”

The story of illumination demonstrates that scaling up a system doesn’t just mean more of the same; it introduces new problems, unanticipated possibilities, and requirements, fundamentally altering its impact on other systems. A more interconnected, larger system gains new capabilities but also becomes vulnerable to widespread failures.

Margin of Safety

A margin of safety is presented as a crucial mental model for interacting with complex systems, ensuring they can handle stressors and unpredictable circumstances. It represents a meaningful gap between what a system is capable of handling and what it is required to handle, acting as a buffer against danger and failure. Engineers, for instance, design bridges for extremes, not averages, adding a large buffer beyond the expected load to account for unusual conditions or future population growth. This doesn’t eliminate failure but reduces its probability. For investors, it’s the gap between an investment’s intrinsic value and its price, with a larger margin indicating greater safety and potential profit, accounting for subjectivity and uncertainty. The greater the cost of failure, the bigger the buffer should be.

To create a margin of safety, complex systems use backups (spare components, capacities, subsystems) to maintain function when things go wrong, increasing resilience. The higher the stakes, the greater the need for backups (e.g., an airplane has more backups than a car). However, margins of safety can lead to overconfidence and risk compensation. Seat belts, for example, might not reduce car accident fatalities because drivers feel safer and drive less carefully, shifting risk to pedestrians. If behavior changes to negate the margin, its benefits are lost (like setting a watch fast but remembering it’s wrong). Too much margin can also be a waste of resources and lead to complacency, making a system uncompetitive, while too little leads to destruction.

Minimum Effective Dose

This sidebar introduces the concept of the minimum effective dose, which is the lowest possible amount of a medication (or intervention) to achieve a meaningful benefit without being dangerous. Pharmacologists also calculate the maximum tolerated dose. For example, a vaccine contains the minimal virus dose to trigger an immune response without causing illness. Knowing this window ensures a margin of safety, allowing doctors to start with a low, likely effective dose, representing a cautious yet impactful approach.

Learning as a Margin of Safety

The authors argue that learning is a way of applying the margin of safety on an individual level. More knowledge reduces blind spots, which are the source of all mistakes. While seemingly inefficient to learn more than needed, the reduction in blind spots offers a crucial buffer, allowing adaptation to changing situations. Astronauts are presented as an excellent example; they must possess far more knowledge than they will ever use. Chris Hadfield, in An Astronaut’s Guide to Life on Earth, explains that astronauts are “perpetual students” who train to “look on the dark side and imagine the worst” to prepare for any eventuality. Their culture involves constant debriefs for learning and improvement. Hadfield’s experience shows how training in meta skills—the ability to parse and solve complex problems rapidly with incomplete information in a hostile environment—is critical. Astronauts are “overqualified” as a safety net, as they must fix problems in space without outside help. This redundancy, along with continuous learning (e.g., Russian, space suit mechanics), is their margin of safety. Our ego often prevents us from capitalizing on this, as we learn just enough for today’s problems or coast on natural strengths. Hadfield emphasizes that “early success is a terrible teacher” and that true readiness means understanding potential problems and having a plan. Knowledge, then, is a buffer against inevitable unexpected challenges.

Anticipating the Worst

While not every endeavor requires comprehensive planning, high-stakes situations demand investing in a significant margin of safety and anticipating the worst. Jacques Jaujard, director of the French National Museums during World War II, exemplified this. Despite public disbelief, he anticipated the Nazi invasion and the targeting of cultural treasures. Having witnessed art’s vulnerability in the Spanish Civil War, he erred on the side of caution. Before the war reached France, Jaujard orchestrated the secret evacuation of the Louvre’s entire collection, under the guise of “maintenance.” Thousands of artworks were packed and transported by various vehicles to castles across France, ensuring a margin of safety before the threat was imminent.

His foresight was crucial: the Nazis stole millions of artworks, but not a single piece from the Louvre’s collection was lost or damaged. Jaujard dispersed the artwork across multiple locations to minimize risk if a stash was found. He also built in extra safety mechanisms like temperature control and even labeled cases by importance. Rose Valland, working in the Nazi art theft division, covertly recorded the whereabouts of stolen art, risking execution to collect information invaluable for postwar repatriation. The Louvre’s collection’s survival symbolized French resistance and highlighted the importance of a significant margin of safety when failure risk is high. This story shows that the future is seldom predictable, and extreme prudence can preserve vital assets and cultural heritage.

Churn

Churn refers to the constant wearing out and replacement of components within systems, including both material and information (stocks) and the parts of the system itself. Examples range from skin cells replacing themselves to trees in a forest dying and new ones growing, or car parts deteriorating. Understanding this model helps work with system change. In business, churn is the loss of customers or employees, influencing growth and expertise accumulation. High churn means constantly running to stay in place, suggesting a need for reassessment. The model encourages asking how to use churn to one’s benefit, such as focusing on core customers or allowing employees to move on for professional growth. A certain level of churn, like from retirements, can bring in new perspectives and experiences, but too much prevents expertise from accumulating.

Cults

The authors present cults as an extreme example of an organization where the purpose becomes preventing churn, rather than its initial aim. Charles Dederich’s Synanon, which started as a drug addiction rehabilitation program in 1958, gradually morphed into a full-blown cult by its demise in 1991. Dederich’s initial belief that addiction was curable by changing social context evolved into an expectation that members stay forever, justified by his later decision that addiction was incurable. Synanon used brainwashing techniques, denial of autonomy (e.g., forced divorces, no children), and threats of violence (e.g., “The Game” for emotional breakdown, paramilitary group, rattlesnake in mailbox for a lawyer). When facing legal scrutiny, Dederich declared Synanon a religion to gain more control over members, as a religion has no end point. Its demise came when the IRS revoked its tax-exempt status. Synanon demonstrates that seeking to eliminate churn perverts a system’s goals, inevitably leading to violence and coercion. The freedom to leave (or “vote with their feet”) acts as a check on abuses of power, and a lack of churn can be a powerful indicator that something is wrong.

Using Churn to Innovate

Churn, at the right level, is healthy and can be leveraged for benefit. It can be built into a system deliberately to encourage new ideas and prevent stagnation. The example of the Bourbaki group of French mathematicians illustrates this. Founded in 1935, their ambitious goal was to compile all existing mathematical knowledge into a single overarching theory. They attributed their collective work to a fictional persona, Nicolas Bourbaki. The group met regularly, engaging in rigorous debate and rewriting, which, though chaotic, ensured the quality and unity of their comprehensive textbooks.

The context was important: France lost many young mathematicians in WWI, leading to an aging academic field lacking new ideas. Bourbaki’s revolutionary aspect was its insistence that members retire at age 50, replaced by new, younger mathematicians. This mandated churn ensured a constant inflow of fresh perspectives and knowledge, preventing stagnation and keeping the field up-to-date. Members could also leave freely if they lost interest. This demonstrates that churn, when harnessed appropriately, brings in new ideas, increases adaptability, and allows for evolution by selecting beneficial traits. While Bourbaki no longer exists in its original form, its method of managed churn helped it stay relevant and useful during its active period, illustrating how change is essential for improvement and adaptation.

Algorithms

Algorithms are defined as methodical sets of unambiguous steps that turn inputs into outputs, helping systems adjust, respond, and scale. They are useful due to their inherent predictability and reliability in producing consistent, logical results. Daniel Dennett’s three defining characteristics of algorithms are:

Substrate neutrality: The logical structure of the procedure, not the material, dictates its power (e.g., a recipe works whether on phone or in a book).
Underlying mindlessness: Each step is utterly simple, leaving no room for interpretation (e.g., a recipe specifies exact amounts and clear steps).
Guaranteed results: An algorithm always produces the same outcome if executed correctly (e.g., a good recipe produces the same cake every time).

Algorithms can be simple or complicated (like computer models for crime prediction) and can even apply to natural processes like DNA execution or human learning. Some evolve and learn, while others remain static. The authors highlight that in complex systems, algorithms are increasingly designed to be directionally correct rather than perfect, evolving to get useful and relevant outputs. Understanding underlying instructions is key to improving a system.

Pirate Constitutions

The authors use pirate articles (constitutions) from the golden age of piracy (early 18th century) as an example of coherent algorithms for groups to achieve shared goals in a repeatable fashion. Far from lawless, successful pirates operated like controlled businesses, with articles designed to maximize profit by ensuring cooperation, full effort, and preventing abuse of power among a diverse crew of 80 members. These articles covered everything from maintaining weapons to resolving disputes ashore, allocation of plunder (equal, with bonuses for bravery), and proto-disability benefits.

Pirate society was remarkably democratic for its time; articles required unanimous agreement, and captains could be voted in and out by majority. Leadership was divided between the captain (battles) and quartermaster (day-to-day). Ching Shih, a Chinese pirate who commanded 70,000-80,000 pirates and up to 2,000 ships, implemented a strict set of rules, including penalties for harm without surrender, unpermitted land entry, keeping too much plunder, and mutilation for deserters. Her invariant application of these rules, akin to an algorithm, ensured cohesion and consistent, reliable outputs, allowing her to peacefully retire with great wealth. The story shows how shared goals, consistent rules, and mechanisms for adaptation can ensure collaboration within a system, even in high-stakes, unpredictable environments.

New Numbers

This sidebar explores how seemingly innocuous, standard inputs can create entirely brand-new outputs through algorithms. It suggests that many modern mathematical concepts, like negative numbers and irrational numbers, didn’t exist until the algorithms (subtraction, square root) that produced them were developed and applied. Paul Lockhart’s Arithmetic illustrates how subtracting a larger number from a smaller one necessitated the concept of negative numbers, which aren’t intuitive in physical representations. Similarly, applying the square root algorithm to a number like 2 revealed the existence of irrational numbers, which cannot be expressed as simple fractions. This highlights how algorithmic thinking can lead to the discovery of entirely new concepts, even from basic operations.

Finding Quality Inputs

Algorithms are developed to get a certain output, but often, the best inputs aren’t obvious. This section introduces algorithmic thinking as a way to determine and refine inputs, organizing a system to leave as little to chance as possible. The Bayer pharmaceutical company’s quest for the first broad-spectrum antibiotic in the late 1920s is a prime example. Heinrich Hörlein, head of research, created an industrial system to identify antibacterial compounds, moving beyond individual scientists’ hunches. He hired Gerhard Domagk to run a “recipe”: systematically testing hundreds of compounds from chemists (like Josef Klarer) against common, deadly bacteria in test tubes and living animals, with meticulous records.

Despite years of negative results, the team persisted with their fixed methodology. In 1932, Klarer attached sulfur to an azo compound (Kl-695), which, through the standardized testing process, yielded the desired result: mice recovering from bacterial infection without toxicity. This breakthrough, even when Domagk was on vacation, proved the efficacy of their entrenched process. This allowed Bayer to refine inputs, discovering that the sulfa itself was the key, leading to the development of Prontosil and subsequent sulfa-based antibiotics. Bayer’s algorithmic approach transformed drug research, becoming the industry standard. This story shows that even without knowing the answers, having a good algorithm for testing and refining inputs can lead to significant discoveries, as the process itself reveals the path to desired outputs. The lack of in-house knowledge, however, meant the Stasi’s attempt to build computers through theft was a zero.

Supporting Idea: Complex Adaptive Systems

This supporting idea distinguishes between complex systems (entities follow fixed rules, like a pocket watch) and complex adaptive systems (CAS), where entities adapt, leading to greater capacity to respond to environmental changes. CAS have properties greater than the sum of their parts; they cannot be understood by studying individual components, which may be simple but interact in unpredictable, nonlinear ways. They exhibit self-organization without centralized control (e.g., city traffic, bird flocks) and have “memories,” meaning they are impacted by past events. Melanie Mitchell defines a CAS as one “in which large networks of components with no central control and simple rules of operation give rise to complex collective behavior, sophisticated information processing, and adaptation via learning or evolution.” In a CAS, we can never do just one thing, as interventions inevitably lead to unintended consequences, often making things worse due to overestimation of control. To work with CAS, one must be comfortable with the nonlinear and unexpected, embracing humility and the scientific method. They learn and change in response to new information (e.g., flu models adapting to vaccination behavior). From the outside, CAS can appear chaotic, but they often work best when slightly disorganized, allowing for mutations and experimentation, with deviations eventually canceling out into coherent patterns.

Critical Mass

Critical mass is the point at which a system is on the verge of changing from one state to another, where a final unit of input has a disproportionate impact, like the “straw to break the camel’s back.” Once this threshold is crossed, only a tiny nudge is needed for change. Also known as the tipping point, it applies to social systems (e.g., enough people adopting a belief for self-sustaining growth), physics (e.g., water boiling), business (e.g., a company becoming self-financing), and epidemiology (e.g., vaccination rates preventing disease spread). The amount of energy needed to reach critical mass varies between systems.

The model helps understand the effort for sustained change. It reminds us not to just focus on the tipping point, but on the buildup of work required to get there. The final unit of input has an outsized impact because of the prior accumulation. It also helps identify target points for change in social systems, such as opinion leaders, whose influence can accelerate reaching critical mass. Systems in a critical state are precarious and easily tipped, making it valuable to recognize when instability is imminent, like a pencil balanced on its end.

The Overton Window

This sidebar introduces the Overton Window, a concept by Joseph P. Overton, referring to the range of ideas considered acceptable for politicians to propose as policy. Ideas outside this window, no matter how good, cannot gain widespread support. Over time, the window shifts (e.g., women’s suffrage moving from “unthinkable” to “policy”). Some politicians deliberately advance extreme ideas to shift the window and make more moderate ones palatable. The progression is: unthinkable ≥ radical ≥ acceptable ≥ sensible ≥ popular ≥ policy. The Overton Window highlights that attitudes and opinions are not static and that what is considered acceptable changes, encouraging recognition of the effort required to shift public perception.

The Work Required for Change

The authors emphasize that focusing solely on tipping points overlooks the extensive work involved in building critical mass. New Zealand becoming the first self-governing country to grant women the right to vote in 1893 serves as a prime example. This was not a sudden change but the culmination of years of effort. Unusual aspects of New Zealand society (recently settled, desiring fairer society, small population, early support from male politicians) laid the groundwork. Key factors in the buildup included:

Equal access to education: Campaigning by Learmonth Dalrymple led to girls receiving the same secondary and university education as boys, with women comprising half of university students by 1893.
Improved employment prospects: Education led to more women entering professions beyond domestic labor, gaining social influence.
Unionization: Women unionized when facing worse working conditions, gaining confidence in collective action.
Temperance movement: This movement, seeking to restrict alcohol consumption (which harmed family life), provided a framework for women to politically organize, learning “the arts of organization, administration, and leadership.”

These efforts culminated in Kate Sheppard’s suffrage movement, which organized petitions with increasing support, amassing 32,000 signatures by 1893, finally passing by a “whisker.” New Zealand’s achievement then inspired similar movements globally, demonstrating that once a tipping point is passed, the system’s nature fundamentally changes. This story illustrates that changing a system doesn’t require changing everything, but rather building capabilities and shifting opinions over time until a critical mass is reached, after which the change perpetuates itself.

Minority Opinions

This sidebar notes that significant social shifts often occur when a critical mass of people hold a viewpoint, not necessarily a majority. Research from Rensselaer Polytechnic Institute suggests this tipping point can be as low as 10% of a population, regardless of network type, while other research indicates around 25%. Once opinion leaders adopt a viewpoint, it spreads more easily due to social consequences for non-conformists. Targeting opinion leaders can accelerate reaching the tipping point, as many people are not as committed to their opinions as they imagine.

Madeline Pollard

The legal case of Madeline Pollard suing Congressman William C. P. Breckinridge in 1894 for breach of promise is presented as an unprecedented legal decision signaling significant social change. Pollard, a struggling student, had a lengthy affair with the older politician, bore two of his children, and sued him after he lied about marrying her. Despite women being laughed out of court for similar cases previously, Pollard won, highlighting the double standards of the time. This case, while appearing as a turning point, was a result of a slow buildup of changing opinions that reached critical mass, demonstrating how legal shifts can be a visible sign of underlying societal evolution.

Organic Cities

In nuclear physics, critical mass is the minimum fissile material for a self-sustaining reaction. Applied to cities, it’s about the density of interactions needed for a city to function well and adapt. Planners often misidentify the elements required, focusing on infrastructure rather than how it fosters interactions. Jane Jacobs in The Death and Life of Great American Cities argued that a city sidewalk, for instance, means nothing without the surrounding buildings and mixed uses that border it. The interplay of people and “a constant succession of eyes” on sidewalks, especially public places used evening and night, ensures safety by moderating antisocial behavior and facilitating swift bystander intervention. This “intricacy of sidewalk use” also makes an area lively and interesting, attracting more activity and economic benefits—a feedback loop dependent on varied uses and interactions.

Jacobs’ ideas contrast with planned cities that segregate functions, leading to a lack of spontaneous interactions. Brasília, Brazil’s capital, designed with visually ordered, segregated zones (work, homes, shops), lacks a street culture and coherent communities. Its designers missed the need for infrastructure to facilitate a critical mass of interactions. The city’s need for inexpensive labor, unaddressed in the plan, led to the spontaneous growth of unofficial low-income areas, showing that the city could only function by deviating from the initial design.

The example of Strøget, Copenhagen’s pedestrianized street network, created in the 1960s, demonstrates successful critical mass through mixed-use areas that promote human interaction. Despite initial skepticism, its combination of shops, cafes, theaters, and street performers attracts 120,000 visitors, even in winter. Architect Jan Gehl applied these principles, showing that a bustling public space results from diverse factors combining at a certain density, making street culture a universal human need if the right spaces are provided. The “actual architecture should be invisible because the focus is on the people.” This highlights that good urban design facilitates a minimum of self-sustaining interactions, rather than just looking good.

Emergence

Emergence describes how systems, when viewed on a macro scale, can exhibit capabilities that are not present on the micro scale. As Aristotle noted, “The whole is something over and above its parts, and not just the sum of them all.” This model reminds us that new capabilities can arise from seemingly innocuous elements. Emergent properties cannot be understood by reducing a system to its components. Termite mounds are a classic example: individual termites are simple, but millions working together build complex structures with ventilation and cooling systems, storage, and specialized housing, all without central control.

Emergence can be weak (functions based on identifiable rules, like a bird flock simulation) or strong (no identifiable rules, like consciousness). A primary feature of emergence is self-organization, where parts appear chaotic but the whole is orderly, without centralized control (e.g., birds flying in coherent shapes by following simple rules like keeping even distances). Emergence is not synonymous with complexity; some complex systems (like nuclear power plants) don’t display it, while simpler ones (like chess) can.

The Mothers of the Plaza de Mayo

The Mothers of the Plaza de Mayo in Argentina provide a powerful example of emergence in protests. Between 1976 and 1983, during Argentina’s “Dirty War,” thousands of people were “disappeared” by the military government. Despite fierce censorship and punishment of dissent, a group of mothers, seemingly powerless individuals, began meeting weekly in the Plaza de Mayo in 1977, demanding to know what happened to their children. Wearing white headscarves with names and birthdates of their children, their peaceful, repetitive actions had an effect greater than the sum of their individual parts. The government initially dismissed them as harmless, but their visibility became both a refuge and a trap.

Individually, they had no power, but as a group, they gained it. News of their protests spread internationally, raising awareness of the regime’s brutality and attracting support from human rights groups. Despite government targeting (some Mothers became “disappeared” themselves, founders were murdered), they refused to back down, finding safety in public visibility. After the Dirty War ended in 1983, their continued fight led to over 850 people being charged and over 120 stolen children identified. This story illustrates that group actions can yield novel, unexpected outcomes, undermining oppressive regimes and inspiring similar movements globally. The Mothers’ collective power, which arose from their unity and persistent visibility, was an emergent property.

Social Innovation

The concept of emergence also applies to social innovation and knowledge sharing. Humans, as a species, can achieve more than any single brain due to cultural learning. Joseph Henrich, in The Secret of Our Success, explains that technologies (from kayaks to antibiotics) emerge “not from singular geniuses but from the flow and recombination of ideas, practices, lucky errors, and chance insights among interconnected minds and across generations.” Our cultural learning abilities also give rise to “dumb” processes that, over generations, produce practices “smarter than any individual or even group.” We don’t need to reinvent the wheel, and we specialize while still benefiting from the collective. Henrich argues that cultural learning has created a “cultural mind,” an emergent property allowing human knowledge to vastly exceed individual scope.

Cultural learning works through sharing improvements and insights. Robert Boyd and Peter J. Richerson explain that Arctic foragers leveraged a “vast pool of useful information” from others, making small, adaptive improvements that are preserved and nudged further through cultural transmission. This process is often “smarter than we are” and unfolds without central guidance. Henrich suggests that when individuals learn from each other with sufficient accuracy, “social groups of individuals develop what might be called collective brains,” which can generate emergent properties and propel technological sophistication. Language development is a prime example: “no single individual does much at all, and no one is trying to achieve this as a goal. It’s an unconscious emergent product of cultural transmission over generations.” Cultural evolution has also put selection pressures on humans, changing our bodies and instincts, giving us a cumulative cultural heritage. The conclusion is that innovation is an emergent property of “a big network of freely interacting minds,” not just individual genius.

Supporting Idea: Chaos Dynamics

Chaotic systems are highly sensitive to initial conditions, giving rise to the butterfly effect, famously discovered by Edward Lorenz. In the 1950s, Lorenz, working on weather prediction, found that a tiny rounded-up variable input led to drastically different predictions. This means that predicting the future behavior of chaotic systems is difficult or impossible without perfect understanding of starting conditions, as any inaccuracies are magnified exponentially over time. We can only make probability-based predictions.

This contradicts our common assumption that tiny differences shouldn’t matter. While the butterfly effect is often misinterpreted as a wing flap causing a typhoon, it means the difference in starting conditions (with vs. without the flap) is enough for a typhoon in one scenario but not the other. No single moment is more significant; every moment changes everything after. Despite the unpredictability, chaotic systems still follow deterministic rules; their behavior has its own logic, even if we cannot predict it precisely. “Making use of chance can be a deliberate and effective part of approaching the hardest sets of problems.”

Irreducibility

Irreducibility is the concept that a theory or system can only be reduced to a certain basic level beyond which it loses its meaning or ceases to function as intended. Albert Einstein’s idea that “the supreme goal of all theory is to make the irreducible basic elements as simple and as few as possible without having to surrender the adequate representation of a single datum of experience” underpins this model. It’s about finding the minimum amount necessary for a thing to still be that thing, recognizing when you’re fundamentally changing the system.

The parable of the goose and the golden eggs illustrates this: cutting open the goose to get all the gold at once destroys its emergent property of laying eggs, leaving the farmer with nothing. Systems with emergent properties lose them when disassembled.

Loose Lips Sink Ships

This section applies irreducibility to communication, specifically wartime propaganda posters. To convey complex information simply and effectively, poster artists sought the minimum number of words and images to depict their message, avoiding ambiguity. The slogan “Loose Lips Sink Ships” combined with a sinking boat image conveyed a complex message about civilian actions impacting the war effort, the presence of spies, and the need for unified vocal support.

Abram Games, a British graphic designer and official WWII war artist, excelled at this, deriving “maximum meaning from minimum means.” His posters, covering topics from patriotism to avoiding gossip, were visually stunning and powerful because they were stripped to their essential forms while retaining clear messages. They leveraged common symbols and symbolic representations (e.g., eagle for US, red for danger) to simplify the message. Joseph Kaminski’s analysis of a WWI American Air Service recruiting poster (“Give ’em the gun,” with “learn” and “earn” highlighted) shows how minimal elements conveyed complex appeals about belonging, fighting, and future career benefits without explicit spelling out. This demonstrates that simplicity, when it retains irreducible elements, can convey powerful meaning, but too much simplicity conveys no meaning at all.

Typography

Typography is another area where irreducibility is crucial. Font designers must identify and retain the irreducible elements of each letter so that, despite variations in size, spacing, or color, a letter remains recognizable. Eric Gill’s An Essay on Typography explains that letters are “signs for sounds… more or less abstract forms” that can be modified without losing their essence. While elements like serifs or thickness can change, certain features of a letter’s shape are irreducible. For example, a “Roman capital A does not cease to be a Roman capital A because it is sloped backward, or forward.” Context matters: the irreducible elements of a letter may differ when it’s part of a word compared to when it’s alone. Gill also notes that a headline’s irreducible element is its immediate noticeability, not necessarily its size, as bolding can achieve the same effect regardless of position. This highlights that irreducible components are not always fixed and depend on the system’s context and goals. Fonts that fail to retain these elements become visual art rather than communication.

Gall’s Law

This sidebar introduces Gall’s Law, from John Gall’s The Systems Bible, which states that complex systems that work invariably evolve from simple systems. Attempting to build a complex system from scratch is typically ineffective. Examples include bureaucratic processes evolving from simple forms, complex organisms from single-celled bacteria, sprawling cities from small towns, and airplanes from bicycles. Gall’s Law reinforces that understanding a complex system by looking at its parts isn’t always possible and advises against designing complex systems from scratch, suggesting consistent, incremental progress from something basic that works.

The Law of Diminishing Returns

The law of diminishing returns posits that inputs to a system lead to more output up to a point, after which each additional unit of input yields a decreasing amount of output. Eventually, more inputs can even reduce total output. This nonlinear relationship applies universally. In economics, increasing inputs like labor in a factory increases production, but too many workers lead to crowding and reduced per-worker output. In farming, adding fertilizer increases crop yields, but past a certain point, more fertilizer means less increase in yield, and eventually, total yield reduction.

Diminishing returns are everywhere: working too many extra hours leads to more mistakes, endlessly tweaking project details yields minimal improvements for time invested, or too much funding for a new company might shift focus from customers to investors. In skill development, early practice yields huge improvements, but subsequent hours bring diminishing gains. This law teaches that outcomes are not linear, and not all inputs are equal. Understanding this allows calculating the point of diminishing returns to interact with systems optimally. We often focus on trivial details at the expense of meaningful ones.

The Diminishing Returns of Homework

This sidebar uses homework as an example of diminishing returns. Research suggests it has no benefit for young children and diminishing benefits for high schoolers, with additional hours leading to fewer gains, especially if it reiterates material. For students with other responsibilities, it can even have negative returns by reducing leisure and sleep. This illustrates how assuming “more effort always leads to more rewards” despite evidence of diminishing returns leads to ineffective practices.

The Viking Raids of Paris

The Viking raids of Paris between 820 and 926 CE illustrate how a strategy initially producing dramatic returns can eventually face diminishing returns. After Charlemagne’s death, his successor Louis the Pious inspired less fear, leading to initial successful Viking raids, particularly on wealthy churches like the Abbey of Saint-Denis. The notorious 845 raid, led by Reginfred (or Ragnar Lodbrok), with 120 ships, extracted a huge ransom from Charles the Bald.

However, paying this ransom set a dangerous precedent, encouraging more attacks. The Franks paid substantial sums over decades. Eventually, the Franks built walls, bridges across the Seine, and towers with boiling wax and oil. This increased their defense, making raids difficult, costly, and time-consuming for the Vikings, who faced resource strain, morale issues, and disease during lengthy sieges. Diminishing returns set in: ransoms became smaller, costs higher. By 886, a weakened Viking leader requested only 60 pounds of metal. In 911, Viking leader Rollo accepted land, a title, and marriage from Charles the Simple in exchange for protecting the area from future Viking attacks, founding Normandy.

This story shows that repeating the same actions indefinitely does not yield the same results. Initial dramatic returns diminish as systems adapt. The areas also ran out of wealth to extract, and finding new villages required greater travel and risk. When returns were no longer worth the effort, Rollo pivoted to a different way of benefiting from Europe. Noticing diminishing returns means it’s time to change tack. Only after the raids stopped did Paris begin to flourish, illustrating that planning for diminishing returns can help avoid them and lead to new opportunities.

The Diminishing Returns of Mass Incarceration

This sidebar discusses mass incarceration as a system with diminishing returns. While incarcerating the most dangerous individuals significantly increases safety, increasing incarceration for less serious crimes yields diminishing safety gains. At a certain point, the costs (to taxpayers, to society from incarcerated individuals’ inability to contribute) outweigh the benefits. This rests on the assumption that locking people up is always good, but the logical end point is absurd (e.g., death penalty for spitting gum). Emile Durkheim argued that a certain amount of crime is inevitable in any society, based on “collective sentiments.” Preventing the worst crimes doesn’t create a perfect society; it merely shifts the focus and significance to more minor “crimes.”

The Ford Edsel Was Just a Car

The Ford Edsel serves as an example of putting average results into perspective and the dangers of over-hyping a product. Launched in 1957 with a lavish two-year advertising campaign that shrouded it in mystery and made bold claims, the Edsel was hyped as the “greatest car ever made.” Americans, for whom car ownership symbolized freedom and prosperity, had high expectations. However, upon release, the Edsel was “just a car,” with an “odd and distorted” vertical front grille. The excitement bubble popped. Early technical issues further tarnished its image, and sales fell far short of expectations. Ford lost billions and stopped production within two years.

The Edsel’s failure wasn’t just due to its design or minor flaws, but the disconnect between the extreme hype and its average reality. Over-advertising created expectations that could only lead to disappointment. This illustrates regression to the mean: Ford had spectacular successes with the Thunderbird before and Mustang after, making the Edsel seem like a massive failure in comparison. However, when viewed against the spectrum of all cars produced, it was simply an average seller. The lesson is that not every effort will yield spectacular, outlier results; there is always an average. Appreciation of the average helps contextualize results and avoid complacency, even after early successes.

Mathematics

The section on Mathematics highlights how quantitative reasoning provides foundational tools for understanding patterns, making predictions, and optimizing outcomes in complex systems. As Shakuntala Devi states, “What is mathematics? It is only a systematic effort of solving puzzles posed by nature.”

Supporting Idea: Distributions

Distributions help contextualize data, enabling predictions about the probability, frequency, and possibility of future events. The four key characteristics determining a distribution type are: discrete vs. continuous data, symmetry (symmetric vs. asymmetric), upper/lower limits, and likelihood of extreme values. Distributions are often idealized representations, requiring compromises between fit and ease of estimation, with the ultimate goal of leading to better decisions.

The normal distribution (bell curve) is the most familiar, where most values cluster around a midpoint (mean, mode, median), with fewer values farther out. It characterizes physical constraints (height, IQ, blood pressure, exam results) and common goods pricing. More extreme values are less likely but not impossible, as the tails go on forever. Long-tail values, though highly unlikely, can have an outsized impact (e.g., extreme commute delays).

Power law distributions contrast with normal ones, with values clustering at low or high ends, and often ranging over many scales (e.g., wealth distribution, town sizes). They are “scale-free” because they lack a “normal” size. “Something normally distributed that’s gone on seemingly too long is bound to end shortly; but the longer something in a power law distribution has gone on, the longer you can expect it to keep going.” Identifying power law situations helps set realistic expectations for change and contend with diverse potential values. Other distributions include geometric (when success might happen), binomial (numbers of successes), Poisson (rare events in large populations), and memoryless (like waiting for a bus). Data quality is crucial; distributions can change with future data.

The Good Life

The philosophy of Epicurus is interpreted through the lens of a normal distribution. Writing around 300 BCE, Epicurus argued that pleasure is the only realistic measure of evaluating life, and its pursuit should drive choices. This was often misunderstood as promoting selfish indulgence. However, Epicurus clarified that “no pleasure is a bad thing in itself. But the things which produce certain pleasures bring troubles many times greater than the pleasures.” These are the pleasures to avoid.

The ideal state, according to Epicurus and Daniel Klein, is tranquility, a life of “neither pleasure nor pain,” corresponding to the median at the top of a normal distribution curve. Excessive indulgence (extreme right of the curve) often leads to pain (left of the curve). Epicurus believed that “living a simple life was the best way to avoid pain,” which is a pleasure in itself. Catherine Wilson explains that for Epicurus, a good life was free from deprivations and anxieties. The focus on pain reduction means avoiding the painful psychological and physical consequences of immediate gratification. For Epicurus, the “greatest source of pleasure in life” was close human relationships, a philosophy of experience over consumption. This interpretation shows how aiming for the median of the life’s “bell curve” avoids the extremes and the negative consequences of excess.

Compounding

Compounding is described as a “magical,” immensely powerful, and often misunderstood force that follows a power law. It refers to the exponential growth of wealth, relationships, and knowledge over time, where most gains come at the end, not the beginning, requiring continuous reinvestment of returns. Compound interest, where interest earns interest, is the most visible form, illustrating how small amounts can become fortunes over long periods. Debt can also compound, becoming impossible to pay off.

Compounding forces long-term thinking, as its effects are only remarkable on a long timeline. It shows that enormous gains can be achieved through incremental efforts. While non-literal (e.g., knowledge compounding isn’t calculable by a formula), the concept serves as a powerful metaphor for how things grow nonlinearly. Early decisions have a greater impact due to their compounding consequences (e.g., a new graduate’s first job choice impacting future career trajectory).

You Don’t Always Know the Payoff

The authors explain that when investing in things that compound, the exact payoff and opportunities might not be known at the outset. For example, saving money now might lead to a dream home later, or the security it provides might enable a different career choice. Jewish education norms serve as a long-term example. In the first century CE, Jewish fathers were religiously mandated to send sons to primary school to learn Torah in Hebrew, a costly sacrifice with no immediate economic returns in an agrarian society. However, over centuries, this investment in literacy and human capital became a “lever.” Jewish people were able to leave farming for more lucrative urban professions like craftsmen, merchants, and moneylenders at a significantly higher rate than non-Jewish people. Literacy improved productivity and earnings in these fields. This demonstrates that early investments, even for non-economic reasons, can compound into exceptional, unanticipated opportunities generations later, as knowledge accrues and allows for capitalization on changing economic landscapes.

Reinvesting Experience

Experience also compounds, leading to greater capabilities if we actively build on developed skills in new situations. This isn’t automatic; like financial compounding, it requires reinvestment. Mireya Mayor, a National Geographic scientist and explorer, exemplifies this. Her successful 2008 expedition retracing Henry Morton Stanley’s path in Tanzania was possible due to years of insights from prior experiences. She learned from her first expedition to Guyana (e.g., what to pack for survival in the jungle) and consciously reflected on how to apply past knowledge. Even her experience as a cheerleader for the Miami Dolphins taught her to “perform under pressure.” Her numerous trips to Madagascar for primatology built relationships and expertise. Mayor “put in the legwork” for years, allowing her to take on increasingly complicated and dangerous challenges, from diving with sharks to working with leopards. Her arduous Tanzanian journey, where she leveraged compounded knowledge to overcome brutal conditions and injuries, is a testament to the power of reinvesting insights. When we stop challenging ourselves, we stop compounding our learning, and 20 years can become “the same year repeated 20 times.”

Compounding Relationships

Relationships also compound, growing stronger through consistent win-win dynamics. Preferential attachment is a type of compounding where things (like money or friends) accrue to those who already have more of it (e.g., people with many friends tend to make more). This can lead to cumulative advantage. Small differences in starting conditions, like graduating during a recession, can have compounding negative impacts on career earnings. However, individuals can leverage the concept of cumulative advantage by seizing tiny advantages and maximizing benefits from them, for instance, through networking. Each connection can lead to more, creating influential networks.

Sidney James Weinberg, a powerful Wall Street banker and Goldman Sachs CEO for 39 years, exemplifies this. Despite a humble background (son of a Polish liquor dealer, left school at 13), he rose to the top by understanding and leveraging compounding relationships. Starting as a janitor’s assistant at Goldman Sachs, a chance meeting led to a promotion to the mailroom, where he proved himself, leading to mentorship and banking courses from Paul Sachs. After WWI, he quickly became a salesperson and then CEO. His greatest assets were the relationships he meticulously built and continuously expanded, notably by serving on over 30 corporate boards (attending 250+ meetings yearly) and befriending CEOs by being helpful. He leveraged his Wall Street influence to support Franklin D. Roosevelt’s presidential campaign, organizing an advisory board of corporate executives who later became Goldman Sachs clients. Weinberg’s philosophy of “friendship should always come before business” and his conscious investment in reciprocal relationships led to his remarkable, compounding influence.

Basic Income and Death Taxes

This sidebar discusses how societies often implement rules to mitigate the exponential accumulation of advantage due to compounding and preferential attachment. Basic income (unconditional income distribution) aims to level the playing field, disrupting preferential attachment cycles. Unlike conditional welfare, basic income allows recipients to increase connections to beneficial “nodes” in a network (e.g., education, health services), which might otherwise be inaccessible due to lack of funds. Studies show basic income improves education and health outcomes and fosters social solidarity, “abating cumulative disadvantage” and increasing access to “real freedom, possibilities, and opportunities.”

Death taxes (inheritance and estate taxes) are another mechanism to reduce advantage accumulation across generations. By deducting from assets left behind after death, even if already taxed, they aim to prevent wealth from concentrating in a few families, forcing the richest to contribute significantly to society and mitigating the advantages heirs receive without effort.

Supporting Idea: Network Effects

Network effects occur when a product’s or service’s utility increases as more people use it, creating more value for all users. The telephone is a classic example: its value grows with each additional user. These effects can also be indirect (e.g., more ride-sharing drivers benefit riders). Network effects create a reinforcing feedback loop: added value attracts new users, who in turn create more value. Getting network effects started can be difficult, as products might have little use until they reach a critical mass of users. But once established, they create a strong competitive advantage and can lead to a winner-takes-all market, where one product captures most users, who are reluctant to switch due to the accrued advantages. First movers aren’t always most successful; later movers can learn from early mistakes.

However, network effects cannot continue indefinitely; past a certain point, negative network effects set in, where user growth leads to less value. This can manifest as an overloaded product (e.g., overcrowded public transport) or a change in fundamental nature (e.g., a small online forum losing its quality due to an influx of new users and diluted norms).

Sampling

Sampling is a fundamental mental model in mathematics, especially statistics, crucial for understanding the world, assessing risk, and distinguishing luck from skill. When seeking information about a population (a set of alike people, things, or events), we usually examine a sample (a part of the population). A census is the exception, aiming to include everyone.

Sample size greatly influences results. The law of large numbers states that larger samples yield results closer to the true value (e.g., rolling a fair die 60,000 times will show frequencies closer to 1 in 6 than rolling it 6 times). Small samples can produce skewed results, failing to capture rare or outlier values (e.g., only seeing white swans in a pond might mislead about black swans). In scientific studies, larger samples mean a lower margin of error and higher sampling confidence, generalizing better to the population. However, trade-offs exist (funding, logistics, ethical considerations for distress).

For a sample to be representative, it must be random, meaning every element in the population has an equal chance of inclusion. George Gallup’s soup analogy (stirring before tasting a spoonful) illustrates this. Awareness of sampling biases is crucial (e.g., the healthy worker effect where employed individuals are healthier than the general population, making a sample of current factory workers unrepresentative of health impacts). Recognizing this helps overcome biases, especially our tendency to overemphasize anecdotes (sample size of one). Traveling, living in diverse areas, and interdisciplinary reading can provide larger, more representative “samples” of ideas and experiences, fostering tolerance and risk awareness.

Defining a Language

The creation of the first Oxford English Dictionary (OED) is a story demonstrating the value of increasing sample size for accuracy. The OED’s ambitious goal was to record all words with a “recognized lifespan” in English, including their first instances, multiple uses, and evolutions. This required reading “everything” ever written in English. The initiators realized this “gigantic, monumental, and impossible” task required the combined action of many volunteers, effectively an army reading millions of pages.

The example of the word “take” (with at least four distinct definitions) illustrates why relying on one usage would be a mistake; all possible uses needed to be found. Volunteers submitted over six million slips of paper, each detailing a word’s usage in a specific text. This vast sample ensured comprehensive coverage and accurate history and definitions, producing the most complete chronicle of English. This monumental effort, finished in 1927 with 12 volumes and over 400,000 words, highlights how increasing sample size can dramatically improve accuracy and completeness.

Insurance

This sidebar explains insurance as a concept predicated on reducing uncertainty by spreading the cost of adverse events across groups and time. While individual risks are unpredictable, a large enough sample size allows insurers to predict aggregate risks with reasonable accuracy, calculating premiums accordingly. Insuring a small group is high risk; a large group is low risk, effectively eliminating uncertainty. However, extreme or unforeseen events, like the 1906 San Francisco earthquake (a rare 7.7-8.3 magnitude event causing $6.3 billion in insured losses and bankrupting 14 companies), can still blindsight insurers, showing that sample size alone doesn’t always help against extreme rarity or first-of-their-kind events (like 9/11).

Not All Samples Sizes Are Created Equal

Beyond mere size, the representative diversity of a sample is crucial for accuracy. The authors highlight how “behavioral scientists routinely publish broad claims about human psychology and behavior… based on samples drawn entirely from Western, Educated, Industrialized, Rich, and Democratic (WEIRD) societies.” WEIRD individuals are often outliers, making studies based solely on them unrepresentative of the global human population. To uncover human universals, data must represent the diversity of the species.

Caroline Criado Perez’s Invisible Women explores how data sets used to make decisions impacting women often lack information about women. She argues that “Big Data is corrupted by big silences,” leading to “half-truths.” For example, high rates of unreported sexual harassment on public transportation mean official police reports would falsely suggest safety. This leads to issues like phones too big for average female hands or medical treatments based on studies “exclusively on male mice” or 200-pound adult men, despite physiological differences in women and different ethnicities. Assuming a large sample size alone guarantees a good data set can perpetuate existing discriminations. The lesson is that deep data on a homogenous population is only relevant to that population; data purporting to describe “human” nature must be representative of the species’ diversity, as “data determines how resources are allocated.”

Randomness

Randomness is a challenging mental model because humans are wired to see order and patterns, often attributing causality where none exists. Yet, randomness is the rule, not the exception, in much of what we encounter. Accepting it forces a confrontation with our lack of control over outcomes. History, in hindsight, often appears ordered, but the past is as random as the future, with historical documents surviving or being interpreted randomly.

However, randomness is a tool. Immune systems produce lymphocytes randomly to fight diverse pathogens. Ants forage randomly, leaving pheromone trails that others randomly follow, leading to self-organization without central control. Embracing randomness can make us less predictable and more creative.

What Are the Odds?

This sidebar addresses our misunderstanding of randomness in sequences of equally likely events. We often believe past outcomes influence future ones (e.g., a coin landing on heads multiple times increases the chance of tails next). But for truly random events, each outcome is independent (e.g., a coin flip is always 50/50). Casinos profit from this fallacy, as seen in the 1913 Monte Carlo roulette wheel landing on black 26 times in a row, leading gamblers to lose millions betting on red, despite the odds remaining 50/50 for each spin. This highlights that past random results have no impact on future probabilities.

Serendipity and Creating

The question “Where do you get your ideas?” is difficult for authors because creativity is messy and inconsistent, not a predictable source. When creativity feels blocked, introducing randomness (unpredictability) can help. Author Jane Smiley in Thirteen Ways of Looking at the Novel describes willingly entering a “zone of randomness” when stuck. Making characters do unexpected things or writing unplanned scenes can provide insights and momentum, even if later cut. Novel writing is nonlinear; unexpected research findings or character developments can steer the story. There are few universals in the writing process, and authors’ varied experiences and temperaments contribute to the unpredictability. When stuck, Smiley advises experiencing your way out of it—reading more, traveling, asking questions, engaging senses—to stumble upon new things. Great novels are not formulas, and their “greatness” is subjective and multifaceted, not perfect for all. Randomness, by allowing new connections, helps overcome creative blocks.

Two Perspectives on Randomness

The authors distinguish between pseudorandomness (appearance of randomness due to our inability to detect a pattern, though underlying causal influences exist) and true randomness (completely detached from any causal factor, with no explanation for predicting outcomes). Humans tend to behave pseudorandomly, which can be exploited. Professional magicians like Chan Canasta (and later mentalists like Banachek) exploited this by making people think their choices were random when they were subtly influenced (e.g., picking a “random” word like “carrot” or a number like “seven”). Such tricks work because we don’t recognize our choices aren’t truly random, and we see only one instance, not repeated trials revealing the pattern.

Generating true randomness is hard for humans, who tend to follow patterns. Genuine disorder for data encryption requires unpredictable physical processes (radioactive decay, atmospheric noise, lava lamps). Historically, divination rituals (like Naskapi foragers heating caribou shoulder bones to create random hunting maps) provided truly random data, far more useful than pseudorandom human choices, despite being attributed to magic. True randomness is detached from any causal factor, making outcomes unpredictable.

Supporting Idea: Pareto Principle

The Pareto Principle (80/20 rule), identified by Vilfredo Pareto, states that in systems, 80% of outputs are typically the result of 20% of inputs, with the remaining 20% of outputs coming from 80% of inputs. Pareto observed this in wealth distribution (20% of population owned 80% of land/wealth) and pea plant yields (20% of plants produced 80% of peas). Joseph M. Juran applied this to manufacturing, noting 80% of defects from 20% of issues. This is an approximate rule of thumb, not a law, but often surprisingly close.

It highlights that inputs and outputs are not evenly distributed. Not all effort is equally productive, nor all investment equally impactful. Knowing this directs focus: if 80% of software users use only 20% of features, those 20% should be prioritized for effectiveness and user-friendliness.

Regression to the Mean

Regression to the mean is a mental model that helps distinguish between luck and skill, and how individual experiences fit into the spectrum of possibility. It posits that outlier results (far above or below the mean) with a luck component are likely to be followed by more moderate ones. Francis Galton first identified this phenomenon by observing that unusually tall or short parents tended to have children of more average height, as if nature “regressed” to the mean.

This applies to various phenomena: extreme successes are often followed by average results, as performance returns to true ability. From a single result, luck and skill are indistinguishable. An athlete’s bomb in one competition is likely followed by a regular performance. A warm winter day might be followed by an average cold one. Repeated iterations allow results to converge toward one’s true ability. Beginner’s luck is real because beginners who fail spectacularly often quit. The model also suggests that consistent effort, even with average results, increases the chance of an occasional exceptional outlier, as skill improves and provides more opportunities for luck.

The Sports Illustrated Curse

This sidebar explains the “Sports Illustrated curse,” where athletes appearing on the cover experience a sudden decline in performance. Regression to the mean offers the explanation: athletes featured are often at the outlier peak of their game, a combination of skill and luck. From there, they are most likely to regress to their average performance. Similarly, injuries ending athletic careers can be seen as bad luck compounding over time. The lesson is to assess abilities by track record, not just greatest achievements, and to recognize that extreme results are not necessarily the start of new trends. Whether things are unusually good or bad, they often return to the mean.

Multiplying by Zero

The mental model of multiplying by zero highlights that any multiplicative system is only as strong as its weakest link. If any component is a “zero” (producing nothing, or effectively negating all other efforts), the entire output is zero, regardless of how strong other components are. This basic math concept is powerful outside of numbers. For example, an unmotivated team member can bring everyone down, or a CEO’s racist comments can cancel out strong branding. In a restaurant, bad food is the “zero” that no amount of good decor or service can compensate for. The value of this model lies in identifying, avoiding, or transforming these zeros. Zeros are usually obvious structural flaws that are deliberately ignored, often leading to seeking easy solutions (snake oil) instead of confronting the problem.

East German Technology Theft

The East German quest to build a self-sufficient computer industry at the end of the Cold War is a prime example of ignoring a “zero.” Despite understanding computers were vital, their system punished creativity, innovation, and collaboration. Due to embargoes, they couldn’t partner with Western companies, so the Ministry for State Security (Stasi) spent billions on stealing blueprints, hardware, and reverse-engineering technology from the West. They even tried to “import” entire factories through illegal routes.

However, the entire endeavor failed. The Stasi’s zero was the lack of organic knowledge development. They didn’t allow their scientists to travel or engage in research, preventing the earned knowledge from development and failure that enables troubleshooting, adaptation, and innovation. Machinery often didn’t work upon arrival, and its illegal acquisition meant no service repairmen. The Stasi’s “cult of secrecy clashed with the scientific ethos of openness.” Throwing money at espionage and smuggling didn’t compensate for the fundamental lack of in-house expertise. To change this “zero” into a “one” would have required a fundamental cultural shift to support innovation, which would have amounted to an acknowledgment of socialism’s failures. This illustrates that success is complex, with many contributing factors, but failure can be determined by just one neglected essential component.

Crop Diversity

This sidebar uses crop diversity in agriculture to illustrate the importance of avoiding “multiplying by zero.” Relying on a single crop (homogeneity) is risky because one failure (plant disease, parasite, weather) can wipe out the entire harvest. Crop diversity is like having multiple equations; if one is multiplied by zero, it doesn’t negate the others. The Irish Potato Famine (1845) is a classic example: a fungus affecting potato plants, combined with a lack of genetic variation (plants were clones), led to over a third of the population lacking their main food source for years. This highlights the importance of not creating excessive dependency on one thing that could fail.

Transforming Zeros

The authors discuss how a “zero” in one’s personal equation—a characteristic or condition perceived as undermining efforts—can be transformed. Stuttering is used as an example: despite intellect, the difficulty verbalizing words can negate the perceived value of knowledge and experience. While there’s no “cure,” stutterers manage it through speech and cognitive behavioral therapy. The feeling that stuttering negates all other efforts is common, but many famous stutterers (James Earl Jones, King George VI, Marilyn Monroe, Emily Blunt, B.B. King, Rubin “Hurricane” Carter) have found ways to succeed in public roles. Actors find that taking on a role helps manage their impediment. Many stutterers also find they don’t stutter while singing, like B.B. King and Rubin Carter, who then applied cadences and relaxation to everyday speech. This demonstrates that transformation isn’t about “getting rid of the zero,” but shifting it just enough to turn it into a “one,” thereby activating the power of the rest of one’s equation. Zeros can challenge us to develop new skills and qualities.

Equivalence

Equivalence is the model that shows things don’t have to be the same to be equal, and there are usually many paths to success. In math, “if A = B and B = C, then C = A” implies different symbols can yield equal answers. This model is useful when traditional solutions are no longer viable but an equivalent result is desired. It also encourages looking for underlying equality in experiences rather than apparent differences to connect with others.

The world is full of things that seem different but are equivalent (e.g., human universals like taboo language or inheritance rules manifest differently across cultures but serve equivalent purposes). Historical recurrence is another form, where seemingly equivalent events happen multiple times (e.g., Lincoln and Kennedy assassinations, Napoleon and Hitler invading Russia), suggesting people in similar situations with similar incentives behave similarly.

Multiple Discoveries

The myth of the solitary genius inventor is challenged by the phenomenon of multiple discoveries, where equivalent results are reached independently by multiple people or teams around the same time. This happens because discoveries often build on cumulative work and recombining existing ideas within a shared scientific and cultural landscape. Examples include:

Charles Darwin and Alfred Russel Wallace independently conceiving natural selection.
Carl Wilhelm Scheele, Joseph Priestley, and Antoine Lavoisier discovering oxygen around the same time.
Louis Ducos du Hauron and Charles Cros presenting similar methods for color photography.
Nettie Stevens and Edmund Beecher Wilson independently identifying X and Y chromosomes for sex determination.
Takaaki Kajita and Arthur B. McDonald sharing the Nobel Prize for showing neutrinos have mass.

Patent law, which assumes a sole inventor, often misses that the “inventor” is one of many who happened to file first or gain recognition. This also highlights how female and minority scientists and inventors are often overlooked, with credit going to better-known figures later. Multiple discoveries show that while details may differ, underlying principles and problem solutions are equivalent, and they demonstrate the rich understanding gained when the full process of innovation is appreciated.

Madeleine Vionnet and the Bias Cut

This sidebar presents Madeleine Vionnet, a 20th-century designer, as an example of achieving equivalent results in a novel way. Up until her time, corsets shaped the female body. Vionnet challenged this by inventing the bias cut in 1922. She intuitively realized that cutting fabric at a 45-degree angle to the grain (rather than straight) allows it to stretch significantly more and cling to the body, creating “soft flattery” and “clothes that moved like water,” freeing fabric from traditional constraints. Her “free-form geometric” approach, inspired by classical Greek statues, became a staple of fashion. This demonstrates that there is more than one way to achieve a desired aesthetic or functional outcome.

How We Deal with the Universal of Death

Death is a universal human experience, and all cultures have a need to process the death of a loved one. Funeral rituals, though widely varied, serve equivalent purposes: helping the living make sense of death, providing a script for grief, and offering reassurance. William G. Hoy notes the universal appeal of “making sense of death through ceremonies.” Regardless of specific religious beliefs or customs (e.g., Jewish custom of sitting with the body, Hindu cremation, Tibetan sky burial, South Korean bead ashes), all traditions aim to console the living and acknowledge the deceased’s transition. The variety of funeral types (somber vs. lavish, with singing or dancing) all serve the same core needs. This demonstrates that many different ways can meet universal human needs, and all are equally valid in their function.

Supporting Idea: Order of Magnitude

Orders of magnitude are a form of notation used to represent very large or very small numbers compactly, as our brains struggle to conceptualize them. To say a number is an order of magnitude larger means it is ten times larger (a power of ten), and a tenth the size for smaller. This notation is crucial in science, mathematics, and engineering (e.g., comparing the weight of Earth to a car). It enables comparisons and estimations, sacrificing perfect accuracy for context (e.g., spending $1 a second, it takes ~11 days to spend $1 million, but ~32 years to spend $1 billion—a difference of three orders of magnitude).

The Richter scale for earthquakes, created by Charles F. Richter and Beno Gutenberg, uses orders of magnitude. Each step up means the earthquake has ten times the ground motion effect and releases 32 times as much energy. It provides a shortcut for showing size differences between seismic events. Most earthquakes are at the bottom end, too small to notice, while events at 8 or higher are rare (only one per year). Richter designed it for segregating “large, moderate, and small shocks” based on instrumental indications, freeing it from subjective estimates. This highlights how orders of magnitude simplify the comparison of vast scales.

Surface Area

Surface area is defined as the amount of something in contact with or able to react to the outside world. This model helps recognize when increasing exposure is beneficial and when it’s problematic. In chemistry, greater surface area means faster reactions (e.g., powdered sugar dissolves faster than a cube, small sticks start fires faster than logs). In biology, organisms evolve specific surface-area-to-volume ratios for different aims (e.g., lungs for oxygen absorption, cold-region animals having lower ratios to reduce heat loss). Applied more broadly, surface area can refer to the number of dependencies or assumptions something has; less surface area means more robustness (e.g., code with fewer dependencies).

Circus Schools and Increasing Creativity

The history of circus development provides an excellent example of how increasing surface area can spur innovation. Historically, circus acts stagnated because performers were from insular “family systems” (closed systems), isolated from the outside world, mindlessly duplicating past work despite increasing technical ability. This small surface area of interaction limited creative reactions.

However, the Russian Revolution and the subsequent establishment of a Russian circus school in 1916 proved momentous. This school adopted an interdisciplinary approach, teaching not just traditional techniques but also philosophy, physics, math, and chemistry to “develop their intellects” as sources of inspiration. The state also invited artists from other disciplines and created “circus ‘labs’” for developing new methods and equipment. This new, multidisciplinary education (increased surface area) led to an explosion of creativity in the Soviet circus, producing unparalleled artistry and professional polish. Their shows toured globally, influencing other countries to open similar schools, like the French national circus school, which expects students to create new work. This demonstrates that increasing our knowledge surface area through exposure to diverse disciplines is a powerful solution for overcoming creative stagnation and fostering innovation.

Guerilla Warfare

Sometimes, reducing surface area is crucial for security, making one less vulnerable to influence, manipulation, or attack. In internet security, a smaller attack surface area (fewer opportunities for unauthorized access) reduces the risk of breaches. Historically, this is seen in the narrow slit windows of medieval fortifications or walled cities with few guarded entrances, concentrating defense.

Guerilla warfare is an offensive strategy that inherently uses a small surface area. Guerillas operate in small, autonomous, mobile units, and are not attached to occupying or holding territory. This provides minimal surface area for adversaries to attack, making them “intangible, invulnerable, without front or back,” as T. E. Lawrence described. Max Boot explains that having “no cities, crops, or other fixed targets to defend” makes them hard to deter. Fidel Castro’s guerilla group in Cuba in the 1950s exemplifies this. Starting with only about 20 men, their small tactical units chipped away at Batista’s infrastructure by attacking vulnerable, isolated units or unguarded communication/supply chains. Operating from mobile bases in hard-to-access mountains, they evaded capture and minimized their attack surface. Their success highlights how a small surface area can be both a defensive and offensive strategy, particularly for the weak against the strong.

When You Can’t Tell the Whole Truth

This sidebar uses maps to illustrate the dangers and opportunities of reducing surface area. All maps present “a chosen aspect of reality,” necessarily omitting details and nuance to be useful. This reduction of information points is a conceptual “surface area.” Mark Monmonier explains that “A good map tells a multitude of little white lies; it suppresses truth to help the user see what needs to be seen.” The London Underground (Tube) map, designed by Harry Beck, is a prime example. Despite its initial derision for ignoring geographical accuracy, Beck’s radical design (simple colored lines, equal station distances, 45-degree angles for intersections) became iconic. It reduced the “surface area” of London’s complex reality to a few essential points for a single purpose: navigation. This success shows that simplifying content by omitting confusing or distracting details is crucial for effective communication.

Global and Local Maxima

The model of global and local maxima refers to the largest (global) and smaller peak (local) values of a mathematical function over its domain. It’s about knowing when you’ve hit a peak in life or a system, and when there’s still potential to go higher. Maxima occur at critical points of change, with increases before and decreases afterward. The “hill climbing” algorithm in computer science, used for searching a space of solutions, is a metaphor for this: the goal is to reach the highest peak in a “landscape with hills and valleys.”

The value of this model is that it pushes us to consider if and how we can do better, even when things are going well. Often, we are just at a local maximum, meaning things are as good as they can get with the current structure. Reaching a higher global maximum requires change—acquiring new knowledge, adopting new methods, or being willing to “go down” into a valley to become a neophyte in some areas before climbing a new hill. It also involves stepping back to broaden one’s view and reassess direction. As we learn new skills, partner with new people, or make big jumps in optimization, we begin climbing toward the next maximum.

Optimization

This sidebar uses an analogy to explain optimization through minima. To find the lowest point in a town (global minima), a small ball (basketball) might get stuck in a local minima (lowest point on a street). A giant ball (quarter mile in diameter) might find something closer to the global minima, but its large scale prevents it from fitting into the true lowest point. The lesson: make big changes first, before optimizing details. The large ball finds the general area; then the smaller ball fine-tunes. This illustrates that local minima (or maxima) can be traps, and sometimes a “bigger jump” (changing the scale of optimization) is needed to find the true global optimum. The ball’s rolling direction provides feedback for the direction of change.

Navigating the Hills

The development of a new product, from conception to market success, often involves navigating hills and valleys, reflecting local and global maxima. The story of the sports bra, invented by Lisa Lindahl and Polly Smith in 1977, is a great example. Lindahl, a runner, needed support that minimizes breast movement, leading to Smith constructing a prototype. But turning a one-off prototype into a business (Jogbra) required figuring out production, sales, logistics, and marketing—a continuous learning cycle of “gaining information, then accruing the knowledge to apply it correctly,” full of mistakes.

Lindahl’s personal journey also involved pushing past her own local maxima, including going back to college in her late twenties and managing her epilepsy. She made a strategic decision to sell the Jogbra in sporting goods stores (rather than lingerie sections) as a piece of athletic equipment, recognizing the unprecedented number of women entering sports due to legislative changes. This was a “big ball” decision, despite initial resistance from male sales reps. The company achieved profitability quickly, reaching an early local maximum. However, growth meant constant learning, including making mistakes with product lines (poor naming, bad colors), but also successes (men’s line). This experimentation, with its inevitable failures, is part of achieving success, showing that “going down sometimes is a part of going up.” The looming need for capital eventually led to selling Jogbra to Playtex, which for Lindahl, was another “hill and valley that led to her exploring other life maxima.” She reflects that what seems like a pinnacle is often just the “floor of your next level.” This model encourages stretching oneself, taking risks, and embracing failures to reach full potential.

Using New Partnerships to Optimize

The global and local maxima model helps identify when and how to find a higher peak. Sometimes, it’s about fine-tuning (basketball analogy), but other times, it requires a “giant ball” approach—changing the scale of optimization. For rock bands, this means ensuring the optimal people are in the group before tinkering with image or style. The success of Queen (Freddie Mercury, Brian May, John Deacon, Roger Taylor) was the product of years of experimentation, development, and many failed bands. Each member’s prior band experiences (e.g., John Deacon’s Opposition, Roger Taylor’s Reaction, Brian May’s 1984 and Smile, Freddie Mercury’s Ibex and Sour Milk Sea) were local maxima—they gained experience in stage setup, sound checks, and band dynamics.

When May, Taylor, and Mercury formed a new band in 1970, they systematically sought out the right bassist, finding John Deacon. This was a “big ball” decision. The emergent properties of their combined musical talent were unpredictable until they started playing together. Through continuous learning and a willingness to incorporate feedback, they refined their sound and brand, openly soliciting comments on performances and not fearing criticism. This iterative process of optimizing to reach their global maxima led Queen to become one of the most dynamic and memorable rock bands. The model suggests that major changes (like band members) are crucial before optimizing details (like individual chords), and that success is a path of peaks and valleys, requiring short-term sacrifices for long-term gains.

Now What?

The “Now What?” section serves as a call to action and a guide for applying the 50 mental models discussed across the three volumes. The authors emphasize that merely reading about models is insufficient; wisdom comes from putting them to the test. Readers are encouraged to:

Pick a model (e.g., one per week) and apply it to their lives, observing what looks different.
Record observations and reflect on experiences, noting different choices made and improved outcomes, and learning from mistakes.
Build a latticework by noticing connections between models and how some are best paired together.
Embark on a lifelong journey of learning and practice, eventually integrating these models into their thinking such that it becomes “impossible to view any situation without the valuable lenses they provide.”

The ultimate goal is to improve lives by seeing the world as it is and working with its fundamental principles, leading to better decisions and a more meaningful life.

Key Takeaways

The Great Mental Models, Volume 3 offers profound insights into how systems and mathematical concepts underpin our reality, providing a toolkit for sharper thinking and better decision-making.

The core lessons readers should remember are:

Systems are constantly in flux: They are rarely static and require continuous adjustment. Understanding feedback loops helps us see how information influences behavior, while concepts like equilibrium, churn, and the law of diminishing returns show that systems are always changing and adapting.
Complexity arises from simple rules and interactions: Complex Adaptive Systems demonstrate how simple components can self-organize to produce sophisticated collective behavior and adaptation, making direct control often impossible but indicating the power of understanding underlying rules.
Scale fundamentally alters dynamics: What works at one size may not work at another. Growth is often nonlinear, introducing new problems and dependencies that require a complete rethinking of the system.
Leverage is found in identifying key points: Bottlenecks represent limiting factors that, when addressed, can unlock significant improvements. Identifying these “zeros” in a multiplicative system is crucial, as they can negate all other efforts.
Luck and skill intertwine, and averages prevail: Regression to the mean teaches us not to overemphasize outlier successes or failures, as results tend to revert to an average performance over time.
Truths are often partial and contextual: Distributions highlight that reality rarely fits neat models, and sampling biases can lead to inaccurate conclusions, underscoring the need for representative data.
Creativity and progress often involve randomness and iteration: Embracing the unpredictable, as in the “randomness” of creative processes or the “iterations” of Kandinsky’s art, can lead to breakthroughs. Similarly, “multiple discoveries” show that innovation is a collective, often simultaneous, effort.
There are many paths to an equivalent outcome: Equivalence teaches that being different doesn’t mean being unequal, and problems often have diverse, yet equally effective, solutions.
Strategic intervention is key: Understanding these models helps us choose where and how to intervene in systems, whether by increasing “surface area” for innovation or reducing it for security, or by making “big ball” changes before optimizing details to reach global maxima.

Next actions readers should take immediately:

Choose a model and apply it actively: Don’t just read; pick one mental model (perhaps one per week) and consciously look for its presence and application in your daily life, work, and relationships.
Journal your observations and reflections: Document what you notice when using a model, how it changes your perspective, and what decisions you make differently. Note both successes and failures to deepen your understanding.
Seek connections between models: As you practice, actively look for how different models interrelate and provide enhanced insights when used in combination (e.g., scale and bottlenecks, emergence and feedback loops).

Reflection prompts:

What recurring problems in your life or work might be symptoms of an unaddressed “zero” in a multiplicative system, and what’s the smallest change you could make to turn it into a “one”?
Where are you currently operating at a “local maximum” in your personal or professional life, and what short-term “descent” might be necessary to open the path to a higher “global maximum”?
In what areas of your life do you consistently expect linear returns, but are actually encountering diminishing returns, and how might acknowledging this shift your strategy?