Evidence-Guided – Complete Book Summary & All Key Ideas

Evidence-Guided: Creating High-Impact Products in the Face of Uncertainty

In “Evidence-Guided: Creating High-Impact Products in the Face of Uncertainty,” Itamar Gilad presents a groundbreaking framework known as GIST (Goals, Ideas, Steps, Tasks) to revolutionize product development. Gilad, drawing from his extensive experience at tech giants like Google and Microsoft, challenges the traditional, opinion-based approach to product creation, which often leads to wasted resources and unfulfilled promises. This book offers a practical, actionable roadmap for product managers, UX designers, engineers, and leaders at all levels to build products that truly matter by making decisions based on evidence, not just intuition or rank. Through clear explanations, compelling real-world examples (including his own experiences with Google+ and Gmail’s Tabbed Inbox), and a structured methodology, Gilad promises to equip readers with the tools to navigate uncertainty, foster innovation, and consistently deliver value to both businesses and customers. This summary will comprehensively break down every important idea, example, and insight from the book in clear, accessible language, ensuring nothing significant is left out.

Introduction: Why Evidence-Guided?

The introduction sets the stage by immediately highlighting a critical flaw in common product development practices: the reliance on opinions over evidence. Gilad recounts his experience at Google, where a seemingly “strategic” project like Google+ integrations for Gmail consumed vast resources but ultimately failed to gain user traction, leading to its eventual discontinuation. This stark contrast illustrates the difference between opinion-based development, which places “blind bets on a roulette table,” and evidence-guided development, which “tilts the odds in their favor.”

Gilad argues that while human judgment is powerful, it’s deeply flawed when faced with complexity, uncertainty, and information overload. Our brains fall prey to cognitive biases like confirmation bias and sunk cost fallacy, leading to overconfidence in bad decisions. Group discussions and relying on senior leaders often exacerbate these problems, as seen by the high failure rates of features in most products (research suggests most features get little or no traction) and A/B experiments (where at most one in three ideas show measurable improvement). Evidence-guided development, in contrast, acknowledges this uncertainty and uses facts and information (evidence) to confirm or refute assumptions, much like in science, medicine, and law. It’s about supercharging judgment rather than ceding decisions to data, leading to faster development, improved resource efficiency, reduced planning time, suppressed politics, increased trust, and empowered teams. This approach is characteristic of truly successful product companies like Amazon, Netflix, and Apple.

To help readers assess their own organizations, Gilad introduces the GIST Scorecard, a self-assessment tool structured around the four pillars of his model: Goals, Ideas, Steps, and Tasks. This scorecard allows readers to gauge their current practices and identify areas for improvement, setting the stage for the detailed exploration of the GIST model throughout the book.

Chapter 1: From Opinions to Evidence

This foundational chapter directly contrasts opinion-based development with evidence-guided development, illustrating the latter through a compelling personal anecdote. Gilad begins by recounting his time working on Google+ integrations within Gmail. Despite Google’s strategic “all-in” push on social, the Gmail team’s efforts to integrate features like sharing photos and filtering messages by “Circles” ultimately failed. Users largely disregarded the features, and Google+ itself, despite seemingly strong user growth on paper, struggled with active engagement and was eventually shut down. This, Gilad explains, was a classic case of opinion-based development, characterized by picking a promising idea, turning it into a fixed plan, and then executing. This often results in a Planning Waterfall, where decisions are made based on intuition, consensus, and rank, leading to a high rate of hidden failure.

Gilad then introduces the alternative: evidence-guided development, which he experienced firsthand at Google, even though the Google+ project didn’t follow this path. He highlights Google’s historical principles: “Start with the user,” “Let a thousand flowers bloom,” “Think big, but start small,” and “Fail fast.” These principles underpin a system that acknowledges uncertainty and seeks to improve the success-to-failure ratio by supercharging human judgment with evidence. He defines evidence as facts and information that confirm or refute assumptions, differentiating it from mere data.

To make evidence-guided development actionable, Gilad developed the GIST model: Goals, Ideas, Steps, and Tasks. This framework breaks down the process into four interconnected layers:

Goals: Define measurable outcomes for customers and the business.
Ideas: Hypotheses for achieving those goals, recognizing that most ideas are not good and need testing.
Steps: Short projects or activities designed to develop and test ideas, often without extensive coding, to gather supporting evidence.
Tasks: The day-to-day work items, managed by agile methods, that connect to steps, ideas, and ultimately, goals.

He then tells the story of the Tabbed Inbox in Gmail as a concrete example of GIST in action. Initially, Gilad had an opinion-based idea for managing “inbox clutter.” However, his managers pushed for a goal-oriented approach: understanding if clutter was a widespread problem and defining measurable success. Through initial data analysis (revealing many users didn’t manage their inboxes) and user interviews (showing the severity of clutter for casual users), they formulated the objective: “help casual users interact only with the messages they want to interact with,” with clear success metrics (accuracy of prediction, no missed important messages, high engagement).

The team then explored multiple ideas (teaching users to clean, grouping messages, digests, tabs). Instead of debating, they moved to testing the riskiest part: the user experience. This involved a usability lab test with a “sham” HTML facade and manual categorization, revealing overwhelming positive feedback for the tabbed inbox. This strong evidence led them to “double-down” on the tabbed inbox. They didn’t go “all-in” but rather engaged in a series of steps: rudimentary versions, Fishfood (team testing), Dogfooding (testing with thousands of Googlers), and an optional Labs feature for broader user feedback, continuously improving algorithms and UI. The project adapted, even expanding to mobile, driven by the evidence collected. The final launch in 2013, despite lukewarm press reviews, was a success due to prior testing and data confirming strong user value.

The chapter concludes by revisiting the Tabbed Inbox story through the lens of GIST:

Goals: Shifted from “launch Itamar’s new inbox” (output) to “measurably improving the inbox experience of casual Gmail users” (outcome).
Ideas: Based on research, multiple ideas were considered, evaluated, and then tested. The Confidence Meter is introduced as a tool to evaluate evidence strength.
Steps: The idea was built iteratively through “build-measure-learn loops,” generating evidence at each stage to gain confidence and course-correct.
Tasks: The team was highly involved in discovery as well as delivery, managing work through a GIST board (to be detailed later) that connected tasks to goals.

Gilad emphasizes that GIST is a meta-framework combining existing methodologies and tools, but its power lies in bringing evidence-guided thinking into development and overcoming traditional mindsets. He sets up the rest of the book as a guide to adopting this transformation, acknowledging that it’s challenging but ultimately worthwhile.

Chapter 2: Goals

This chapter delves into the critical first layer of the GIST model: Goals. Gilad argues that many companies suffer from a “cyclical planning ritual” where teams debate “what should we do” (ideas/output) without first agreeing on “what should we achieve” (goals/outcomes). This leads to misalignment, decisions based on negotiation or seniority, and a focus on “executing the plan” rather than delivering actual value. Evidence-guided companies, in contrast, define firm, measurable outcome goals and then adapt plans as new information emerges, empowering teams to discover the best path.

Gilad introduces Objectives and Key Results (OKR) as the primary methodology for expressing goals, tracing its origin to Andy Grove at Intel in the 1970s. An OKR has three parts:

Objective: A short, aspirational statement describing the desired end state (e.g., “All customers onboard quickly and successfully”). It should be inspiring but not necessarily measurable or fully feasible.
Key Results (KR): 2-5 measurable targets that define success for the current goal cycle, including current value and target (e.g., “Reduce average onboarding time from 30 days to 4 days”). KRs focus on outcomes (measurable improvements), not outputs (things to do).
Context (optional but helpful): Explains why the goal is important, relevant evidence, and other useful information.

He emphasizes that OKRs should guide what to achieve, not how to achieve it. Baking specific product ideas into OKRs leads to an output focus, hindering agility and team empowerment.

To choose the most important outcomes, Gilad proposes the Value Exchange Loop. Every organization must both deliver value to its market (users, customers, partners) and capture value back (revenue, market share, data, attention). Success comes from a virtuous loop where giving more leads to capturing more, which in turn fuels more value delivery.

This dual mission is measured by two top-level metrics:

The North Star Metric (NSM): Measures how much value we deliver to the market. It sums up the core value (e.g., WhatsApp: Messages sent per month, Airbnb: Nights booked per month). A good NSM is as close as possible to the core value experience, an aggregate number (not a rate or ratio), and simple and memorable. It can change as the company evolves.
The Top Business Metric: Measures how much value we capture (e.g., revenue, paying customers, monthly active users). Companies should choose one most important business metric for focus.

Gilad illustrates how a CEO can present the company’s mission as a yearly goal, using the NSM and Top Business Metric as key results (e.g., “Help enterprise employees express themselves better with smart docs” with KRs for documents created and yearly revenue). This clarity helps the organization focus and align.

Beyond these top-level metrics, Gilad introduces Metrics Trees (or graphs). The NSM and Top Business Metric are often trailing indicators, so they need to be broken down into submetrics (input metrics) that are easier to influence and reflect changes more quickly. The tree shows how various lower-level metrics contribute to the top ones (e.g., Documents created = Monthly active users x Docs created per user). This helps define team ownership and mission (Local North Star) and reduces dependencies. He stresses the importance of an overlap between the two metrics trees, which fosters business awareness in product teams and customer focus in business teams, aiding collaboration.

Gilad also stresses the importance of supplementary goals that fall outside the main metrics trees. These address crucial aspects like code health, user privacy, employee wellbeing, company culture, environmental impact, internal operations, and compliance. Allowing teams to create such goals and encouraging employees to float missing company goals upward helps maintain overall organizational health.

Creating Alignment is achieved through a three-step OKR process at the end of each goal cycle:

Top-down goals (yearly): Company leadership publishes draft OKRs (3-6 objectives, 2-5 KRs each) for the year, with context. Some KRs might be “To Be Determined,” inviting bottom-up input.
Bottom-up goals (quarterly): Product teams (and other departments) propose their own OKRs in response to company goals. These are team-level goals, often set by a “Trio” (product manager, engineering lead, designer). They adapt company objectives or create their own, with KRs aligned to their responsibilities (e.g., customer onboarding team reducing onboarding time and support tickets). Shared goals across teams are encouraged for collaboration.
Finalizing (quarterly): Team leads review draft goals with their managers and stakeholders, adjusting targets or KRs through discussion and mutual agreement, not dictation. Managers facilitate finding similarities or conflicts between teams. The final review with company leadership ensures alignment and may surface new company-level objectives. Andy Grove’s principle that at least 60% of KRs should be invented bottom-up is highlighted.

The chapter concludes by reiterating that this shift to outcomes-based leadership is transformative, providing clarity, strategic alignment, and tactical independence, ultimately empowering people to discover the best ways to achieve their goals.

Chapter 3: Ideas

Chapter 3 confronts a sobering truth about product development: most ideas are not good. Gilad cites extensive research from companies like Microsoft, Slack, Netflix, VWO, and Booking.com, which consistently show that only a minority of tested product ideas (ranging from 8% to 35%) actually lead to measurable improvement. This high failure rate means that the majority of what’s often in a roadmap or product backlog is not worth doing.

The problem is compounded by humans’ inability to reliably pick the best ideas. We are susceptible to cognitive biases (anchoring effect, sunk cost fallacy, false consensus effect) and rely heavily on intuition and superficial evidence. Traditional prioritization, often a high-stakes and political “battle of opinions” decided by senior managers, doesn’t improve success odds; in fact, seniority can lead to overconfidence.

Gilad advocates adopting the scientific method for ideas, inspired by Linus Pauling’s quote: “If you want to have good ideas you must have many ideas. Most of them will be wrong, and what you have to learn is which ones to throw away.” This involves two key changes:

Generate more and better ideas through continuous research: This includes user interviews, data analysis, competitor evaluation, market trend tracking, and daily product use. Group ideation sessions and design sprints can help process findings.
Use evidence to pick ideas: Instead of opinions, ideas are evaluated quickly, picked for testing, validated, re-evaluated based on results, and then either parked or built and launched. This creates a steady flow of ideas through an engine of discovery and delivery.

To keep ideas organized, Gilad introduces the concept of an idea bank. This is a repository (spreadsheet, database) for all ideas, which are not backlogs (meaning not all are expected to be implemented). Its purpose is to track, compare, and remind that there are always options. Each product team manages its own idea bank, visible to all, where new ideas are triaged by the owner (typically the PM) to accept or park. Ideas then move between Candidate (investigate further) and Parked states. A smaller Working-Set of 3-5 ideas per key result are actively pursued for testing.

For evaluating ideas, Gilad champions ICE scores (Impact, Confidence, Ease), where ICE Score = Impact * Confidence * Ease. This numerical quantification facilitates a structured discussion focused on:

Impact: An estimate (0-10) of how much the idea will improve the target metric (e.g., company’s North Star Metric, team’s Local North Star, or a specific Key Result). It’s the hardest to estimate and can be done via guesstimates, referencing past ideas, assessment/fact-finding (like the mobile deployment example showing how initial high impact can be downgraded with more data), back-of-the-envelope calculations, simulations, or ultimately, tests.
Ease: An estimate (0-10) of how hard/easy it is to implement the idea fully, inversely related to effort (person-weeks). Can be guessed, compared to past efforts, or estimated by breaking down the project.
Confidence: The antidote to the planning fallacy, measuring how sure we are that the idea will have the expected Impact and Ease. This is based on supporting evidence, whose strength is crucial.

Gilad introduces the Confidence Meter, a thermometer-like tool (0-10 scale) that categorizes and ranks common types of evidence by strength:

Self-Conviction (0.01): Personal strong feeling.
Pitch Deck (0.05): Idea documented.
Thematic Support (0.05): Aligned with trends, vision, strategy.
Others’ Opinion (0.1): Consensus from colleagues, managers, experts.
Estimates and Plans (0.3): Based on calculations, detailed cost estimates (can refute initial guesses).
Anecdotal Evidence (0.5-1.0): Few data points, customer interest, sporadic sales requests, competitor features.
Market Data (1.0-2.0): Surveys, smoke tests, competitive analyses (larger samples, simple tests).
User/Customer Evidence (2.0-3.0): Significant product/support data, interviews, usability studies, concierge tests (stronger, but can still produce false signals).
Test Results (3.0-5.0): Mid-level tests (longitudinal studies, alphas) and late-stage tests (A/B experiments, betas).
Launch Data (5.0-10.0): Consistent usage, positive feedback, target metric improvement after full launch (strongest evidence).

He emphasizes that you don’t need every type of evidence for every idea; the level of validation needed depends on cost and risk. Using the Confidence Meter to calculate a confidence score for the antivirus deployment example, he shows how initial high impact (8) and ease (2) based on opinion could drop to a lower impact (4) and ease (2) but with a slightly higher confidence (1.96) after initial analysis and anecdotal evidence.

Gilad highlights that ICE helps banish the battle of opinions by shifting discussions from “is it good?” to “what’s its impact, ease, and how confident are we?” This makes discussions more productive, transparent, and grounded in reality.

When choosing among multiple ideas, merely sorting by the total ICE score can be misleading, especially with low confidence. Instead, he suggests splitting candidate ideas into three groups by confidence level (low, medium, high) and picking ideas within each group, using judgment and ICE values as hints. The real magic, he concludes, is not just ICE, but continuous research, evaluating many ideas, and putting promising ideas to the test, which is the subject of the next chapter.

Chapter 4: Steps

Chapter 4 introduces the crucial “Steps” layer of the GIST model, which brings the Build-Measure-Learn loop into practical application. Gilad opens with the anecdote of Greg Linden’s secret A/B test at Amazon in 1998 for checkout recommendations. Despite strong objections from a senior VP who feared distractions, Linden ran the test and proved that the feature generated significant extra spend. This story powerfully illustrates how learning and evidence-guided decisions counteract reliance on opinions, preventing false positives and negatives.

Gilad criticizes the common “launch-and-iterate” approach, where the full cost of building an idea is paid upfront before any user feedback, leading to limited ability to correct or abandon flawed ideas post-launch. The alternative is to test early and often, a core principle of Design Thinking, Product Discovery, Growth Marketing, and Lean Startup.

In GIST, Steps are defined as activities or mini-projects designed to develop an idea somewhat (sometimes just in concept) and test it. They are “mini-projects” that combine elements of development and validation. The Tabbed Inbox story from Chapter 1 is revisited as an example of effective step progression: data analysis, user interviews, usability tests, dogfooding, and partial launches. Each step yields supporting evidence, directions for improvement, and a somewhat more complete version of the feature, building confidence and justifying further investment (Evidence → Confidence → Investment). Steps are designed to be short and minimal, starting with quick analyses and progressing to longer, more developed tests. The final step is the Launch (Delivery) of the full, production-ready feature. This iterative process, though sometimes requiring “throwing away some code and designs,” is superior because it allows for discovering mistakes early when they are cheap to fix. The idea bank serves as a knowledge base, linking to past steps, results, and insights.

To illustrate step progression in detail, Gilad presents the “Chatbot and the Dashboard” example, where a product team considers two major ideas for a small business support product aimed at increasing “interactions carried out between customers and end users.”

Step 1: Triage

Initial guesstimates for Impact, Confidence, and Ease (ICE scores).
Chatbot: High Impact (8), Low Ease (2), but very low Confidence (0.16) based on self-conviction, thematic support, and others’ opinion.
Dashboard: Medium Impact (4), Medium Ease (4), but slightly higher Confidence (0.61) due to anecdotal customer requests.
Outcome: Dashboard initially looks better, but confidence is too low for commitment.

Step 2: Estimates

Engineering and Design provide rough effort estimates, and a simple model estimates impact.
Chatbot: Ease drops to 3, Impact stays 8. Confidence gains +0.3 for “estimates and plans,” total confidence 0.46.
Dashboard: Ease stays 4, Impact drops to 3. Confidence gains +0.3, total confidence 0.91.
Outcome: Chatbot closes the gap in ICE score, but both still have low confidence.

Step 3: Fake Door Test (Smoke Test)

Demand for both features is measured by prompting users to “opt-in” to non-existent features.
Chatbot: 77% clicked, 85% notified (strong demand). Confidence gains +1.0 for “market data,” total confidence 1.46.
Dashboard: Only 16% clicked, 64% notified (mixed demand). Confidence gains +0.5 for “market data,” total confidence 1.41.
Outcome: Chatbot takes a strong lead, but “fake door” results are not conclusive proof.

Step 4: Usability Test (Wizard of Oz Test for Chatbot)

Interactive prototypes are shown to existing customers, revealing nuanced insights.
Dashboard: 8/10 found it useful (higher value than expected). Impact bumped to 6. Confidence gains +2.0 for “user/customer evidence,” total confidence 3.41.
Chatbot: 9/10 would use it, but significant usability issues and customer concerns emerged. Impact bumped to 9 but Ease reduced to 2 (more work needed). Confidence gains only +1.0 (half the full boost) for “user/customer evidence” due to mixed results, total confidence 2.46.
Outcome: Dashboard takes a clear lead, with medium confidence. Team decides to test both again.

Step 5: Longitudinal Study

Early, unpolished versions are built and trialed with users over two weeks.
Chatbot: High drop-off, major disappointment. Concluded to be high risk, very high effort (Ease 1), low impact (0.5). Confidence gains +3.0 for “test results,” total confidence 5.46.
Dashboard: Very good results, high daily usage, overwhelming positive feedback, grave disappointment when disabled. Impact bumped to 8. Ease is 5 (10 more weeks to launch). Confidence gains +3.0 for “test results,” total confidence 6.11.
Outcome: Clear decision: Park the chatbot, launch the dashboard.

This story demonstrates Validated Learning: how systematic analysis of test results and recalculating ICE scores forces honesty and helps avert costly failures. Gilad emphasizes that Impact, Confidence, and Ease are more useful than the total ICE score (which can fluctuate). The goal is to reach at least medium-high confidence (3.0+ on Confidence Meter) before committing to full delivery, staging investment based on evidence. Most ideas will fail early, saving resources and allowing more ideas to be tested.

Gilad then introduces the AFTER Model for choosing steps, categorizing validation methods from cheapest/least accurate to most costly/rigorous:

Assessment: Internal evaluations (ICE, assumption mapping, PR/FAQ reviews).
Fact-Finding: Data analysis, user interviews, surveys, competitive analysis.
Tests: Putting ideas in front of users (Fake Door, Usability Study, Concierge, Longitudinal Study, Alpha, Early Adopter, Dogfood, Labs, Beta, Preview).
Experiments: Controlled tests like A/B tests.
Release Results: Gradual rollouts (Percent launches), holdback experiments, post-launch metric tracking.

He addresses Common Challenges with Evidence-Guided Development:

“Too Slow”: Dispels this myth by arguing GIST reduces wasted work, eliminates procrastination and scope-creep, and focuses on “time-to-value” rather than “time-to-market.”
Learning to Learn: Requires a scientific mindset, objective data analysis, and resisting biased interpretation of results.
“What about the Roadmap?”: GIST shifts focus from fixed feature roadmaps to outcome roadmaps.
Accumulating Technical and Design Debt: Acknowledges that interim steps cut corners, but insists on raising the quality bar for full launches, paying down debt.
Keeping the Team Happy: Argues that projects built around steps are more fun and satisfying, fostering achievement and morale.

The chapter concludes by highlighting the profound shift from Launch-and-Iterate to Build-Measure-Learn, transforming organizations from output-focused to outcome-focused, which will be further explored in how it connects with tasks.

Chapter 5: Tasks

Chapter 5 addresses the growing disconnect between developers and the business, positioning Tasks (the day-to-day work items) as the critical link in the GIST model that bridges this divide. Gilad shares his experience with Mailtrack, a startup struggling with slow development, missed deadlines, and disengaged developers, despite managers’ frustration. He identifies the root cause: the growing divide between developers and the business.

Historically, software engineers handled requirements, design, and implementation, but modern development has led to specialized roles (PMs, UX, researchers, engineers) and a split into “Waterfall World” (executives, stakeholders focused on multi-quarter roadmaps and business goals) and “Agile World” (engineers, designers focused on 1-2 week sprints, story points, and code delivery). This often leads to a disengaged development organization acting like a hired contractor, lacking context, inflating estimates, and delivering unsatisfactory software. Product managers often find themselves stuck in the middle, administrating roadmaps and backlogs rather than truly leading.

Gilad argues that the solution is to build a shared view of the world where engineers and designers are connected to the business and user goals. In GIST, Tasks are the daily activities (Scrum sprint items, Kanban cards) that are managed by whatever agile method the team prefers, but they must be explicitly connected, through the GIST stack, to the Goals, Ideas, and Steps of the team and company. This provides full context and deep focus.

A crucial first change implemented at Mailtrack was the adoption of a Step Backlog. Instead of work items like “Complete API X,” the backlog consisted of idea validation steps (e.g., “run a usability test of features X and Y”) and also technical/design enhancement steps (e.g., reducing technical debt). This shifted the mindset from “eating the backlog” to achieving “mini-project launches” with defined users, enlisting the team in product discovery as well as delivery. At Mailtrack, this transparency led the engineering team to realize they were over-investing in quality work and deprioritize some of their own ideas, showing empowered decision-making. Managers, in turn, gained more rigor and accountability in their own idea generation.

Building on the Step Backlog, Gilad introduces the GIST Board, a physical or digital work management tool that visually links everything together:

Goals on the left (quarterly outcomes, up to 4).
Ideas in the middle (actively pursued ideas, 3-5 per KR).
Steps on the right (next 2-4 steps per idea).

Tasks themselves are managed separately in existing tools like Jira, but are linked back to steps. The GIST board should be physically located near the team or displayed on a large screen and reviewed in regular (weekly/bi-weekly), mandatory 30-minute meetings with the entire team. The agenda includes: reviewing goal progress, discussing ideas, checking step status (ensuring results are collected and analyzed), and planning changes. This helps the team keep the bigger context in mind, track true progress, and serves as a transparent view for managers and stakeholders.

Managing the GIST Board should be a collaborative effort, not just the product manager’s responsibility. The Team Leads (Trio or Triad) – typically a Product Manager, Engineering Lead, and UX Designer – jointly manage the board and make key decisions. The rest of the team reviews and approves, with room for pushback.

Choosing Goals: At the start of each quarter, the team copies its quarterly outcomes (KRs) into the board. The Trio partners to balance product, engineering, and design goals, guided by metrics.
Choosing Ideas: Ideas from the idea bank are selected for the board based on how they best achieve quarterly goals, using ICE scores for product ideas. For technical/design ideas, leads prioritize based on clear, measurable benefits. The team (not managers) decides which ideas to pursue, focusing on testing first, acknowledging that most ideas fail, and using evidence transparently.
Choosing Steps: For each idea, a sequence of steps is defined to both develop and validate assumptions. Examples show steps ranging from data analysis to A/B tests to final launch, acknowledging that plans often change. Steps can be parallel or sequential.

A key benefit of GIST is enabling Context, Not Requirements. Gilad shares an anecdote from Gmail where engineers and designers, empowered by deep understanding of the context, innovated a UI improvement without being asked for detailed requirements. The GIST board and regular reviews foster this shared understanding of the “why” (goals, hypotheses, evidence) and the “what” (through collaborative step development, discussions, and demos).

Planning and Executing Steps involves:

Forming a small Step Force (relevant engineers, designer, PM, researcher) for each step.
Kicking off with a whiteboard meeting to define: what to test, with whom, how to test (including mockups, sketches, user stories), what to measure, and what constitutes success (setting targets to avoid bias). Hypotheses statements (e.g., from Lean UX) are useful.
Executing the step, with the Step Force collaboratively developing requirements and responsibilities, and ensuring results are analyzed to increase confidence.

Connecting Steps with Agile Development: The step backlog serves as the input for Scrum or Kanban. Teams break steps into smaller Tasks. While Scrum’s fixed sprints can be challenging with dynamic step backlogs, Gilad suggests one-week sprints or relaxing the “immutable sprint scope” rule. The need for fine-grained user stories diminishes as developers are actively involved in designing steps and understanding context.

The chapter concludes by highlighting how GIST gets everyone on the same page. Developers are no longer just focused on tasks, but understand their connection to steps, ideas, and goals, fostering engagement and shared ownership. Managers and stakeholders gain transparency and influence outcomes without micromanaging. Product managers are freed from administrative burden to lead product discovery and support delivery. This leads to better context, increased trust, and improved collaboration, returning to the “old way of working” seen in highly innovative companies like early Apple.

Chapter 6: The Evidence-Guided Company

Chapter 6 expands the GIST model from the team level to the entire organization, using the fictional but realistic case study of AcmeInvoice.ai, a midsize company specializing in machine-learning-powered invoice processing. AcmeInvoice experienced rapid growth but then suffered from process, politics, and mistrust, prompting a shift to an evidence-guided approach.

Product Strategy

AcmeInvoice’s past strategy involved “big bets” (long, expensive projects like a failed SaaS product) decided top-down, leading to wasted resources due to lack of clear target markets or product definitions. Now, leaders focus on strategic opportunities—market segments with strong, underserved needs. These opportunities are identified through research (customer, competitive, market) and vetted by the CPO. Promising opportunities are assigned to strategy squads (experienced cross-functional teams) who quickly validate them through research, business modeling, and early tests (customer interviews, surveys, fake door tests). Most opportunities prove less meaningful, but those with clear supporting evidence lead to the creation of nuclear product teams (Strategic Tracks) focused on discovering product ideas with strong product/market fit potential. These teams use the GIST model (goals, ideas, steps like usability studies, concierge tests, early adopter programs).

A key takeaway is that AcmeInvoice doesn’t shy away from big ideas, but it adheres to “think big, but start small,” gradually increasing investment based on evidence and growing confidence. This is exemplified by the successful Travel Expenses strategic track, which identified a massive opportunity in enterprise travel expense reporting. The team quickly prototyped a solution based on existing technology, signed early adopter clients, and built a compelling business case. This evidence led to funding a larger Travel Expenses group (13 engineers, PMs, designers) with a specific year-end goal: turn 6 of 8 early adopter clients into paying reference customers uploading expenses regularly. This demonstrates how a large project can stem from validated opportunities and ideas.

Top Metrics and Metrics Trees

AcmeInvoice moved from a proliferation of KPIs to focusing on two main top-level metrics:

North Star Metric (NSM): Number of documents processed per month (DPM), as it directly measures value delivered to customers (less human effort).
Top Business Metric: Revenue, to fuel expansion and attract investors.

These metrics are broken down into Metrics Trees (or graphs), showing how various submetrics contribute. This year, DPM was split into Accounting DPM and Travel DPM to track the new strategic direction, even though Travel DPM didn’t have a revenue target yet, its primary business metric was active customers. AcmeInvoice also tracks health metrics (customer satisfaction, employee satisfaction, carbon footprint) beyond the main trees. The process of creating these trees, though tough, helped model growth and identify underperforming areas.

Company Goals

Company-level goals are set yearly using OKRs, reviewed quarterly. AcmeInvoice’s executives focus on a sparse set of objectives (3 this year) with clear, outcome-based key results, avoiding the dozen-plus OKRs of the past.

Objective 1 (Core Product): “Intelligent invoice Processing That Accounting Firms Love” with KRs for DPM growth, revenue, service availability, security, and new market expansion. These are specific outcomes, not projects.
Objective 2 (Strategic Initiative): “Find Product/Market Fit in Enterprise Travel Expenses Processing” with KRs for paying enterprise customers and processed expenses. It includes extensive context explaining why this opportunity is important and outlining the evidence gathered so far.
Objective 3 (Self-Improvement): “Continuously Improve AcmeInvoice to be Modern, Efficient, and a Great Place to Work” with KRs for meeting time reduction, employee satisfaction, and launch frequency. This objective demonstrates the company’s commitment to internal transformation.

The company explicitly links to external documents for more context, reflecting its belief in transparency and scrutiny of decisions.

The Product Organization: Structure and Goals

AcmeInvoice’s product organization is structured into three layers of leadership: product teams, product groups, and overall product org, all striving for team empowerment and distributed decision-making.

Product Teams: Composed of up to 10 engineers, a PM, and a UX designer. Led by a Trio (PM, UX designer, Engineering Tech Lead) responsible for steering the team. Teams develop expertise in their area and have their own goals and idea banks.
- Functional area teams: Develop external-facing product parts (e.g., Invoice Ingestion, Reporting).
- Platform/Service/Technology teams: Develop internal systems/tech (e.g., Machine Learning, Dev Tools).
- Ad hoc teams: Temporary, cross-org teams for specific goals (e.g., strategy squads, strategic track teams).
Product Groups: Aggregations of up to 5 product teams, grouped by customer type (e.g., accounting firms, clients of accounting firms, internal tools). No group-level OKRs, but directors connect dots and ensure alignment.
Product Org Management: Co-led by a Trio (CPO, CTO, Head of Design). Their mission is to ensure smooth, efficient operation towards company goals. They focus on providing funding, headcount, context, and tools. They discourage separate disciplinary goals (e.g., “engineering goals”) to foster cross-functional collaboration.

Managing Ideas

AcmeInvoice shifted from time-consuming, political idea prioritization to pushing idea collection and prioritization to the teams. Each product team manages its idea bank, open to all for proposals, but triaged by the PM. Ideas are ranked by ICE scores, with Impact tied to the company’s North Star Metric (DPM) or Top Business Metric (Revenue). The introduction of Confidence (and the Confidence Meter) as a key element of ICE has defused hype, salesmanship, and pressure tactics, leading to shorter, more concrete discussions grounded in evidence.

Big ideas and inter-team dependencies are managed by starting small. Promising ideas are validated cheaply, and investment scales up only as evidence grows. This organic growth applies from strategic opportunities (like Travel Expense) down to team-level initiatives. When big ideas become large projects involving multiple teams, coordination shifts to delivery of a well-understood, validated product idea. The company also mitigates dependencies through strategic alignment (fewer, focused OKRs, NSM/TBM) and tactical independence (optimizing team topology and ownership).

Managing the Work: Steps, Tasks, and the GIST Board

AcmeInvoice’s engineering teams use Scrum, but with significant changes. The focus shifted from rigid sprint planning and detailed requirements to empowering teams. Each team maintains its own GIST board (Goals, Ideas, Steps) for current work, reviewing it bi-weekly. This provides context, reduces the need for detailed requirements, and encourages engineers to define and iterate on behavior and interfaces collaboratively. Strict demands for immutable sprint scope are dropped to allow reaction to new information. Team success is measured by achieving OKR outcomes, not story points or tickets. This approach leads to true agility.

GIST boards are transparent (Google Sheets, visible company-wide) and regularly shared. This contrasts with traditional project plans and Gantt charts. The shift incurs some PM overhead but is balanced by reduced specification work and increased impact and collaboration.

The Roadmap

AcmeInvoice replaced its problematic yearly output roadmaps (which rarely materialized as planned and were prone to constant reshuffling) with Outcome Roadmaps. These visual roadmaps reflect yearly objectives and key results, showing expected growth in NSM and Top Business Metric. Full diamonds represent KRs with target completion dates, while orange bars indicate periods of activity. Hollow diamonds represent specific, high-confidence feature ideas that have been sufficiently validated and are now in delivery mode.

For ideas still in the pipeline, the product org shares a list of candidate ideas with their confidence levels, acknowledging volatility but providing transparency to customer-facing teams. This new approach, though initially feared by Sales and Marketing, proved beneficial. Sales and Marketing teams adapted, withholding time-consuming work on product ideas until confidence grew, and collaborating with product teams on market research and early adopter programs. The outcome roadmap became an important leadership tool, affirming the prioritization of outcomes over output and aligning departments. This led to no measurable drop in feature delivery but a massive boost in outcomes.

Chapter 7: Scaling GIST

Chapter 7 explores how the GIST model adapts to companies of different sizes and life stages: startups, scale-ups/midsize companies, and enterprises. Gilad emphasizes that GIST’s core principles remain constant, but their application and intensity vary.

GIST in Startups and Small Businesses

Startups face extreme uncertainty, limited budgets, and often inexperienced leadership, requiring rapid execution and learning. A lightweight GIST implementation provides structure.

Goals: Startups typically have two main goals, often sequential:

Find product/market fit (PMF): This is the pivotal state where a large/growing market with strong underserved needs meets a “good-enough product,” leading to strong demand.
- Objective: To achieve PMF.
- Key Results (KRs): Often include a high NSM target (e.g., “6,000 lessons completed/week” for a cooking app), weekly active users, retention rates (e.g., “30d retention > 20%”), organic signups, and estimated addressable market size.
- Cycles: Shorter than quarterly (3-8 weeks) for rapid progress on KRs.
- North Star Metric: May not be immediately clear. Prior to PMF, focus on customer traction (waitlist signups, pilot agreements). After problem/solution fit, active users/customers can be a fallback.
- Top Business Metric: Less relevant initially. The key business metric is runway (cash remaining).
Find a repeatable, scalable, and profitable business model: This comes after PMF and ensures the company can scale sustainably.
- Objective: To achieve this.
- Key Results (KRs): Focus on unit economics for SaaS (e.g., “CAC < $20,” “LTV > $75,” “CAC payback time – 6 months or less,” high retention/low churn), or sales rep performance for direct sales.
- North Star Metric: Should be clearly defined and measure total value delivered.
- Top Business Metric: Number of paying customers, total revenue, MRR/ARR, or remaining runway.
- Alignment: Avoid separate departmental goals; unite the entire company under as few goals as possible.

Ideas: Startups often begin with one big idea, but it’s unlikely to be the exact one that succeeds. GIST helps by:

Idea Bank: Managed by the product founder, holding potential products/features. It helps balance vision with customer/market pull.
ICE Scoring: Useful for unbiased evaluation, balancing the tension between vision and market demand.
Business Ideas: Gradually added to the same idea bank initially, then split as the company grows.

Steps: Crucial for rapid learning at low cost, resisting the urge for premature v1.0 launches.

Prior to problem/solution fit: Focus on research (customer observations, interviews, market/competitive analysis, technology evaluation) and early testing (usability, fake door, concierge tests).
After problem/solution fit: Move to mid-stage testing, producing “rough versions” or MVPs.
AFTER Framework: Highly valuable for fast idea iteration. Mastering assessment and fact-finding is key to eliminating weak ideas early.
Learning Rate: Just as important as development speed.

Tasks: In a startup, everyone contributes to PMF. A single GIST board for the entire company keeps everyone synced, updated frequently. As the startup grows, teams will eventually get their own boards.

GIST in Scale-ups/Midsize Companies

This is a crucial, risky transition phase with rapid growth, new hires, increasing complexity, and a tendency toward siloing.

Goals: Alignment is paramount.

Clear NSM and Top Business Metric: Communicate and align everything around these.
Two levels of OKRs: Company-level (yearly, reviewed quarterly) and team-level (quarterly, reviewed bi-weekly).
Mid-level managers: Facilitate alignment top-down, bottom-up, and across.
Supplementary goals: Include these for product, company, and employee health.
Discourage pure departmental goals: They hinder cross-functional collaboration. Use ad hoc virtual teams and shared OKRs.
Outcome Roadmaps: Essential to manage expectations and provide alternatives to traditional release roadmaps.

Ideas: Idea contention increases significantly.

Prioritization: Goals provide focus. Clear ownership with meaningful product teams helps.
Team Idea Banks: Each team manages its own, ranked by ICE (often against NSM or TBM).
Product Leader Trios: PM/UX/Eng trios at every level for triage and decision-making.
Big Ideas: Start small, validate with small sub-teams, scale investment based on evidence.
Strategic Opportunities: Start collecting and validating new strategic opportunities in separate idea banks, often managed by senior PMs.

Steps: The full AFTER model becomes vital as products grow complex.

Early testing: Prioritize Assessment, Fact-Finding, and early testing techniques before committing engineering resources.
Qualitative & Quantitative: Bring in user researchers and data analysts. Establish regular practices (interviews, usability, deep dives, A/B experiments).
Direct Customer Access: Crucial for product teams.
Iterating Fast: Continuously measure and accelerate the rate of testing and learning.

Tasks: Counteract disconnection between developers and the business.

GIST Board: Essential for keeping teams connected to goals, ideas, and steps.
Step Backlogs: Implemented by cross-functional step-forces.
Developer Involvement: Encourage developers to participate in user/market research.
Communication: GIST boards serve as a transparent communication tool for stakeholders and managers.

GIST in Enterprises

Enterprises are characterized by immense size, structural complexity (divisions, business units, central functions, regional offices), many layers of management, and numerous stakeholders. Trust is challenging to maintain, leading to top-down plans, restrictive processes, and slow innovation.

Principles for Innovation:

Highly aligned, but loosely coupled (Netflix): Leaders set concrete strategy and broad context, delegating tactical decisions down, accepting that mistakes will happen and learning from them.
Evidence-based decisions: Managers are expected to decide based on evidence and analysis, not just opinion (Apple, Amazon).
Business Units/Divisions: Splitting into product-focused units allows them to operate as standalone medium-sized companies.

Goals:

North Star Metric: Can be company-wide, or by business unit/product if value delivery varies greatly. Delegation of NSM choice is key.
OKRs: Company-level, business-unit level, and team-level. Keep KRs minimal at each level. Company-level OKRs may include goals from successful business units or company-wide supplementary goals.

Ideas: Overcoming challenges of “big top-down projects” and many dependencies.

Think big, but start small: Big projects must stem from well-researched opportunities and thoroughly validated ideas, with investment proportional to confidence. Leaders delegate validation and decide based on evidence.
Innovation Labs/Internal Startups: Dedicated units for opportunity detection and idea incubation (using startup playbook).
Team Time for New Ideas: Encourage allocation of time for radical new ideas, with company-level goals for innovation.
Idea Prioritization: Critical due to high costs and dependencies. Clear goals and transparent ICE-based systems are vital. Teams manage their idea banks; cross-team ideas use shared OKRs.
Addressing Legacy/Complexity: Focus on incremental improvements to architecture and processes (e.g., Strangler Fig pattern, continuous integration) rather than “big bang” projects.

Steps and Tasks:

Semi-autonomous product teams: Manage their GIST boards.
Cross-team collaboration: Required for dependencies, but early-stage steps are minimal and don’t need extensive project management. Later steps for mature ideas may become managed projects for coordinated delivery.

Gilad concludes by emphasizing that GIST’s main value varies by company size: intense focus and rapid experimentation for startups; aligning and empowering teams for scale-ups; and sustaining product exploration and innovation in large, complex enterprises. It’s about maintaining a customer-focused, agile, evidence-guided, and team-empowering organization.

Chapter 8: GIST Patterns

Chapter 8 dives into how GIST’s universal principles are applied with specific adaptations based on the type of product, market, and business model. While the core GIST framework remains consistent, its implementation nuances are crucial for success.

GIST in Enterprise-Grade Products

Developing for the enterprise market (B2B) often brings skepticism towards agile, evidence-guided methodologies, particularly concerning the Ideas and Steps layers.

Ideas: B2B companies are often driven by direct customer requests and sales department input, fearing customer loss. Gilad addresses two key underlying principles:

Product Company vs. Professional Services Company: Product companies build standardized, scalable products; professional services customize for individual clients. Constantly prioritizing one-off customer requests can slowly transform a product company into a less profitable, harder-to-maintain professional services one.
Product Team vs. Feature Team vs. Delivery Team: In B2B, stakeholders/executives often believe they know best what features are needed due to direct customer contact. Gilad argues for empowered product teams responsible for ensuring value and viability, rather than just delivering requested features. He cites the BlackBerry/Microsoft vs. iPhone example, where “must-have” features (long battery, physical keyboard) were disregarded by the market in favor of new value.

To apply GIST in B2B, assuming a desire to be a true product company empowering teams, Gilad uses the example of a “must-have” feature request from a top customer.

ICE Evaluation:
- Impact: Calculate potential NSM growth if implemented (e.g., 5% growth for that customer * 20% of NSM = 1% total growth) PLUS potential NSM decline if not implemented (e.g., losing 20% of NSM). The total impact (21%) can be significant (e.g., 9/10).
- Ease: A typical estimate (e.g., 5/10).
- Confidence: Initially low (0.5 for “anecdotal evidence”) because customers aren’t perfect predictors of their own future behavior. However, B2B teams might adjust the weight for requests from important customers (e.g., boosting to 2/10 confidence).
Deeper Investigation: ICE analysis often reveals that impact is driven by threat of loss, prompting deeper questions: Will the customer really leave? What’s the underlying need? Are there other solutions? This leads to PMs joining customer calls, CEO involvement, and exploring pricing options, leading to a more nuanced picture and preventing costly mistakes.

Steps: Enterprise teams often feel constrained from using “build-measure-learn” due to customer conservatism, complex deployments, lack of direct customer access, and company aversion to testing. However, validation is possible:

Interviews: Crucial due to scarce quantitative data. Aim for 15+ interviews for new features, more for new products. Leverage pre-sale/post-sale meetings, conferences, LinkedIn.
Early Adopter Programs: Find 6-8 “reference customers” (from a single segment) willing to partner on shaping a less-than-perfect product. Treat them as development partners, not sales targets. This is key to finding PMF in a segment.
Data Analysis: Leverage existing customer data and CRM systems (e.g., for qualifying questions, value-proposition tests).
Concierge Tests: Manual, low-commitment execution of future software functionality to learn.
Pilots: Confine risks to a defined area, requiring lower customer commitment. The product team may fill gaps for the client manually during the pilot.
Don’t Rush to Sell: Avoid distracting product discovery with sales targets for unvalidated products. Instead, involve a “Renaissance Rep” (business development) in the cross-functional team to help find early adopters and manage early sales, without sales quotas.

GIST in Business-to-Consumer and Business-to-SMB

These sectors often have easier deployment and measurement, but common pitfalls exist:

Skipping Idea Testing: The ease of deployment can lead to “just launch and see.” This is an anti-pattern as it leads to bloated, unused features and complex codebases. Every idea beyond a low-risk tweak should be tested for value, usability, and viability, ideally with an A/B test.
Not Doing Qualitative Research: Assuming understanding of users because “we are all consumers.” Gilad stresses the importance of regular interviews, usability tests, and field research with segmented user groups.
Not Iterating Fast Enough: Even agile teams can be slow. This is often tied to old planning practices. The rate of testing and learning should be a measured metric and a topic for retrospectives and management reviews.

GIST in Internal Platform/Technology/Services Teams

These teams build products/tech for other internal product teams or employees. They often have strong engineering but sparse product/design/research resources. However, they can be good candidates for GIST adoption as they face less external pressure and have “customers in the building.”

Challenges: Defining mission, value delivery, and success metrics (company NSM vs. own).
Collaboration Models:
- Service Provider Model: Treats internal employees/teams as users/customers. NSM and submetrics are geared towards value delivered to internal customers. Team manages its GIST stack independently.
- Co-development Model: Team members directly help other product teams. A virtual team shares goals (e.g., related to external shoppers) and co-owns the GIST stack. This allows platform teams to measure success by contribution to the company’s external NSM.

GIST in Multi-sided Marketplaces/Services

These involve two or more user types (buyers/sellers, drivers/passengers). Success hinges on connecting supply and demand.

Goals, Metrics, Ideas:
- North Star Metric: Measures total value exchanged (e.g., eBay’s GMV, Airbnb’s Nights Booked, Monthly Trips Completed for ride-hailing). Aims to measure as close as possible to the value experience.
- Submetrics: User-specific metrics, liquidity metrics (supply/demand matching), efficiency metrics, and successful exchange rates.
- OKRs: Teams work independently on their side’s goals (e.g., driver income, passenger safety). Shared OKRs and virtual teams are crucial for goals spanning multiple sides (e.g., improving search).
Ideas and Steps: Teams pursue ideas without tight central coordination, following “strategically aligned, but loosely coupled.” GIST boards aid communication. Many steps require testing on multiple sides, but can use different validation techniques (e.g., beta for buyers, concierge for sellers) and develop one side first.

GIST in Physical Products

Gilad challenges the intuition that physical products are exempt from evidence-guided development due to longer, costlier cycles.

Software Component: The distinction between hardware and software is blurring. Many hardware products have software that can be developed and updated independently with faster cycles (e.g., smartphones, Tesla firmware updates), enabling continuous improvements and A/B experiments via internet connectivity.
Learning Loops for Purely Physical Products:
- Iterative Design/Testing: Example of Team New Zealand’s dual-boat testing for yacht design, and Zara’s rapid fashion iteration based on RFID data and small batch testing.
- Prototyping:
  - Enclosures/Interfaces: Paper, cardboard, wood, clay, Lego, 3D printing, CNC.
  - Electronics: Breadboards, off-the-shelf components, FPGAs for early functionality.
  - Computerized Prototypes: Raspberry Pi, Arduino, mobile phones for embedded systems.
- Rebranding Existing Products: GE’s “616” diesel engine MVP, a modified existing engine tested for a new purpose, provided early learning and concrete evidence for management.

Gilad concludes that GIST is highly adaptable and can work across a broad set of circumstances, emphasizing that its successful adoption often depends on a company’s willingness to embrace change, which is the focus of the next chapter.

Chapter 9: Adopting GIST

Chapter 9 addresses the critical challenge of adopting GIST within an organization. Gilad acknowledges that switching to an evidence-guided mode of work is rarely easy, as it requires people to change their ingrained ways of thinking and working. Resistance, even from supportive colleagues and managers, is expected due to their responsibilities, reputations, and career implications.

Common Adoption Challenges

Gilad outlines 11 common challenges and strategies to overcome them:

Lack of Trust: Managers may distrust teams as inexperienced; teams may distrust management as sales-driven or waterfall-ish. This stems from self-reinforcing beliefs and cycles of failure.
- Solution: Carve out a safe area (pilot project/team) for at least a full quarter. Enforce extreme transparency (managers share rationale, teams share ideas/steps via GIST board) and evidence rather than opinions (“If we have data, let’s look at data”). An objective arbiter calls out infringements. This replaces mistrust with a virtuous cycle of transparency and built-up trust.
Missing Resources: Waiting for analytics systems, test infrastructure, or dedicated user researchers.
- Solution: You don’t need Google-level infrastructure. Most Assessment and Fact-Finding techniques can be done with existing resources (e.g., PMs doing interviews, engineers querying databases, Excel for charting). Early tests (fake door, concierge, Wizard of Oz, dogfood) require minimal special infrastructure. Build experimentation capabilities gradually.
No Time: “We’re too busy with planning/important projects.”
- Solution: This is the “blunt axe” problem. There’s never a perfect time. GIST saves time and effort by reducing waste on bad ideas. Start immediately, even with gradual adoption.
Attachment to Roadmaps: Sales/Marketing want to plan work, promote features, or see roadmaps as contracts.
- Solution: Show that old roadmaps rarely materialized and didn’t create expected business results. Suggest Outcome Roadmaps (Chapter 6) which show commitment to goals and can include high-confidence ideas. Frame it as addressing organizational needs, not just adopting a new philosophy.
Fear of Losing Control of the Product: Managers/stakeholders are reluctant to give up decision-making power over ideas.
- Solution: Explain and demonstrate that it’s a shift from controlling deliverables to controlling outcomes (leadership vs. management). Show how GIST provides transparency and influence through goal-setting and evidence review.
Fear of Slowing Down: Concern about reduced “implementation velocity” or under-utilized engineers.
- Solution: Clarify the goal: outcomes over output. GIST reduces wasted effort on bad ideas, making projects faster and more resource-efficient overall (improving “time-to-value”). Weed out output-based goals/incentives. Utilize “downtime” for tech/design debt or research.
Fear of Over-Analysis: Concern about “analysis paralysis” from excessive testing.
- Solution: Acknowledge it’s possible but rare. GIST cuts through lengthy debates and drives action. Time is spent on developing and testing, not endless analysis.
Not Right for Our Type of Product or Business: Belief that GIST is inapplicable (e.g., high-touch enterprise, regulated, physical products).
- Solution: Challenge these assumptions by asking: Can we define goals by outcomes? Are there multiple ways to achieve them? Can ideas fail? Can we evaluate ideas better before/during/after building? Should teams be involved? (Chapters 7 & 8 address these specific contexts).
Perfectionism: Worry about launching incomplete/unpolished products, damaging brand/trust.
- Solution: Clarify that final product quality is not compromised. Interim versions are exposed to only a small subset of users. Emphasize that users often prefer limited, valuable products over feature-rich, low-value ones (e.g., early Google Docs vs. MS Office).
We’re Already Doing It: Middle managers claim to be already evidence-guided.
- Solution: Use the GIST Scorecard (re-presented in this chapter) to objectively assess current practices and identify nuances or gaps.
We’re Already Using Another Methodology: Incompatibility with existing frameworks.
- Solution: GIST is compatible with true agile, design thinking, and growth hacking. If philosophies clash, it’s better to choose one clear approach rather than a confusing compromise.

Driving the Change

Gilad offers practices for successful GIST adoption:

Make the Case for the Change: Back your argument with facts and evidence to top executives. Highlight current process shortfalls (costs), GIST’s benefits (agility, customer focus, empowerment, shorter cycles), and how ICE can help evaluate funding requests objectively. Emphasize waste reduction and competitive risk.
Find an Executive Sponsor: A C-level champion responsible for GIST’s success. They explain, convince, ensure training, attend sessions, and ensure consistent messaging and incentives.
Teach the Methodology Broadly: Everyone should understand GIST’s principles, terms, methodology, and ownership. This includes PMs, designers, engineering leads, managers, key stakeholders, and broader cross-functional teams. Shared understanding is key to avoiding misalignment and pushback.
Gradual Rollout—Start Where the Pain Is: Avoid a “big-bang” approach. GIST is modular. Identify the biggest pain point (Goals, Ideas, Steps, or Tasks) and implement that layer first.
- Goals first: If direction is unclear, alignment is low, or customer focus is weak (NSM, outcome roadmaps, OKRs).
- Ideas first: If debates are common, ideas are opinion-based, or bad ideas are invested in (idea banks, ICE scores, more research).
- Steps first: If projects are long/expensive, little testing occurs, or results are disappointing (learning milestones, AFTER model, objective analysis).
- Tasks first: If dev teams are disconnected, projects take too long, or excessive specs are needed (GIST board, active team contribution).
- The North Star Metric is the fundamental “plumbing” needed to start, enabling ranking ideas and evaluating steps.
Set Ambitious-yet-Realistic Goals: Track progress through two types of outcomes: business/user outcomes and self-improvement outcomes (describing the change in GIST adoption).
- Key metrics for progress: Total ideas evaluated/tested/released per quarter; tests/experiments conducted per month; % of steps generating learning; % of ideas launched with medium confidence; % of ideas generating measurable outcomes.
- Metrics to reduce: Time spent planning, sales escalations, output incentives, % ideas slowed by dependencies, average time to launch.
- Establish a baseline before implementing GIST.
Making Incremental Improvements (Kaizen): Instead of waiting for quarter-end, set quicker self-improvement cycles (e.g., monthly) to boost specific areas like idea evaluation or team autonomy. This makes change concrete, surfaces issues, and builds habits.
Build an Operative Team: A small GIST task force (senior PMs, designers, engineers, researchers, business stakeholders) to steer day-to-day adoption, set goals, assess progress, and assist teams.

The chapter concludes by empowering the reader to be the “protagonist of this story,” emphasizing that individual effort, supported by like-minded colleagues, patient persuasion, and executive sponsorship, can drive meaningful, worthwhile change within an organization.

Key Takeaways

The core lessons of “Evidence-Guided” are that successful product development hinges on shifting from opinion-based, output-focused planning to evidence-guided, outcome-focused discovery and delivery. Most ideas are bad, and human intuition is flawed, so rigorous testing and learning are non-negotiable to avoid wasted resources and missed opportunities. The GIST framework (Goals, Ideas, Steps, Tasks) provides a practical, adaptable structure for this transformation, ensuring transparency, fostering trust, and empowering teams at every level of the organization.

To implement these ideas, readers should immediately self-assess their current practices using the GIST Scorecard to identify areas of greatest pain. Then, make a compelling case for change to leadership, emphasizing the costs of current inefficiencies versus the tangible benefits of an evidence-guided approach. Find an executive sponsor and teach the GIST methodology broadly to create shared understanding. Begin with a gradual rollout, starting with the GIST layer that addresses the most pressing organizational pain (Goals, Ideas, Steps, or Tasks), and set ambitious-yet-realistic self-improvement goals to measure the transition. Continuously iterate and celebrate small wins, building momentum and trust. This commitment to continuous learning and adaptation will enable organizations to innovate faster, deliver higher value, and build products that truly resonate with customers in the face of uncertainty.

Consider: How might embracing the transparency and iterative testing embedded in GIST fundamentally reshape your team’s approach to risk, foster greater internal collaboration, and ultimately lead to products that customers genuinely love and use?