DS Bootstrap
Prepare for Data Scientist Interviews with mock questions from real interview experience
107 questions · 0 mastered · 0 in review
Pick up where you left off
Learning plans
Stats & probability
38 Distributions → inference → experiments → regression → causal
0 / 0 / 38
Data intuition & product sense
37 Framework → metrics → experiments → modeling cases → advanced
0 / 0 / 37
Coding — SQL, pandas, simulation
22 Basic aggregates → window functions → multi-CTE → Python / simulation
0 / 0 / 22
Leadership & behavioral
10 Conflict → failure → ownership → ambiguity → pressure
0 / 0 / 10
/
Status:
Topic:
Sort:
Q003 New Q016 New Q017 New Q025 New Q026 New Q043 New Q048 New Q076 New Q104 New Q001 Q002 Q004 Q005 Q006 Q007 Q008 Q009 Q010 Q011 Q012 Q013 Q014 Q015 Q018 Q019 Q020 Q021 Q022 Q023 Q024 Q027 Q028 Q029 Q030 Q031 Q032 Q033 Q034 Q035 Q036 Q037 Q038 Q039 Q040 Q041 Q042 Q044 Q045 Q046 Q047 Q049 Q050 Q051 Q052 Q053 Q054 Q055 Q056 Q057 Q058 Q059 Q060 Q061 Q062 Q063 Q064 Q065 Q066 Q067 Q068 Q069 Q070 Q071 Q072 Q073 Q074 Q075 Q077 Q078 Q079 Q080 Q081 Q082 Q083 Q084 Q085 Q086 Q087 Q088 Q089 Q090 Q091 Q092 Q093 Q094 Q095 Q096 Q097 Q098 Q099 Q100 Q101 Q102 Q103 Q105 Q106 Q107
Explain CI, P-value, and Alpha — How Would You Explain Them to a PM?
Driving Data Science Direction Independently — Impact & Ownership
Google Meet Enterprise Clients Complaining About Frequent Disconnections
Tricky Customer Retention Rate with LAG and Multiple CTEs
A/B Test Metric Selection — Change a Button's Color
Global Sales Went Up — But Could Regional Analysis Show It Went Down?
Car X vs. Car Y Fuel Efficiency — Which Technology Saves More Gas?
Mean vs Median — Which Is Better? Force a Choice.
Implement Binary Search on a Sorted Array
How Would You Prove a Sequence Follows a Uniform Distribution?
How Does Bootstrap Work? Can It Be Used for Variance Reduction?
Why Can't We Reverse the Null Hypothesis to "Prove" Something?
Model the Goal-Scoring Probability at Every Location on a Football Pitch
Designing an Experiment for an Android App Update (Self-Selection Bias)
Estimate the Nth Percentile from Bucketed/Histogram Data
How Do You Choose Between 99% and 95% Confidence Level?
How Would You Predict the Number of Views for a Video?
How to Allocate K Human Reviews to Estimate ML Model Accuracy
Gmail SQL — Top Countries by Email Volume & Month-over-Month Change
Your Analysis Contradicts What the PM Wants to Hear
Handling Disagreement in a Cross-Functional Team
A Time an Experiment or Model Failed — What Did You Learn?
A Decision Mistake — What Would You Do Differently?
Google Maps Jogging Route Recommendation — Experiment Design
Predict Remaining Phone Battery Life — Case Study
What Happens If You Duplicate Data in Linear Regression?
Spot the Modeling Mistakes
Python Simulation — Average Price Under a Randomized Price Hike
Working with Multi-Functional Teams
Compare the Profit of Two Stores Near a School Gate
Explain Linear Regression to a Non-Technical Person
How Would You Handle a Highly Imbalanced Dataset?
Design a Database Schema for a Video Company
Find the Most Frequently Co-Purchased Products
Add a Conditional Column in Pandas Based on Multiple Other Columns
Relationship Between Product Sales and Review Data
Top 100 Week-over-Week User Count Changes
Earliest Day Each Video Hit 100 Views
Find Users Active in One Period But Not Another (Anti-Join Patterns)
In-App Purchase — Multi-Part SQL Drill (Cumulative, Growth Rate, Rolling Average)
Time Between the Last Two Ad Views Per User-Ad Pair
Rolling Window Active Users — "Users Active in the Last 7 Days"
A Metric Suddenly Changed Overnight — What Would You Do?
A Campaign Launched in One Country — Did It Actually Cause the Metric Lift?
30 A/B Tests, One Significant at p = 0.04 — Should You Ship?
Estimate the Lifetime Value (LTV) of a User or an Ad Click
Bad Search Algorithm A/B Test — Three-Part Analysis
If You Could Only Choose One Metric to Evaluate a New Product, Which Would It Be?
Favorite Google Product — How Would You Improve It and Measure Success?
A Stakeholder Asks You to Run a Project — What Questions Do You Ask Them First?
Evaluating an Ads CTR Prediction Model — Metric Selection, Traps, and Fixes
Survey with 4 Options — Which Is the Most Preferred, and Is It Significantly So?
How Does a Data Scientist Decide the P-Value Cutoff?
Generate a Scatter Dataset with Slope ≈ 2 and R² ≈ 0.8
Should YouTube Add a Bonus to Creator Ad Revenue?
Minimum Number of Days to Reach Over 1 Billion Unique Users
Testing a Population Mean — What Does p-value = x% Actually Mean?
Your SE is 0.1, You Want SE = 0.01 — What Do You Do? What If Sample Size Can't Change?
Given a Sample X₁, …, Xₙ ~ X, Estimate P(X > 10) and Construct a 95% CI
Bias-Variance Tradeoff — What Do You Do When Your Model Is Much Better on Training Than Test?
You Have 1000 Features and Want to Estimate Conversion Rate — What Do You Do?
Truncated Normal — Estimate μ and σ² When You Only Observe X > 3
Derive the MLE for Normal(μ, σ²) from Scratch
Derive the Conditional Distribution Y | X for a Bivariate Normal
Why Divide by n − 1 in Sample Variance? (Bessel's Correction)
Is the Sample Standard Deviation an Unbiased Estimator of σ?
From a 1% Query Sample, Estimate the Number of Singleton Queries
Design a Metric for Bird Species Segregation in a Forest
Does YouTube Ad Playback Drive Sales? Spot the Problems in This Simple Regression
You Voted on an Offsite — Hiking Won, But One Teammate Objects. What Do You Do?
Compute a P-Value from the Definition via Monte Carlo Simulation — No Formulas
Linear Regression When You Have More Features Than Observations (m > n)
Why Is the Test on a Regression Coefficient a t-Distribution, Not Normal?
Does OLS Give the Same Predictions If You Rotate the Features?
Does Listening to the YouTube Music "Commute" Playlist Make People Drive Faster?
Implement Evaluation Metrics and Bootstrap CI (PR Curve + Percentage RMSE)
CRM Algorithm A/B Test — Identifying Wrong CI Calculation & Proper Experiment Design
Drug A (Placebo) vs B — Between-Subjects vs Within-Subjects Design
Censored Normal — Estimate μ, σ² When X > 10 Is Right-Censored
Estimate the Number of Distinct Queries from a 1% Sample
Find the Most Expensive Order for Each User
Week-over-Week Search Growth Rate per Country × Language Segment
Evaluate a New YouTube Auto-Playlist Ranking Algorithm
How Do You Navigate Ambiguity?
Multiple DS Teams Produce Very Similar (But Inconsistent) Metrics — What Do You Do?
It's 6 PM Monday, Deadline Tomorrow Morning, 8+ Hours of Work Left. What Do You Do?
Experiment Where Users Must Sign an Agreement — Selection Bias and How to Fix It
Hypothesis Testing with Only 20 Samples — What Do You Do?
When Does K-Means Fail? What Are the Estimation Problems and How Do You Fix Them?
Drug Trial — Parallel vs Crossover Design (Detailed Follow-ups)
Binomial Distribution — Probability of Fewer Than 2 Girls Out of N Children
Why Do Customers Prefer the 4.9-Rated Product Over the 5.0-Rated Product?
Pandas GroupBy + Visualization
Sample Ratio Mismatch (SRM) — How to Detect, Diagnose, and Handle
The Peeking Problem — Why Early Stopping Inflates False Positives, and How to Fix It
Novelty and Primacy Effects — How to Detect Them and What to Do
Network Effects and Spillover — Why User-Level A/B Breaks and How to Fix It
Incrementality Testing — Measuring What Ads Actually Cause, Not Just What They're Credited For
Ad Auction Mechanics — Second-Price, First-Price, and the Role of Quality Score
The Four Assumptions of Linear Regression — What Breaks When Each Is Violated, and How to Diagnose
Lasso vs. Ridge — When to Use Which, and Why Lasso Creates Sparsity
Poisson vs. Binomial — Assumptions, When to Use Each, and the Poisson Approximation
If You Had No Metrics for Google Docs, Which Five Would You Define First?
Sample From a Weighted Discrete Distribution in Python
Find the K-th Smallest Element With Quickselect
Generate Samples and Query Distributions in Python (NumPy + SciPy)
Fibonacci in Python — Iterative, Memoized, Matrix-Power
No questions match your filter.
Try clearing the search or loosening the status/topic filter.