DS Bootstrap

Prepare for Data Scientist Interviews with mock questions from real interview experience

107 questions · 0 mastered · 0 in review
Pick up where you left off
Learning plans
/
Status:
Topic:
Sort:
Q003
Explain CI, P-value, and Alpha — How Would You Explain Them to a PM?
Statistics Medium High freq
New
Q016
Driving Data Science Direction Independently — Impact & Ownership
Other Medium High freq
New
Q017
Google Meet Enterprise Clients Complaining About Frequent Disconnections
Product Hard High freq
New
Q025
Tricky Customer Retention Rate with LAG and Multiple CTEs
SQL Hard High freq
New
Q026
A/B Test Metric Selection — Change a Button's Color
Experimentation Medium High freq
New
Q043
Global Sales Went Up — But Could Regional Analysis Show It Went Down?
Causal inference Medium High freq
New
Q048
Car X vs. Car Y Fuel Efficiency — Which Technology Saves More Gas?
Product Medium High freq
New
Q076
Mean vs Median — Which Is Better? Force a Choice.
Probability Medium High freq
New
Q104
Implement Binary Search on a Sorted Array
Other Easy Medium freq
New
Q001
How Would You Prove a Sequence Follows a Uniform Distribution?
Probability Medium High freq
Q002
How Does Bootstrap Work? Can It Be Used for Variance Reduction?
Statistics Medium High freq
Q004
Why Can't We Reverse the Null Hypothesis to "Prove" Something?
Statistics Hard High freq
Q005
Model the Goal-Scoring Probability at Every Location on a Football Pitch
ML Hard Medium freq
Q006
Designing an Experiment for an Android App Update (Self-Selection Bias)
Experimentation Hard High freq
Q007
Estimate the Nth Percentile from Bucketed/Histogram Data
SQL Medium High freq
Q008
How Do You Choose Between 99% and 95% Confidence Level?
Product Medium High freq
Q009
How Would You Predict the Number of Views for a Video?
ML Medium High freq
Q010
How to Allocate K Human Reviews to Estimate ML Model Accuracy
Statistics Hard High freq
Q011
Gmail SQL — Top Countries by Email Volume & Month-over-Month Change
SQL Medium High freq
Q012
Your Analysis Contradicts What the PM Wants to Hear
Other Medium High freq
Q013
Handling Disagreement in a Cross-Functional Team
Other Medium High freq
Q014
A Time an Experiment or Model Failed — What Did You Learn?
Other Medium High freq
Q015
A Decision Mistake — What Would You Do Differently?
Other Medium High freq
Q018
Google Maps Jogging Route Recommendation — Experiment Design
Experimentation Hard High freq
Q019
Predict Remaining Phone Battery Life — Case Study
ML Medium High freq
Q020
What Happens If You Duplicate Data in Linear Regression?
ML Medium High freq
Q021
Spot the Modeling Mistakes
ML Medium High freq
Q022
Python Simulation — Average Price Under a Randomized Price Hike
ML Medium High freq
Q023
Working with Multi-Functional Teams
Other Medium High freq
Q024
Compare the Profit of Two Stores Near a School Gate
Causal inference Medium High freq
Q027
Explain Linear Regression to a Non-Technical Person
ML Hard High freq
Q028
How Would You Handle a Highly Imbalanced Dataset?
ML Hard High freq
Q029
Design a Database Schema for a Video Company
ML Medium High freq
Q030
Find the Most Frequently Co-Purchased Products
SQL Medium High freq
Q031
Add a Conditional Column in Pandas Based on Multiple Other Columns
ML Medium High freq
Q032
Relationship Between Product Sales and Review Data
Causal inference Medium High freq
Q033
Top 100 Week-over-Week User Count Changes
SQL Medium High freq
Q034
Earliest Day Each Video Hit 100 Views
SQL Medium High freq
Q035
Find Users Active in One Period But Not Another (Anti-Join Patterns)
SQL Medium High freq
Q036
In-App Purchase — Multi-Part SQL Drill (Cumulative, Growth Rate, Rolling Average)
SQL Medium High freq
Q037
Time Between the Last Two Ad Views Per User-Ad Pair
SQL Medium High freq
Q038
Rolling Window Active Users — "Users Active in the Last 7 Days"
SQL Medium High freq
Q039
A Metric Suddenly Changed Overnight — What Would You Do?
Product Medium High freq
Q040
A Campaign Launched in One Country — Did It Actually Cause the Metric Lift?
Causal inference Hard High freq
Q041
30 A/B Tests, One Significant at p = 0.04 — Should You Ship?
Statistics Medium High freq
Q042
Estimate the Lifetime Value (LTV) of a User or an Ad Click
Product Medium High freq
Q044
Bad Search Algorithm A/B Test — Three-Part Analysis
Experimentation Hard High freq
Q045
If You Could Only Choose One Metric to Evaluate a New Product, Which Would It Be?
Product Medium High freq
Q046
Favorite Google Product — How Would You Improve It and Measure Success?
Product Medium High freq
Q047
A Stakeholder Asks You to Run a Project — What Questions Do You Ask Them First?
Product Medium High freq
Q049
Evaluating an Ads CTR Prediction Model — Metric Selection, Traps, and Fixes
ML Hard High freq
Q050
Survey with 4 Options — Which Is the Most Preferred, and Is It Significantly So?
Statistics Medium High freq
Q051
How Does a Data Scientist Decide the P-Value Cutoff?
Statistics Medium High freq
Q052
Generate a Scatter Dataset with Slope ≈ 2 and R² ≈ 0.8
ML Medium High freq
Q053
Should YouTube Add a Bonus to Creator Ad Revenue?
Product Medium High freq
Q054
Minimum Number of Days to Reach Over 1 Billion Unique Users
SQL Hard High freq
Q055
Testing a Population Mean — What Does p-value = x% Actually Mean?
Statistics Medium High freq
Q056
Your SE is 0.1, You Want SE = 0.01 — What Do You Do? What If Sample Size Can't Change?
Statistics Hard High freq
Q057
Given a Sample X₁, …, Xₙ ~ X, Estimate P(X > 10) and Construct a 95% CI
Statistics Hard High freq
Q058
Bias-Variance Tradeoff — What Do You Do When Your Model Is Much Better on Training Than Test?
Product Medium High freq
Q059
You Have 1000 Features and Want to Estimate Conversion Rate — What Do You Do?
ML Medium High freq
Q060
Truncated Normal — Estimate μ and σ² When You Only Observe X > 3
Probability Hard Medium freq
Q061
Derive the MLE for Normal(μ, σ²) from Scratch
Probability Hard High freq
Q062
Derive the Conditional Distribution Y | X for a Bivariate Normal
Probability Hard High freq
Q063
Why Divide by n − 1 in Sample Variance? (Bessel's Correction)
Statistics Medium High freq
Q064
Is the Sample Standard Deviation an Unbiased Estimator of σ?
Statistics Hard High freq
Q065
From a 1% Query Sample, Estimate the Number of Singleton Queries
Statistics Hard High freq
Q066
Design a Metric for Bird Species Segregation in a Forest
Product Hard Medium freq
Q067
Does YouTube Ad Playback Drive Sales? Spot the Problems in This Simple Regression
Causal inference Medium High freq
Q068
You Voted on an Offsite — Hiking Won, But One Teammate Objects. What Do You Do?
Other Medium High freq
Q069
Compute a P-Value from the Definition via Monte Carlo Simulation — No Formulas
SQL Medium High freq
Q070
Linear Regression When You Have More Features Than Observations (m > n)
ML Hard High freq
Q071
Why Is the Test on a Regression Coefficient a t-Distribution, Not Normal?
ML Hard High freq
Q072
Does OLS Give the Same Predictions If You Rotate the Features?
ML Hard High freq
Q073
Does Listening to the YouTube Music "Commute" Playlist Make People Drive Faster?
Causal inference Hard High freq
Q074
Implement Evaluation Metrics and Bootstrap CI (PR Curve + Percentage RMSE)
SQL Medium High freq
Q075
CRM Algorithm A/B Test — Identifying Wrong CI Calculation & Proper Experiment Design
Experimentation Hard High freq
Q077
Drug A (Placebo) vs B — Between-Subjects vs Within-Subjects Design
Experimentation Medium High freq
Q078
Censored Normal — Estimate μ, σ² When X > 10 Is Right-Censored
Probability Hard High freq
Q079
Estimate the Number of Distinct Queries from a 1% Sample
Statistics Hard High freq
Q080
Find the Most Expensive Order for Each User
SQL Medium High freq
Q081
Week-over-Week Search Growth Rate per Country × Language Segment
SQL Medium High freq
Q082
Evaluate a New YouTube Auto-Playlist Ranking Algorithm
ML Hard High freq
Q083
How Do You Navigate Ambiguity?
Other Medium High freq
Q084
Multiple DS Teams Produce Very Similar (But Inconsistent) Metrics — What Do You Do?
Other Medium High freq
Q085
It's 6 PM Monday, Deadline Tomorrow Morning, 8+ Hours of Work Left. What Do You Do?
Other Medium High freq
Q086
Experiment Where Users Must Sign an Agreement — Selection Bias and How to Fix It
Experimentation Hard High freq
Q087
Hypothesis Testing with Only 20 Samples — What Do You Do?
Statistics Medium High freq
Q088
When Does K-Means Fail? What Are the Estimation Problems and How Do You Fix Them?
ML Medium High freq
Q089
Drug Trial — Parallel vs Crossover Design (Detailed Follow-ups)
Experimentation Hard High freq
Q090
Binomial Distribution — Probability of Fewer Than 2 Girls Out of N Children
Probability Medium High freq
Q091
Why Do Customers Prefer the 4.9-Rated Product Over the 5.0-Rated Product?
Product Medium High freq
Q092
Pandas GroupBy + Visualization
ML Medium High freq
Q093
Sample Ratio Mismatch (SRM) — How to Detect, Diagnose, and Handle
Experimentation Medium High freq
Q094
The Peeking Problem — Why Early Stopping Inflates False Positives, and How to Fix It
Experimentation Hard High freq
Q095
Novelty and Primacy Effects — How to Detect Them and What to Do
Experimentation Medium High freq
Q096
Network Effects and Spillover — Why User-Level A/B Breaks and How to Fix It
Experimentation Hard High freq
Q097
Incrementality Testing — Measuring What Ads Actually Cause, Not Just What They're Credited For
Causal inference Hard High freq
Q098
Ad Auction Mechanics — Second-Price, First-Price, and the Role of Quality Score
Product Medium High freq
Q099
The Four Assumptions of Linear Regression — What Breaks When Each Is Violated, and How to Diagnose
ML Medium High freq
Q100
Lasso vs. Ridge — When to Use Which, and Why Lasso Creates Sparsity
ML Medium High freq
Q101
Poisson vs. Binomial — Assumptions, When to Use Each, and the Poisson Approximation
Probability Medium High freq
Q102
If You Had No Metrics for Google Docs, Which Five Would You Define First?
Product Medium High freq
Q103
Sample From a Weighted Discrete Distribution in Python
Probability Medium Medium freq
Q105
Find the K-th Smallest Element With Quickselect
Other Medium Medium freq
Q106
Generate Samples and Query Distributions in Python (NumPy + SciPy)
Statistics Easy Medium freq
Q107
Fibonacci in Python — Iterative, Memoized, Matrix-Power
Other Easy Medium freq