Implement Binary Search on a Sorted Array
You are given a sorted array arr (ascending, no duplicates unless specified) and a target value target. Return the index of target in arr, or -1 if it's not present.
Write it iteratively. Be explicit about off-by-one risk (closed vs half-open interval). Then discuss: how would you find the first or last occurrence if duplicates exist?
▶Canonical iterative binary search (closed interval)
The invariant: the target, if present, is in arr[low..high] (both inclusive).
def binary_search(arr, target):
low, high = 0, len(arr) - 1
while low <= high:
mid = (low + high) // 2
if arr[mid] == target:
return mid
elif arr[mid] < target:
low = mid + 1
else:
high = mid - 1
return -1- Why
low <= high? Because the interval is closed. If it were[low, high), you'd uselow < highandhigh = mid. - Why
(low + high) // 2? For large low + high you'd wantlow + (high - low) // 2to avoid integer overflow in languages with fixed-width ints. Python's int is unbounded, so either is fine, but the second is the portable idiom.
▶Variants worth knowing
First occurrence of target (when duplicates allowed)
bisect_left(arr, target) returns the leftmost insertion point. Check arr[idx] == target afterward.
def first_occurrence(arr, target):
lo, hi = 0, len(arr)
while lo < hi:
mid = (lo + hi) // 2
if arr[mid] < target:
lo = mid + 1
else:
hi = mid
return lo if lo < len(arr) and arr[lo] == target else -1Note the half-open interval [lo, hi) and lo < hi — a different template.
Last occurrence
bisect_right(arr, target) - 1. Or flip the comparator: arr[mid] <= target moves lo = mid + 1, which pushes past all equals.
Rotated sorted array
Classic follow-up — the array is sorted then rotated around some pivot. At each step, one half is still sorted, so you check whether the target falls in that sorted half; if yes, recurse there, else recurse the other half. Still O(log n).
▶Complexity
- Time: O(log n) comparisons
- Space: O(1) (iterative)
Recursive version is O(log n) space from the call stack — rarely what you want for an interview unless asked.
▶Pitfalls
mid = (low + high) // 2can overflow in fixed-width languages. Uselow + (high - low) // 2.- Don't return
midif you want first/last occurrence — you might stop too early. - Unsorted input → garbage output. Always confirm the sort precondition up front.
- Searching for floats with
==is sketchy — use a tolerance or rethink.
Write binary search on a sorted array.
Sure. I'll use the closed-interval template — low and high are both inclusive. Loop while low <= high, compute mid, compare, and shrink. Let me write it:
def binary_search(arr, target):
low, high = 0, len(arr) - 1
while low <= high:
mid = (low + high) // 2
if arr[mid] == target:
return mid
elif arr[mid] < target:
low = mid + 1
else:
high = mid - 1
return -1Walk me through why low <= high and not low < high.
Because the interval is closed on both sides. When low == high, there's still one candidate to check — that's arr[low]. If I exit early on strict less-than, I'd skip the last element when low meets high. The half-open variant uses low < high, but then the body needs high = mid instead of high = mid - 1.
Picking between the two conventions — any preference?
Honestly, whichever you're most confident with, stick to one. I use closed intervals for the "find exactly this value" case because the termination is natural. For bisect_left or bisect_right patterns (first/last occurrence), half-open is cleaner because the answer is the boundary itself.
Speaking of which — how would you find the first occurrence if duplicates exist?
I'd switch to bisect_left: half-open interval [lo, hi), loop while lo < hi, and when arr[mid] < target move lo = mid + 1, else hi = mid. No early termination on equality. At the end, lo is the leftmost index where target would be inserted. If arr[lo] == target, that's the first occurrence; else it's not in the array.
def first_occurrence(arr, target):
lo, hi = 0, len(arr)
while lo < hi:
mid = (lo + hi) // 2
if arr[mid] < target:
lo = mid + 1
else:
hi = mid
return lo if lo < len(arr) and arr[lo] == target else -1Complexity?
O(log n) time — the interval halves each step. O(1) space for the iterative version. The recursive version is also O(log n) time but O(log n) stack space.
Let me twist it: the array was sorted, then rotated at some pivot. Find a target.
The key observation: after rotation, at least one half — either arr[low..mid] or arr[mid..high] — is still sorted. At each step, check which half is sorted (compare arr[low] to arr[mid]). If the left half is sorted and the target lies within [arr[low], arr[mid]), recurse there; else recurse the right half. Same O(log n). The trap is getting the inclusive/exclusive comparisons right at the boundaries, especially when duplicates are allowed.
Last thing — what's a pitfall you'd call out to a junior engineer reviewing this code?
Three. First, (low + high) // 2 can overflow in fixed-width integer languages; low + (high - low) // 2 is safer. Python doesn't care but the idiom is portable. Second, always verify the sort precondition — binary search on unsorted data returns garbage, no error. Third, == on floats is fragile; for float keys use a tolerance or switch to a range query.
Write binary search on a sorted array.
Sure. I'll use the closed-interval template — low and high are both inclusive. Loop while low <= high, compute mid, compare, and shrink. Let me write it:
def binary_search(arr, target):
low, high = 0, len(arr) - 1
while low <= high:
mid = (low + high) // 2
if arr[mid] == target:
return mid
elif arr[mid] < target:
low = mid + 1
else:
high = mid - 1
return -1Walk me through why low <= high and not low < high.
Because the interval is closed on both sides. When low == high, there's still one candidate to check — that's arr[low]. If I exit early on strict less-than, I'd skip the last element when low meets high. The half-open variant uses low < high, but then the body needs high = mid instead of high = mid - 1.
Picking between the two conventions — any preference?
Honestly, whichever you're most confident with, stick to one. I use closed intervals for the "find exactly this value" case because the termination is natural. For bisect_left or bisect_right patterns (first/last occurrence), half-open is cleaner because the answer is the boundary itself.
Speaking of which — how would you find the first occurrence if duplicates exist?
I'd switch to bisect_left: half-open interval [lo, hi), loop while lo < hi, and when arr[mid] < target move lo = mid + 1, else hi = mid. No early termination on equality. At the end, lo is the leftmost index where target would be inserted. If arr[lo] == target, that's the first occurrence; else it's not in the array.
def first_occurrence(arr, target):
lo, hi = 0, len(arr)
while lo < hi:
mid = (lo + hi) // 2
if arr[mid] < target:
lo = mid + 1
else:
hi = mid
return lo if lo < len(arr) and arr[lo] == target else -1Complexity?
O(log n) time — the interval halves each step. O(1) space for the iterative version. The recursive version is also O(log n) time but O(log n) stack space.
Let me twist it: the array was sorted, then rotated at some pivot. Find a target.
The key observation: after rotation, at least one half — either arr[low..mid] or arr[mid..high] — is still sorted. At each step, check which half is sorted (compare arr[low] to arr[mid]). If the left half is sorted and the target lies within [arr[low], arr[mid]), recurse there; else recurse the right half. Same O(log n). The trap is getting the inclusive/exclusive comparisons right at the boundaries, especially when duplicates are allowed.
Last thing — what's a pitfall you'd call out to a junior engineer reviewing this code?
Three. First, (low + high) // 2 can overflow in fixed-width integer languages; low + (high - low) // 2 is safer. Python doesn't care but the idiom is portable. Second, always verify the sort precondition — binary search on unsorted data returns garbage, no error. Third, == on floats is fragile; for float keys use a tolerance or switch to a range query.
This question has a debrief tool attached. Practice it aloud with a voice-mode AI interviewer, paste the transcript, and get a graded debrief against the reference answer.
How to do a mock interview
- 1
- 2
Copy this question and paste it as your first message:
You are given a sorted array `arr` (ascending, no duplicates unless specified) and a target value `target`. Return the index of `target` in `arr`, or `-1` if it's not present. Write it iteratively. Be explicit about off-by-one risk (closed vs half-open interval). Then discuss: how would you find the *first* or *last* occurrence if duplicates exist? - 3
Switch to voice mode (mic icon in the chat input). Speak through each follow-up — aim for 4–6 turns.
- 4
When the interviewer says "thank you, that's all I had", type or speak this:
Print the full transcript of our conversation as alternating "Interviewer:" and "Candidate:" lines. Include every exchange verbatim. Do not paraphrase, summarize, or skip turns. Do not add commentary. - 5
Copy ChatGPT's response, paste it below, and run the debrief.