TC3006 Analysis and Design of Algorithms
Salvador Hinojosa
salvador.hinojosa@tec.mx
Session 06 | March 24, 2025
Office hours: salvadorhinojosa.youcanbook.me
Based on course materials by Dr. Hugo Terashima Marín
By the end of this session, you will be able to:
Fundamentals and Binary Search
Problem: Given a sorted array of n elements, find element x.
Since the array is sorted, we can eliminate half the elements with each comparison.
Algorithm:
Searching for 47 in a sorted array
Question: What is the worst-case time complexity of binary search on a sorted array of n elements?
Think about:
Answer:
Meta-comment: This logarithmic complexity makes binary search extremely efficient—searching 1 billion elements requires only ~30 comparisons!
Question: What is the worst case of binary search?
Recurrence Relation:
Solution:
Each recursive call halves the problem size, giving us logarithmic complexity.
Question: What is the average case of binary search?
Assumptions:
Binary Search Tree Representation:
\(n = 2^k - 1\)
Total comparisons: \(\sum_{i=1}^{k} i \cdot 2^{i-1}\)
Average complexity formula:
if x is not in the array
Step 1: Simplify the summation
Step 2: Substitute \(n = 2^k - 1\)
Step 3: Final simplification
⭐ Key Result:
Binary search has \(O(\log n)\) complexity in both worst and average cases.
This makes it one of the most efficient search algorithms for sorted data!
Definition: Given n elements, find the kth order statistic.
The kth order statistic is the element that would be at position k if the array were sorted.
Examples:
We can do better!
Problem: Find the 5th smallest element
Array: [23, 17, 31, 12, 8, 19, 25, 14, 29, 11]
Approach 1 - Sorting:
Sort → [8, 11, 12, 14, 17, 19, 23, 25, 29, 31]
Answer: 17 (at index 4)
Cost: ~34 comparisons (n log n ≈ 10 × 3.32)
Approach 2 - Selection:
Use selection algorithm → Answer: 17
Cost: ~17 comparisons (linear in n)
50% fewer comparisons!
function findMax(A[1..n]):
max = A[1]
for i = 2 to n:
if A[i] > max:
max = A[i]
return max
How many comparisons?
Claim: Any comparison-based algorithm must make at least n-1 comparisons.
Proof:
Our algorithm uses exactly n-1 comparisons → OPTIMAL!
Array: [12, 7, 23, 5, 18]
| Step | Compare | Result | Current Max | Comparisons |
|---|---|---|---|---|
| 1 | 7 vs 12 | 7 < 12 | 12 | 1 |
| 2 | 23 vs 12 | 23 > 12 | 23 | 2 |
| 3 | 5 vs 23 | 5 < 23 | 23 | 3 |
| 4 | 18 vs 23 | 18 < 23 | 23 | 4 |
Result: Maximum = 23, Comparisons = 4 = n-1 ✓
1 Find maximum using n-1 comparisons
2 Find minimum from remaining n-1 elements
This takes n-2 comparisons
Key Insight: The maximum can never lose a comparison, and the minimum can never win.
So we can separate elements into two groups!
1 Compare elements in pairs → separate into winners and losers
2 Find max among winners
3 Find min among losers
Array: [23, 17, 31, 12, 8, 19, 25, 14] (n = 8)
| Pair | Comparison | Winner | Loser |
|---|---|---|---|
| 1 | 23 vs 17 | 23 | 17 |
| 2 | 31 vs 12 | 31 | 12 |
| 3 | 8 vs 19 | 19 | 8 |
| 4 | 25 vs 14 | 25 | 14 |
Comparisons so far: 4 = n/2
Winners: [23, 31, 19, 25]
Comparisons: 23 vs 31 → 31, then 19 vs 25 → 25, then 31 vs 25 → 31
Max = 31, Comparisons: 3 = n/2 - 1
Losers: [17, 12, 8, 14]
Comparisons: 17 vs 12 → 12, then 8 vs 14 → 8, then 12 vs 8 → 8
Min = 8, Comparisons: 3 = n/2 - 1
Pair comparisons = n/2
Max among winners = (n/2) - 1
Min among losers = (n/2) - 1
Total:
Question: Using the pairing method we just derived, how many comparisons are needed to find both max and min for n = 10 elements?
Think about:
✓ Answer:
n = 10 is even, so T(10) = 3(10)/2 - 2 = 15 - 2 = 13 comparisons
This formative assessment helps verify your understanding before moving forward
Handle first element separately (acts as both initial max and min)
Remaining n-1 elements (even) use pairing method
Pair comparisons = (n-1)/2
Max among winners = (n-1)/2 (including initial element)
Min among losers = (n-1)/2 (including initial element)
Total:
| n | Naive (2n-3) | Optimized (\(\lceil 3n/2 \rceil\)-2) | Savings | % Improvement |
|---|---|---|---|---|
| 10 | 17 | 13 | 4 | 23.5% |
| 20 | 37 | 28 | 9 | 24.3% |
| 50 | 97 | 73 | 24 | 24.7% |
| 100 | 197 | 148 | 49 | 24.9% |
| 1000 | 1997 | 1498 | 499 | 25.0% |
As n grows, we save approximately 25% comparisons!
Theorem: Any comparison-based algorithm must make at least \(\lceil 3n/2 \rceil\) - 2 comparisons.
Proof (via adversary argument):
Key Theorem: The second largest element MUST have lost directly to the largest element.
Definition: A tournament tree is a complete binary tree where:
1 Build tournament tree
2 Count elements that lost to max
3 Find max among those
T(n) = Tbuild tournament + Tfind among losers
| n | \(\lceil \log_2 n \rceil\) | n + \(\lceil \log_2 n \rceil\) - 2 | Naive (2n-3) | Improvement |
|---|---|---|---|---|
| 8 | 3 | 9 | 13 | 31% |
| 16 | 4 | 18 | 29 | 38% |
| 64 | 6 | 68 | 125 | 46% |
| 1024 | 10 | 1032 | 2045 | 50% |
Array: [12, 7, 23, 5, 18, 9, 15, 11]
Round 1 (4 comparisons):
12 vs 7 → 12 | 23 vs 5 → 23 | 18 vs 9 → 18 | 15 vs 11 → 15
Round 2 (2 comparisons):
12 vs 23 → 23 | 18 vs 15 → 18
Round 3 (1 comparison):
23 vs 18 → 23 (MAX)
Total comparisons: 4 + 2 + 1 = 7 = n-1 ✓
Path from 23 to root: 3 levels → \(\lceil \log_2 8 \rceil\) = 3 comparisons → 3 losers
Losers to max: {5, 12, 18} (3 elements = \(\lceil \log_2 8 \rceil\))
12 vs 18 → 18 (1 comparison)
5 vs 18 → 18 (1 comparison)
2nd Largest = 18
Total: 7 + 2 = 9 = 8 + 3 - 2 = n + \(\lceil \log_2 n \rceil\) - 2 ✓
Proof: Why does the maximum compare against exactly \(\lceil \log_2 n \rceil\) elements?
Question: For a tournament with n = 16 elements, how many elements lose directly to the maximum?
Hint:
✓ Answer:
\(\lceil \log_2 16 \rceil\) = 4 elements lose directly to the maximum
Understanding this relationship is key to proving optimality
Information-Theoretic Approach
Definition: A proof technique that shows a lower bound by constructing an opponent (adversary) that forces any algorithm to do at least k operations.
Definition: An information unit is a fundamental piece of knowledge gained from a comparison that helps determine the solution.
For finding max and min, each comparison yields information about which elements can/cannot be max or min:
Compare A[i] vs A[j], assume A[i] < A[j]:
We need 2n - 2 information units to determine both max and min (eliminate n-1 candidates for max, n-1 candidates for min). The adversary strategy ensures we gain the minimum information per comparison.
Lemma: In the adversary game for max/min, each comparison yields at most 2 state transitions (moves).
| Comparison Type | State Changes | Information Units |
|---|---|---|
| (N, N) | N→W, N→L | 2 moves (maximum) |
| (W, W) | Loser: W→WL | 1 move |
| (L, L) | Winner: L→WL | 1 move |
| (W, N) | N→L (adversary makes W win) | 1 move |
| (L, N) | N→W (adversary makes L lose) | 1 move |
| (W, L) | None (outcome already determined) | 0 moves |
| (WL, *) | None (WL is terminal state) | 0 moves |
Proof by case analysis on state transitions:
Maximum: 2 state transitions per comparison, achieved only when comparing two N elements (one becomes W, one becomes L). ∎
Known: Total moves needed = 2n - 2
Proven: Moves per comparison ≤ 2
Therefore:
✓ Lower Bound Proven: Any algorithm finding both max and min requires at least n - 1 comparisons.
Our pairing method achieves \(\lceil 3n/2 \rceil\) - 2 ≈ 1.5n - 2 comparisons.
Gap factor: ~1.5× (optimal within a constant factor!)
| State | Meaning | Could be Max? | Could be Min? |
|---|---|---|---|
| N | Never compared | Yes | Yes |
| W | Won ≥1, Lost 0 | Yes | No |
| L | Won 0, Lost ≥1 | No | Yes |
| WL | Won ≥1, Lost ≥1 | No | No |
Algorithm can stop when:
Question: Which comparison gives the adversary MOST control (LEAST information)?
Options:
Think about:
✓ Answer: C (W vs L)
Key: W vs L reveals no new information since the outcome is predetermined!
This understanding is crucial for proving the 2n-2 lower bound
| Comparison | Adversary Response | New States | Info Units | Explanation |
|---|---|---|---|---|
| N, N | x > y | W, L | 2 | Most efficient: 2 state changes |
| W, W | x > y | W, WL | 1 | One loses "max candidacy" |
| L, L | x > y | WL, L | 1 | One loses "min candidacy" |
| W, N | x > y | W, L | 1 | N element gets classified |
| L, N | x < y | L, W | 1 | N element gets classified |
| W, L | x > y (forced) | W, L | 0 | Outcome already known! |
| WL, W | Consistent | WL, W | 0 | No new information |
Info Unit: A state transition (N→W, N→L, W→WL, or L→WL)
Initial state: All n elements in state N
Final state: 1 in W, 1 in L, n-2 in WL
Moves needed:
Best case: N vs N comparisons give 2 moves each
Other comparisons give ≤ 1 move each
Minimizing comparisons:
Case 1: n even
Case 2: n odd
Watch states change as comparisons happen (n=6)
Tracking n=6 elements through adversary strategy
Elements: [e₀, e₁, e₂, e₃, e₄, e₅]
Initial: All in state N
| Step | Compare | Result | States After | W|L|WL|N | Moves |
|---|---|---|---|---|---|
| 1 | e₀ vs e₁ | e₀ > e₁ | W,L,N,N,N,N | 1|1|0|4 | 2 |
| 2 | e₂ vs e₃ | e₂ > e₃ | W,L,W,L,N,N | 2|2|0|2 | 2 |
| 3 | e₄ vs e₅ | e₄ > e₅ | W,L,W,L,W,L | 3|3|0|0 | 2 |
| 4 | e₀ vs e₂ | e₀ > e₂ | W,L,WL,L,W,L | 2|3|1|0 | 1 |
| 5 | e₀ vs e₄ | e₀ > e₄ | W,L,WL,L,WL,L | 1|3|2|0 | 1 |
| 6 | e₃ vs e₅ | e₅ > e₃ | W,L,WL,WL,WL,L | 1|2|3|0 | 1 |
| 7 | e₁ vs e₅ | e₁ > e₅ | W,WL,WL,WL,WL,L | 1|1|4|0 | 1 |
Total: 7 comparisons, 10 moves = 2(6)-2 ✓
Formula: \(\lceil 3(6)/2 \rceil\) - 2 = 9 - 2 = 7 ✓
Optimal Pivot Selection and Recurrence Analysis
Goal: Find the kth smallest element in O(n) worst-case time
1 Divide n elements into groups of 5
2 Find median of each group (6 comparisons per group)
3 Recursively find median of the medians → pivot X
4 Partition array around X into L and R
5 Recursively search in L or R based on X's position
Step 1: Count group medians ≤ X
Step 2: Count elements in those groups
For each group whose median ≤ X:
Step 3: Total elements ≤ X
Result: The median-of-medians X satisfies:
By symmetry (same argument for elements ≥ X):
Key Insight: Both partitions have size ≤ 7n/10 + 6
This guarantees the recursion decreases by at least 30% each time!
Since n/5 + 7n/10 = 9n/10 < n, this recurrence solves to T(n) = O(n) ✓
Result: At least 30% of elements are on each side → Guaranteed partition quality!
Elements guaranteed ≤ X: ≥ 3n/10 - 6
Elements guaranteed ≥ X: ≥ 3n/10 - 6 (by symmetry)
Worst case partition size:
Similarly for elements < X
This guarantees at least 30% reduction each recursion!
m = \(\lceil n/5 \rceil\) groups
Finding medians: 6m comparisons (6 per group of 5)
Partitioning: ≤ 2(m-1) comparisons (optimized)
Finding median of medians: T(m)
Recursive call on partition: T\((7n/10 + 6)\)
Recurrence relation:
Key observation:
Guess T(n) ≤ cn for some constant c:
We need: 1.6n + 0.9cn ≤ cn
Result:
This means approximately 16-17n comparisons in worst case!
| Group Size | Guarantee | Recursive Sizes | Sum | Result |
|---|---|---|---|---|
| 3 | ≥ 25% | n/3 + 3n/4 | 13n/12 > n | O(n log n) ❌ |
| 5 | ≥ 30% | n/5 + 7n/10 | 9n/10 < n | O(n) ✓ |
| 7 | ≥ 37% | n/7 + 5n/7 | 6n/7 < n | O(n) ✓ (slower) |
| 9 | ≥ 44% | n/9 + 5n/9 | 6n/9 < n | O(n) ✓ (much slower) |
Why 5 is optimal: Smallest group size that guarantees linear time with reasonable constant!
Array (n=15): [3, 7, 2, 9, 1, 8, 5, 6, 4, 15, 11, 13, 10, 14, 12]
Find: 8th smallest
G1: [3,7,2,9,1] → sorted: [1,2,3,7,9] → median: 3
G2: [8,5,6,4,15] → sorted: [4,5,6,8,15] → median: 6
G3: [11,13,10,14,12] → sorted: [10,11,12,13,14] → median: 12
Medians: [3, 6, 12]
Pivot X = 6 (median of [3, 6, 12])
L (< 6): [3, 2, 1, 5, 4] → 5 elements
X: [6] → position 6
R (> 6): [7, 9, 8, 15, 11, 13, 10, 14, 12] → 9 elements
Looking for 8th smallest, X is at position 6
8 > 6, so search for (8-6)=2nd smallest in R
Find 2nd smallest in [7, 9, 8, 15, 11, 13, 10, 14, 12]
Answer: 8 ✓
| Problem | Comparisons | Type | Proven Optimal? |
|---|---|---|---|
| Find maximum | n - 1 | Exact | ✅ Yes |
| Find minimum | n - 1 | Exact | ✅ Yes |
| Find both max & min | \(\lceil 3n/2 \rceil\) - 2 | Exact | ✅ Yes |
| Find 2nd largest | n + \(\lceil \log_2 n \rceil\) - 2 | Exact | ✅ Yes |
| Find kth element | O(n) | Asymptotic | ✅ Yes (time) |
| Find median | ≥ 3(n-1)/2 | Lower bound | ❓ Gap exists |
Open Problem: For median, best known algorithm uses ~17n comparisons, but lower bound is only 1.5n. Can we close this gap?
Precise analysis reveals optimal algorithms
Prove fundamental limits
Can achieve optimal bounds
Assignment: HH-HWork-06
Due Date: March 31, 2025
Topics Covered: Selection algorithms, adversary arguments, lower bound proofs, median-of-medians analysis
⚠️ Important Note:
Median-of-medians algorithm was not covered in lecture due to time constraints.
Required Reading: Sara Baase Chapter 5, Section 5.5 before starting the assignment.
Office hours available for questions: salvadorhinojosa.youcanbook.me
Office Hours: Available for questions on all topics including median-of-medians
Contact: salvador.hinojosa@tec.mx
Salvador Hinojosa
salvador.hinojosa@tec.mx
CS-4012: Advanced Algorithm Analysis
Session 06 | March 24, 2025
Course materials based on the work of Dr. Hugo Terashima Marín