Post Details
image segmentation
Image Segmentation: Complete Guide with Examples
1. What is Image Segmentation?
Segmentation is the process of dividing an image into meaningful regions or objects. It is commonly used for object recognition, tracking, and analysis.
Segmentation approaches:
- Edge-based segmentation: Detect boundaries between regions.
- Thresholding: Separate regions based on intensity values.
- Region-based segmentation: Group pixels with similar properties.
2. Edge Detection
Edges are points in an image where intensity changes sharply. They define object boundaries.
2.1 Prewitt Operator
Kernels:
Horizontal (Gx):
-1 0 1 -1 0 1 -1 0 1
Vertical (Gy):
1 1 1 0 0 0 -1 -1 -1
Example 3×3 patch:
| 52 | 55 | 61 |
| 62 | 59 | 55 |
| 63 | 65 | 66 |
Compute Horizontal Gradient (Gx):
Gx = (-1*52 + 0*55 + 1*61) + (-1*62 + 0*59 + 1*55) + (-1*63 + 0*65 + 1*66) = 5
Compute Vertical Gradient (Gy):
Gy = (1*52 + 1*55 + 1*61) + (0*62 + 0*59 + 0*55) + (-1*63 + -1*65 + -1*66) = -26
Edge Magnitude and Direction:
Magnitude = √(Gx² + Gy²) ≈ 26.49
Direction θ = arctan(Gy / Gx) ≈ -79.1°
Edge Highlight (Red = Edge Pixels):
| 26.49 | … | … |
| … | … | … |
| … | … | … |
2.2 Sobel Operator
Kernels:
Horizontal (Gx):
-1 0 1 -2 0 2 -1 0 1
Vertical (Gy):
1 2 1 0 0 0 -1 -2 -1
Compute Horizontal Gradient (Gx):
Gx = (-1*52 + 0*55 + 1*61) + (-2*62 + 0*59 + 2*55) + (-1*63 + 0*65 + 1*66) = -2
Compute Vertical Gradient (Gy):
Gy = (1*52 + 2*55 + 1*61) + (0*62 + 0*59 + 0*55) + (-1*63 + -2*65 + -1*66) = -36
Edge Magnitude and Direction:
Magnitude ≈ 36.06
Direction θ ≈ 86.8°
Edge Highlight (Red = Edge Pixels):
| 36.06 | … | … |
| … | … | … |
| … | … | … |
Canny
1. Original 3×3 Patch
| 52 | 55 | 61 |
| 62 | 59 | 55 |
| 63 | 65 | 66 |
2. Canny Edge Detection for All Pixels
Step 1: Gaussian Smoothing
Smoothed 3×3 patch:
| 54 | 55 | 60 |
| 60 | 59 | 56 |
| 62 | 64 | 65 |
Step 2: Compute Gradients (Sobel)
Kernels:
Gx = [-1 0 1; -2 0 2; -1 0 1] Gy = [1 2 1; 0 0 0; -1 -2 -1]
Gx (Horizontal) Computations:
- Pixel(1,1): (-1*54+0*55+1*60)+(-2*60+0*59+2*56)+(-1*62+0*64+1*65)=1
- Pixel(1,2): (-1*55+0*61+1*0)+(-2*59+0*55+2*0)+(-1*64+0*65+1*0)= -58 (approx border)
- Pixel(1,3): (-1*61+0*0+1*0)+(-2*55+0*0+2*0)+(-1*65+0*0+1*0)= -181
- Pixel(2,1): (-1*60+0*59+1*56)+(-2*62+0*59+2*55)+(-1*62+0*64+1*65)= -3
- Pixel(2,2): (-1*59+0*56+1*0)+(-2*59+0*0+2*0)+(-1*64+0*0+1*0)= -182
- Pixel(2,3): (-1*56+0*0+1*0)+(-2*55+0*0+2*0)+(-1*65+0*0+1*0)= -176
- Pixel(3,1): border ≈ -3
- Pixel(3,2): border ≈ -182
- Pixel(3,3): border ≈ -176
Gy (Vertical) Computations:
- Pixel(1,1): (1*54+2*55+1*60)+(0*60+0*59+0*56)+(-1*62-2*64-1*65)= -25
- Pixel(1,2): border ≈ -60
- Pixel(1,3): border ≈ -110
- Pixel(2,1): border ≈ -28
- Pixel(2,2): (1*59+2*56+1*0)+(0*59+0*0+0*0)+(-1*64-2*65-1*0)= -83
- Pixel(2,3): border ≈ -120
- Pixel(3,1): border ≈ -31
- Pixel(3,2): border ≈ -86
- Pixel(3,3): border ≈ -125
Step 3: Gradient Magnitude
Magnitude = √(Gx² + Gy²)
| √(1²+(-25)²) ≈ 25.02 | √((-58)²+(-60)²) ≈ 83.54 | √((-181)²+(-110)²) ≈ 209.4 |
| √((-3)²+(-28)²) ≈ 28.16 | √((-182)²+(-83)²) ≈ 200.4 | √((-176)²+(-120)²) ≈ 213.5 |
| √((-3)²+(-31)²) ≈ 31.14 | √((-182)²+(-86)²) ≈ 201.6 | √((-176)²+(-125)²) ≈ 215.4 |
Step 4: Non-Maximum Suppression
Keep only local maxima along gradient direction (simplified, highlight largest magnitude in neighborhood):
| 0 | 0 | 209.4 |
| 0 | 0 | 213.5 |
| 0 | 0 | 215.4 |
Step 5: Double Thresholding
- High = 200 → Strong edges (red)
- Low = 50 → Weak edges (orange)
Strong edges ≥200 → highlight in red. Weak edges (50-200) → highlight orange.
| 25.02 | 83.54 | 209.4 |
| 28.16 | 200.4 | 213.5 |
| 31.14 | 201.6 | 215.4 |
Step 6: Edge Tracking by Hysteresis
Strong edges (red) kept, weak edges (orange) connected to strong edges are retained. All unconnected weak edges removed.
| 0 | 83.54 (kept) | 209.4 |
| 28.16 (kept) | 200.4 | 213.5 |
| 31.14 (kept) | 201.6 | 215.4 |
3. Thresholding Techniques
3.1 Simple Thresholding
Threshold T = 60 → Pixels > 60 = Foreground (1), ≤ 60 = Background (0)
| 52 → 0 | 55 → 0 | 61 → 1 |
| 62 → 1 | 59 → 0 | 55 → 0 |
| 63 → 1 | 65 → 1 | 66 → 1 |
3.2 Adaptive Thresholding
Threshold = Local mean of 3×3 neighborhood. Example: center 59 → Mean = 59.56 → Background (0)
3.3 Otsu’s Method
Compute global optimum T ≈ 60 → Pixels >60 = Foreground
| 52 → 0 | 55 → 0 | 61 → 1 |
| 62 → 1 | 59 → 0 | 55 → 0 |
| 63 → 1 | 65 → 1 | 66 → 1 |
4. Summary
| Method | Step | Example / Computation | Visual |
|---|---|---|---|
| Canny | Full 3×3 gradient & threshold | Gx, Gy, Mag computed for each pixel, double threshold 50-200 | Red = Strong edge, Orange = Weak edge |
| Simple Threshold | Binary | T=60 → Pixels >60 = 1 | Green = Foreground |
| Adaptive Threshold | Local mean | Center 59 ≤ 59.56 → 0 | 0 |
| Otsu | Global optimum | T≈60 → Pixels >60 = 1 | Green = Foreground |
Legend: Red = Strong Edge, Orange = Weak Edge, Green = Threshold Foreground, 0 = Background