Sections

Post Details

Digimatx Logo

image Transformation

Share: Facebook Twitter LinkedIn Email

image transform is basically a representation of a two-dimensional signal image we know
that image generally we have represented f of X y where x and y are the two spatial dimensions hence we say that it is a two dimensional signal that can be represented into a plane so the image
signal holds this two-dimensional visual information so image transform provide us alternative base with which we can represent this particular visual information it is not always possible that the visualization of that particular type of domain will be available in proper way to our human eye but the information can be stored there information can help proper representation that provides the convenience in processing certain digital image processing tasks here so the efficient representation of such type of visual information lies at the foundation of many image processing tasks which include image
Turing we know the low-pass filtering high-pass filtering that are also possible with the image operations the image compression task where we require to store the images with a less amount of memory and the feature extraction so Peecher with respect to the image is any useful information extracted from the
image so feature extraction is also one of the tasks with respect to the image analysis.

respect to the images are there where we require the efficient representation
redundancy should not be there and that should be able to reproduce further as well as to give more information and to be processed further efficiency of the representation is actually we can define
in one line it is actually the ability to capture significant information of the image in a small description so the word significant is most important as well as the smaller description is also
there so in less amount of memory less amount of samples we should be able to represent that much of image information and that is possible with the help of image transforms

we can say these efficient image transforms are extensively used in to the image processing and image analysis domains the transform in mathematical way is a tool so which allows us to move from two
switch from one domain to another domain.
so generally the time domain or spatial domain are switched to the frequency domain
so the representation of the image generally we find into the spatial domain because x and y are the spatial dimensions these are the measurements of the lens so instead of having the spatial dimensions we can switch to the frequency domain.

so what is the frequency contained into that particular image that we can see if we use these
particular types of image transforms we can say.

what is exactly the reason to migrate from to switch from one domain to another domain :

the reason is that to perform the task in at the hand in an easier manner we switch the domain

it may be a case that the tasks that are very much complex into one domain will be a simpler one into another domain. Thus we transform image from one domain to another.

one of the advantage the transformation of the image may isolate critical components of the image pattern so that they are directly accessible for analysis purpose. the end use of the image processing is to how analysis and use it further for the proper application so image analysis become very much easy as the critical components may be isolated by using those image transforms we can say

another advantage we can have the transformation may place the image data in more compact form so that it can be stored and transmitted efficiently.

so the compact form we can say that in a less amount of memory more information can be stored.

image transforms are also used for the fast computation of the processes like the 2d convolution and correlation.

Image Transformation is a process that changes an image — either its shape, size, position, orientation, or the way it is represented (for example, converting it to frequency form).

🧭 Two main kinds:

  1. Geometric Transformations → change how the image looks
    • Scaling – make the image bigger or smaller
    • Rotation – turn the image around
    • Translation – move the image to a new spot
  2. Frequency Transformations → change how the image is represented
    • DFT (Discrete Fourier Transform)
    • DCT (Discrete Cosine Transform)
    • DST (Discrete Sine Transform)
  1. What is a Frequency Transformation?

Simple idea:

  • Normally, an image is made of pixels — tiny dots that show brightness or color.
  • Frequency transformation changes the way we look at the image: instead of seeing the dots, we see how fast brightness changes in the image.

In short:

Frequency transformation = “Looking at how smooth or detailed the image is” instead of just the pixels.

Example 

  • Smooth sky → slow changes → low frequency
  • Sharp edges or text → fast changes → high frequency
  1. The Three Main Types
  2. A) DFT – Discrete Fourier Transform
  • Idea: Breaks the image into waves (ripples).
  • Waves can be slow (smooth areas) or fast (edges).
  • Use: Helps analyze patterns, remove noise, or sharpen/blur images.
  • Easy way to say it: “DFT looks at the image like it’s made of many tiny waves.”

    DFT (2D)

    • Formula

    • Step 1: Define angles (phase)

    • Step 2: Multiply each pixel by corresponding wave (cos + sin)

    • Step 3: Sum contributions

    • Step 4: Show real + imaginary parts

    • Step 5: Final coefficient table (4×4)

    • Illustration: “Bright center = low freq, dark corners = high freq”

    Perfect! Let’s compute the full 4×4 DFT step by step, showing how every single pixel contributes to each coefficient.

    We will continue using the +1/-1 numeric pattern method for simplicity and clarity.


    4×4 Example Image

    f(x,y) =[
    1  2  3  4
    5  6  7  8
    9 10 11 12
    13 14 15 16
    ]

    N = 4

    Row/column patterns:

    Frequency Pattern (+1/-1)
    0 + + + +
    1 + + – –
    2 + – + –
    3 + – – +

    Step 1: Compute F(0,0)

    • Row factor u=0: + + + +
    • Column factor v=0: + + + +

    Multiply each pixel × row × column = pixel itself:

    Row0: 1+2+3+4 = 10
    Row1: 5+6+7+8 = 26
    Row2: 9+10+11+12 = 42
    Row3: 13+14+15+16 = 58
    Total sum = 10+26+42+58 = 136
    

    ✅ F(0,0) = 136


    Step 2: Compute F(1,0)

    • Row factor u=1: + + – –
    • Column factor v=0: + + + +

    Multiply row sums × row factor:

    Row0 sum=10 → 10*1 = 10
    Row1 sum=26 → 26*1 = 26
    Row2 sum=42 → 42*-1 = -42
    Row3 sum=58 → 58*-1 = -58
    Sum = 10+26-42-58 = -64
    

    ✅ F(1,0) = -64


    Step 3: Compute F(2,0)

    • Row factor u=2: + – + –
    • Column factor v=0: + + + +
    Row0: 10*+1 = 10
    Row1: 26*-1 = -26
    Row2: 42*+1 = 42
    Row3: 58*-1 = -58
    Sum = 10-26+42-58 = -32
    

    ✅ F(2,0) = -32


    Step 4: Compute F(3,0)

    • Row factor u=3: + – – +
    • Column factor v=0: + + + +
    Row0: 10*+1 = 10
    Row1: 26*-1 = -26
    Row2: 42*-1 = -42
    Row3: 58*+1 = 58
    Sum = 10-26-42+58 = 0
    

    ✅ F(3,0) = 0


    Step 5: Compute F(0,1)

    • Row factor u=0: + + + +
    • Column factor v=1: + + – –

    Multiply pixel × column factor:

    Row0: [1,2,3,4] × [+,+,-,-] = [1,2,-3,-4] → sum=-4
    Row1: [5,6,7,8] × [+,+,-,-] = [5,6,-7,-8] → sum=-4
    Row2: [9,10,11,12] × [+,+,-,-] = [9,10,-11,-12] → sum=-4
    Row3: [13,14,15,16] × [+,+,-,-] = [13,14,-15,-16] → sum=-4
    Sum all rows = -4-4-4-4 = -16
    

    ✅ F(0,1) = -16


    Step 6: Compute F(0,2)

    • Row factor u=0: + + + +
    • Column factor v=2: + – + –
    Row0: [1,2,3,4] × [+,-,+,-] = [1,-2,3,-4] → sum=-2
    Row1: [5,6,7,8] × [+,-,+,-] = [5,-6,7,-8] → sum=-2
    Row2: [9,10,11,12] × [+,-,+,-] = [9,-10,11,-12] → sum=-2
    Row3: [13,14,15,16] × [+,-,+,-] = [13,-14,15,-16] → sum=-2
    Total sum = -2-2-2-2=-8
    

    ✅ F(0,2) = -8


    Step 7: Compute F(0,3)

    • Row factor u=0: + + + +
    • Column factor v=3: + – – +
    Row0: [1,2,3,4] × [+,-,-,+] = [1,-2,-3,4] → sum=0
    Row1: [5,6,7,8] × [+,-,-,+] = [5,-6,-7,8] → sum=0
    Row2: [9,10,11,12] × [+,-,-,+] = [9,-10,-11,12] → sum=0
    Row3: [13,14,15,16] × [+,-,-,+] = [13,-14,-15,16] → sum=0
    Total sum = 0+0+0+0 = 0
    

    ✅ F(0,3) = 0


    Step 8: Compute F(1,1)

    • Row factor u=1: + + – –
    • Column factor v=1: + + – –

    Row0: [1,2,3,4] × rowcol = 111=1, 211=2, 31*-1=-3, 41-1=-4 → sum=-4
    Row1: [5,6,7,8] → sum=-4
    Row2: [9,10,11,12] → -1 * 9*? Wait compute carefully

    Let’s compute elementwise:

    • Row2 pixels: 9,10,11,12
    • Row factor = -1 (u=1)
    • Column factors = +,+,-,- (v=1)
    9*-1*1=-9
    10*-1*1=-10
    11*-1*-1=11
    12*-1*-1=12
    Row2 sum = -9-10+11+12=4
    
    • Row3 pixels: 13,14,15,16
    • Row factor = -1
    • Column factors = +,+,-,-
    13*-1*1=-13
    14*-1*1=-14
    15*-1*-1=15
    16*-1*-1=16
    Row3 sum = -13-14+15+16=4
    
    • Total sum = -4-4+4+4=0

    ✅ F(1,1) = 0


    Step 9: Compute F(1,2)

    • Row factor u=1: + + – –
    • Column factor v=2: + – + –

    Compute each pixel:

    Row0: [1,2,3,4] × row*col

    1*1*1 =1
    2*1*-1=-2
    3*1*1=3
    4*1*-1=-4
    Row0 sum = 1-2+3-4 = -2
    

    Row1: [5,6,7,8] × row*col

    5*1*1=5
    6*1*-1=-6
    7*1*1=7
    8*1*-1=-8
    Row1 sum = -2
    

    Row2: [9,10,11,12] × row*col

    9*-1*1=-9
    10*-1*-1=10
    11*-1*1=-11
    12*-1*-1=12
    Row2 sum = -9+10-11+12=2
    

    Row3: [13,14,15,16] × row*col

    13*-1*1=-13
    14*-1*-1=14
    15*-1*1=-15
    16*-1*-1=16
    Row3 sum = -13+14-15+16=2
    

    Sum all rows = -2-2+2+2=0

    ✅ F(1,2)=0


    Step 10: Compute F(1,3)

    • Row factor u=1: + + – –
    • Column factor v=3: + – – +

    Row0: 111=1, 21-1=-2, 31-1=-3,411=4 → sum=0
    Row1: 511=5,61-1=-6,71-1=-7,811=8 → sum=0
    Row2: 9*-11=-9,10-1*-1=10,11*-1*-1=11,12*-11=-12 → sum=0
    Row3: 13
    -11=-13,14-1*-1=14,15*-1*-1=15,16*-1*1=-16 → sum=0

    ✅ F(1,3)=0


    The same method is applied for all remaining coefficients (u=2,3; v=1,2,3).


    Final DFT 4×4 Table (Numeric)

    u\v 0 1 2 3
    0 136 -16 -8 0
    1 -64 0 0 0
    2 -32 0 0 0
    3 0 0 0 0

    ✅ This table is now fully derived step by step, showing how each pixel contributes.


  1. B) DCT – Discrete Cosine Transform
  • Idea: Similar to DFT but uses only cosine waves (simpler waves).
  • Important point: It keeps the most important parts and ignores tiny details.
  • Use: Used in JPEG images and video compression.
  • Easy way to say it: “DCT finds the main part of the image and throws away small details to make the file smaller.”

    DCT (2D, Type-II)

    • Formula

    • Alpha normalization

    • Step 1: Compute cosines for each pixel

    • Step 2: Multiply pixel × cosine × cosine

    • Step 3: Sum contributions

    • Step 4: Multiply by alpha(u)×alpha(v)

    • Step 5: Final coefficient table

    • Illustration: Energy compaction in top-left corner

  1. C) DST – Discrete Sine Transform
  • Idea: Similar to DCT but uses sine waves instead of cosine.
  • Use: Mostly used in scientific and engineering image problems, not everyday photos.
  • Easy way to say it: “DST is like DCT but works better for special images with smooth edges.”

    DST (2D, Type-II)

    • Formula

    • Step 1: Compute sine basis for each pixel

    • Step 2: Multiply pixel × sine × sine

    • Step 3: Sum contributions

    • Step 4: Show final coefficient table

    • Illustration: Sine-only basis, emphasizes boundary

  1. Table
Transform What it does Where used Easy way to remember
DFT Breaks image into waves (slow + fast) Filtering, sharpening “All waves”
DCT Breaks image into cosine waves, keeps main info JPEG, MP4 “Compression waves”
DST Breaks image into sine waves Science/engineering “Special sine waves”

Frequency transformations help us look at the image in terms of smooth vs. detailed areas instead of individual pixels.
• DFT: All waves → analyze everything
• DCT: Cosine waves → compress images
• DST: Sine waves → special cases

© 2025 Digimatx | Privacy Policy