1 of 35

The Math Behind the Pixels: Linear Algebra & Statistics in Computer Vision

2 of 35

  • At its core, computer vision is an applied field of AI, and its language is mathematics.
  • To truly understand how algorithms “see” and interpret images, you must grasp the foundational concepts of Linear Algebra and Statistics.
  • These fields provide the tools to represent, manipulate, and make sense of the visual data.

3 of 35

  • Computer Vision is the study to extract the knowledge from images. And for the extraction, core concepts of Linear Algebra are used.
  • Vector: It is a 1D array and usually defined in such a way that it has magnitude and direction.
  • Matrix: It is a 2D array of numbers. For example: the pixels representation of image in matrix form. The operations of matrices are Projection, Translation, Rotation, Scaling

4 of 35

  • Linear algebra is the framework for all image data.
  • In a computer vision context, images are not just pictures;
  • They are highly structured numerical grids.
  • Vectors, In computer vision, a vector can be a single row or column of pixels.

  • Pixel values: A vector can represent a single pixel’s color information,

like [255, 0, 0] for pure red in an RGB image.

  • Images as Matrices: A grayscale image is a matrix of pixel intensity values.

A 512×512 image is a 512×512 matrix. Operations like blurring, sharpening,

or edge detection.

5 of 35

  • Tensor: It is a generalization of vectors and matrices.

6 of 35

  • Gray scale images can be represented as a 2-D matrix of pixels, where each pixel corresponds to an intensity level ranging from 0 (black) to 255 (white).
  • Colored images are represented by a 3-D matrix, as they possess a more intricate nature. Each pixel in a colored image is an array of length 3, indicating intensity values (0–255) for Red, Green, and Blue channels. 

7 of 35

  • Let's consider the following image and its black & white variant .
  • Notice that the image can be represented as a grid of 16x16 small pieces, which are called pixels (the smallest graphical element of an image, which can take only one color at a time).

8 of 35

  • If we can assign numbers to each color, then, the grid of pixels can be represented as a numerical matrix.
  • If in the previous image, we assign 1 to the white color, and 0 to the black one, then, the image can be represented as a 16 x 16 matrix, whose elements are the numbers 0 and 1.

9 of 35

10 of 35

  • Using the same procedure, we can also represent grayscale image as matrices, but in this case, there are more than two numbers. For this purpose, most of the digital files use numbers between 0 (black) and 255(white) as a representation of the intensity.

11 of 35

  • Matrix representation of color images depends on the color system used by the program that is processing the image.
  • For this purpose we will use the RGB (the most popular one), where each pixel specifies the amount of Red (R), Green (G) and Blue (B), and each color can vary from 0 to 255.
  • Thus, in the RGB, a pixel can be represented as a tri-dimensional vector (r, g, b) where r, g and b are integer numbers from 0 to 255.

12 of 35

13 of 35

How RGB values are packed into a single integer and then unpacked again.

𝑟= red component (0–255)

𝑔 = green component (0–255)

𝑏 = blue component (0–255)

𝑣 = the single integer representation

Note: 65536=256 * 256

256 is the range of each color channel

14 of 35

15 of 35

16 of 35

But why red is multuplied with 256 squared and green by 256 and blue by nothing

17 of 35

18 of 35

19 of 35

20 of 35

Introduction to Mathematical Transformations

21 of 35

22 of 35

Scaling Transformation

  • In it a matrix that will scale points up (or down) along each axis.
  • Sx, Sy,and Sz are the scale factors along the x, y, and z axes, respectively. You can see that if Sx= Sy= Sz =1, the matrix is the identity matrix.

23 of 35

24 of 35

Rotation Transformation

25 of 35

26 of 35

27 of 35

28 of 35

29 of 35

30 of 35

31 of 35

32 of 35

33 of 35

Shear Transformation

Shear is a somewhat less commonly used transformation that moves points parallel to an axis. Shearing terms arise in the off-diagonal elements of matrices.

34 of 35

35 of 35