1 of 69

CSCI 3280

Introduction to Multimedia Systems

(2026 Term 2)

Computer Science & Engineering

The Chinese University of Hong Kong

2 of 69

Announcement

  • Make your team of the final project (by the end of Feb. 28).

  • The first assignment is released today (due on Feb. 6)

  • The second tutorial will be provided on Feb. 11.

3 of 69

Visual Application (1)

4 of 69

Visual Application (2)

5 of 69

Visual Application (3)

6 of 69

Visual Application (4)

7 of 69

Visual Application (5)

8 of 69

Image Representation (1)

9 of 69

How to Represent Light?

  • RGB? No

  • Light spectrum: EM wave in [400nm, 700nm]

10 of 69

How to Represent Light (2)?

  • How? Taking samples on the spectrum.

  • [Hall89] proposed to take 9 samples on the curves.

11 of 69

How to Represent Color (1)?

  • The diagram below tells us how can we observe a red object.

  • But “How can we display the final light spectrum on the RGB monitor?”

12 of 69

How to Represent Color (2)?

  • Different spectrums may produce same response in our eyes.

  • Hence no need to reproduce the exact light spectrum. But reproduce another spectrum that gives us the “same” perceptual color.

13 of 69

XYZ Color Space

  • CIE 1931 color spaces which define the relationship between the visible spectrum and the visual sensation of specific colors by human color vision

  • Convert the light spectrum to XYZ

14 of 69

From XYZ to RGB (1)

  • From XYZ to RGB: spectral curve of RGB primitives can also be expressed as,
  • In other words,

15 of 69

From XYZ to RGB (2)

  • Particularly, we have
  • R_linear = 3.2404542X - 1.5371385Y - 0.4985314*Z
  • G_linear = -0.9692660X + 1.8760108Y + 0.0415560*Z
  • B_linear = 0.0556434X - 0.2040259Y + 1.0572252*Z

16 of 69

Color Model

  • RGB model is one of the color models. There exists other color models, different color models are used in different domain with different purpose.

  • YIQ and YUV are commonly used in TV broadcasting; CMY model is used in printing industry HSV (Hue, Saturation & Value) and HLS (Hue, Lightness & Saturation) are used by artists to retouch the images (e.g. Photoshop)

17 of 69

YIQ Model (1)

  • YIQ is used in TV broadcasting.

  • Y is the luminance (grayness or lightness) component; I and Q are the chrominance components (color components).

  • It can be easily transformed from RGB by:

18 of 69

YIQ Model (2)

1

  • Why YIQ for TV broadcasting?

- By ignoring the I and Q channels and only handling Y (luminance component).

- Our eye is more sensitive to luminance than chrominance components. By separating luminance from chrominance and reducing the bandwidth of chrominance channels, we can reduce the bandwidth without much noticeable artifact.

  • NTSC adopts YIQ. The ratio of bandwidth allocated for Y:I:Q is 4:2:2

19 of 69

YIQ Model (3)

  • Visual comparison: 5 bits per pixel (both images)

  • Note the red and green “clouds” in the left image

20 of 69

YUV Color Model

  • Similar to YIQ.
  • This color model is adopted in CD-I (CD-Interactive) and DVI video.
  • Just like YIQ, the bandwidth of chrominance components are reduced.

21 of 69

CYM and CYMK Models (1)

  • Cyan, Magenta and Yellow (CMY) are the complementary colors of RGB. They are subtractive primaries, i.e.� C = 1.0 - R� M = 1.0 - G� Y = 1.0 - B
  • Used in printing industry because color pigments absorbs color light (subtractive) instead of emitting light
  • Sometimes an extra channel is added, the black K. So the CYMK model is modified to� K = min(C, M, Y)� C = C - K� M = M - K� Y = Y - K

22 of 69

CYM and CYMK Models (2)

1

  • Apply (print) order: from light to dark Y, C, M, B

23 of 69

Digital Image (1)

1

  • An image is a 2D continuous function of light intensity values.
  • In computer, we can only store the discrete version of this function.
  • The 2D function is sampled at discrete intervals yielding a 2D matrix of discrete values.
  • A sample in the digital image is called pixel.

24 of 69

Digital Image (2)

  • Just like digital audio, there are 3 steps:�sampling, quantization and coding.

  • Image resolution actually specifies the sampling rate. e.g. 320 x 200, 640 x 480, 1024 x 768,

  • Again, we cannot represent any color intensity values, only discrete color values can be represented.

  • A clever coding scheme (color index and color lookup table) is used due to the lack of memory.

25 of 69

Gray Image

  • For B/W image, a 256-level quantization is usually enough. Pixel values range from 0 to 255.

  • Therefore 1 byte (8 bits) is needed to code the value of a pixel. For example, (try to find �quantization artifact on �the right)

  • In fact, the Y component is�actually a grayscale image

26 of 69

Color Image

  • In previous slide, the values in 1 channel (B/W image) can be finely represented by 256-level quantization.

  • Therefore, a color image requires three channels, each with 256-level quantization, altogether (256)3 quantization levels. Or in other words, 3 bytes are needed for each pixel.

  • Obviously, this is a huge storage requirement, especially for computer before 90’s.

  • So, people come up with a clever solution, color indexing.

27 of 69

Color Lookup Table

  • Instead of directly storing the pixel color in the frame buffer, the actual colors are stored in a color lookup table.
  • The frame buffer only stores the color index (indexing to the color table).

28 of 69

Is 8-bit Really Enough? (1)

  • When we say 8-bit per channel is usually sufficient. We actually mean “sufficient for representing perceived light intensity”.
  • Is 8-bit really sufficient for representing the physical light intensity?
  • No. Our vision is only a small moving window in the long light intensity range in real world.

29 of 69

Is 8-bit Really Enough? (2)

  • Have you ever experienced the problem of over-exposure and under-exposure in your photographs?
  • Our eyes (automatically) and cameras (manually) adjust exposure during capture
  • We acquire a relative light intensity and the perceived light intensity is not linear to physical light intensity

30 of 69

High Dynamic Range Image

  • For serious applications, we need to acquire quantity closer to physical light intensity

  • This motivates the development of high dynamic range image. That is, 16-bit per color channel

  • 16-bit per channel digital cameras are now available in the market but usually expensive

  • But our monitors are still 8-bit per channel. Therefore high dynamic range images are still not popular

31 of 69

Image Processing - Filtering

  • Image filtering is a technique that is utilized to enhance or revise the visual appearance of the image

32 of 69

Motivation - Noise Reduction

  • We have a basic assumption for before image filtering.

33 of 69

Motivation - Noise Reduction

  • Salt and pepper noise: random occurrences of black and white pixels
  • Impulse noise: random occurrences of white pixels
  • Gaussian noise: variations in intensity drawn from a Gaussian normal distribution

34 of 69

Image Filtering

  • Idea: Use the information coming from the neighboring pixels for processing

  • Design a transformation function of the local neighborhood at each pixel in the image

– Function specified by a “filter” or mask saying how to combine values from neighbors.

  • Various uses of filtering:

– Enhance an image (denoise, resize, etc)

– Extract information (texture, edges, etc)

– Detect patterns (template matching)

35 of 69

Linear Filtering

  • Filtered value is the linear combination of neighboring pixel values.

  • Key properties:
  • Can be modeled mathematically by convolution

36 of 69

First attempt at a solution

  • Let’s replace each pixel with an average of all the values in its neighborhood

  • Moving average in 1D:

37 of 69

Discrete convolution

  • Simple averaging:

- every sample gets the same weight

  • Convolution: same idea but with weighted average

- each sample gets its own weight (normally zero far away)

  • This is all convolution is: it is a moving weighted average

38 of 69

Discrete filtering in 2D

  • Same equation, one more index:

– now the filter is a rectangle you slide around over a grid of numbers

  • Usefulness of associativity

39 of 69

Discrete filtering in 2D

  • What values belong in the kernel H for the moving average example?

40 of 69

Smoothing by averaging

  • What if the filter size was 5 x 5 instead of 3 x 3?

41 of 69

Boundary issues

  • What about near the edge?

– the filter window falls off the edge of the image

– need to extrapolate

– methods:

• clip filter (black)

• wrap around

• copy edge

• reflect across edge

42 of 69

Gaussian filter

  • What if we want nearest neighboring pixels to have the most influence on the output?
  • Removes high-frequency components from the image (“low-pass filter”).

43 of 69

Smoothing with a Gaussian

44 of 69

Smoothing with a Gaussian

  • Parameter σ is the “scale” / “width” / “spread” of the Gaussian kernel, and controls the amount of smoothing.

45 of 69

More examples

  • 2D Convolution.

46 of 69

Signals and Images

  • A signal is composed of low and high frequency components

47 of 69

Edge Detection

  • Goal: Identify sudden changes (discontinuities) in an image

– Intuitively, most semantic and shape information from the image can be encoded in the edges

– More compact than pixels

  • Ideal: artistʼs line drawing (but artist is also using object-level knowledge)

48 of 69

Edge Detection: Motivation

  • Extract information, recognize objects

  • Recover geometry and viewpoint

49 of 69

What Causes an Edge?

50 of 69

Characterizing Edges

  • An edge is a place of rapid change in the image intensity function

51 of 69

Derivatives with Convolution

  • For 2D function f(x,y), the partial derivative is:
  • For discrete data, we can approximate using finite differences:

52 of 69

Partial Derivatives of an Image

53 of 69

Original Image

54 of 69

Gradient Magnitude Image

55 of 69

Thresholding Gradient with a Threshold

56 of 69

Designing an Edge Detector

  • Criteria for a good edge detector:

– Good detection: the optimal detector should find all real edges, ignoring noise or other artifacts

– Good localization

• the edges detected must be as close as possible to the true edges

• the detector must return one point only for each true edge point

  • Cues of edge detection

– Differences in color, intensity, or texture across the boundary

– Continuity and closure

– High-level knowledge

57 of 69

The Canny Edge Detector

58 of 69

The Canny Edge Detector: Recap

59 of 69

Various Kinds of Blurs

60 of 69

Image Deblurring: Motivation

  • Strong demand for high quality deblurring

61 of 69

Image Deblurring

  • Remove blur and restore a latent sharp image.

62 of 69

Commonly Used Blur Model

  • Define as a deconvolution operation.

63 of 69

Blind Deconvolution

  • The kernel or PSF is unknown.

64 of 69

Non-blind Deconvolution

  • The kernel or PSF is known.

65 of 69

MAP based Approaches (1)

66 of 69

MAP based Approaches (2)

67 of 69

Edge Prediction based Approaches (1)

68 of 69

Edge Prediction based Approaches (2)

69 of 69

Summary

  • Digital image: sampling, quantization and coding;

  • Image filtering, convolution, edge detection;

  • Image deblurring.