Histogram Equalization
What is an Image Histogram?
Introduction
Histogram equalization is a method in image processing of contrast adjustment using the image's histogram. It is a technique for adjusting image intensities to enhance contrast. The method is useful in images with backgrounds and foregrounds that are both bright or both dark. In particular, the method can lead to better views of bone tissue in x-ray images, and to better detail in photographs that are over or under-exposed.
Interactive Visualization: I made this interactive visualization to visually represent the intensity levels of an image in 3D space. Feel free to move the image around, and definitely zoom in. You will be able to see the peaks and valleys of the image signal in 3D space.
Here I am using the image of a basketball player dunking to visualize the "signal" of an image in 3D space. At first I calculated the grayscale intensity for each pixel and create a 3D plane geometry where each vertex's z-coordinate corresponds to the pixel's intensity, effectively visualizing the image signal in 3D. Using this method yielded a poor visualization where I couldn't see much, pretty much trash. So I calculated a color for each vertex based on its z-position (intensity) and used an HSL color model to generate a gradient. This gradient visually represents the intensity levels across the mesh. Now, the hue varies from blue (low intensity) to red (high intensity).
What is an Image Histogram?
An image histogram is a graphical representation of the distribution of intensity values in an image. It plots the number of pixels for each intensity value. The x-axis represents the intensity values, and the y-axis represents the number of pixels with that intensity value.
How to Compute an Image Histogram
To calculate the histogram of a digital image, we must perform a frequency count of the intensity levels present across all pixels. Mathematically, for an 8-bit grayscale image, this involves the following:
-
Initialization: Define a function that maps each possible intensity level to a natural number, initialized to zero. This function represents the histogram, and the domain corresponds to the range of possible pixel intensities for an 8-bit image.
-
Accumulation: Iterate over the image domain , which consists of all pixel coordinates in the image. For each pixel located at with intensity , increment the histogram function at the corresponding intensity level: .
-
Normalization: To facilitate comparisons between images of different sizes or to represent the histogram as a probability distribution, normalize the histogram by dividing each by the total number of pixels , where of the image.
Implementing these steps in Python, we arrive at the following code, which rigorously follows the outlined mathematical procedure:
import numpy as np
def compute_histogram(image):
# Initialize the histogram array with zeros for each intensity level
histogram = np.zeros(256, dtype=int)
# Iterate over each pixel in the image and accumulate histogram values
for intensity in image.ravel():
histogram[intensity] += 1
# Normalize the histogram by dividing by the total number of pixels
normalized_histogram = histogram / image.size
return normalized_histogram
In this code snippet, image.ravel()
flattens the 2D image array into a 1D array for efficient processing. image.size
gives the total number of pixels in the image, simplifying the normalization step. The resulting normalized_histogram
array represents the relative frequency of each intensity level in the image.
How to Implement Histogram Equalization
Linear Time Invariant (LTI) Systems
A system is linear time-invariant (LTI) when it observes the properties of linearity and time-invariance.
Linear vs Non-Linear Systems
A system that accepts an image as an input and produces an image as an output can be classified as linear if it adheres to the principles of additivity and homogeneity. The system, often represented as an operator , maps the set of all possible images to itself:
Here, and signify the dimensions of the image.
The operation performed by the system can be denoted as:
where is the input image, is the output image, and represents pixel coordinates.
Additivity
The system exhibits additivity if the output corresponding to the sum of two input images is the sum of the outputs for each image individually.
To demonstrate the property of additivity, consider two input images and , and let and be the system's outputs for these images, respectively. The system is linear if:
Or in terms of :
This implies that for any two images, the response of the system to their sum is equal to the sum of the responses to each image.
I will demonstrate the property of additivity using the following example:
import numpy as np
# Create two images
f1 = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
f2 = np.array([[9, 8, 7], [6, 5, 4], [3, 2, 1]])
# Create a system
def T(f):
return f * 2
# Compute the output for each image
g1 = T(f1)
g2 = T(f2)
# Compute the output for the sum of the images
g_sum = T(f1 + f2)
# Verify the property of additivity
assert np.array_equal(g_sum, g1 + g2)
Homogeneity
If the input is an image multiplied by a scalar, the output is the image multiplied by the same scalar.
To demonstrate the property of homogeneity, consider an input image and a scalar . The system is linear if:
This implies that for any image and scalar, the response of the system to the scaled image is equal to the scaled response to the original image.
I will demonstrate the property of homogeneity using the following example:
import numpy as np
# Create an image
f = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
# Create a system
def T(f):
return f * 2
# Compute the output for the image
g = T(f)
# Compute the output for the scaled image
g_scaled = T(f * 3)
# Verify the property of homogeneity
assert np.array_equal(g_scaled, g * 3)
If you have taken ECE 181, these topics should look very familiar to you.
Time-Invariant vs Time-Variant Systems
A system is time-invariant if the output does not depend on the time at which the input is applied. In the context of images, this means that the output does not depend on the location of the input in the image.