A set of
where,
Arrive at the desired
Since
For a multivariable function, in
where,
and
Hence, the equation becomes:
Notice that the decrease in
Let
then the gradient
and starting from an initial point
For two functions
for discrete samples that we deal with:
if
when zero-padded by 1 pixel gives:
This is achieved as:
A_padded = np.pad(A, padding = 1, mode = "constant")
Also, before proceeding with the convolution, the kernel must be flipped Left-Right and then Upside-Down
This is achieved by:
ker_flipped = np.flipud(np.fliplr(ker))
fliplr denoting a left-right flip and flipud denoting a up-down flip.
Choose a stride of length 1 and perform the convolution as the dot product of kernel sized chunks of
. . .
Notice the dimensions of the final output matrix:
def convolve2d(image, kernel, padding, stride):
image_height, image_width = image.shape
kernel_height, kernel_width = kernel.shape
output_height = (image_height + 2 * padding - kernel_height) // stride + 1
output_width = (image_width + 2 * padding - kernel_width) // stride + 1
output = np.zeros((output_height, output_width))
padded_image = np.pad(image, padding, mode = "constant")
kernel = np.flipud(np.fliplr(kernel))
for i in range(0, output_height, stride):
for j in range(0, output_width, stride):
output[i, j] = np.sum(padded_image[i : i + kernel_height, j : j+kernel_width] * kernel)
return output
Obtain two images
The final edge-detected image is obtained as:
def edge_detect(image_org):
padding, stride = 1, 1
rgb_weights = [0.2989, 0.5870, 0.1140]
image = np.dot(image_org, rgb_weights)
Gx = np.array([[1.0, 0.0, -1.0], [2.0, 0.0, -2.0], [1.0, 0.0, -1.0]])
Gy = np.array([[1.0, 2.0, 1.0], [0.0, 0.0, 0.0], [-1.0, -2.0, -1.0]])
image_height, image_width = image.shape
output_height = (image_height + 2 * padding - 3) // stride + 1
output_width = (image_width + 2 * padding - 3) // stride + 1
A_sobel = np.zeros((output_height, output_width))
padded_image = np.pad(image, padding, mode = "constant")
Gx = np.flipud(np.fliplr(Gx))
Gy = np.flipud(np.fliplr(Gy))
for i in range(0, output_height, stride):
for j in range(0, output_width, stride):
A_sobel[i, j] = (np.sum(padded_image[i : i + 3, j : j + 3] * Gx)**2 + np.sum(padded_image[i : i + 3, j : j + 3] * Gy)**2)**0.5
plt.imsave("Edge.jpeg", A_sobel, cmap = "gray")
fig, ax = plt.subplots(nrows = 1, ncols = 2, figsize=(15, 15))
ax[0].imshow(image_org)
ax[0].set_title("Original Image")
ax[1].imshow(A_sobel, cmap = "gray")
ax[1].set_title("Edge-Detected")
plt.show()
def max_pool(image, kernel_size, stride):
image_height, image_width, channels = image.shape
kernel_height, kernel_width = kernel_size[0], kernel_size[1]
output_height = (image_height - kernel_height) // stride + 1
output_width = (image_width - kernel_width) // stride + 1
output = np.zeros((output_height, output_width, 3))
for c in range(channels):
for i in range(0, output_height * stride, stride):
for j in range(0, output_width * stride, stride):
output[i // stride, j // stride, c] = np.max(image[i : i + kernel_height, j : j + kernel_width, c])
# Replace np.max() with np.mean() for Average Pooling
# output[i // stride, j // stride, c] = np.mean(image[i : i + kernel_height, j : j + kernel_width, c])
final = output.astype(np.uint8)
return final