Spatial operations
These methods perform linear spatial filtering operations such as smoothing, convolution on greyscale and color images.
- class machinevisiontoolbox.ImageSpatial.ImageSpatialMixin[source]
- smooth(sigma, h=None, mode='same', border='reflect', bordervalue=0)[source]
Smooth image
- Parameters:
sigma (float) – standard deviation of the Gaussian kernel
h (int) – half-width of the kernel
mode (str, optional) – option for convolution, see
convolve
, defaults to ‘same’border (str, optional) – option for boundary handling, see
convolve
, defaults to ‘reflect’bordervalue (scalar, optional) – padding value, see
convolve
, defaults to 0
- Returns:
smoothed image
- Return type:
Image
Smooth the image by convolving with a Gaussian kernel of standard deviation
sigma
. Ifh
is not given the kernel half width is set to \(2 \mbox{ceil}(3 \sigma) + 1\).Example:
>>> from machinevisiontoolbox import Image >>> img = Image.Read('monalisa.png') >>> img.smooth(sigma=3).disp() <matplotlib.image.AxesImage object at 0x7f9054e72310>
- Note:
Smooths all planes of the input image.
The Gaussian kernel has a unit volume.
If input image is integer it is converted to float, convolved, then converted back to integer.
- References:
Robotics, Vision & Control for Python, Section 11.5.1, P. Corke, Springer 2023.
- Seealso:
machinevisiontoolbox.Kernel.Gauss
convolve
- convolve(K, mode='same', border='reflect', bordervalue=0)[source]
Image convolution
- Parameters:
K (ndarray(N,M)) – kernel
mode (str, optional) – option for convolution, defaults to ‘same’
border (str, optional) – option for boundary handling, defaults to ‘reflect’
bordervalue (scalar, optional) – padding value, defaults to 0
- Returns:
convolved image
- Return type:
Image
instance
Computes the convolution of image with the kernel
K
.There are two options that control what happens at the edge of the image where the convolution window lies outside the image border.
mode
controls the size of the resulting image, whileborder
controls how pixel values are extrapolated outside the image border.mode
description
'same'
output image is same size as input image (default)
'full'
output image is larger than the input image, add border to input image
'valid'
output image is smaller than the input image and contains only valid pixels
border
description
'replicate'
replicate border pixels outwards
'pad'
outside pixels are set to
value
'wrap'
borders are joined, left to right, top to bottom
'reflect'
outside pixels reflect inside pixels
'reflect101'
outside pixels reflect inside pixels except for edge
'none'
do not look outside of border
Example:
>>> from machinevisiontoolbox import Image >>> import numpy as np >>> img = Image.Read('monalisa.png') >>> img.convolve(K=np.ones((11,11))).disp() <matplotlib.image.AxesImage object at 0x7f5cd15272e0>
- Note:
The kernel is typically square with an odd side length.
The result has the same datatype as the input image. For a kernel where the results could be negative (eg. edge detection kernel) this will cause issues such as value wraparound.
If the image is color (has multiple planes) the kernel is applied to each plane, resulting in an output image with the same number of planes.
- References:
Robotics, Vision & Control for Python, Section 11.5.1, P. Corke, Springer 2023.
- Seealso:
- gradients(kernel=None, mode='same', border='reflect', bordervalue=0)[source]
Compute horizontal and vertical gradients
- Parameters:
kernel (2D ndarray, optional) – derivative kerne, defaults to Sobel
mode (str, optional) – option for convolution, see
convolve
, defaults to ‘same’border (str, optional) – option for boundary handling, see
convolve
, defaults to ‘reflect’bordervalue (scalar, optional) – padding value, , see
convolve
, defaults to 0
- Returns:
gradient images
- Return type:
Image
,Image
Compute horizontal and vertical gradient images.
Example:
>>> from machinevisiontoolbox import Image >>> img = Image.Read('monalisa.png', grey=True) >>> Iu, Iv = img.gradients()
- References:
Robotics, Vision & Control for Python, Section 11.5.1.3, P. Corke, Springer 2023.
- Seealso:
- direction(vertical)[source]
Gradient direction
- Parameters:
im (
Image
) – vertical gradient image- Returns:
gradient direction in radians
- Return type:
Image
Compute the per-pixel gradient direction from two images comprising the horizontal and vertical gradient components.
\[\theta_{u,v} = \tan^{-1} \frac{\mat{I}_{v: u,v}}{\mat{I}_{u: u,v}}\]Example:
>>> from machinevisiontoolbox import Image >>> img = Image.Read('monalisa.png', grey=True) >>> Iu, Iv = img.gradients() >>> direction = Iu.direction(Iv)
- References:
Robotics, Vision & Control for Python, Section 11.5.1.3, P. Corke, Springer 2023.
- Seealso:
- Harris_corner_strength(k=0.04, h=2)[source]
Harris corner strength image
- Parameters:
k (float, optional) – Harris parameter, defaults to 0.04
h (int, optional) – kernel half width, defaults to 2
- Returns:
Harris corner strength image
- Return type:
Image
Returns an image containing Harris corner strength values. This is positive for high gradient in orthogonal directions, and negative for high gradient in a single direction.
- References:
Robotics, Vision & Control for Python, Section 12.3.1, P. Corke, Springer 2023.
- Seealso:
gradients
Harris
- window(func, h=None, se=None, border='reflect', bordervalue=0, **kwargs)[source]
Generalized spatial operator
- Parameters:
func (callable) – function applied to window
h (int, optional) – half width of structuring element
se (ndarray(N,M), optional) – structuring element
border (str, optional) – option for boundary handling, see
convolve
, defaults to ‘reflect’bordervalue (scalar, optional) – padding value, defaults to 0
- Raises:
ValueError –
border
is not a valid optionTypeError –
func
not callableValueError – single channel images only
- Returns:
transformed image
- Return type:
Image
Returns an image where each pixel is the result of applying the function
func
to a neighbourhood centred on the corresponding pixel in image. The return value offunc
becomes the corresponding pixel value.The neighbourhood is defined in two ways:
If
se
is given then it is the the size of the structuring elementse
which should have odd side lengths. The elements in the neighbourhood corresponding to non-zero elements inse
are packed into a vector (in column order from top left) and passed to the specified callable functionfunc
.If
se
is None thenh
is the half width of a \(w \times w\) square structuring element of ones, where \(w =2h+1\).
Example:
>>> from machinevisiontoolbox import Image >>> import numpy as np >>> img = Image.Read('monalisa.png', grey=True) >>> out = img.window(np.median, h=3)
- Note:
The structuring element should have an odd side length.
Is slow since the function
func
must be invoked once for every output pixel.
- References:
Robotics, Vision & Control for Python, Section 11.5.3, P. Corke, Springer 2023.
- Seealso:
- zerocross()[source]
Compute zero crossing
- Returns:
boolean image
- Return type:
Image
instance
Compute a zero-crossing image, where pixels are true if they are adjacent to a change in sign.
Example:
>>> from machinevisiontoolbox import Image >>> U, V = Image.meshgrid(None, 6, 6) >>> img = Image(U - V - 2, dtype='float') >>> img.print() -2.00 -1.00 0.00 1.00 2.00 3.00 -3.00 -2.00 -1.00 0.00 1.00 2.00 -4.00 -3.00 -2.00 -1.00 0.00 1.00 -5.00 -4.00 -3.00 -2.00 -1.00 0.00 -6.00 -5.00 -4.00 -3.00 -2.00 -1.00 -7.00 -6.00 -5.00 -4.00 -3.00 -2.00 >>> img.zerocross().print() 0 0 0 1 0 0 0 0 1 0 1 0 0 0 0 1 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
- Note:
Use morphological filtering with 3x3 structuring element, can lead to erroneous values in border pixels.
- References:
Robotics, Vision & Control for Python, Section 11.5.1.3, P. Corke, Springer 2023.
- Seealso:
Laplace
LoG
- scalespace(n, sigma=1)[source]
Compute image scalespace sequence
- Parameters:
n (omt) – number of steps
sigma (scalar, optional) – Gaussian filter width, defaults to 1
- Returns:
Gaussian and difference of Gaussian sequences, scale factors
- Return type:
list of
Image
, list ofImage
, list of float
Compute a scalespace image sequence by consecutively smoothing the input image with a Gaussian of width
sigma
. The difference between consecutive smoothings is the difference of Gaussian which is an approximation to the Laplacian of Gaussian.Examples:
>>> mona = Image.Read("monalisa.png", dtype="float"); >>> G, L, scales = mona.scalespace(8, sigma=8);
- Note:
The two image sequences have the same length, the original image is not included in the list of smoothed images.
- References:
Robotics, Vision & Control for Python, Section 12.3.2, P. Corke, Springer 2023.
- Seealso:
- pyramid(sigma=1, N=None, border='replicate', bordervalue=0)[source]
Pyramidal image decomposition
- Parameters:
sigma (float) – standard deviation of Gaussian kernel
N (int, optional) – number of pyramid levels to be computed, defaults to all
border (str, optional) – option for boundary handling, see
convolve
, defaults to ‘replicate’bordervalue (scalar, optional) – padding value, defaults to 0
- Returns:
list of images at each pyramid level
- Return type:
list of
Image
Returns a pyramid decomposition of the input image using Gaussian smoothing with standard deviation of
sigma
. The return is a list array of images each one having dimensions half that of the previous image. The pyramid is computed down to a non-halvable image size.Example:
>>> from machinevisiontoolbox import Image >>> img = Image.Read('monalisa.png') >>> pyramid = img.pyramid(4) >>> len(pyramid) 11 >>> pyramid [Image: 677 x 700 (uint8), Image: 339 x 350 (uint8), Image: 170 x 175 (uint8), Image: 85 x 88 (uint8), Image: 43 x 44 (uint8), Image: 22 x 22 (uint8), Image: 11 x 11 (uint8), Image: 6 x 6 (uint8), Image: 3 x 3 (uint8), Image: 2 x 2 (uint8), Image: 1 x 1 (uint8)]
- Note:
Works for greyscale images only.
Converts a color image to greyscale.
- References:
Robotics, Vision & Control for Python, Section 12.3.2, P. Corke, Springer 2023.
- Seealso:
- canny(sigma=1, th0=None, th1=None)[source]
Canny edge detection
- Parameters:
sigma (float, optional) – standard deviation for Gaussian kernel smoothing, defaults to 1
th0 (float) – lower threshold
th1 (float) – upper threshold
- Returns:
edge image
- Return type:
Image
instance
Computes an edge image obtained using the Canny edge detector algorithm. Hysteresis filtering is applied to the gradient image: edge pixels >
th1
are connected to adjacent pixels >th0
, those belowth0
are set to zero.Example:
>>> from machinevisiontoolbox import Image >>> img = Image.Read('monalisa.png') >>> edges = img.canny()
- Note:
Produces a zero image with single pixel wide edges having non-zero values.
Larger values correspond to stronger edges.
If
th1
is zero then no hysteresis filtering is performed.A color image is automatically converted to greyscale first.
- References:
“A Computational Approach To Edge Detection”, J. Canny, IEEE Trans. Pattern Analysis and Machine Intelligence, 8(6):679–698, 1986.
Robotics, Vision & Control for Python, Section 11.5.1.3, P. Corke, Springer 2023.
- rank(footprint=None, h=None, rank=-1, border='replicate', bordervalue=0)[source]
Rank filter
- Parameters:
footprint (ndarray(N,M), optional) – filter footprint or structuring element
h (int, optional) – half width of structuring element
rank (int, str) – rank of filter
border (str, optional) – option for boundary handling, defaults to ‘replicate’
bordervalue (scalar, optional) – padding value, defaults to 0
- Returns:
rank filtered image
- Return type:
Image
Return a rank filtered version of image. Only pixels corresponding to non-zero elements of the structuring element are ranked, and the value that is
rank
in rank becomes the corresponding output pixel value. The highest rank, the maximum, is rank 0. The rank can also be given as a string: ‘min|imumum’, ‘max|imum’, ‘med|ian’, long or short versions are supported.The structuring element is given as:
footprint
a 2D Numpy array containing zero or one values, orh
which is the half width \(w=2h+1\) of an array of ones
Example:
>>> from machinevisiontoolbox import Image >>> import numpy as np >>> img = Image(np.arange(25).reshape((5,5))) >>> img.print() 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 >>> img.rank(h=1, rank=0).print() # maximum filter 6 7 8 9 9 11 12 13 14 14 16 17 18 19 19 21 22 23 24 24 21 22 23 24 24 >>> img.rank(h=1, rank=8).print() # minimum filter 0 0 1 2 3 0 0 1 2 3 5 5 6 7 8 10 10 11 12 13 15 15 16 17 18 >>> img.rank(h=1, rank=4).print() # median filter 1 2 3 4 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 20 21 22 23 >>> img.rank(h=1, rank='median').print() # median filter 1 2 3 4 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 20 21 22 23
- Note:
The footprint should have an odd side length.
The input can be logical, uint8, uint16, float or double, the output is always double.
- References:
Robotics, Vision & Control for Python, Section 11.5.3, P. Corke, Springer 2023.
- Seealso:
- medianfilter(h=1, **kwargs)[source]
Median filter
- Parameters:
h (int, optional) – half width of structuring element, defaults to 1
kwargs – options passed to
rank
- Returns:
median filtered image
- Return type:
Image
instance
Return the median filtered image. For every \(w \times w, w=2h+1\) window take the median value as the output pixel value.
Example:
File "/opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages/scipy/ndimage/_filters.py", line 1379, in rank_filter return _rank_filter(input, rank, size, footprint, output, mode, cval, File "/opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages/scipy/ndimage/_filters.py", line 1293, in _rank_filter raise RuntimeError('filter footprint array has incorrect shape.') RuntimeError: filter footprint array has incorrect shape.
- Note:
This filter is effective for removing impulse (aka salt and pepper) noise.
- References:
Robotics, Vision & Control for Python, Section 11.5.3, P. Corke, Springer 2023.
- Seealso:
- distance_transform(invert=False, norm='L2', h=1)[source]
Distance transform
- Parameters:
invert (bool, optional) – consider inverted image, defaults to False
norm (str, optional) – distance metric: ‘L1’ or ‘L2’ [default]
h (int, optional) – half width of window, defaults to 1
- Returns:
distance transform of image
- Return type:
Image
Compute the distance transform. For each zero input pixel, compute its distance to the nearest non-zero input pixel.
Example:
>>> from machinevisiontoolbox import Image >>> import numpy as np >>> pixels = np.zeros((5,5)) >>> pixels[2, 1:3] = 1 >>> img = Image(pixels) >>> img.distance_transform().print(precision=3) 2.324 1.910 1.910 2.324 2.739 1.369 0.955 0.955 1.369 2.324 0.955 0.000 0.000 0.955 1.910 1.369 0.955 0.955 1.369 2.324 2.324 1.910 1.910 2.324 2.739 >>> img.distance_transform(norm="L1").print() 3.00 2.00 2.00 3.00 4.00 2.00 1.00 1.00 2.00 3.00 1.00 0.00 0.00 1.00 2.00 2.00 1.00 1.00 2.00 3.00 3.00 2.00 2.00 3.00 4.00
- Note:
The output image is the same size as the input image.
Distance is computed using a sliding window and is an approximation of true distance.
For non-zero input pixels the corresponding output pixels are set to zero.
The signed-distance function is
image.distance_transform() - image.distance_transform(invert=True)
- References:
Robotics, Vision & Control for Python, Section 11.6.4, P. Corke, Springer 2023.
- Seealso:
- labels_binary(connectivity=4, ltype='int32')[source]
Blob labelling
- Parameters:
connectivity (int, optional) – number of neighbours used for connectivity: 4 [default] or 8
ltype (string, optional) – output image type: ‘int32’ [default], ‘uint16’
- Returns:
label image, number of regions
- Return type:
Image
, int
Compute labels of connected components in the input greyscale or binary image. Regions are sets of contiguous pixels with the same value.
The method returns the label image and the number of labels N, so labels lie in the range [0, N-1].The value in the label image in an integer indicating which region the corresponding input pixel belongs to. The background has label 0.
Example:
>>> from machinevisiontoolbox import Image >>> img = Image.Squares(2, 15) >>> img.print() 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 1 1 1 1 1 0 0 0 0 1 1 1 1 1 0 1 1 1 1 1 0 0 0 0 1 1 1 1 1 0 1 1 1 1 1 0 0 0 0 1 1 1 1 1 0 1 1 1 1 1 0 0 0 0 1 1 1 1 1 0 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 1 1 1 1 1 0 0 0 0 1 1 1 1 1 0 1 1 1 1 1 0 0 0 0 1 1 1 1 1 0 1 1 1 1 1 0 0 0 0 1 1 1 1 1 0 1 1 1 1 1 0 0 0 0 1 1 1 1 1 0 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 >>> labels, N = img.labels_binary() >>> N 5 >>> labels.print() 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 2 2 2 2 2 0 0 0 0 1 1 1 1 1 0 2 2 2 2 2 0 0 0 0 1 1 1 1 1 0 2 2 2 2 2 0 0 0 0 1 1 1 1 1 0 2 2 2 2 2 0 0 0 0 1 1 1 1 1 0 2 2 2 2 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 3 3 3 3 0 4 4 4 4 4 0 0 0 0 3 3 3 3 3 0 4 4 4 4 4 0 0 0 0 3 3 3 3 3 0 4 4 4 4 4 0 0 0 0 3 3 3 3 3 0 4 4 4 4 4 0 0 0 0 3 3 3 3 3 0 4 4 4 4 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
- Note:
This algorithm is variously known as region labelling, connectivity analysis, region coloring, connected component analysis, blob labelling.
The output image is the same size as the input image.
The input image can be binary or greyscale.
Connectivity is performed using 4 nearest neighbours by default.
8-way connectivity introduces ambiguities, a chequerboard is two blobs.
- References:
Robotics, Vision & Control for Python, Section 12.1.2.1, P. Corke, Springer 2023.
- Seealso:
- labels_MSER(**kwargs)[source]
Blob labelling using MSER
- Parameters:
kwargs – arguments passed to
MSER_create
- Returns:
label image, number of regions
- Return type:
Image
, int
Compute labels of connected components in the input greyscale image. Regions are sets of contiguous pixels that form stable regions across a range of threshold values.
The method returns the label image and the number of labels N, so labels lie in the range [0, N-1].The value in the label image in an integer indicating which region the corresponding input pixel belongs to. The background has label 0.
Example:
>>> from machinevisiontoolbox import Image >>> img = Image.Squares(2, 15) >>> img.print() 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 1 1 1 1 1 0 0 0 0 1 1 1 1 1 0 1 1 1 1 1 0 0 0 0 1 1 1 1 1 0 1 1 1 1 1 0 0 0 0 1 1 1 1 1 0 1 1 1 1 1 0 0 0 0 1 1 1 1 1 0 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 1 1 1 1 1 0 0 0 0 1 1 1 1 1 0 1 1 1 1 1 0 0 0 0 1 1 1 1 1 0 1 1 1 1 1 0 0 0 0 1 1 1 1 1 0 1 1 1 1 1 0 0 0 0 1 1 1 1 1 0 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 >>> labels, N = img.labels_MSER() >>> N 0 >>> labels.print() 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
- References:
Linear time maximally stable extremal regions, David Nistér and Henrik Stewénius, In Computer Vision–ECCV 2008, pages 183–196. Springer, 2008.
Robotics, Vision & Control for Python, Section 12.1.2.2, P. Corke, Springer 2023.
- Seealso:
- labels_graphseg(sigma=0.5, k=2000, minsize=100)[source]
Blob labelling using graph-based segmentation
- Parameters:
kwargs – arguments passed to
MSER_create
- Returns:
label image, number of regions
- Return type:
Image
, int
Compute labels of connected components in the input color image. Regions are sets of contiguous pixels that are similar with respect to their surrounds.
The method returns the label image and the number of labels N, so labels lie in the range [0, N-1].The value in the label image in an integer indicating which region the corresponding input pixel belongs to. The background has label 0.
- References:
Efficient graph-based image segmentation, Pedro F Felzenszwalb and Daniel P Huttenlocher, volume 59, pages 167–181. Springer, 2004.
Robotics, Vision & Control for Python, Section 12.1.2.2, P. Corke, Springer 2023.
- Seealso:
labels_binary
labels_MSER
blobs
opencv.createGraphSegmentation
- sad(image2)[source]
Sum of absolute differences
- Parameters:
image2 (
Image
) – second image- Raises:
ValueError – image2 shape is not equal to self
- Returns:
sum of absolute differences
- Return type:
scalar
Returns a simple image disimilarity measure which is the sum of absolute differences between the image and
image2
. The result is a scalar and a value of 0 indicates identical pixel patterns and is increasingly positive as image dissimilarity increases.Example:
>>> from machinevisiontoolbox import Image >>> img1 = Image([[10, 11], [12, 13]]) >>> img2 = Image([[10, 11], [10, 13]]) >>> img1.sad(img2) 2 >>> img1.sad(img2+10) 986 >>> img1.sad(img2*2) 982
- ssd(image2)[source]
Sum of squared differences
- Parameters:
image2 (
Image
) – second image- Raises:
ValueError – image2 shape is not equal to self
- Returns:
sum of squared differences
- Return type:
scalar
Returns a simple image disimilarity measure which is the sum of the squared differences between the image and
image2
. The result is a scalar and a value of 0 indicates identical pixel patterns and is increasingly positive as image dissimilarity increases.Example:
>>> from machinevisiontoolbox import Image >>> img1 = Image([[10, 11], [12, 13]]) >>> img2 = Image([[10, 11], [10, 13]]) >>> img1.ssd(img2) 4 >>> img1.ssd(img2+10) 364 >>> img1.ssd(img2*2) 454
- ncc(image2)[source]
Normalised cross correlation
- Parameters:
image2 (
Image
) – second image- Raises:
ValueError – image2 shape is not equal to self
- Returns:
normalised cross correlation
- Return type:
scalar
Returns an image similarity measure which is the normalized cross-correlation between the image and
image2
. The result is a scalar in the interval -1 (non match) to 1 (perfect match) that indicates similarity.Example:
>>> from machinevisiontoolbox import Image >>> img1 = Image([[10, 11], [12, 13]]) >>> img2 = Image([[10, 11], [10, 13]]) >>> img1.ncc(img2) 0.9970145759536657 >>> img1.ncc(img2+10) 1.395820406335132 >>> img1.ncc(img2*2) 1.2678511640312011
- zsad(image2)[source]
Zero-mean sum of absolute differences
- Parameters:
image2 (
Image
) – second image- Raises:
ValueError – image2 shape is not equal to self
- Returns:
sum of absolute differences
- Return type:
scalar
Returns a simple image disimilarity measure which is the zero-mean sum of absolute differences between the image and
image2
. The result is a scalar and a value of 0 indicates identical pixel patterns (relative to their mean values) and is increasingly positive as image dissimilarity increases.Example:
>>> from machinevisiontoolbox import Image >>> img1 = Image([[10, 11], [12, 13]]) >>> img2 = Image([[10, 11], [10, 13]]) >>> img1.zsad(img2) 3.0 >>> img1.zsad(img2+10) 3.0 >>> img1.zsad(img2*2) 6.0
- zssd(image2)[source]
Zero-mean sum of squared differences
- Parameters:
image2 (
Image
) – second image- Raises:
ValueError – image2 shape is not equal to self
- Returns:
sum of squared differences
- Return type:
scalar
Returns a simple image disimilarity measure which is the zero-mean sum of the squared differences between the image and
image2
. The result is a scalar and a value of 0 indicates identical pixel patterns (relative to their maen) and is increasingly positive as image dissimilarity increases.Example:
>>> from machinevisiontoolbox import Image >>> img1 = Image([[10, 11], [12, 13]]) >>> img2 = Image([[10, 11], [10, 13]]) >>> img1.zssd(img2) 3.0 >>> img1.zssd(img2+10) 3.0 >>> img1.zssd(img2*2) 13.0
- zncc(image2)[source]
Zero-mean normalized cross correlation
- Parameters:
image2 (
Image
) – second image- Raises:
ValueError – image2 shape is not equal to self
- Returns:
normalised cross correlation
- Return type:
scalar
Returns an image similarity measure which is the zero-mean normalized cross-correlation between the image and
image2
. The result is a scalar in the interval -1 (non match) to 1 (perfect match) that indicates similarity.Example:
>>> from machinevisiontoolbox import Image >>> img1 = Image([[10, 11], [12, 13]]) >>> img2 = Image([[10, 11], [10, 13]]) >>> img1.zncc(img2) 0.7302967433402214 >>> img1.zncc(img2+10) 0.7302967433402214 >>> img1.zncc(img2*2) 0.7302967433402214
- similarity(T, metric='zncc')[source]
Locate template in image
- Parameters:
T (ndarray(N,M)) – template image
metric (str) – similarity metric, one of: ‘ssd’, ‘zssd’, ‘ncc’, ‘zncc’ [default]
- Raises:
ValueError – template T must have odd dimensions
ValueError – bad metric specified
- Returns:
similarity image
- Return type:
Image
instance
Compute a similarity image where each output pixel is the similarity of the template
T
to the same-sized neighbourhood surrounding the corresonding input pixel in image.Example:
>>> from machinevisiontoolbox import Image >>> crowd = Image.Read("wheres-wally.png", mono=True, dtype="float") >>> T = Image.Read("wally.png", mono=True, dtype="float") >>> sim = crowd.similarity(T, "zncc") >>> sim.disp(colormap="signed", colorbar=True); <matplotlib.image.AxesImage object at 0x7f21f34eb1c0>
- Note:
For NCC and ZNCC the maximum similarity value corresponds to the most likely template location. For SSD and ZSSD the minimum value corresponds to the most likely location.
Similarity is not computed for those pixels where the template crosses the image boundary, and these output pixels are set to NaN.
- References:
Robotics, Vision & Control for Python, Section 11.5.2, P. Corke, Springer 2023.
- Seealso:
Image kernels
These class methods define standard image kernels
- class machinevisiontoolbox.ImageSpatial.Kernel[source]
Image processing kernel operations on the Image class
- static Gauss(sigma, h=None)[source]
Gaussian kernel
- Parameters:
sigma (float) – standard deviation of Gaussian kernel
h (integer, optional) – half width of the kernel
- Returns:
Gaussian kernel
- Return type:
ndarray(2h+1, 2h+1)
Return the 2-dimensional Gaussian kernel of standard deviation
sigma
\[\mathbf{K} = \frac{1}{2\pi \sigma^2} e^{-(u^2 + v^2) / 2 \sigma^2}\]The kernel is centred within a square array with side length given by:
\(2 \mbox{ceil}(3 \sigma) + 1\), or
\(2 \mathtt{h} + 1\)
Example:
>>> from machinevisiontoolbox import Kernel >>> K = Kernel.Gauss(sigma=1, h=2) >>> K.shape (5, 5) >>> K array([[0.003 , 0.0133, 0.0219, 0.0133, 0.003 ], [0.0133, 0.0596, 0.0983, 0.0596, 0.0133], [0.0219, 0.0983, 0.1621, 0.0983, 0.0219], [0.0133, 0.0596, 0.0983, 0.0596, 0.0133], [0.003 , 0.0133, 0.0219, 0.0133, 0.003 ]]) >>> K = Kernel.Gauss(sigma=2) >>> K.shape (13, 13)
- Note:
The volume under the Gaussian kernel is one.
If the kernel is strongly truncated, ie. it is non-zero at the edges of the window then the volume will be less than one.
- References:
Robotics, Vision & Control for Python, Section 11.5.1.1, P. Corke, Springer 2023.
- Seealso:
- static Laplace()[source]
Laplacian kernel
- Returns:
Laplacian kernel
- Return type:
ndarray(3,3)
Return the Laplacian kernel
\[\begin{split}\mathbf{K} = \begin{bmatrix} 0 & 1 & 0 \\ 1 & -4 & 1 \\ 0 & 1 & 0 \end{bmatrix}\end{split}\]Example:
>>> from machinevisiontoolbox import Kernel >>> Kernel.Laplace() array([[ 0, 1, 0], [ 1, -4, 1], [ 0, 1, 0]])
- Note:
This kernel has an isotropic response to image gradient.
- References:
Robotics, Vision & Control for Python, Section 11.5.1.3, P. Corke, Springer 2023.
- Seealso:
LoG
zerocross
- static Sobel()[source]
Sobel edge detector
- Returns:
Sobel kernel
- Return type:
ndarray(3,3)
Return the Sobel kernel for horizontal gradient
\[\begin{split}\mathbf{K} = \frac{1}{8} \begin{bmatrix} 1 & 0 & -1 \\ 2 & 0 & -2 \\ 1 & 0 & -1 \end{bmatrix}\end{split}\]Example:
>>> from machinevisiontoolbox import Kernel >>> Kernel.Sobel() array([[ 0.125, 0. , -0.125], [ 0.25 , 0. , -0.25 ], [ 0.125, 0. , -0.125]])
- Note:
This kernel is an effective vertical-edge detector
The y-derivative (horizontal-edge) kernel is
K.T
- References:
Robotics, Vision & Control for Python, Section 11.5.1.3, P. Corke, Springer 2023.
- Seealso:
- static DoG(sigma1, sigma2=None, h=None)[source]
Difference of Gaussians kernel
- Parameters:
sigma1 (float) – standard deviation of first Gaussian kernel
sigma2 (float, optional) – standard deviation of second Gaussian kernel
h (int, optional) – half-width of Gaussian kernel
- Returns:
difference of Gaussian kernel
- Return type:
ndarray(2h+1, 2h+1)
Return the 2-dimensional difference of Gaussian kernel defined by two standard deviation values:
\[\mathbf{K} = G(\sigma_1) - G(\sigma_2)\]where \(\sigma_1 > \sigma_2\). By default, \(\sigma_2 = 1.6 \sigma_1\).
The kernel is centred within a square array with side length given by:
\(2 \mbox{ceil}(3 \sigma) + 1\), or
\(2\mathtt{h} + 1\)
Example:
>>> from machinevisiontoolbox import Kernel >>> Kernel.DoG(1) array([[ 0.0019, 0.0049, 0.0082, 0.0095, 0.0082, 0.0049, 0.0019], [ 0.0049, 0.0108, 0.0116, 0.0085, 0.0116, 0.0108, 0.0049], [ 0.0082, 0.0116, -0.0142, -0.0427, -0.0142, 0.0116, 0.0082], [ 0.0095, 0.0085, -0.0427, -0.0937, -0.0427, 0.0085, 0.0095], [ 0.0082, 0.0116, -0.0142, -0.0427, -0.0142, 0.0116, 0.0082], [ 0.0049, 0.0108, 0.0116, 0.0085, 0.0116, 0.0108, 0.0049], [ 0.0019, 0.0049, 0.0082, 0.0095, 0.0082, 0.0049, 0.0019]])
- static LoG(sigma, h=None)[source]
Laplacian of Gaussian kernel
- Parameters:
sigma (float) – standard deviation of first Gaussian kernel
h (int, optional) – half-width of kernel
- Returns:
kernel
- Return type:
ndarray(2h+1, 2h+1)
Return a 2-dimensional Laplacian of Gaussian kernel with standard deviation
sigma
\[\mathbf{K} = \frac{1}{\pi \sigma^4} \left(\frac{u^2 + v^2}{2 \sigma^2} -1\right) e^{-(u^2 + v^2) / 2 \sigma^2}\]The kernel is centred within a square array with side length given by:
\(2 \mbox{ceil}(3 \sigma) + 1\), or
\(2\mathtt{h} + 1\)
Example:
>>> from machinevisiontoolbox import Kernel >>> Kernel.LoG(1) array([[ 0.0005, 0.0028, 0.0087, 0.0125, 0.0087, 0.0028, 0.0005], [ 0.0028, 0.0177, 0.0394, 0.0432, 0.0394, 0.0177, 0.0028], [ 0.0087, 0.0394, 0.0002, -0.0964, 0.0002, 0.0394, 0.0087], [ 0.0125, 0.0432, -0.0964, -0.3181, -0.0964, 0.0432, 0.0125], [ 0.0087, 0.0394, 0.0002, -0.0964, 0.0002, 0.0394, 0.0087], [ 0.0028, 0.0177, 0.0394, 0.0432, 0.0394, 0.0177, 0.0028], [ 0.0005, 0.0028, 0.0087, 0.0125, 0.0087, 0.0028, 0.0005]])
- static DGauss(sigma, h=None)[source]
Derivative of Gaussian kernel
- Parameters:
sigma (float) – standard deviation of first Gaussian kernel
h (int, optional) – half-width of kernel
- Returns:
kernel
- Return type:
ndarray(2h+1, 2h+1)
Returns a 2-dimensional derivative of Gaussian kernel with standard deviation
sigma
\[\mathbf{K} = \frac{-x}{2\pi \sigma^2} e^{-(x^2 + y^2) / 2 \sigma^2}\]The kernel is centred within a square array with side length given by:
\(2 \mbox{ceil}(3 \sigma) + 1\), or
\(2\mathtt{h} + 1\)
Example:
>>> from machinevisiontoolbox import Kernel >>> Kernel.DGauss(1) array([[ 0.0001, 0.0005, 0.0011, -0. , -0.0011, -0.0005, -0.0001], [ 0.0007, 0.0058, 0.0131, -0. , -0.0131, -0.0058, -0.0007], [ 0.0032, 0.0261, 0.0585, -0. , -0.0585, -0.0261, -0.0032], [ 0.0053, 0.0431, 0.0965, -0. , -0.0965, -0.0431, -0.0053], [ 0.0032, 0.0261, 0.0585, -0. , -0.0585, -0.0261, -0.0032], [ 0.0007, 0.0058, 0.0131, -0. , -0.0131, -0.0058, -0.0007], [ 0.0001, 0.0005, 0.0011, -0. , -0.0011, -0.0005, -0.0001]])
- Note:
This kernel is the horizontal derivative of the Gaussian, \(dG/dx\).
The vertical derivative, \(dG/dy\), is the transpose of this kernel.
This kernel is an effective edge detector.
- References:
Robotics, Vision & Control for Python, Section 11.5.1.3, P. Corke, Springer 2023.
- Seealso:
- static Circle(radius, h=None, normalize=False, dtype='uint8')[source]
Circular structuring element
- Parameters:
radius (scalar, array_like(2)) – radius of circular structuring element
h (int) – half-width of kernel
normalize (bool, optional) – normalize volume of kernel to one, defaults to False
dtype (str or NumPy dtype, optional) – data type for image, defaults to
uint8
- Returns:
circular kernel
- Return type:
ndarray(2h+1, 2h+1)
Returns a circular kernel of radius
radius
pixels. Sometimes referred to as a tophat kernel. Values inside the circle are set to one, outside are set to zero.If
radius
is a 2-element vector the result is an annulus of ones, and the two numbers are interpretted as inner and outer radii respectively.The kernel is centred within a square array with side length given by \(2\mathtt{h} + 1\).
Example:
>>> from machinevisiontoolbox import Kernel >>> Kernel.Circle(2) array([[0, 0, 1, 0, 0], [0, 1, 1, 1, 0], [1, 1, 1, 1, 1], [0, 1, 1, 1, 0], [0, 0, 1, 0, 0]], dtype=uint8) >>> Kernel.Circle([2, 3]) array([[0, 0, 0, 1, 0, 0, 0], [0, 1, 1, 1, 1, 1, 0], [0, 1, 0, 0, 0, 1, 0], [1, 1, 0, 0, 0, 1, 1], [0, 1, 0, 0, 0, 1, 0], [0, 1, 1, 1, 1, 1, 0], [0, 0, 0, 1, 0, 0, 0]], dtype=uint8)
- References:
Robotics, Vision & Control for Python, Section 11.5.1.1, P. Corke, Springer 2023.
- Seealso:
- static Box(h, normalize=True)[source]
Square structuring element
- Parameters:
h (int) – half-width of kernel
normalize (bool, optional) – normalize volume of kernel to one, defaults to True
- Returns:
kernel
- Return type:
ndarray(2h+1, 2h+1)
Returns a square kernel with unit volume.
The kernel is centred within a square array with side length given by \(2\mathtt{h} + 1\).
Example:
>>> from machinevisiontoolbox import Kernel >>> Kernel.Box(2) array([[0.04, 0.04, 0.04, 0.04, 0.04], [0.04, 0.04, 0.04, 0.04, 0.04], [0.04, 0.04, 0.04, 0.04, 0.04], [0.04, 0.04, 0.04, 0.04, 0.04], [0.04, 0.04, 0.04, 0.04, 0.04]]) >>> Kernel.Box(2, normalize=False) array([[1., 1., 1., 1., 1.], [1., 1., 1., 1., 1.], [1., 1., 1., 1., 1.], [1., 1., 1., 1., 1.], [1., 1., 1., 1., 1.]])
- References:
Robotics, Vision & Control for Python, Section 11.5.1.1, P. Corke, Springer 2023.
- Seealso: