Image processing

These methods perform image processing operations on grey-scale and color images.

class machinevisiontoolbox.ImageProcessing.ImageProcessingMixin[source]
LUT(lut, colororder=None)[source]

Apply lookup table

Parameters
  • lut (array_like(256), ndarray(256,N)) – lookup table

  • colororder (str or dict) – colororder for output image, optional

Returns

transformed image

Return type

Image

For a greyscale image the LUT can be:

  • (256,)

  • (256,N) in which case the resulting image has N planes created my applying the I’th column of the LUT to the input image

For a color image the LUT can be:

  • (256,) and applied to every plane, or

  • (256,N) where the LUT columns are applied to the N planes of the input image.

Example:

>>> from machinevisiontoolbox import Image
>>> import numpy as np
>>> img = Image([[100, 150], [200, 250]])
>>> img.LUT(np.arange(255, -1, -1, dtype='uint8')).A
array([[155, 105],
       [ 55,   5]], dtype=uint8)

Note

Works only for uint8 and int8 image and LUT.

References
  • Robotics, Vision & Control for Python, Section 11.3, P. Corke, Springer 2023.

Seealso

cv2.LUT

apply(func, vectorize=False)[source]

Apply a function to an image

Parameters

func (callable) – function to apply to image or pixel

Returns

transformed image

Return type

Image

If vectorize is False the function is called with a single argument which is the underlying NumPy array, and it must return a NumPy array. The return array can have different dimensions to its argument.

If vectorize is True the function is called for every pixel with a single argument which is a scalar or a 1d-array of length equal to the number of color planes. The return array will have the same dimensions to its argument.

Example:

>>> from machinevisiontoolbox import Image
>>> import numpy as np
>>> import math
>>> img = Image([[1, 2], [3, 4]])
>>> img.apply(np.sqrt).image
array([[1.   , 1.414],
       [1.732, 2.   ]], dtype=float16)
>>> img.apply(lambda x: math.sqrt(x), vectorize=True).image
array([[1.    , 1.4142],
       [1.7321, 2.    ]])

Note

Slow when vectorize=True which involves a large number of calls to func.

References
  • Robotics, Vision & Control for Python, Section 11.3, P. Corke, Springer 2023.

Seealso

apply2

apply2(other, func, vectorize=False)[source]

Apply a function to two images

Parameters

func (callable) – function to apply to image or pixel

Raises

ValueError – images must have same size

Returns

transformed image

Return type

Image

If vectorize is False the function is called with two arguments which are the underlying NumPy arrays, and it must return a NumPy array. The return array can have different dimensions to its arguments.

If vectorize is True the function is called for every pixel in both images with two arguments which are the corresponding pixel values as a scalar or 1d-array of length equal to the number of color planes. The function returns a scalar or a 1d-array. The return array will have the same dimensions to its argument.

Example:

>>> from machinevisiontoolbox import Image
>>> import numpy as np
>>> import math
>>> img1 = Image([[1, 2], [3, 4]])
>>> img2 = Image([[5, 6], [7, 8]])
>>> img1.apply2(img2, np.hypot).image
array([[5.098, 6.324],
       [7.617, 8.945]], dtype=float16)
>>> img1.apply2(img2, lambda x, y: math.hypot(x,y), vectorize=True).image
array([[5.099 , 6.3246],
       [7.6158, 8.9443]])

Note

Slow when vectorize=True which involves a large number of calls to func.

References
  • Robotics, Vision & Control for Python, Section 11.4, P. Corke, Springer 2023.

Seealso

apply

clip(min, max)[source]

Clip pixel values

Parameters
  • min (int or float) – minimum value

  • max (int or float) – maximum value

Returns

transformed image

Return type

Image

Pixels in the returned image will have values in the range [min, max] inclusive.

Example:

>>> from machinevisiontoolbox import Image
>>> img = Image([[1, 2], [3, 4]])
>>> img.clip(2, 3).A
array([[2, 2],
       [3, 3]], dtype=uint8)
Seealso

numpy.clip

roll(ru=0, rv=0)[source]

Roll image by row or column

Parameters
  • ru (int, optional) – roll in the column direction, defaults to 0

  • rv (int, optional) – roll in the row direction, defaults to 0

Returns

rolled image

Return type

Image instance

Example:

>>> from machinevisiontoolbox import Image
>>> img = Image([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
>>> img.roll(ru=1).A
array([[3, 1, 2],
       [6, 4, 5],
       [9, 7, 8]], dtype=uint8)
>>> img.roll(ru=-1).A
array([[2, 3, 1],
       [5, 6, 4],
       [8, 9, 7]], dtype=uint8)
>>> img.roll(rv=1).A
array([[7, 8, 9],
       [1, 2, 3],
       [4, 5, 6]], dtype=uint8)
>>> img.roll(ru=1, rv=-1).A
array([[6, 4, 5],
       [9, 7, 8],
       [3, 1, 2]], dtype=uint8)
Seealso

numpy.roll

normhist()[source]

Histogram normalisaton

Returns

normalised image

Return type

Image instance

Return a histogram normalized version of the image which highlights image detail in low-contrast areas of an image.

Example:

>>> from machinevisiontoolbox import Image
>>> img = Image([[10, 20, 30], [40, 41, 42], [70, 80, 90]])
>>> img.normhist().A
array([[  0,  32,  64],
       [ 96, 128, 159],
       [191, 223, 255]], dtype=uint8)

Note

  • The histogram of the normalized image is approximately uniform, that is, all grey levels ae equally likely to occur.

  • Color images automatically converted to grayscale

References
  • Robotics, Vision & Control for Python, Section 11.3, P. Corke, Springer 2023.

Seealso

cv2.equalizeHist

stretch(max=1, range=None, clip=True)[source]

Image normalisation

Parameters
  • max (int, float, optional) – maximum value of output pixels, defaults to 1

  • range (array_like(2), optional) – range[0] is mapped to 0, range[1] is mapped to max

  • clip (bool, optional) – clip pixel values to interval [0, max], defaults to True

Returns

Image with pixel values stretched to M across r

Return type

Image

Returns a normalised image in which all pixel values are linearly mapped to the interval of 0.0 to max. That is, the minimum pixel value is mapped to 0 and the maximum pixel value is mapped to max.

If range is specified then range[0] is mapped to 0.0 and range[1] is mapped to max. If clip is False then pixels less than range[0] will be mapped to a negative value and pixels greater than range[1] will be mapped to a value greater than max.

Example:

>>> from machinevisiontoolbox import Image
>>> img = Image([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
>>> img.stretch().A
array([[0.   , 0.125, 0.25 ],
       [0.375, 0.5  , 0.625],
       [0.75 , 0.875, 1.   ]])
References
  • Robotics, Vision & Control, Section 12.1, P. Corke, Springer 2011.

thresh(t=None, opt='binary')[source]

Image threshold

Parameters
  • t (scalar, str) – threshold value

  • option (str, optional) – threshold option, defaults to ‘binary’

Returns

thresholded image

Return type

Image

Apply a threshold t to the image. Various thresholding options are supported:

Option

Function

'binary'

\(Y_{u,v} = \left\{ \begin{array}{l} m \mbox{, if } X_{u,v} > t \\ 0 \mbox{, otherwise} \end{array} \right.\)

'binary_inv'

\(Y_{u,v} = \left\{ \begin{array}{l} 0 \mbox{, if } X_{u,v} > t \\ m \mbox{, otherwise} \end{array} \right.\)

'truncate'

\(Y_{u,v} = \left\{ \begin{array}{l} t \mbox{, if } X_{u,v} > t \\ X_{u,v} \mbox{, otherwise} \end{array} \right.\)

'tozero'

\(Y_{u,v} = \left\{ \begin{array}{l} X_{u,v} \mbox{, if } X_{u,v} > t \\ 0 \mbox{, otherwise} \end{array} \right.\)

'tozero_inv'

\(Y_{u,v} = \left\{ \begin{array}{l} 0 \mbox{, if } X_{u,v} > t \\ X_{u,v} \mbox{, otherwise} \end{array} \right.\)

where \(m\) is the maximum value of the image datatype.

If threshold t is a string then the threshold is determined automatically:

threshold

algorithm

'otsu'

Otsu’s method finds the threshold that minimizes the within-class variance. This technique is effective for a bimodal greyscale histogram.

The triangle method constructs a line between the histogram peak and the farthest end of the histogram. The threshold is the point of maximum distance between the line and the histogram. This technique is effective when the object pixels produce a weak peak in the histogram.

'triangle'

Example:

>>> from machinevisiontoolbox import Image
>>> img = Image([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
>>> img.thresh(5).image
array([[  0,   0,   0],
       [  0,   0, 255],
       [255, 255, 255]], dtype=uint8)

Note

  • The threshold is applied to all color planes

  • If threshold is ‘otsu’ or ‘triangle’ the image must be greyscale, and the computed threshold is also returned.

References
  • A Threshold Selection Method from Gray-Level Histograms, N. Otsu. IEEE Trans. Systems, Man and Cybernetics Vol SMC-9(1), Jan 1979, pp 62-66.

  • Automatic measurement of sister chromatid exchange frequency” Zack (Zack GW, Rogers WE, Latt SA (1977), J. Histochem. Cytochem. 25 (7): 741–53.

  • Robotics, Vision & Control for Python, Section 12.1.1, P. Corke, Springer 2023.

Seealso

ithresh adaptive_threshold otsu opencv.threshold

ithresh()[source]

Interactive thresholding

Returns

selected threshold value

Return type

scalar

The image is displayed with a binary threshold displayed in a simple Matplotlib GUI along with the histogram and a slider for threshold value. Adjusting the slider changes the thresholded image view.

The displayed image is

\[\begin{split}Y_{u,v} = \left\{ \begin{array}{l} m \mbox{, if } X_{u,v} > t \\ 0 \mbox{, otherwise} \end{array} \right.\end{split}\]
References
  • Robotics, Vision & Control for Python, Section 12.1.1.1, P. Corke, Springer 2023.

Seealso

thresh adaptive_threshold otsu opencv.threshold

otsu()[source]

Otsu threshold selection

Returns

Otsu’s threshold

Return type

scalar

Compute the optimal threshold for binarizing an image with a bimodal intensity histogram. t is a scalar threshold that maximizes the variance between the classes of pixels below and above the thresold t.

Example:

>>> from machinevisiontoolbox import Image
>>> img = Image.Read('street.png')
>>> img.otsu()
116

Note

  • Converts a color image to greyscale.

  • OpenCV implementation gives slightly different result to MATLAB Machine Vision Toolbox.

References
  • A Threshold Selection Method from Gray-Level Histograms, N. Otsu. IEEE Trans. Systems, Man and Cybernetics Vol SMC-9(1), Jan 1979, pp 62-66.

  • An improved method for image thresholding on the valley-emphasis method. H-F Ng, D. Jargalsaikhan etal. Signal and Info Proc. Assocn. Annual Summit and Conf (APSIPA). 2013. pp1-4

  • Robotics, Vision & Control for Python, Section 12.1.1, P. Corke, Springer 2023.

Seealso

thresh ithresh adaptive_threshold opencv.threshold

blend(image2, alpha, beta=None, gamma=0)[source]

Image blending

Parameters
  • image2 (Image) – second image

  • alpha (float) – fraction of image

  • beta (float, optional) – fraction of image2, defaults to 1-alpha

  • gamma (int, optional) – gamma nonlinearity, defaults to 0

Raises
  • ValueError – images are not same size

  • ValueError – images are of different type

Returns

blended image

Return type

Image

The resulting image is

\[\mathbf{Y} = \alpha \mathbf{X}_1 + \beta \mathbf{X}_2 + \gamma\]

Example:

>>> from machinevisiontoolbox import Image
>>> img1 = Image.Constant(3, value=4)
>>> img2 = Image([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
>>> img1.blend(img2, 0.5, 2).A
array([[ 4,  6,  8],
       [10, 12, 14],
       [16, 18, 20]], dtype=uint8)

Note

  • For integer images the result is saturated.

  • For a multiplane image each plane is processed independently.

Seealso

choose cv2.addWeighted

choose(image2, mask)[source]

Pixel-wise image merge

Parameters
  • image2 (Image, array_like(3), str) – second image

  • mask (ndarray(H,W)) – image mask

Raises
  • ValueError – image and mask must be same size

  • ValueError – image and image2 must be same size

Returns

merged images

Return type

Image

Return an image where each pixel is selected from the corresponding pixel in self or image2 according to the corresponding pixel values in mask. If the element of mask is zero/false the pixel value from self is selected, otherwise the pixel value from image2 is selected:

\[\begin{split}\mathbf{Y}_{u,v} = \left\{ \begin{array}{ll} \mathbf{X}_{1:u,v} & \mbox{if } \mathbf{M}_{u,v} = 0 \\ \mathbf{X}_{2:u,v} & \mbox{if } \mathbf{M}_{u,v} > 0 \end{array} \right.\end{split}\]

If image2 is a scalar or 1D array it is taken as the pixel value, and must have the same number of elements as the channel depth of self. If image2 is a string it is taken as a colorname which is looked up using name2color.

Example:

>>> from machinevisiontoolbox import Image
>>> img1 = Image.Constant(3, value=10)
>>> img2 = Image.Constant(3, value=80)
>>> img = Image([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
>>> img1.choose(img2, img >=5).image
array([[10, 10, 10],
       [10, 80, 80],
       [80, 80, 80]], dtype=uint8)
>>> img1 = Image.Constant(3, value=[0,0,0])
>>> img1.choose('red', img>=5).red().image
array([[  0,   0,   0],
       [  0, 255, 255],
       [255, 255, 255]], dtype=uint8)

Note

  • If image and image2 are both greyscale then the result is greyscale.

  • If either of image and image2 are color then the result is color.

  • If one image is double and the other is integer, then the integer image is first converted to a double image.

  • image2 can contain a color descriptor which is one of: a scalar value corresponding to a greyscale, a 3-vector corresponding to a color value, or a string containing the name of a color which is found using name2color.

Seealso

name2color opencv.bitwise_and

paste(pattern, pt, method='set', position='topleft', copy=False, zero=True)[source]

Paste an image into an image

Parameters
  • pattern (Image, ndarray(H,W)) – image to be pasted

  • pt (array_like(2)) – coordinates (u,v) where pattern is pasted

  • method (str) – options for image merging, one of: 'set' [default], 'mean', 'add'

  • position (str, optional) – pt is one of: 'topleft' [default] or 'centre'

  • copy (bool, optional) – copy image before pasting, defaults to False

  • zero (bool, optional) – zero-based coordinates (True, default) or 1-based coordinates (False)

Raises

ValueError – pattern is positioned outside the bounds of the image

Returns

original image with pasted pattern

Return type

Image

Pastes the pattern into the image which is modified inplace. The pattern can be incorporated into the specified image by:

method

description

'set'

overwrites the pixels in image

'add'

adds to the pixels in image

'mean'

sets pixels to the mean of the pixel values in pattern and image

The position of the pasted pattern in the image can be specified by its top left corner (umin, vmin) or its centre in the image.

Example:

>>> from machinevisiontoolbox import Image
>>> img1 = Image.Constant(5, value=10)
>>> pattern = Image([[11, 12], [13, 14]])
>>> img1.copy().paste(pattern, (1,2)).image
array([[10, 10, 10, 10, 10],
       [10, 10, 10, 10, 10],
       [10, 11, 12, 10, 10],
       [10, 13, 14, 10, 10],
       [10, 10, 10, 10, 10]], dtype=uint8)
>>> img1.copy().paste(pattern, (1,2), method='add').image
array([[10, 10, 10, 10, 10],
       [10, 10, 10, 10, 10],
       [10, 21, 22, 10, 10],
       [10, 23, 24, 10, 10],
       [10, 10, 10, 10, 10]], dtype=uint8)
>>> img1.copy().paste(pattern, (1,2), method='mean').image
array([[10, 10, 10, 10, 10],
       [10, 10, 10, 10, 10],
       [10, 10, 11, 10, 10],
       [10, 11, 12, 10, 10],
       [10, 10, 10, 10, 10]], dtype=uint8)

Note

  • Pixels outside the pasted region are unaffected.

  • If copy is False the image is modified in place

  • For position='centre' an odd sized pattern is assumed. For an even dimension the centre pixel is the one at dimension / 2.

  • Multi-plane images are supported.

  • If the pattern is multiplane and the image is singleplane, the image planes are replicated and colororder is taken from the pattern.

  • If the image is multiplane and the pattern is singleplane, the pattern planes are replicated.

invert()[source]

Invert image

Returns

_description_

Return type

_type_

For an integer image

\[\begin{split}Y_{u,v} = \left\{ \begin{array}{l} p_{\mbox{max}} \mbox{, if } X_{u,v} = 0 \\ p_{\mbox{min}} \mbox{, otherwise} \end{array}\right.\end{split}\]

where \(p_{\mbox{min}}\) and \(p_{\mbox{max}}\) are respectively the minimum and maximum value of the datatype.

For a float image

\[\begin{split}Y_{u,v} = \left\{ \begin{array}{l} 1.0 \mbox{, if } X_{u,v} = 0 \\ 0.0 \mbox{, otherwise} \end{array}\right.\end{split}\]

Example:

>>> from machinevisiontoolbox import Image
>>> img = Image([[0, 1], [2, 3]])
>>> img.invert().image
array([[255,   0],
       [  0,   0]], dtype=uint8)