Image shape/size changing

These methods change the shape/size of image or combine multiple images.

class machinevisiontoolbox.ImageReshape.ImageReshapeMixin[source]
trim(left=0, right=0, top=0, bottom=0)[source]

Trim pixels from the edges of the image

Parameters
  • left (int, optional) – number of pixels to trim from left side of image, defaults to 0

  • right (int, optional) – number of pixels to trim from right side of image, defaults to 0

  • top (int, optional) – number of pixels to trim from top side of image, defaults to 0

  • bottom (int, optional) – number of pixels to trim from bottom side of image, defaults to 0

Returns

trimmed image

Return type

Image

Trim pixels from the edges of the image.

Example:

>>> from machinevisiontoolbox import Image
>>> img = Image.Read('flowers1.png')
>>> img
Image: 640 x 426 (uint8), R:G:B [.../images/flowers1.png]
>>> img.trim(left=100, bottom=100)
Image: 540 x 326 (uint8), R:G:B
pad(left=0, right=0, top=0, bottom=0, value=0)[source]

Pad the edges of the image

Parameters
  • left (int, optional) – number of pixels to pad on left side of image, defaults to 0

  • right (int, optional) – number of pixels to pad on right side of image, defaults to 0

  • top (int, optional) – number of pixels to pad on top side of image, defaults to 0

  • bottom (int, optional) – number of pixels to pad on bottom side of image, defaults to 0

  • value (scalar, str, array_like) – value of pixels to pad with

Returns

padded image

Return type

Image

Pad the edges of the image with pixels equal to value.

Example:

>>> from machinevisiontoolbox import Image
>>> img = Image.Read('flowers1.png', dtype='float')
>>> img
Image: 640 x 426 (float32), R:G:B [.../images/flowers1.png]
>>> img.pad(left=10, bottom=10, top=10, right=10, value='r')
Image: 660 x 446 (float32), R:G:B
classmethod Hstack(images, sep=1, bgcolor=None, return_offsets=False)[source]

Horizontal concatenation of images

Parameters
  • images (iterable of Image) – images to concatenate horizontally

  • sep (int, optional) – separation between images, defaults to 1

  • bgcolor (scalar, string, array_like, optional) – color of background, seen in the separation between images, defaults to black

  • return_offsets (bool, optional) – additionally return the horizontal coordinates of each input image within the output image, defaults to False

Raises
  • ValueError – all images must have the same dtype

  • ValueError – all images must have the same color order

Returns

horizontally stacked images

Return type

Image

Create a new image by stacking the input images horizontally, with a vertical separator line of width sep and color bgcolor.

The horizontal coordinate of the first column of each image, in the composite output image, can be optionally returned if return_offsets is True.

Example:

>>> from machinevisiontoolbox import Image
>>> img = Image.Read('street.png')
>>> img
Image: 1280 x 851 (uint8) [.../images/street.png]
>>> Image.Hstack((img, img, img))
Image: 3842 x 851 (uint8)
>>> Image.Hstack((img, img, img), return_offsets=True)
(Image: 3842 x 851 (uint8), [0, 1281, 2562])
Seealso

Vstack Tile

classmethod Vstack(images, sep=1, bgcolor=None, return_offsets=False)[source]

Vertical concatenation of images

Parameters
  • images (iterable of Image) – images to concatenate vertically

  • sep (int, optional) – separation between images, defaults to 1

  • bgcolor (scalar, string, array_like, optional) – color of background, seen in the separation between images, defaults to black

  • return_offsets (bool, optional) – additionally return the vertical coordinates of each input image within the output image, defaults to False

Raises
  • ValueError – all images must have the same dtype

  • ValueError – all images must have the same color order

Returns

vertically stacked images

Return type

Image

Create a new image by stacking the input images vertically, with a horizontal separator line of width sep and color bgcolor.

The vertical coordinate of the first row of each image, in the composite output image, can be optionally returned if return_offsets is True.

Example:

>>> from machinevisiontoolbox import Image
>>> img = Image.Read('street.png')
>>> img
Image: 1280 x 851 (uint8) [.../images/street.png]
>>> Image.Vstack((img, img, img))
Image: 1280 x 2555 (uint8)
>>> Image.Vstack((img, img, img), return_offsets=True)
(Image: 1280 x 2555 (uint8), [0, 852, 1704])
Seealso

Hstack Tile

classmethod Tile(tiles, columns=4, sep=2, bgcolor=None)[source]

Tile images into a grid

Parameters
  • tiles (iterable of Image) – images to tile

  • columns (int, optional) – number of columns in the grid, defaults to 4

  • sep (int, optional) – separation between images, defaults to 1

  • bgcolor (scalar, string, array_like, optional) – color of background, seen in the separation between images, defaults to black

Raises
  • ValueError – all images must have the same size

  • ValueError – all images must have the same dtype

Returns

grid of images

Return type

Image instance

Construct a new image by tiling the input images into a grid.

Example:

ValueError: (array([[[0, 0, 0],
        [0, 0, 0],
        [0, 0, 0],
        ...,
        [0, 0, 0],
        [0, 0, 0],
        [0, 0, 0]],

       [[0, 0, 0],
        [0, 0, 0],
        [0, 0, 0],
        ...,
        [0, 0, 0],
        [0, 0, 0],
        [0, 0, 0]],

       [[0, 0, 0],
        [0, 0, 0],
        [0, 0, 0],
        ...,
        [0, 0, 0],
        [0, 0, 0],
        [0, 0, 0]],

       ...,

       [[0, 0, 0],
        [0, 0, 0],
        [0, 0, 0],
        ...,
        [0, 0, 0],
        [0, 0, 0],
        [0, 0, 0]],

       [[0, 0, 0],
        [0, 0, 0],
        [0, 0, 0],
        ...,
        [0, 0, 0],
        [0, 0, 0],
        [0, 0, 0]],

       [[0, 0, 0],
        [0, 0, 0],
        [0, 0, 0],
        ...,
        [0, 0, 0],
        [0, 0, 0],
        [0, 0, 0]]], dtype=uint8), 'Image must be greyscale')
Seealso

Hstack Vstack

decimate(m=2, sigma=None)[source]

Decimate an image

Parameters
  • m (int) – decimation factor

  • sigma (float, optional) – standard deviation for Gaussian kernel smoothing, defaults to None

Raises

ValueError – decimation factor m must be an integer

Returns

decimated image

Return type

Image

Return a decimated version of the image whose size is reduced by subsampling every m (an integer) pixels in both dimensions.

The image is smoothed with a Gaussian kernel with standard deviation sigma. If

  • sigma is None then a value of m/2 is used,

  • sigma is zero then no smoothing is performed.

Note

  • If the image has multiple planes, each plane is decimated.

  • Smoothing is used to eliminate aliasing artifacts and the standard deviation should be chosen as a function of the maximum spatial frequency in the image.

Example:

>>> from machinevisiontoolbox import Image
>>> img = Image.Random(6)
>>> img.print()
 253  90  99 122  78 242
  67 249 187 182 137  67
   8 156 206   4  64 168
 138  48  75 215  52 248
  17 182 107 250  36  69
 221  41 159 105 196 201
>>> img.decimate(2, sigma=0).print()
 253  99  78
   8 206  64
  17 107  36
References
  • Robotics, Vision & Control for Python, Section 11.7.2, P. Corke, Springer 2023.

Seealso

replicate scale

replicate(n=1)[source]

Replicate image pixels

Parameters

n (int, optional) – replication factor, defaults to 1

Returns

image with replicated pixels

Return type

Image

Create an image where each input pixel becomes an \(n \times n\) patch of pixel values. This is a simple way of upscaling an image.

Example:

>>> from machinevisiontoolbox import Image
>>> img = Image.Random(5)
>>> img.print()
 197  94  44 154 184
 128 244 112 234  82
  76 161 133  34  60
  92 118 150 107 160
  83 240 190  65  45
>>> bigger = img.replicate(2)
>>> bigger.print()
 197 197  94  94  44  44 154 154 184 184
 197 197  94  94  44  44 154 154 184 184
 128 128 244 244 112 112 234 234  82  82
 128 128 244 244 112 112 234 234  82  82
  76  76 161 161 133 133  34  34  60  60
  76  76 161 161 133 133  34  34  60  60
  92  92 118 118 150 150 107 107 160 160
  92  92 118 118 150 150 107 107 160 160
  83  83 240 240 190 190  65  65  45  45
  83  83 240 240 190 190  65  65  45  45

Note

  • Works only for greyscale images.

  • The resulting image is “blocky”, apply Gaussian smoothing to reduce this.

References
  • Robotics, Vision & Control for Python, Section 11.7.2, P. Corke, Springer 2023.

Seealso

decimate

roi(bbox=None)[source]

Extract region of interest

Parameters

bbox (array_like(4)) – region as [umin, umax, vmin, vmax]

Returns

region of interest, optional bounding box

Return type

Image, list

Return the specified region of the image. If bbox is None the image is displayed using Matplotlib and the user can interactively select the region, returning the image region and the bounding box [umin, umax, vmin, vmax].

Example:

>>> from machinevisiontoolbox import Image
>>> img = Image.Read('monalisa.png')
>>> smile = img.roi([265, 342, 264, 286])
samesize(image2, bias=0.5)[source]

Automatic image trimming

Parameters
  • v – image to match size with

  • bias (float, optional) – bias that controls what part of the image is cropped, defaults to 0.5

Returns

resized image

Rtype out

Image

Return a version of the image that has the same dimensions as image2. This is achieved by cropping (to match the aspect ratio) and scaling (to match the size).

bias controls which part of the image is cropped. bias = 0.5 is symmetric cropping, bias < 0.5 moves the crop window up or to the left, while bias>0.5 moves the crop window down or to the right.

Example:

>>> from machinevisiontoolbox import Image
>>> foreground = Image.Read("greenscreen.png", dtype="float")
>>> foreground
Image: 1024 x 768 (float32), R:G:B [.../images/greenscreen.png]
>>> background = Image.Read("road.png", dtype="float")
>>> background
Image: 1280 x 851 (float32), R:G:B [.../images/road.png]
>>> background.samesize(foreground)
Image: 1024 x 768 (float32), R:G:B
References
  • Robotics, Vision & Control for Python, Section 11.4.1.1, P. Corke, Springer 2023.

Seealso

trim scale

scale(sfactor, sigma=None, interpolation=None)[source]

Scale an image

Parameters
  • sfactor (scalar) – scale factor

  • sigma (float) – standard deviation of kernel for image smoothing, in pixels

Raises
  • ValueError – bad interpolation string

  • ValueError – bad interpolation value

Returns

smoothed image

Return type

Image instance

Rescale the image. If sfactor> 1 the image is enlarged.

If sfactor < 1 the image is made smaller and smoothing can be applied to reduce sampling artefacts. If sigma is None, use default for scale by sigma=1/sfactor/2. If sigma=0 perform no smoothing.

interpolation

description

'cubic'

bicubic interpolation

'linear'

bilinear interpolation

'area'

resampling using pixel area relation

Example:

>>> from machinevisiontoolbox import Image
>>> img = Image.Read('monalisa.png')
>>> img.scale(2)
Image: 1354 x 1400 (uint8), R:G:B
>>> img.scale(0.5)
Image: 338 x 350 (uint8), R:G:B
References
  • Robotics, Vision & Control for Python, Section 11.7.2, P. Corke, Springer 2023.

Seealso

opencv.resize

rotate(angle, centre=None)[source]

Rotate an image

Parameters
  • angle (scalar) – rotatation angle [radians]

  • centre (array_like(2)) – centre of rotation, defaults to centre of image

Returns

rotated image

Return type

Image

Rotate the image counter-clockwise by angle angle in radians.

Example:

>>> from machinevisiontoolbox import Image
>>> img = Image.Read('monalisa.png')
>>> out = img.rotate(0.5)
>>> out.disp()
<matplotlib.image.AxesImage object at 0x7f2e59036d00>

Note

  • Rotation is defined with respect to a z-axis which is into the image, therefore counter-clockwise is a positive angle.

  • The pixels in the corners of the resulting image will be undefined.

rotate_spherical(R)[source]

Rotate a spherical image

Parameters

R (spatialmath.pose3d.SO3) – an SO(3) rotation matrix

Returns

rotated spherical image

Return type

Image

Rotates pixels in the input spherical image by the SO(3) rotation matrix.

A spherical image is represented by a rectangular array of pixels with a horizintal domain that spans azimuth angle \(\phi \in [0, 2\pi]\) and a vertical domain that spans colatitude angle \(\theta \in [0, \pi]\).

Seealso

meshgrid uspan vspan scipy.interpolate.griddata

meshgrid(width=None, height=None, step=1)[source]

Coordinate arrays for image

Parameters
  • width (int, optional) – width of array in pixels, defaults to width of image

  • height (int, optional) – height of array in pixels, defaults to height of image

Returns

domain of image

Rtype u

ndarray(H,W), ndarray(H,W)

Create a pair of arrays U and V that describe the domain of the image. The element U(u,v) = u and V(u,v) = v. These matrices can be used for the evaluation of functions over the image such as interpolation and warping.

Invoking as a class method with self=None is a convenient way to access base.meshgrid.

Example:

>>> from machinevisiontoolbox import Image
>>> img = Image.Zeros(3)
>>> U, V = img.meshgrid()
>>> U
array([[0, 1, 2],
       [0, 1, 2],
       [0, 1, 2]])
>>> V
array([[0, 0, 0],
       [1, 1, 1],
       [2, 2, 2]])
>>> Image(U**2 + V**2).image
array([[0, 1, 4],
       [1, 2, 5],
       [4, 5, 8]])
>>> U, V = Image.meshgrid(None, 4, 4)
>>> U
array([[0, 1, 2, 3],
       [0, 1, 2, 3],
       [0, 1, 2, 3],
       [0, 1, 2, 3]])
Seealso

meshgrid

warp(U, V, interp=None, domain=None)[source]

Image warping

Parameters
  • U (ndarray(Wo,Ho)) – u-coordinate array for output image

  • V (ndarray(Wo,Ho)) – u-coordinate array for output image

  • interp (str, optional) – interpolation mode, defaults to None

  • domain ((ndarray(H,W), ndarray(H,W)), optional) – domain of output image, defaults to None

Returns

warped image

Return type

Image

Compute an image by warping the input image. The output image is \(H_o \times W_o\) and output pixel (u,v) is interpolated from the input image coordinate (U[u,v], V[u,v]):

\[Y_{u,v} = X_{u^\prime, v^\prime} \mbox{, where } u^\prime = U_{u,v}, v^\prime = V_{u,v}\]

Note

Uses OpenCV.

Seealso

interp2d meshgrid opencv.remap

interp2d(U, V, Ud=None, Vd=None, **kwargs)[source]

Image warping

Parameters
  • U (ndarray(Ho,Wo)) – u-coordinate array for output image

  • V (ndarray(Ho,Wo)) – u-coordinate array for output image

  • Ud (ndarray(H,W), optional) – u-coordinate array for domain of input image, defaults to None

  • Vd (ndarray(H,W), optional) – u-coordinate array for domain of input image, defaults to None

Returns

warped image

Return type

Image

Compute an image by warping the input image. The output image is \(H_o \times W_o\) and output pixel (u,v) is interpolated from the input image coordinate (U[v,u], V[v,u]).

\[Y_{u,v} = X_{u^\prime, v^\prime} \mbox{, where } u^\prime = U_{u,v}, v^\prime = V_{u,v}\]

The coordinates in U and V are with respect to the domain of the input image but can be overridden by specifying Ud and Vd.

Note

Uses SciPy

Seealso

warp meshgrid uspan vspan scipy.interpolate.griddata

warp_affine(M, inverse=False, size=None, bgcolor=None)[source]

Affine warp of image

Parameters
  • M (ndarray(2,3), SE2) – affine matrix

  • inverse (bool, optional) – warp with inverse of M, defaults to False

  • size (array_like(2), optional) – size of output image, defaults to size of input image

  • bgcolor (scalar, str, array_like, optional) – background color, defaults to None

Returns

warped image

Return type

Image

Apply an affine warp to the image. Pixels in the output image that correspond to pixels outside the input image are set to bgcol.

\[\begin{split}Y_{u,v} = X_{u^\prime, v^\prime} \mbox{, where } \begin{pmatrix} u^\prime \\ v^\prime \end{pmatrix} = \mat{M} \begin{pmatrix} u \\ v \\ 1 \end{pmatrix}\end{split}\]

Example:

>>> from machinevisiontoolbox import Image
>>> import numpy as np
>>> from spatialmath import SE2
>>> img = Image.Read('monalisa.png')
>>> M = np.diag([0.25, 0.25, 1]) * SE2(100, 200)  # scale and translate
>>> M
array([[  0.25,   0.  , 100.  ],
       [  0.  ,   0.25, 200.  ],
       [  0.  ,   0.  ,   1.  ]])
>>> out = img.warp_affine(M, bgcolor=np.nan)  # unmapped pixels are NaNs
>>> out.disp(badcolor="r")  # display warped image with NaNs as red
<matplotlib.image.AxesImage object at 0x7fedf4028cd0>

Note

Only the first two rows of M are used.

Seealso

warp opencv.warpAffine

warp_perspective(H, method='linear', inverse=False, tile=False, size=None)[source]

Perspective warp

Parameters
  • H (ndarray(3,3)) – homography

  • method (str, optional) – interpolation mode: ‘linear’ [default], ‘nearest’

  • inverse (bool, optional) – use inverse of H, defaults to False

  • tile (bool, optional) – return minimal enclosing tile, defaults to False

  • size (array_like(2), optional) – size of output image, defaults to size of input image

Raises

TypeError – H must be a 3x3 NumPy array

Returns

warped image

Return type

Image

Applies a perspective warp to the input image.

\[\begin{split}Y_{u,v} = X_{u^\prime, v^\prime} \mbox{, where } u^\prime=\frac{\tilde{u}}{\tilde{w}}, v^\prime=\frac{\tilde{v}}{\tilde{w}}, \begin{pmatrix} \tilde{u} \\ \tilde{v} \\ \tilde{w} \end{pmatrix} = \mat{H} \begin{pmatrix} u \\ v \\ 1 \end{pmatrix}\end{split}\]

The resulting image may be smaller or larger than the input image. If tile is True then the output image is the smallest rectangle that contains the warped result, and its position with respect to the origin of the input image, and the coordinates of the four corners of the input image.

References
  • Robotics, Vision & Control for Python, Section 14.8, P. Corke, Springer 2023.

Seealso

warp opencv.warpPerspective

undistort(K, dist)[source]

Undistort image

Parameters
  • K (ndarray(3,3)) – camera intrinsics

  • dist (array_like(5)) – lens distortion parameters

Returns

undistorted image

Return type

Image

Remove lens distortion from image.

The distortion coefficients are \((k_1, k_2, p_1, p_2, k_3)\) where \(k_i\) are radial distortion coefficients and \(p_i\) are tangential distortion coefficients.

Example:

>>> from machinevisiontoolbox import Image, ImageCollection
>>> import numpy as np
>>> images = ImageCollection("calibration/*.jpg")
>>> K = np.array([[ 534.1, 0, 341.5], [ 0, 534.1, 232.9], [ 0, 0, 1]])
>>> distortion = np.array([ -0.293, 0.1077, 0.00131, -3.109e-05, 0.04348])
>>> out = images[12].undistort(K, distortion)
>>> out.disp()
<matplotlib.image.AxesImage object at 0x7f9358254c70>
Seealso

images2C opencv.undistort

view1d()[source]

Convert image to a column view

Returns

column view

Return type

ndarray(N,) or ndarray(N, np)

A greyscale image is converted to a 1D array in row-major (C) order, ie. row 0, row 1 etc.

A color image is converted to a 2D array in row-major (C) order, with one row per pixel, and each row is the pixel value, the values of its planes.

Example:

>>> from machinevisiontoolbox import Image
>>> Image.Read('street.png').view1d().shape
(1089280,)
>>> Image.Read('monalisa.png').view1d().shape
(473900, 3)

Note

This creates a view of the original image, so operations on the column will affect the original image.