Image shape/size changing

These methods change the shape/size of image or combine multiple images.

class machinevisiontoolbox.ImageReshape.ImageReshapeMixin[source]

trim(left=0, right=0, top=0, bottom=0)[source]

Trim pixels from the edges of the image

Parameters

left (int, optional) – number of pixels to trim from left side of image, defaults to 0
right (int, optional) – number of pixels to trim from right side of image, defaults to 0
top (int, optional) – number of pixels to trim from top side of image, defaults to 0
bottom (int, optional) – number of pixels to trim from bottom side of image, defaults to 0

Returns

trimmed image

Return type

Image

Trim pixels from the edges of the image.

Example:

>>> from machinevisiontoolbox import Image
>>> img = Image.Read('flowers1.png')
>>> img
Image: 640 x 426 (uint8), R:G:B [.../images/flowers1.png]
>>> img.trim(left=100, bottom=100)
Image: 540 x 326 (uint8), R:G:B

pad(left=0, right=0, top=0, bottom=0, value=0)[source]

Pad the edges of the image

Parameters

left (int, optional) – number of pixels to pad on left side of image, defaults to 0
right (int, optional) – number of pixels to pad on right side of image, defaults to 0
top (int, optional) – number of pixels to pad on top side of image, defaults to 0
bottom (int, optional) – number of pixels to pad on bottom side of image, defaults to 0
value (scalar, str, array_like) – value of pixels to pad with

Returns

padded image

Return type

Image

Pad the edges of the image with pixels equal to value.

Example:

>>> from machinevisiontoolbox import Image
>>> img = Image.Read('flowers1.png', dtype='float')
>>> img
Image: 640 x 426 (float32), R:G:B [.../images/flowers1.png]
>>> img.pad(left=10, bottom=10, top=10, right=10, value='r')
Image: 660 x 446 (float32), R:G:B

classmethod Hstack(images, sep=1, bgcolor=None, return_offsets=False)[source]

Horizontal concatenation of images

Parameters

images (iterable of Image) – images to concatenate horizontally
sep (int, optional) – separation between images, defaults to 1
bgcolor (scalar, string, array_like, optional) – color of background, seen in the separation between images, defaults to black
return_offsets (bool, optional) – additionally return the horizontal coordinates of each input image within the output image, defaults to False

Raises

ValueError – all images must have the same dtype
ValueError – all images must have the same color order

Returns

horizontally stacked images

Return type

Image

Create a new image by stacking the input images horizontally, with a vertical separator line of width sep and color bgcolor.

The horizontal coordinate of the first column of each image, in the composite output image, can be optionally returned if return_offsets is True.

Example:

>>> from machinevisiontoolbox import Image
>>> img = Image.Read('street.png')
>>> img
Image: 1280 x 851 (uint8) [.../images/street.png]
>>> Image.Hstack((img, img, img))
Image: 3842 x 851 (uint8)
>>> Image.Hstack((img, img, img), return_offsets=True)
(Image: 3842 x 851 (uint8), [0, 1281, 2562])

Seealso: Vstack Tile

classmethod Vstack(images, sep=1, bgcolor=None, return_offsets=False)[source]

Vertical concatenation of images

Parameters

images (iterable of Image) – images to concatenate vertically
sep (int, optional) – separation between images, defaults to 1
bgcolor (scalar, string, array_like, optional) – color of background, seen in the separation between images, defaults to black
return_offsets (bool, optional) – additionally return the vertical coordinates of each input image within the output image, defaults to False

Raises

ValueError – all images must have the same dtype
ValueError – all images must have the same color order

Returns

vertically stacked images

Return type

Image

Create a new image by stacking the input images vertically, with a horizontal separator line of width sep and color bgcolor.

The vertical coordinate of the first row of each image, in the composite output image, can be optionally returned if return_offsets is True.

Example:

>>> from machinevisiontoolbox import Image
>>> img = Image.Read('street.png')
>>> img
Image: 1280 x 851 (uint8) [.../images/street.png]
>>> Image.Vstack((img, img, img))
Image: 1280 x 2555 (uint8)
>>> Image.Vstack((img, img, img), return_offsets=True)
(Image: 1280 x 2555 (uint8), [0, 852, 1704])

Seealso: Hstack Tile

classmethod Tile(tiles, columns=4, sep=2, bgcolor=None)[source]

Tile images into a grid

Parameters

tiles (iterable of Image) – images to tile
columns (int, optional) – number of columns in the grid, defaults to 4
sep (int, optional) – separation between images, defaults to 1
bgcolor (scalar, string, array_like, optional) – color of background, seen in the separation between images, defaults to black

Raises

ValueError – all images must have the same size
ValueError – all images must have the same dtype

Returns

grid of images

Return type

Image instance

Construct a new image by tiling the input images into a grid.

Example:

ValueError: (array([[[0, 0, 0],
        [0, 0, 0],
        [0, 0, 0],
        ...,
        [0, 0, 0],
        [0, 0, 0],
        [0, 0, 0]],

       [[0, 0, 0],
        [0, 0, 0],
        [0, 0, 0],
        ...,
        [0, 0, 0],
        [0, 0, 0],
        [0, 0, 0]],

       [[0, 0, 0],
        [0, 0, 0],
        [0, 0, 0],
        ...,
        [0, 0, 0],
        [0, 0, 0],
        [0, 0, 0]],

       ...,

       [[0, 0, 0],
        [0, 0, 0],
        [0, 0, 0],
        ...,
        [0, 0, 0],
        [0, 0, 0],
        [0, 0, 0]],

       [[0, 0, 0],
        [0, 0, 0],
        [0, 0, 0],
        ...,
        [0, 0, 0],
        [0, 0, 0],
        [0, 0, 0]],

       [[0, 0, 0],
        [0, 0, 0],
        [0, 0, 0],
        ...,
        [0, 0, 0],
        [0, 0, 0],
        [0, 0, 0]]], dtype=uint8), 'Image must be greyscale')

Seealso: Hstack Vstack

decimate(m=2, sigma=None)[source]

Decimate an image

Parameters

m (int) – decimation factor
sigma (float, optional) – standard deviation for Gaussian kernel smoothing, defaults to None

Raises

ValueError – decimation factor m must be an integer

Returns

decimated image

Return type

Image

Return a decimated version of the image whose size is reduced by subsampling every m (an integer) pixels in both dimensions.

The image is smoothed with a Gaussian kernel with standard deviation sigma. If

sigma is None then a value of m/2 is used,
sigma is zero then no smoothing is performed.

Note

If the image has multiple planes, each plane is decimated.
Smoothing is used to eliminate aliasing artifacts and the standard deviation should be chosen as a function of the maximum spatial frequency in the image.

Example:

>>> from machinevisiontoolbox import Image
>>> img = Image.Random(6)
>>> img.print()
 253  90  99 122  78 242
  67 249 187 182 137  67
   8 156 206   4  64 168
 138  48  75 215  52 248
  17 182 107 250  36  69
 221  41 159 105 196 201
>>> img.decimate(2, sigma=0).print()
 253  99  78
   8 206  64
  17 107  36

References

Robotics, Vision & Control for Python, Section 11.7.2, P. Corke, Springer 2023.

Seealso

replicate scale

replicate(n=1)[source]

Replicate image pixels

Parameters: n (int, optional) – replication factor, defaults to 1
Returns: image with replicated pixels
Return type: Image

Create an image where each input pixel becomes an \(n \times n\) patch of pixel values. This is a simple way of upscaling an image.

Example:

>>> from machinevisiontoolbox import Image
>>> img = Image.Random(5)
>>> img.print()
94  44 154 184
244 112 234  82
161 133  34  60
118 150 107 160
240 190  65  45
>>> bigger = img.replicate(2)
>>> bigger.print()
197  94  94  44  44 154 154 184 184
197  94  94  44  44 154 154 184 184
128 244 244 112 112 234 234  82  82
128 244 244 112 112 234 234  82  82
76 161 161 133 133  34  34  60  60
76 161 161 133 133  34  34  60  60
92 118 118 150 150 107 107 160 160
92 118 118 150 150 107 107 160 160
83 240 240 190 190  65  65  45  45
83 240 240 190 190  65  65  45  45

Note

Works only for greyscale images.
The resulting image is “blocky”, apply Gaussian smoothing to reduce this.

References

Robotics, Vision & Control for Python, Section 11.7.2, P. Corke, Springer 2023.

Seealso

decimate

roi(bbox=None)[source]

Extract region of interest

Parameters: bbox (array_like(4)) – region as [umin, umax, vmin, vmax]
Returns: region of interest, optional bounding box
Return type: Image, list

Return the specified region of the image. If bbox is None the image is displayed using Matplotlib and the user can interactively select the region, returning the image region and the bounding box [umin, umax, vmin, vmax].

Example:

>>> from machinevisiontoolbox import Image
>>> img = Image.Read('monalisa.png')
>>> smile = img.roi([265, 342, 264, 286])

samesize(image2, bias=0.5)[source]

Automatic image trimming

Parameters

v – image to match size with
bias (float, optional) – bias that controls what part of the image is cropped, defaults to 0.5

Returns

resized image

Rtype out

Image

Return a version of the image that has the same dimensions as image2. This is achieved by cropping (to match the aspect ratio) and scaling (to match the size).

bias controls which part of the image is cropped. bias = 0.5 is symmetric cropping, bias < 0.5 moves the crop window up or to the left, while bias>0.5 moves the crop window down or to the right.

Example:

>>> from machinevisiontoolbox import Image
>>> foreground = Image.Read("greenscreen.png", dtype="float")
>>> foreground
Image: 1024 x 768 (float32), R:G:B [.../images/greenscreen.png]
>>> background = Image.Read("road.png", dtype="float")
>>> background
Image: 1280 x 851 (float32), R:G:B [.../images/road.png]
>>> background.samesize(foreground)
Image: 1024 x 768 (float32), R:G:B

References

Robotics, Vision & Control for Python, Section 11.4.1.1, P. Corke, Springer 2023.

Seealso

trim scale

scale(sfactor, sigma=None, interpolation=None)[source]

Scale an image

Parameters

sfactor (scalar) – scale factor
sigma (float) – standard deviation of kernel for image smoothing, in pixels

Raises

ValueError – bad interpolation string
ValueError – bad interpolation value

Returns

smoothed image

Return type

Image instance

Rescale the image. If sfactor> 1 the image is enlarged.

If sfactor < 1 the image is made smaller and smoothing can be applied to reduce sampling artefacts. If sigma is None, use default for scale by sigma=1/sfactor/2. If sigma=0 perform no smoothing.

interpolation	description
`'cubic'`	bicubic interpolation
`'linear'`	bilinear interpolation
`'area'`	resampling using pixel area relation

Example:

>>> from machinevisiontoolbox import Image
>>> img = Image.Read('monalisa.png')
>>> img.scale(2)
Image: 1354 x 1400 (uint8), R:G:B
>>> img.scale(0.5)
Image: 338 x 350 (uint8), R:G:B

References

Robotics, Vision & Control for Python, Section 11.7.2, P. Corke, Springer 2023.

Seealso

opencv.resize

rotate(angle, centre=None)[source]

Rotate an image

Parameters

angle (scalar) – rotatation angle [radians]
centre (array_like(2)) – centre of rotation, defaults to centre of image

Returns

rotated image

Return type

Image

Rotate the image counter-clockwise by angle angle in radians.

Example:

>>> from machinevisiontoolbox import Image
>>> img = Image.Read('monalisa.png')
>>> out = img.rotate(0.5)
>>> out.disp()
<matplotlib.image.AxesImage object at 0x7f2e59036d00>

Note

Rotation is defined with respect to a z-axis which is into the image, therefore counter-clockwise is a positive angle.
The pixels in the corners of the resulting image will be undefined.

rotate_spherical(R)[source]

Rotate a spherical image

Parameters: R (spatialmath.pose3d.SO3) – an SO(3) rotation matrix
Returns: rotated spherical image
Return type: Image

Rotates pixels in the input spherical image by the SO(3) rotation matrix.

A spherical image is represented by a rectangular array of pixels with a horizintal domain that spans azimuth angle \(\phi \in [0, 2\pi]\) and a vertical domain that spans colatitude angle \(\theta \in [0, \pi]\).

Seealso: meshgrid uspan vspan scipy.interpolate.griddata

meshgrid(width=None, height=None, step=1)[source]

Coordinate arrays for image

Parameters

width (int, optional) – width of array in pixels, defaults to width of image
height (int, optional) – height of array in pixels, defaults to height of image

Returns

domain of image

Rtype u

ndarray(H,W), ndarray(H,W)

Create a pair of arrays U and V that describe the domain of the image. The element U(u,v) = u and V(u,v) = v. These matrices can be used for the evaluation of functions over the image such as interpolation and warping.

Invoking as a class method with self=None is a convenient way to access base.meshgrid.

Example:

>>> from machinevisiontoolbox import Image
>>> img = Image.Zeros(3)
>>> U, V = img.meshgrid()
>>> U
array([[0, 1, 2],
       [0, 1, 2],
       [0, 1, 2]])
>>> V
array([[0, 0, 0],
       [1, 1, 1],
       [2, 2, 2]])
>>> Image(U**2 + V**2).image
array([[0, 1, 4],
       [1, 2, 5],
       [4, 5, 8]])
>>> U, V = Image.meshgrid(None, 4, 4)
>>> U
array([[0, 1, 2, 3],
       [0, 1, 2, 3],
       [0, 1, 2, 3],
       [0, 1, 2, 3]])

Seealso: meshgrid

warp(U, V, interp=None, domain=None)[source]

Image warping

Parameters

U (ndarray(Wo,Ho)) – u-coordinate array for output image
V (ndarray(Wo,Ho)) – u-coordinate array for output image
interp (str, optional) – interpolation mode, defaults to None
domain ((ndarray(H,W), ndarray(H,W)), optional) – domain of output image, defaults to None

Returns

warped image

Return type

Image

Compute an image by warping the input image. The output image is \(H_o \times W_o\) and output pixel (u,v) is interpolated from the input image coordinate (U[u,v], V[u,v]):

\[Y_{u,v} = X_{u^\prime, v^\prime} \mbox{, where } u^\prime = U_{u,v}, v^\prime = V_{u,v}\]

Note

Uses OpenCV.

Seealso: interp2d meshgrid opencv.remap

interp2d(U, V, Ud=None, Vd=None, **kwargs)[source]

Image warping

Parameters

U (ndarray(Ho,Wo)) – u-coordinate array for output image
V (ndarray(Ho,Wo)) – u-coordinate array for output image
Ud (ndarray(H,W), optional) – u-coordinate array for domain of input image, defaults to None
Vd (ndarray(H,W), optional) – u-coordinate array for domain of input image, defaults to None

Returns

warped image

Return type

Image

Compute an image by warping the input image. The output image is \(H_o \times W_o\) and output pixel (u,v) is interpolated from the input image coordinate (U[v,u], V[v,u]).

\[Y_{u,v} = X_{u^\prime, v^\prime} \mbox{, where } u^\prime = U_{u,v}, v^\prime = V_{u,v}\]

The coordinates in U and V are with respect to the domain of the input image but can be overridden by specifying Ud and Vd.

Note

Uses SciPy

Seealso: warp meshgrid uspan vspan scipy.interpolate.griddata

warp_affine(M, inverse=False, size=None, bgcolor=None)[source]

Affine warp of image

Parameters

M (ndarray(2,3), SE2) – affine matrix
inverse (bool, optional) – warp with inverse of M, defaults to False
size (array_like(2), optional) – size of output image, defaults to size of input image
bgcolor (scalar, str, array_like, optional) – background color, defaults to None

Returns

warped image

Return type

Image

Apply an affine warp to the image. Pixels in the output image that correspond to pixels outside the input image are set to bgcol.

\[\begin{split}Y_{u,v} = X_{u^\prime, v^\prime} \mbox{, where } \begin{pmatrix} u^\prime \\ v^\prime \end{pmatrix} = \mat{M} \begin{pmatrix} u \\ v \\ 1 \end{pmatrix}\end{split}\]

Example:

>>> from machinevisiontoolbox import Image
>>> import numpy as np
>>> from spatialmath import SE2
>>> img = Image.Read('monalisa.png')
>>> M = np.diag([0.25, 0.25, 1]) * SE2(100, 200)  # scale and translate
>>> M
array([[  0.25,   0.  , 100.  ],
       [  0.  ,   0.25, 200.  ],
       [  0.  ,   0.  ,   1.  ]])
>>> out = img.warp_affine(M, bgcolor=np.nan)  # unmapped pixels are NaNs
>>> out.disp(badcolor="r")  # display warped image with NaNs as red
<matplotlib.image.AxesImage object at 0x7fedf4028cd0>

Note

Only the first two rows of M are used.

Seealso: warp opencv.warpAffine

warp_perspective(H, method='linear', inverse=False, tile=False, size=None)[source]

Perspective warp

Parameters

H (ndarray(3,3)) – homography
method (str, optional) – interpolation mode: ‘linear’ [default], ‘nearest’
inverse (bool, optional) – use inverse of H, defaults to False
tile (bool, optional) – return minimal enclosing tile, defaults to False
size (array_like(2), optional) – size of output image, defaults to size of input image

Raises

TypeError – H must be a 3x3 NumPy array

Returns

warped image

Return type

Image

Applies a perspective warp to the input image.

\[\begin{split}Y_{u,v} = X_{u^\prime, v^\prime} \mbox{, where } u^\prime=\frac{\tilde{u}}{\tilde{w}}, v^\prime=\frac{\tilde{v}}{\tilde{w}}, \begin{pmatrix} \tilde{u} \\ \tilde{v} \\ \tilde{w} \end{pmatrix} = \mat{H} \begin{pmatrix} u \\ v \\ 1 \end{pmatrix}\end{split}\]

The resulting image may be smaller or larger than the input image. If tile is True then the output image is the smallest rectangle that contains the warped result, and its position with respect to the origin of the input image, and the coordinates of the four corners of the input image.

References

Robotics, Vision & Control for Python, Section 14.8, P. Corke, Springer 2023.

Seealso

warp opencv.warpPerspective

undistort(K, dist)[source]

Undistort image

Parameters

K (ndarray(3,3)) – camera intrinsics
dist (array_like(5)) – lens distortion parameters

Returns

undistorted image

Return type

Image

Remove lens distortion from image.

The distortion coefficients are \((k_1, k_2, p_1, p_2, k_3)\) where \(k_i\) are radial distortion coefficients and \(p_i\) are tangential distortion coefficients.

Example:

>>> from machinevisiontoolbox import Image, ImageCollection
>>> import numpy as np
>>> images = ImageCollection("calibration/*.jpg")
>>> K = np.array([[ 534.1, 0, 341.5], [ 0, 534.1, 232.9], [ 0, 0, 1]])
>>> distortion = np.array([ -0.293, 0.1077, 0.00131, -3.109e-05, 0.04348])
>>> out = images[12].undistort(K, distortion)
>>> out.disp()
<matplotlib.image.AxesImage object at 0x7f9358254c70>

Seealso: images2C opencv.undistort

view1d()[source]

Convert image to a column view

Returns: column view
Return type: ndarray(N,) or ndarray(N, np)

A greyscale image is converted to a 1D array in row-major (C) order, ie. row 0, row 1 etc.

A color image is converted to a 2D array in row-major (C) order, with one row per pixel, and each row is the pixel value, the values of its planes.

Example:

>>> from machinevisiontoolbox import Image
>>> Image.Read('street.png').view1d().shape
(1089280,)
>>> Image.Read('monalisa.png').view1d().shape
(473900, 3)

Note

This creates a view of the original image, so operations on the column will affect the original image.