Image shape/size changing
These methods change the shape/size of image or combine multiple images.
- class machinevisiontoolbox.ImageReshape.ImageReshapeMixin[source]
- trim(left=0, right=0, top=0, bottom=0)[source]
Trim pixels from the edges of the image
- Parameters
left (int, optional) – number of pixels to trim from left side of image, defaults to 0
right (int, optional) – number of pixels to trim from right side of image, defaults to 0
top (int, optional) – number of pixels to trim from top side of image, defaults to 0
bottom (int, optional) – number of pixels to trim from bottom side of image, defaults to 0
- Returns
trimmed image
- Return type
Image
Trim pixels from the edges of the image.
Example:
>>> from machinevisiontoolbox import Image >>> img = Image.Read('flowers1.png') >>> img Image: 640 x 426 (uint8), R:G:B [.../images/flowers1.png] >>> img.trim(left=100, bottom=100) Image: 540 x 326 (uint8), R:G:B
- pad(left=0, right=0, top=0, bottom=0, value=0)[source]
Pad the edges of the image
- Parameters
left (int, optional) – number of pixels to pad on left side of image, defaults to 0
right (int, optional) – number of pixels to pad on right side of image, defaults to 0
top (int, optional) – number of pixels to pad on top side of image, defaults to 0
bottom (int, optional) – number of pixels to pad on bottom side of image, defaults to 0
value (scalar, str, array_like) – value of pixels to pad with
- Returns
padded image
- Return type
Image
Pad the edges of the image with pixels equal to
value
.Example:
>>> from machinevisiontoolbox import Image >>> img = Image.Read('flowers1.png', dtype='float') >>> img Image: 640 x 426 (float32), R:G:B [.../images/flowers1.png] >>> img.pad(left=10, bottom=10, top=10, right=10, value='r') Image: 660 x 446 (float32), R:G:B
- classmethod Hstack(images, sep=1, bgcolor=None, return_offsets=False)[source]
Horizontal concatenation of images
- Parameters
images (iterable of
Image
) – images to concatenate horizontallysep (int, optional) – separation between images, defaults to 1
bgcolor (scalar, string, array_like, optional) – color of background, seen in the separation between images, defaults to black
return_offsets (bool, optional) – additionally return the horizontal coordinates of each input image within the output image, defaults to False
- Raises
ValueError – all images must have the same dtype
ValueError – all images must have the same color order
- Returns
horizontally stacked images
- Return type
Image
Create a new image by stacking the input images horizontally, with a vertical separator line of width
sep
and colorbgcolor
.The horizontal coordinate of the first column of each image, in the composite output image, can be optionally returned if
return_offsets
is True.Example:
>>> from machinevisiontoolbox import Image >>> img = Image.Read('street.png') >>> img Image: 1280 x 851 (uint8) [.../images/street.png] >>> Image.Hstack((img, img, img)) Image: 3842 x 851 (uint8) >>> Image.Hstack((img, img, img), return_offsets=True) (Image: 3842 x 851 (uint8), [0, 1281, 2562])
- classmethod Vstack(images, sep=1, bgcolor=None, return_offsets=False)[source]
Vertical concatenation of images
- Parameters
images (iterable of
Image
) – images to concatenate verticallysep (int, optional) – separation between images, defaults to 1
bgcolor (scalar, string, array_like, optional) – color of background, seen in the separation between images, defaults to black
return_offsets (bool, optional) – additionally return the vertical coordinates of each input image within the output image, defaults to False
- Raises
ValueError – all images must have the same dtype
ValueError – all images must have the same color order
- Returns
vertically stacked images
- Return type
Image
Create a new image by stacking the input images vertically, with a horizontal separator line of width
sep
and colorbgcolor
.The vertical coordinate of the first row of each image, in the composite output image, can be optionally returned if
return_offsets
is True.Example:
>>> from machinevisiontoolbox import Image >>> img = Image.Read('street.png') >>> img Image: 1280 x 851 (uint8) [.../images/street.png] >>> Image.Vstack((img, img, img)) Image: 1280 x 2555 (uint8) >>> Image.Vstack((img, img, img), return_offsets=True) (Image: 1280 x 2555 (uint8), [0, 852, 1704])
- classmethod Tile(tiles, columns=4, sep=2, bgcolor=None)[source]
Tile images into a grid
- Parameters
tiles (iterable of
Image
) – images to tilecolumns (int, optional) – number of columns in the grid, defaults to 4
sep (int, optional) – separation between images, defaults to 1
bgcolor (scalar, string, array_like, optional) – color of background, seen in the separation between images, defaults to black
- Raises
ValueError – all images must have the same size
ValueError – all images must have the same dtype
- Returns
grid of images
- Return type
Image
instance
Construct a new image by tiling the input images into a grid.
Example:
ValueError: (array([[[0, 0, 0], [0, 0, 0], [0, 0, 0], ..., [0, 0, 0], [0, 0, 0], [0, 0, 0]], [[0, 0, 0], [0, 0, 0], [0, 0, 0], ..., [0, 0, 0], [0, 0, 0], [0, 0, 0]], [[0, 0, 0], [0, 0, 0], [0, 0, 0], ..., [0, 0, 0], [0, 0, 0], [0, 0, 0]], ..., [[0, 0, 0], [0, 0, 0], [0, 0, 0], ..., [0, 0, 0], [0, 0, 0], [0, 0, 0]], [[0, 0, 0], [0, 0, 0], [0, 0, 0], ..., [0, 0, 0], [0, 0, 0], [0, 0, 0]], [[0, 0, 0], [0, 0, 0], [0, 0, 0], ..., [0, 0, 0], [0, 0, 0], [0, 0, 0]]], dtype=uint8), 'Image must be greyscale')
- decimate(m=2, sigma=None)[source]
Decimate an image
- Parameters
m (int) – decimation factor
sigma (float, optional) – standard deviation for Gaussian kernel smoothing, defaults to None
- Raises
ValueError – decimation factor m must be an integer
- Returns
decimated image
- Return type
Image
Return a decimated version of the image whose size is reduced by subsampling every
m
(an integer) pixels in both dimensions.The image is smoothed with a Gaussian kernel with standard deviation
sigma
. Ifsigma
is None then a value ofm/2
is used,sigma
is zero then no smoothing is performed.
Note
If the image has multiple planes, each plane is decimated.
Smoothing is used to eliminate aliasing artifacts and the standard deviation should be chosen as a function of the maximum spatial frequency in the image.
Example:
>>> from machinevisiontoolbox import Image >>> img = Image.Random(6) >>> img.print() 253 90 99 122 78 242 67 249 187 182 137 67 8 156 206 4 64 168 138 48 75 215 52 248 17 182 107 250 36 69 221 41 159 105 196 201 >>> img.decimate(2, sigma=0).print() 253 99 78 8 206 64 17 107 36
- replicate(n=1)[source]
Replicate image pixels
- Parameters
n (int, optional) – replication factor, defaults to 1
- Returns
image with replicated pixels
- Return type
Image
Create an image where each input pixel becomes an \(n \times n\) patch of pixel values. This is a simple way of upscaling an image.
Example:
>>> from machinevisiontoolbox import Image >>> img = Image.Random(5) >>> img.print() 197 94 44 154 184 128 244 112 234 82 76 161 133 34 60 92 118 150 107 160 83 240 190 65 45 >>> bigger = img.replicate(2) >>> bigger.print() 197 197 94 94 44 44 154 154 184 184 197 197 94 94 44 44 154 154 184 184 128 128 244 244 112 112 234 234 82 82 128 128 244 244 112 112 234 234 82 82 76 76 161 161 133 133 34 34 60 60 76 76 161 161 133 133 34 34 60 60 92 92 118 118 150 150 107 107 160 160 92 92 118 118 150 150 107 107 160 160 83 83 240 240 190 190 65 65 45 45 83 83 240 240 190 190 65 65 45 45
Note
Works only for greyscale images.
The resulting image is “blocky”, apply Gaussian smoothing to reduce this.
- References
Robotics, Vision & Control for Python, Section 11.7.2, P. Corke, Springer 2023.
- Seealso
- roi(bbox=None)[source]
Extract region of interest
- Parameters
bbox (array_like(4)) – region as [umin, umax, vmin, vmax]
- Returns
region of interest, optional bounding box
- Return type
Image
, list
Return the specified region of the image. If
bbox
is None the image is displayed using Matplotlib and the user can interactively select the region, returning the image region and the bounding box[umin, umax, vmin, vmax]
.Example:
>>> from machinevisiontoolbox import Image >>> img = Image.Read('monalisa.png') >>> smile = img.roi([265, 342, 264, 286])
- samesize(image2, bias=0.5)[source]
Automatic image trimming
- Parameters
v – image to match size with
bias (float, optional) – bias that controls what part of the image is cropped, defaults to 0.5
- Returns
resized image
- Rtype out
Image
Return a version of the image that has the same dimensions as
image2
. This is achieved by cropping (to match the aspect ratio) and scaling (to match the size).bias
controls which part of the image is cropped.bias
= 0.5 is symmetric cropping,bias
< 0.5 moves the crop window up or to the left, whilebias
>0.5 moves the crop window down or to the right.Example:
>>> from machinevisiontoolbox import Image >>> foreground = Image.Read("greenscreen.png", dtype="float") >>> foreground Image: 1024 x 768 (float32), R:G:B [.../images/greenscreen.png] >>> background = Image.Read("road.png", dtype="float") >>> background Image: 1280 x 851 (float32), R:G:B [.../images/road.png] >>> background.samesize(foreground) Image: 1024 x 768 (float32), R:G:B
- scale(sfactor, sigma=None, interpolation=None)[source]
Scale an image
- Parameters
sfactor (scalar) – scale factor
sigma (float) – standard deviation of kernel for image smoothing, in pixels
- Raises
ValueError – bad interpolation string
ValueError – bad interpolation value
- Returns
smoothed image
- Return type
Image
instance
Rescale the image. If
sfactor> 1
the image is enlarged.If
sfactor < 1
the image is made smaller and smoothing can be applied to reduce sampling artefacts. Ifsigma
is None, use default for scale by sigma=1/sfactor/2. Ifsigma=0
perform no smoothing.interpolation
description
'cubic'
bicubic interpolation
'linear'
bilinear interpolation
'area'
resampling using pixel area relation
Example:
>>> from machinevisiontoolbox import Image >>> img = Image.Read('monalisa.png') >>> img.scale(2) Image: 1354 x 1400 (uint8), R:G:B >>> img.scale(0.5) Image: 338 x 350 (uint8), R:G:B
- References
Robotics, Vision & Control for Python, Section 11.7.2, P. Corke, Springer 2023.
- Seealso
- rotate(angle, centre=None)[source]
Rotate an image
- Parameters
angle (scalar) – rotatation angle [radians]
centre (array_like(2)) – centre of rotation, defaults to centre of image
- Returns
rotated image
- Return type
Image
Rotate the image counter-clockwise by angle
angle
in radians.Example:
>>> from machinevisiontoolbox import Image >>> img = Image.Read('monalisa.png') >>> out = img.rotate(0.5) >>> out.disp() <matplotlib.image.AxesImage object at 0x7f2e59036d00>
Note
Rotation is defined with respect to a z-axis which is into the image, therefore counter-clockwise is a positive angle.
The pixels in the corners of the resulting image will be undefined.
- rotate_spherical(R)[source]
Rotate a spherical image
- Parameters
R (
spatialmath.pose3d.SO3
) – an SO(3) rotation matrix- Returns
rotated spherical image
- Return type
Image
Rotates pixels in the input spherical image by the SO(3) rotation matrix.
A spherical image is represented by a rectangular array of pixels with a horizintal domain that spans azimuth angle \(\phi \in [0, 2\pi]\) and a vertical domain that spans colatitude angle \(\theta \in [0, \pi]\).
- Seealso
meshgrid
uspan
vspan
scipy.interpolate.griddata
- meshgrid(width=None, height=None, step=1)[source]
Coordinate arrays for image
- Parameters
width (int, optional) – width of array in pixels, defaults to width of image
height (int, optional) – height of array in pixels, defaults to height of image
- Returns
domain of image
- Rtype u
ndarray(H,W), ndarray(H,W)
Create a pair of arrays
U
andV
that describe the domain of the image. The elementU(u,v) = u
andV(u,v) = v
. These matrices can be used for the evaluation of functions over the image such as interpolation and warping.Invoking as a class method with
self=None
is a convenient way to accessbase.meshgrid
.Example:
>>> from machinevisiontoolbox import Image >>> img = Image.Zeros(3) >>> U, V = img.meshgrid() >>> U array([[0, 1, 2], [0, 1, 2], [0, 1, 2]]) >>> V array([[0, 0, 0], [1, 1, 1], [2, 2, 2]]) >>> Image(U**2 + V**2).image array([[0, 1, 4], [1, 2, 5], [4, 5, 8]]) >>> U, V = Image.meshgrid(None, 4, 4) >>> U array([[0, 1, 2, 3], [0, 1, 2, 3], [0, 1, 2, 3], [0, 1, 2, 3]])
- Seealso
- warp(U, V, interp=None, domain=None)[source]
Image warping
- Parameters
U (ndarray(Wo,Ho)) – u-coordinate array for output image
V (ndarray(Wo,Ho)) – u-coordinate array for output image
interp (str, optional) – interpolation mode, defaults to None
domain ((ndarray(H,W), ndarray(H,W)), optional) – domain of output image, defaults to None
- Returns
warped image
- Return type
Image
Compute an image by warping the input image. The output image is \(H_o \times W_o\) and output pixel (u,v) is interpolated from the input image coordinate (U[u,v], V[u,v]):
\[Y_{u,v} = X_{u^\prime, v^\prime} \mbox{, where } u^\prime = U_{u,v}, v^\prime = V_{u,v}\]Note
Uses OpenCV.
- Seealso
- interp2d(U, V, Ud=None, Vd=None, **kwargs)[source]
Image warping
- Parameters
U (ndarray(Ho,Wo)) – u-coordinate array for output image
V (ndarray(Ho,Wo)) – u-coordinate array for output image
Ud (ndarray(H,W), optional) – u-coordinate array for domain of input image, defaults to None
Vd (ndarray(H,W), optional) – u-coordinate array for domain of input image, defaults to None
- Returns
warped image
- Return type
Image
Compute an image by warping the input image. The output image is \(H_o \times W_o\) and output pixel (u,v) is interpolated from the input image coordinate (U[v,u], V[v,u]).
\[Y_{u,v} = X_{u^\prime, v^\prime} \mbox{, where } u^\prime = U_{u,v}, v^\prime = V_{u,v}\]The coordinates in
U
andV
are with respect to the domain of the input image but can be overridden by specifyingUd
andVd
.Note
Uses SciPy
- Seealso
warp
meshgrid
uspan
vspan
scipy.interpolate.griddata
- warp_affine(M, inverse=False, size=None, bgcolor=None)[source]
Affine warp of image
- Parameters
M (ndarray(2,3), SE2) – affine matrix
inverse (bool, optional) – warp with inverse of
M
, defaults to Falsesize (array_like(2), optional) – size of output image, defaults to size of input image
bgcolor (scalar, str, array_like, optional) – background color, defaults to None
- Returns
warped image
- Return type
Image
Apply an affine warp to the image. Pixels in the output image that correspond to pixels outside the input image are set to
bgcol
.\[\begin{split}Y_{u,v} = X_{u^\prime, v^\prime} \mbox{, where } \begin{pmatrix} u^\prime \\ v^\prime \end{pmatrix} = \mat{M} \begin{pmatrix} u \\ v \\ 1 \end{pmatrix}\end{split}\]Example:
>>> from machinevisiontoolbox import Image >>> import numpy as np >>> from spatialmath import SE2 >>> img = Image.Read('monalisa.png') >>> M = np.diag([0.25, 0.25, 1]) * SE2(100, 200) # scale and translate >>> M array([[ 0.25, 0. , 100. ], [ 0. , 0.25, 200. ], [ 0. , 0. , 1. ]]) >>> out = img.warp_affine(M, bgcolor=np.nan) # unmapped pixels are NaNs >>> out.disp(badcolor="r") # display warped image with NaNs as red <matplotlib.image.AxesImage object at 0x7fedf4028cd0>
Note
Only the first two rows of
M
are used.- Seealso
- warp_perspective(H, method='linear', inverse=False, tile=False, size=None)[source]
Perspective warp
- Parameters
H (ndarray(3,3)) – homography
method (str, optional) – interpolation mode: ‘linear’ [default], ‘nearest’
inverse (bool, optional) – use inverse of
H
, defaults to Falsetile (bool, optional) – return minimal enclosing tile, defaults to False
size (array_like(2), optional) – size of output image, defaults to size of input image
- Raises
TypeError – H must be a 3x3 NumPy array
- Returns
warped image
- Return type
Image
Applies a perspective warp to the input image.
\[\begin{split}Y_{u,v} = X_{u^\prime, v^\prime} \mbox{, where } u^\prime=\frac{\tilde{u}}{\tilde{w}}, v^\prime=\frac{\tilde{v}}{\tilde{w}}, \begin{pmatrix} \tilde{u} \\ \tilde{v} \\ \tilde{w} \end{pmatrix} = \mat{H} \begin{pmatrix} u \\ v \\ 1 \end{pmatrix}\end{split}\]The resulting image may be smaller or larger than the input image. If
tile
is True then the output image is the smallest rectangle that contains the warped result, and its position with respect to the origin of the input image, and the coordinates of the four corners of the input image.- References
Robotics, Vision & Control for Python, Section 14.8, P. Corke, Springer 2023.
- Seealso
- undistort(K, dist)[source]
Undistort image
- Parameters
K (ndarray(3,3)) – camera intrinsics
dist (array_like(5)) – lens distortion parameters
- Returns
undistorted image
- Return type
Image
Remove lens distortion from image.
The distortion coefficients are \((k_1, k_2, p_1, p_2, k_3)\) where \(k_i\) are radial distortion coefficients and \(p_i\) are tangential distortion coefficients.
Example:
>>> from machinevisiontoolbox import Image, ImageCollection >>> import numpy as np >>> images = ImageCollection("calibration/*.jpg") >>> K = np.array([[ 534.1, 0, 341.5], [ 0, 534.1, 232.9], [ 0, 0, 1]]) >>> distortion = np.array([ -0.293, 0.1077, 0.00131, -3.109e-05, 0.04348]) >>> out = images[12].undistort(K, distortion) >>> out.disp() <matplotlib.image.AxesImage object at 0x7f9358254c70>
- Seealso
images2C
opencv.undistort
- view1d()[source]
Convert image to a column view
- Returns
column view
- Return type
ndarray(N,) or ndarray(N, np)
A greyscale image is converted to a 1D array in row-major (C) order, ie. row 0, row 1 etc.
A color image is converted to a 2D array in row-major (C) order, with one row per pixel, and each row is the pixel value, the values of its planes.
Example:
>>> from machinevisiontoolbox import Image >>> Image.Read('street.png').view1d().shape (1089280,) >>> Image.Read('monalisa.png').view1d().shape (473900, 3)
Note
This creates a view of the original image, so operations on the column will affect the original image.