Color operations

These methods perform image processing operations on grey-scale and color images.

class machinevisiontoolbox.ImageColor.ImageColorMixin[source]

Image processing color operations on the Image class

mono(opt='r601')[source]

Convert color image to monochrome

Parameters: opt (str, optional) – greyscale conversion mode, one of: ‘r601’ [default], ‘r709’, ‘value’ or ‘cv’
Returns: monochrome image
Return type: Image

Return a greyscale image of the same width and height as the color image. Various conversion options are available:

`opt`	definition
`'r601'`	ITU Rec. 601, Y’ = 0.229 R’ + 0.587 G’ + 0.114 B’
`'r709'`	ITU Rec. 709, Y’ = 0.2126 R’ + 0.7152 G’ + 0.0722 B’
`'value'`	V (value) component of HSV space
`'cv'`	OpenCV colorspace() RGB to gray conversion

Example:

>>> from machinevisiontoolbox import Image
>>> img = Image.Read('flowers1.png')
>>> img
Image: 640 x 426 (uint8), R:G:B [.../images/flowers1.png]
>>> img.mono()
Image: 640 x 426 (uint8)

Note

For a monochrome image returns a reference to the Image instance.

References

Robotics, Vision & Control for Python, Section 10.2.7, P. Corke, Springer 2023.

Seealso

colorspace colorize

chromaticity(which='RG')[source]

Create chromaticity image

Parameters: which (str, optional) – string comprising single letter color plane names, defaults to ‘RG’
Returns: chromaticity image
Return type: Image instance

Convert a tristimulus image to a chromaticity image. For the case of an RGB image and which='RG'

\[r = \frac{R}{R+G+B}, \, g = \frac{G}{R+G+B}\]

Example:

>>> from machinevisiontoolbox import Image
>>> img = Image.Read('flowers1.png')
>>> img.chromaticity()
Image: 640 x 426 (float32), r:g
>>> img.chromaticity('RB')
Image: 640 x 426 (float32), r:b

Note

The chromaticity color planes are the same as which but lower cased.

References

Robotics, Vision & Control for Python, Section 10.2.5, P. Corke, Springer 2023.

Seealso

tristim2cc

colorize(color=[1, 1, 1], colororder='RGB', alpha=False)[source]

Colorize a greyscale image

Parameters

color (string, array_like(3)) – base color
colororder (str, dict) – order of color channels of resulting image

Returns

color image

Return type

Image instance

The greyscale image is colorized by setting each output pixel to the product of color and the input pixel value.

Example:

>>> from machinevisiontoolbox import Image
>>> img = Image.Read('shark1.png')
>>> img.colorize([1, 0, 0])  # red shark
Image: 500 x 500 (uint8), R:G:B
>>> img.colorize('blue')  # blue shark
Image: 500 x 500 (uint8), R:G:B

References

Robotics, Vision & Control for Python, Section 11.3, P. Corke, Springer 2023.

Seealso

mono

kmeans_color(k=None, centroids=None, seed=None)[source]

k-means color clustering

Training

param k

number of clusters, defaults to None

type k

int, optional

param seed

random number seed, defaults to None

type seed

int, optional

return

label image, centroids and residual

rtype

Image, ndarray(P,k), float

The pixels are grouped into k clusters based on their Euclidean distance from k cluster centroids. Clustering is iterative and the intial cluster centroids are random.

The method returns a label image, indicating the assigned cluster for each input pixel, the cluster centroids and a residual.

Example:
>>> from machinevisiontoolbox import Image
>>> targets = Image.Read("tomato_124.png", dtype="float", gamma="sRGB")
>>> ab = targets.colorspace("L*a*b*").plane("a*:b*")
>>> targets_labels, targets_centroids, resid = ab.kmeans_color(k=3, seed=0)
>>> targets_centroids
array([[0.6767, 0.5036, 0.4309],
       [0.6211, 0.5084, 0.6149]], dtype=float32)
Classification

param centroids

cluster centroids from training phase

type centroids

ndarray(P,k)

return

label image

rtype

Image

Pixels in the input image are assigned the label of the closest centroid.

Note

The colorspace of the images could a chromaticity space to classify objects while ignoring brightness variation.

references

Robotics, Vision & Control for Python, Section 12.1.1.2, P. Corke, Springer 2023.

Seealso: opencv.kmeans

colorspace(dst, src=None)[source]

Transform a color image between color representations

Parameters

dst (str) – destination color space (see below)
src (str, optional) – source color space (see below), defaults to colororder of image

Returns

color image in new colorspace

Return type

Image

Color space names (synonyms listed on the same line) are:

Color space name	Option string(s)
grey scale	‘grey’, ‘gray’
RGB (red/green/blue)	‘rgb’
BGR (blue/green/red)	‘bgr’
CIE XYZ	‘xyz’, ‘xyz_709’
YCrCb	‘ycrcb’
HSV (hue/sat/value)	‘hsv’
HLS (hue/lightness/sat)	‘hls’
CIE Lab*	‘lab’, ‘lab*’
CIE Luv*	‘luv’, ‘luv*’

Example:

>>> from machinevisiontoolbox import Image
>>> im = Image.Read('flowers1.png')
>>> im.colorspace('hsv')
Image: 640 x 426 (uint8), h:s:v

Note

RGB images are assumed to be linear, or gamma decoded.

References

Robotics, Vision & Control for Python, Section 10.2.7, 10.4.1, P. Corke, Springer 2023.

Seealso

mono colorspace_convert

classmethod Overlay(im1, im2, colors='rc')[source]

Overlay two greyscale images in different colors

Parameters

im1 (Image) – first image
im2 (Image) – second image
colors (2-element string/list/tuple, optional) – colors for each image, defaults to ‘rc’’

Raises

ValueError – images must be greyscale

Returns

overlaid images

Return type

Image

Two greyscale images are overlaid in different colors. Useful for visualizing disparity or optical flow.

Example:

>>> from machinevisiontoolbox import Image
>>> img1 = Image.Read('eiffel-1.png', mono=True)
>>> img2 = Image.Read('eiffel-2.png', mono=True)
>>> Image.Overlay(img1, img2)
Image: 1280 x 960 (uint8), R:G:B
>>> Image.Overlay(img1, img2, 'rg')
Image: 1280 x 960 (uint8), R:G:B
>>> Image.Overlay(img1, img2, ((1, 0, 0), (0, 1, 0)))
Image: 1280 x 960 (uint8), R:G:B

Note

Images can be different size, the output image size is the maximum of the dimensions of the input images. Small dimensions are zero padded. The top-left corner of both images are aligned.

Seealso: anaglyph blend stshow

gamma_encode(gamma)[source]

Gamma encoding

Parameters: gamma (str, float) – gamma value
Returns: gamma encoded version of image
Return type: Image

Gamma encode the image. This takes a linear luminance image and converts it to a form suitable for display on a non-linear monitor. gamma is either the string ‘sRGB’ for IEC 61966-2-1:1999 or a float:

\[\mat{Y}_{u,v} = \mat{X}_{u,v}^\gamma\]

Example:

>>> from machinevisiontoolbox import Image
>>> img = Image(np.arange(8)[np.newaxis, :])  # create grey step wedge
>>> img.gamma_encode('sRGB').disp()
<matplotlib.image.AxesImage object at 0x7ff4cd1a7a90>

Note

gamma is the reciprocal of the value used for gamma decoding
Gamma encoding is typically performed in a camera with \(\gamma=0.45\).
For images with multiple planes, the gamma encoding is applied to all planes.
For floating point images, the pixels are assumed to be in the range 0 to 1.
For integer images,the pixels are assumed in the range 0 to the maximum value of their class. Pixels are converted first to double, processed, then converted back to the integer class.

References

Robotics, Vision & Control for Python, Section 10.2.7, 10.3.6, P. Corke, Springer 2023.

Seealso

gamma_encode colorspace

gamma_decode(gamma)[source]

Gamma decoding

Parameters: gamma – gamma value
Returns: gamma decoded version of image
Return type: Image instance

Gamma decode the image. This takes a gamma-encoded image, as typically obtained from a camera or image file, and converts it to a linear luminance image. gamma is either the string ‘sRGB’ for IEC 61966-2-1:1999 or a float:

\[\mat{Y}_{u,v} = \mat{X}_{u,v}^\gamma\]

Example:

>>> from machinevisiontoolbox import Image
>>> img = Image.Read('street.png')
>>> linear = img.gamma_decode('sRGB')

Note

gamma is the reciprocal of the value used for gamma encoding
Gamma decoding should be applied to any color image prior to colometric operations.
Gamma decoding is typically performed in the display hardware with \(\gamma=2.2\).
For images with multiple planes, the gamma decoding is applied to all planes.
For floating point images, the pixels are assumed to be in the range 0 to 1.
For integer images,the pixels are assumed in the range 0 to the maximum value of their class. Pixels are converted first to double, processed, then converted back to the integer class.

References

Robotics, Vision & Control for Python, Section 10.2.7, 10.3.6, P. Corke, Springer 2023.

Seealso

gamma_encode colorspace