Color operations

These methods perform image processing operations on grey-scale and color images.

class machinevisiontoolbox.ImageColor.ImageColorMixin[source]

Image processing color operations on the Image class

mono(opt='r601')[source]

Convert color image to monochrome

Parameters

opt (str, optional) – greyscale conversion mode, one of: ‘r601’ [default], ‘r709’, ‘value’ or ‘cv’

Returns

monochrome image

Return type

Image

Return a greyscale image of the same width and height as the color image. Various conversion options are available:

opt

definition

'r601'

ITU Rec. 601, Y’ = 0.229 R’ + 0.587 G’ + 0.114 B’

'r709'

ITU Rec. 709, Y’ = 0.2126 R’ + 0.7152 G’ + 0.0722 B’

'value'

V (value) component of HSV space

'cv'

OpenCV colorspace() RGB to gray conversion

Example:

>>> from machinevisiontoolbox import Image
>>> img = Image.Read('flowers1.png')
>>> img
Image: 640 x 426 (uint8), R:G:B [.../images/flowers1.png]
>>> img.mono()
Image: 640 x 426 (uint8)

Note

For a monochrome image returns a reference to the Image instance.

References
  • Robotics, Vision & Control for Python, Section 10.2.7, P. Corke, Springer 2023.

Seealso

colorspace colorize

chromaticity(which='RG')[source]

Create chromaticity image

Parameters

which (str, optional) – string comprising single letter color plane names, defaults to ‘RG’

Returns

chromaticity image

Return type

Image instance

Convert a tristimulus image to a chromaticity image. For the case of an RGB image and which='RG'

\[r = \frac{R}{R+G+B}, \, g = \frac{G}{R+G+B}\]

Example:

>>> from machinevisiontoolbox import Image
>>> img = Image.Read('flowers1.png')
>>> img.chromaticity()
Image: 640 x 426 (float32), r:g
>>> img.chromaticity('RB')
Image: 640 x 426 (float32), r:b

Note

The chromaticity color planes are the same as which but lower cased.

References
  • Robotics, Vision & Control for Python, Section 10.2.5, P. Corke, Springer 2023.

Seealso

tristim2cc

colorize(color=[1, 1, 1], colororder='RGB', alpha=False)[source]

Colorize a greyscale image

Parameters
  • color (string, array_like(3)) – base color

  • colororder (str, dict) – order of color channels of resulting image

Returns

color image

Return type

Image instance

The greyscale image is colorized by setting each output pixel to the product of color and the input pixel value.

Example:

>>> from machinevisiontoolbox import Image
>>> img = Image.Read('shark1.png')
>>> img.colorize([1, 0, 0])  # red shark
Image: 500 x 500 (uint8), R:G:B
>>> img.colorize('blue')  # blue shark
Image: 500 x 500 (uint8), R:G:B
References
  • Robotics, Vision & Control for Python, Section 11.3, P. Corke, Springer 2023.

Seealso

mono

kmeans_color(k=None, centroids=None, seed=None)[source]

k-means color clustering

Training

param k

number of clusters, defaults to None

type k

int, optional

param seed

random number seed, defaults to None

type seed

int, optional

return

label image, centroids and residual

rtype

Image, ndarray(P,k), float

The pixels are grouped into k clusters based on their Euclidean distance from k cluster centroids. Clustering is iterative and the intial cluster centroids are random.

The method returns a label image, indicating the assigned cluster for each input pixel, the cluster centroids and a residual.

Example:

>>> from machinevisiontoolbox import Image
>>> targets = Image.Read("tomato_124.png", dtype="float", gamma="sRGB")
>>> ab = targets.colorspace("L*a*b*").plane("a*:b*")
>>> targets_labels, targets_centroids, resid = ab.kmeans_color(k=3, seed=0)
>>> targets_centroids
array([[0.6767, 0.5036, 0.4309],
       [0.6211, 0.5084, 0.6149]], dtype=float32)

Classification

param centroids

cluster centroids from training phase

type centroids

ndarray(P,k)

return

label image

rtype

Image

Pixels in the input image are assigned the label of the closest centroid.

Note

The colorspace of the images could a chromaticity space to classify objects while ignoring brightness variation.

references
  • Robotics, Vision & Control for Python, Section 12.1.1.2, P. Corke, Springer 2023.

Seealso

opencv.kmeans

colorspace(dst, src=None)[source]

Transform a color image between color representations

Parameters
  • dst (str) – destination color space (see below)

  • src (str, optional) – source color space (see below), defaults to colororder of image

Returns

color image in new colorspace

Return type

Image

Color space names (synonyms listed on the same line) are:

Color space name

Option string(s)

grey scale

‘grey’, ‘gray’

RGB (red/green/blue)

‘rgb’

BGR (blue/green/red)

‘bgr’

CIE XYZ

‘xyz’, ‘xyz_709’

YCrCb

‘ycrcb’

HSV (hue/sat/value)

‘hsv’

HLS (hue/lightness/sat)

‘hls’

CIE L*a*b*

‘lab’, ‘l*a*b*’

CIE L*u*v*

‘luv’, ‘l*u*v*’

Example:

>>> from machinevisiontoolbox import Image
>>> im = Image.Read('flowers1.png')
>>> im.colorspace('hsv')
Image: 640 x 426 (uint8), h:s:v

Note

RGB images are assumed to be linear, or gamma decoded.

References
  • Robotics, Vision & Control for Python, Section 10.2.7, 10.4.1, P. Corke, Springer 2023.

Seealso

mono colorspace_convert

classmethod Overlay(im1, im2, colors='rc')[source]

Overlay two greyscale images in different colors

Parameters
  • im1 (Image) – first image

  • im2 (Image) – second image

  • colors (2-element string/list/tuple, optional) – colors for each image, defaults to ‘rc’’

Raises

ValueError – images must be greyscale

Returns

overlaid images

Return type

Image

Two greyscale images are overlaid in different colors. Useful for visualizing disparity or optical flow.

Example:

>>> from machinevisiontoolbox import Image
>>> img1 = Image.Read('eiffel-1.png', mono=True)
>>> img2 = Image.Read('eiffel-2.png', mono=True)
>>> Image.Overlay(img1, img2)
Image: 1280 x 960 (uint8), R:G:B
>>> Image.Overlay(img1, img2, 'rg')
Image: 1280 x 960 (uint8), R:G:B
>>> Image.Overlay(img1, img2, ((1, 0, 0), (0, 1, 0)))
Image: 1280 x 960 (uint8), R:G:B

Note

Images can be different size, the output image size is the maximum of the dimensions of the input images. Small dimensions are zero padded. The top-left corner of both images are aligned.

Seealso

anaglyph blend stshow

gamma_encode(gamma)[source]

Gamma encoding

Parameters

gamma (str, float) – gamma value

Returns

gamma encoded version of image

Return type

Image

Gamma encode the image. This takes a linear luminance image and converts it to a form suitable for display on a non-linear monitor. gamma is either the string ‘sRGB’ for IEC 61966-2-1:1999 or a float:

\[\mat{Y}_{u,v} = \mat{X}_{u,v}^\gamma\]

Example:

>>> from machinevisiontoolbox import Image
>>> img = Image(np.arange(8)[np.newaxis, :])  # create grey step wedge
>>> img.gamma_encode('sRGB').disp()
<matplotlib.image.AxesImage object at 0x7ff4cd1a7a90>

Note

  • gamma is the reciprocal of the value used for gamma decoding

  • Gamma encoding is typically performed in a camera with \(\gamma=0.45\).

  • For images with multiple planes, the gamma encoding is applied to all planes.

  • For floating point images, the pixels are assumed to be in the range 0 to 1.

  • For integer images,the pixels are assumed in the range 0 to the maximum value of their class. Pixels are converted first to double, processed, then converted back to the integer class.

References
  • Robotics, Vision & Control for Python, Section 10.2.7, 10.3.6, P. Corke, Springer 2023.

Seealso

gamma_encode colorspace

gamma_decode(gamma)[source]

Gamma decoding

Parameters

gamma – gamma value

Returns

gamma decoded version of image

Return type

Image instance

Gamma decode the image. This takes a gamma-encoded image, as typically obtained from a camera or image file, and converts it to a linear luminance image. gamma is either the string ‘sRGB’ for IEC 61966-2-1:1999 or a float:

\[\mat{Y}_{u,v} = \mat{X}_{u,v}^\gamma\]

Example:

>>> from machinevisiontoolbox import Image
>>> img = Image.Read('street.png')
>>> linear = img.gamma_decode('sRGB')

Note

  • gamma is the reciprocal of the value used for gamma encoding

  • Gamma decoding should be applied to any color image prior to colometric operations.

  • Gamma decoding is typically performed in the display hardware with \(\gamma=2.2\).

  • For images with multiple planes, the gamma decoding is applied to all planes.

  • For floating point images, the pixels are assumed to be in the range 0 to 1.

  • For integer images,the pixels are assumed in the range 0 to the maximum value of their class. Pixels are converted first to double, processed, then converted back to the integer class.

References
  • Robotics, Vision & Control for Python, Section 10.2.7, 10.3.6, P. Corke, Springer 2023.

Seealso

gamma_encode colorspace