Region features
These methods extract features such as homogenous regions, text and fiducials from the image.
- class machinevisiontoolbox.ImageRegionFeatures.ImageRegionFeaturesMixin[source]
- MSER(**kwargs)[source]
Find MSER features in image
- Parameters
kwargs – arguments passed to
opencv.MSER_create
- Returns
set of MSER features
- Return type
Find all the maximally stable extremal regions in the image and return an object that represents the MSERs found. The object behaves like a list and can be indexed, sliced and used as an iterator in for loops and comprehensions.
Example:
>>> from machinevisiontoolbox import Image >>> img = Image.Read("castle.png") >>> mser = img.MSER() >>> len(mser) # number of features 899 >>> mser[:5].bbox array([[ 1, 4, 145, 95], [ 1, 184, 182, 274], [1243, 179, 1279, 258], [1243, 179, 1279, 258], [1242, 178, 1279, 258]], dtype=int32)
- References
Robotics, Vision & Control for Python, Section 12.1.1.2, P. Corke, Springer 2023.
- Seealso
- ocr(minconf=50, plot=False)[source]
Optical character recognition
- Parameters
minconf (int, optional) – minimum confidence value for text to be returned or plotted (percentage), defaults to 50
plot (bool, optional) – overlay detected text on the current plot, assumed to be the image, defaults to False
- Returns
detected strings and metadata
- Return type
list of
OCRWord
Example:
Traceback (most recent call last): File "<input>", line 1, in <module> NameError: name 'im' is not defined
Each recognized text string is described by an
OCRWord
instance that contains the string, confidence and bounding box within the image.Warning
PyTessearct must be installed.
- References
Robotics, Vision & Control for Python, Section 12.4.1, P. Corke, Springer 2023.
- Seealso
- fiducial(dict='4x4_1000', K=None, side=None)[source]
Find fiducial markers in image
- Parameters
dict (str, optional) – marker type, defaults to “4x4_1000”
K (ndarray(3,3), optional) – camera intrinsics, defaults to None
side (float, optional) – side length of the marker, defaults to None
- Returns
markers found in image
- Return type
list of
Fiducial
instances
Find ArUco or ApriTag markers in the scene and return a list of
Fiducial
objects, one per marker. If camera intrinsics are provided then also compute the marker pose with respect to the camera.dict
specifies the marker family or dictionary and describes the number of bits in the tag and the number of usable unique tags.dict
tag type
marker size
number of unique tags
4x4_50
Aruco
4x4
50
4x4_100
Aruco
4x4
100
4x4_250
Aruco
4x4
250
4x4_1000
Aruco
4x4
1000
5x5_50
Aruco
5x5
50
5x5_100
Aruco
5x5
100
5x5_250
Aruco
5x5
250
5x5_1000
Aruco
5x5
1000
6x6_50
Aruco
6x6
50
6x6_100
Aruco
6x6
100
6x6_250
Aruco
6x6
250
6x6_1000
Aruco
6x6
1000
7x7_50
Aruco
7x7
50
7x7_100
Aruco
7x7
100
7x7_250
Aruco
7x7
250
7x7_1000
Aruco
7x7
1000
original
Aruco
?
?
16h5
AprilTag
4x4
30
25h9
AprilTag
5x5
35
36h10
AprilTag
6x6
?
36h11
AprilTag
6x6
587
Example:
File "/opt/hostedtoolcache/Python/3.9.14/x64/lib/python3.9/site-packages/machinevisiontoolbox/base/data.py", line 174, in mvtb_path_to_datafile raise ValueError(f"file {filename} not found locally or in mvtbdata") ValueError: file images/tags.png not found locally or in mvtbdata Traceback (most recent call last): File "<input>", line 1, in <module> NameError: name 'im' is not defined Traceback (most recent call last): File "<input>", line 1, in <module> NameError: name 'fiducials' is not defined Traceback (most recent call last): File "<input>", line 1, in <module> NameError: name 'fiducials' is not defined
Note
side
is the dimension of the square that contains the small white squares inside the black background.- References
Robotics, Vision & Control for Python, Section 13.6.1, P. Corke, Springer 2023.
- Seealso
Region feature classes
- class machinevisiontoolbox.ImageRegionFeatures.MSERFeature(image=None, **kwargs)[source]
Find MSERs
- Parameters
image (
Image
) – input imagekwargs – parameters passed to
opencv.MSER_create
Find all the maximally stable extremal regions in the image and return an object that represents the MSERs found. This class behaves like a list and each MSER is an element of the list.
Example:
>>> from machinevisiontoolbox import Image >>> img = Image.Read('shark2.png') >>> msers = img.MSER() >>> len(msers) 2 >>> msers[0] MSER features, 2 regions >>> msers.bbox array([[299, 300, 445, 408], [ 99, 100, 245, 208]], dtype=int32)
- References
J. Matas, O. Chum, M. Urban, and T. Pajdla. “Robust wide baseline stereo from maximally stable extremal regions.” Proc. of British Machine Vision Conference, pages 384-396, 2002.
Robotics, Vision & Control for Python, Section 12.1.2.2, P. Corke, Springer 2023.
- Seealso
- __len__()[source]
Number of MSER features
- Returns
number of features
- Return type
int
Example:
>>> from machinevisiontoolbox import Image >>> img = Image.Read("castle.png") >>> mser = img.MSER() >>> len(mser) # number of features 899
- Seealso
- __getitem__(i)[source]
Get MSERs from MSER feature object
- Parameters
i (int or slice) – index
- Raises
IndexError – index out of range
- Returns
subset of point features
- Return type
MSERFeature
instance
This method allows a
MSERFeature
object to be indexed, sliced or iterated.Example:
>>> from machinevisiontoolbox import Image >>> img = Image.Read("castle.png") >>> mser = img.MSER() >>> len(mser) # number of features 899 >>> mser[:5] # first 5 MSER features MSER features, 5 regions >>> mser[::50] # every 50th MSER feature MSER features, 18 regions
- Seealso
len
- __str__()[source]
String representation of MSER
- Returns
Brief readable description of MSER
- Return type
str
Example:
>>> from machinevisiontoolbox import Image >>> img = Image.Read("castle.png") >>> msers = img.MSER() >>> str(msers) 'MSER features, 899 regions' >>> str(msers[0]) 'MSER features, 2 regions'
- property points
Points belonging to MSERs
- Returns
Coordinates of points in (u,v) format that belong to MSER
- Return type
ndarray(2,N), list of ndarray(2,N)
If the object contains just one region the result is an array, otherwise it is a list of arrays (with different numbers of rows).
Example:
>>> from machinevisiontoolbox import Image >>> import numpy as np >>> img = Image.Read("castle.png") >>> msers = img.MSER() >>> np.printoptions(threshold=10) <contextlib._GeneratorContextManager object at 0x7fc7722ab9a0> >>> msers[0].points array([[ 9, 10, 11, ..., 8, 9, 10], [ 5, 5, 5, ..., 5, 4, 4]], dtype=int32) >>> msers[2:4].points [array([[1249, 1249, 1249, ..., 1245, 1251, 1246], [ 221, 220, 222, ..., 232, 181, 242]], dtype=int32), array([[1249, 1249, 1249, ..., 1250, 1244, 1255], [ 221, 220, 222, ..., 181, 203, 257]], dtype=int32)]
- Seealso
- property bbox
Bounding boxes of MSERs
- Returns
Bounding box of MSER in [umin, vmin, umax, vmax] format
- Return type
ndarray(4) or ndarray(N,4)
If the object contains just one region the result is a 1D array, otherwise it is a 2D arrays with one row per bounding box.
Example:
>>> from machinevisiontoolbox import Image >>> img = Image.Read("castle.png") >>> msers = img.MSER() >>> msers[0].bbox array([ 1, 4, 145, 95], dtype=int32) >>> msers[:4].bbox array([[ 1, 4, 145, 95], [ 1, 184, 182, 274], [1243, 179, 1279, 258], [1243, 179, 1279, 258]], dtype=int32)
- Seealso
- class machinevisiontoolbox.ImageRegionFeatures.OCRWord(ocr, i)[source]
OCR word and metadata
- Parameters
ocr (dict of lists) – dict from Tesseract
i (int) – index of word
- Returns
OCR data for word
- Return type
OCRWord
instance
Describes a word detected by OCR including its metadata which is available as a number of properties:
Property
Meaning
text
recognized text
conf
confidence in text recognition (percentage)
l
left coordinate (umin) of rectangle containing the text
t
top coordinate (vmin) of rectangle containing the text
w
height of rectangle containing the text
h
height of rectangle containing the text
ltrb
bounding box [left, top, right, bottom]
- Seealso
ocr
- __str__()[source]
String representation of MSER
- Returns
Brief readable description of OCR word
- Return type
str
- property l
Left side of word bounding box
- property t
Top side of word bounding box
- property w
Width of word bounding box
- property h
Height of word bounding box
- property ltrb
Word bounding box
- class machinevisiontoolbox.ImageRegionFeatures.Fiducial(id, corners, K=None, rvec=None, tvec=None)[source]
Properties of a visual fiducial marker
- Parameters
id (int) – identity of the marker
corners (ndarray(2, 4)) – image plane marker corners
K (ndarray(3,3), optional) – camera intrinsics
rvec (ndarray(3), optional) – translation of marker with respect to camera, as an Euler vector
tvec (ndarray(3), optional) – translation of marker with respect to camera
- Seealso
- __str__()[source]
String representation of fiducial
- Returns
Brief readable description of fidicual id and pose
- Return type
str
- property id
Fiducial id
- Returns
fiducial marker identity
- Return type
int
Returns the built in identity code of the April tag or arUco marker.
- property pose
Fiducial pose
- Returns
marker pose
- Return type
SE3
Returns the pose of the tag with respect to the camera. The x- and y-axes are in the marker plane and the z-axis is out of the marker.
Note
Accurate camera intrinsics and dimension parameters are required for this value to be metric.
- draw(image, length=100, thick=2)[source]
Draw marker coordinate frame into image
- Parameters
image (
Image
) – image with BGR color orderlength (int, optional) – axis length in pixels, defaults to 100
thick (int, optional) – axis thickness in pixels, defaults to 2
- Raises
ValueError – image must have BGR color order
Draws a coordinate frame into the image representing the pose of the marker. The x-, y- and z-axes are drawn as red, green and blue line segments.