Region features
These methods extract features such as homogenous regions, text and fiducials from the image.
- class machinevisiontoolbox.ImageRegionFeatures.ImageRegionFeaturesMixin[source]
- MSER(**kwargs)[source]
- Find MSER features in image - Parameters
- kwargs – arguments passed to - opencv.MSER_create
- Returns
- set of MSER features 
- Return type
 - Find all the maximally stable extremal regions in the image and return an object that represents the MSERs found. The object behaves like a list and can be indexed, sliced and used as an iterator in for loops and comprehensions. - Example: - >>> from machinevisiontoolbox import Image >>> img = Image.Read("castle.png") >>> mser = img.MSER() >>> len(mser) # number of features 899 >>> mser[:5].bbox array([[ 1, 4, 145, 95], [ 1, 184, 182, 274], [1243, 179, 1279, 258], [1243, 179, 1279, 258], [1242, 178, 1279, 258]], dtype=int32) - References
- Robotics, Vision & Control for Python, Section 12.1.1.2, P. Corke, Springer 2023. 
 
- Seealso
 
 - ocr(minconf=50, plot=False)[source]
- Optical character recognition - Parameters
- minconf (int, optional) – minimum confidence value for text to be returned or plotted (percentage), defaults to 50 
- plot (bool, optional) – overlay detected text on the current plot, assumed to be the image, defaults to False 
 
- Returns
- detected strings and metadata 
- Return type
- list of - OCRWord
 - Example: - Traceback (most recent call last): File "<input>", line 1, in <module> NameError: name 'im' is not defined - Each recognized text string is described by an - OCRWordinstance that contains the string, confidence and bounding box within the image.- Warning - PyTessearct must be installed. - References
- Robotics, Vision & Control for Python, Section 12.4.1, P. Corke, Springer 2023. 
 
- Seealso
 
 - fiducial(dict='4x4_1000', K=None, side=None)[source]
- Find fiducial markers in image - Parameters
- dict (str, optional) – marker type, defaults to “4x4_1000” 
- K (ndarray(3,3), optional) – camera intrinsics, defaults to None 
- side (float, optional) – side length of the marker, defaults to None 
 
- Returns
- markers found in image 
- Return type
- list of - Fiducialinstances
 - Find ArUco or ApriTag markers in the scene and return a list of - Fiducialobjects, one per marker. If camera intrinsics are provided then also compute the marker pose with respect to the camera.- dictspecifies the marker family or dictionary and describes the number of bits in the tag and the number of usable unique tags.- dict - tag type - marker size - number of unique tags - 4x4_50- Aruco - 4x4 - 50 - 4x4_100- Aruco - 4x4 - 100 - 4x4_250- Aruco - 4x4 - 250 - 4x4_1000- Aruco - 4x4 - 1000 - 5x5_50- Aruco - 5x5 - 50 - 5x5_100- Aruco - 5x5 - 100 - 5x5_250- Aruco - 5x5 - 250 - 5x5_1000- Aruco - 5x5 - 1000 - 6x6_50- Aruco - 6x6 - 50 - 6x6_100- Aruco - 6x6 - 100 - 6x6_250- Aruco - 6x6 - 250 - 6x6_1000- Aruco - 6x6 - 1000 - 7x7_50- Aruco - 7x7 - 50 - 7x7_100- Aruco - 7x7 - 100 - 7x7_250- Aruco - 7x7 - 250 - 7x7_1000- Aruco - 7x7 - 1000 - original- Aruco - ? - ? - 16h5- AprilTag - 4x4 - 30 - 25h9- AprilTag - 5x5 - 35 - 36h10- AprilTag - 6x6 - ? - 36h11- AprilTag - 6x6 - 587 - Example: - File "/opt/hostedtoolcache/Python/3.9.14/x64/lib/python3.9/site-packages/machinevisiontoolbox/base/data.py", line 174, in mvtb_path_to_datafile raise ValueError(f"file {filename} not found locally or in mvtbdata") ValueError: file images/tags.png not found locally or in mvtbdata Traceback (most recent call last): File "<input>", line 1, in <module> NameError: name 'im' is not defined Traceback (most recent call last): File "<input>", line 1, in <module> NameError: name 'fiducials' is not defined Traceback (most recent call last): File "<input>", line 1, in <module> NameError: name 'fiducials' is not defined - Note - sideis the dimension of the square that contains the small white squares inside the black background.- References
- Robotics, Vision & Control for Python, Section 13.6.1, P. Corke, Springer 2023. 
 
- Seealso
 
 
Region feature classes
- class machinevisiontoolbox.ImageRegionFeatures.MSERFeature(image=None, **kwargs)[source]
- Find MSERs - Parameters
- image ( - Image) – input image
- kwargs – parameters passed to - opencv.MSER_create
 
 - Find all the maximally stable extremal regions in the image and return an object that represents the MSERs found. This class behaves like a list and each MSER is an element of the list. - Example: - >>> from machinevisiontoolbox import Image >>> img = Image.Read('shark2.png') >>> msers = img.MSER() >>> len(msers) 2 >>> msers[0] MSER features, 2 regions >>> msers.bbox array([[299, 300, 445, 408], [ 99, 100, 245, 208]], dtype=int32) - References
- J. Matas, O. Chum, M. Urban, and T. Pajdla. “Robust wide baseline stereo from maximally stable extremal regions.” Proc. of British Machine Vision Conference, pages 384-396, 2002. 
- Robotics, Vision & Control for Python, Section 12.1.2.2, P. Corke, Springer 2023. 
 
- Seealso
 - __len__()[source]
- Number of MSER features - Returns
- number of features 
- Return type
- int 
 - Example: - >>> from machinevisiontoolbox import Image >>> img = Image.Read("castle.png") >>> mser = img.MSER() >>> len(mser) # number of features 899 - Seealso
 
 - __getitem__(i)[source]
- Get MSERs from MSER feature object - Parameters
- i (int or slice) – index 
- Raises
- IndexError – index out of range 
- Returns
- subset of point features 
- Return type
- MSERFeatureinstance
 - This method allows a - MSERFeatureobject to be indexed, sliced or iterated.- Example: - >>> from machinevisiontoolbox import Image >>> img = Image.Read("castle.png") >>> mser = img.MSER() >>> len(mser) # number of features 899 >>> mser[:5] # first 5 MSER features MSER features, 5 regions >>> mser[::50] # every 50th MSER feature MSER features, 18 regions - Seealso
- len
 
 - __str__()[source]
- String representation of MSER - Returns
- Brief readable description of MSER 
- Return type
- str 
 - Example: - >>> from machinevisiontoolbox import Image >>> img = Image.Read("castle.png") >>> msers = img.MSER() >>> str(msers) 'MSER features, 899 regions' >>> str(msers[0]) 'MSER features, 2 regions' 
 - property points
- Points belonging to MSERs - Returns
- Coordinates of points in (u,v) format that belong to MSER 
- Return type
- ndarray(2,N), list of ndarray(2,N) 
 - If the object contains just one region the result is an array, otherwise it is a list of arrays (with different numbers of rows). - Example: - >>> from machinevisiontoolbox import Image >>> import numpy as np >>> img = Image.Read("castle.png") >>> msers = img.MSER() >>> np.printoptions(threshold=10) <contextlib._GeneratorContextManager object at 0x7fc7722ab9a0> >>> msers[0].points array([[ 9, 10, 11, ..., 8, 9, 10], [ 5, 5, 5, ..., 5, 4, 4]], dtype=int32) >>> msers[2:4].points [array([[1249, 1249, 1249, ..., 1245, 1251, 1246], [ 221, 220, 222, ..., 232, 181, 242]], dtype=int32), array([[1249, 1249, 1249, ..., 1250, 1244, 1255], [ 221, 220, 222, ..., 181, 203, 257]], dtype=int32)] - Seealso
 
 - property bbox
- Bounding boxes of MSERs - Returns
- Bounding box of MSER in [umin, vmin, umax, vmax] format 
- Return type
- ndarray(4) or ndarray(N,4) 
 - If the object contains just one region the result is a 1D array, otherwise it is a 2D arrays with one row per bounding box. - Example: - >>> from machinevisiontoolbox import Image >>> img = Image.Read("castle.png") >>> msers = img.MSER() >>> msers[0].bbox array([ 1, 4, 145, 95], dtype=int32) >>> msers[:4].bbox array([[ 1, 4, 145, 95], [ 1, 184, 182, 274], [1243, 179, 1279, 258], [1243, 179, 1279, 258]], dtype=int32) - Seealso
 
 
- class machinevisiontoolbox.ImageRegionFeatures.OCRWord(ocr, i)[source]
- OCR word and metadata - Parameters
- ocr (dict of lists) – dict from Tesseract 
- i (int) – index of word 
 
- Returns
- OCR data for word 
- Return type
- OCRWordinstance
 - Describes a word detected by OCR including its metadata which is available as a number of properties: - Property - Meaning - text- recognized text - conf- confidence in text recognition (percentage) - l- left coordinate (umin) of rectangle containing the text - t- top coordinate (vmin) of rectangle containing the text - w- height of rectangle containing the text - h- height of rectangle containing the text - ltrb- bounding box [left, top, right, bottom] - Seealso
- ocr
 - __str__()[source]
- String representation of MSER - Returns
- Brief readable description of OCR word 
- Return type
- str 
 
 - property l
- Left side of word bounding box 
 - property t
- Top side of word bounding box 
 - property w
- Width of word bounding box 
 - property h
- Height of word bounding box 
 - property ltrb
- Word bounding box 
 
- class machinevisiontoolbox.ImageRegionFeatures.Fiducial(id, corners, K=None, rvec=None, tvec=None)[source]
- Properties of a visual fiducial marker - Parameters
- id (int) – identity of the marker 
- corners (ndarray(2, 4)) – image plane marker corners 
- K (ndarray(3,3), optional) – camera intrinsics 
- rvec (ndarray(3), optional) – translation of marker with respect to camera, as an Euler vector 
- tvec (ndarray(3), optional) – translation of marker with respect to camera 
 
- Seealso
 - __str__()[source]
- String representation of fiducial - Returns
- Brief readable description of fidicual id and pose 
- Return type
- str 
 
 - property id
- Fiducial id - Returns
- fiducial marker identity 
- Return type
- int 
 - Returns the built in identity code of the April tag or arUco marker. 
 - property pose
- Fiducial pose - Returns
- marker pose 
- Return type
- SE3 
 - Returns the pose of the tag with respect to the camera. The x- and y-axes are in the marker plane and the z-axis is out of the marker. - Note - Accurate camera intrinsics and dimension parameters are required for this value to be metric. 
 - draw(image, length=100, thick=2)[source]
- Draw marker coordinate frame into image - Parameters
- image ( - Image) – image with BGR color order
- length (int, optional) – axis length in pixels, defaults to 100 
- thick (int, optional) – axis thickness in pixels, defaults to 2 
 
- Raises
- ValueError – image must have BGR color order 
 - Draws a coordinate frame into the image representing the pose of the marker. The x-, y- and z-axes are drawn as red, green and blue line segments.