Expert Insights: Mastering Optical Character Recognition (OCR) with TwinCAT Vision


Optical Character Recognition (OCR) technology enables the efficient extraction of text from images, significantly alleviating the burden of manual data entry. The OCR functionality introduced in this article is specifically designed to recognize characters within images and return the identified character sequences in string format, with a focus on identifying numerals 0-9, special characters /-:=, and uppercase letters A-Z. The (OCR) function utilizes classical machine learning algorithms during its development to construct the model, which has been pre-trained, eliminating the need for users to perform additional setup or custom training when utilizing the OCR feature.

Hardware and Software Versions

Control Software

Test Operating System:Win11;

Software Version: TwinCAT3 FULL Version V3.1.4024.50;

TF7xxx Plugin Version: Ver.4.0.4.8.


For testing this sample, you can utilize the offline image files attached to verify the algorithms, thus eliminating the need to connect a camera for testing purposes.



Preparation


Install TwinCAT Vision FULL Version V3.1.4024.50 on a Windows operating system, along with the TwinCAT Vision add-on TF7xxx. For this test, Ver.4.0.4.8 is required.

Download link for TF7xxx version:

https://www.beckhoff.com.cn/zh-cn/products/automation/twincat/tfxxxx-twincat-3-functions/tf7xxx-vision/tf7800.html

Code Usage and Explanation

Image Acquisition

In this example, the offline simulation functionality of TwinCAT Vision is utilized to process images from the File source (offline) through the algorithm. Image acquisition from files involves loading them into the TwinCAT real-time system from the file system. The specific operation method is as follows: VISION node → File Source → File Source Control.



As shown in the figure above, load the images from the folder named "images" into the File Source Control.

Code Explanation

Firstly, in the main program, the initial part consists of a conditional statement used to initialize the OCR model. If the initialization is successful, the bInitialized flag is set to TRUE.



Below is the implementation to check the initialization status of the OCR model, allowing for resource release or re-initialization as needed.



The following code retrieves the current image and performs a series of preprocessing operations, such as converting the color space, setting the Region of Interest (ROI), and applying morphological processing.



In setting the Region of Interest (ROI), the F_GetROI function block is called. The primary role of this function block is to set different ROIs and related parameters based on the file name (sFileName), enabling flexible image processing and OCR recognition for various images. For instance, if the image is named "OCR 01.png", the ROI's top-left corner coordinates are set to (86, 34), with a width of 263 pixels and a height of 102 pixels. The binarization threshold is set to 150, and the sPattern parameter specifies the expected format of the characters, such as "dd#dd#dd" which represents an alternating pattern of digits and special symbols, corresponding to a date format like "12.11.20" in the image.



Next is the section where the OCR function is called. Depending on whether advanced features are to be used (bUseExpFunction), different OCR functions are invoked for character recognition. Both the F_VN_OCR and F_VN_OCRExp functions require the input image passed to the ipSrcImage parameter to be a single-channel binary image with white characters on a black background. ETcVnOcrModelType is an enumeration type that provides different OCR model types:

  • TCVN_OMT_NUMBERS: Used for recognizing numbers.
  • TCVN_OMT_NUMBERS_SC: Used for recognizing numbers and special characters.
  • TCVN_OMT_UCLETTERS: Used for recognizing uppercase letters.
  • TCVN_OMT_NUMBERS_SC_UCLETTERS: Used for recognizing numbers, special characters, and uppercase letters.

Specific Operation Steps and Result Images

  • Add the sample image to the FileSource1 control.
  • Modify the parameters in F_GetROI according to your provided image, such as the file name, ROI area, and binarization threshold.
  • Activate the configuration -> Start the TwinCAT system and PLC runtime -> Observe the results in the ADS Image Watch.
  • Toggle lbUseExpFunction to switch between OCR standard and expert functions.

Taking the image OCR_01.png as an example, after loading the image into FileSource, activate the configuration and directly download the program.



In this code segment, the parameters for the F_GetROI function have already been set and do not require modification. Therefore, set bUseExpFunction to TRUE and observe the processing results in the ADS Image Watch.

The following image demonstrates the result after binarizing the ROI area and removing bright objects connected to the image boundaries. This step aims to ensure that only the relevant information in the image is preserved, allowing for more accurate character recognition.



The result of the edge highlighting area after binarization is:



The final recognized OCR result image is:



Requirements for Characters

  • The height of characters must be at least 20 pixels.
  • The stroke width of characters must be at least 3 pixels.
  • The minimum size of dots is 3 x 3 pixels.
  • The minimum size of lines is 3 x 6 pixels.
  • The spacing between characters must be at least 4 pixels.
  • Characters must not overlap.
  • The maximum deviation angle for horizontal alignment of characters is ±6°.
  • Lines in characters must not be broken.

General Requirements for Images

  • The Region of Interest (ROI) should only contain text and its immediate surroundings without any distracting elements.
  • There should be good contrast between the characters and the background.
  • The background should be uniform, noise-free, undisturbed, and opaque.

Requirements for Fonts

  • Only monospaced fonts are allowed, where character spacing is equal to the character width.
  • Larger spaces will only be recognized as a single space.
  • Only sans-serif fonts are permitted, such as Arial, Tahoma, Courier, Univers, Frutiger, Verdana, OCR-B.
  • Mixing fonts is not allowed.
  • Dot matrix printing (dot fonts) or italic fonts are not permitted.