Machine vision. What is it and how to use it? Image processing optical source

Machine vision is a scientific field in the field of artificial intelligence, in particular robotics, and related technologies for obtaining images of real-world objects, their processing and using the obtained data to solve various kinds of applied problems without the participation of a (full or partial) person.

Historical breakthroughs in machine vision

  • 1955 год – Оливер Селфридж. Статья «Глаза и уши компьютера».
  • 1958 год – Фрэнк Розенблатт. Компьютерная реализация персептрона.
  • 1960-е годы – первые системы обработки изображений.
  • 1970-е годы – Лавренсе Робертс. Концепция машинного построения трёхмерных образов объектов.
  • 1979 год – Ганс-Хельмут Нагель. Теория анализа динамических сцен.
  • 1990-е годы – Первые беспилотные системы управления автотранспортом.
  • 2003 год – Корпоративные системы распознавания лиц.

Machine Vision System Components

  • Одна или несколько цифровых или аналоговых камер (черно-белые или цветные) с подходящей оптикой для получения изображений
  • Программное обеспечение для изготовления изображений для обработки. Для аналоговых камер это оцифровщик изображений
  • Процессор (современный ПК c многоядерным процессором или встроенный процессор, например — ЦСП)
  • Программное обеспечение машинного зрения, которое предоставляет инструменты для разработки отдельных приложений программного обеспечения.
  • Оборудование ввода-вывода или каналы связи для доклада о полученных результатах
  • Умная камера: одно устройство, которое включает в себя все вышеперечисленные пункты.
  • Очень специализированные источники света (светодиоды, люминесцентные и галогенные лампы и т. д.)
  • Специфичные приложения программного обеспечения для обработки изображений и обнаружения соответствующих свойств.
  • Датчик для синхронизации частей обнаружения (часто оптический или магнитный датчик) для захвата и обработки изображений.
  • Приводы определенной формы используемые для сортировки или отбрасывания бракованных деталей.
Machine vision focuses on applications mainly industrial, such as autonomous robots and visual inspection and measurement systems. This means that the technology of image sensors and control theory is associated with the processing of video data to control the robot and the processing of the received data in real time is carried out either software or hardware.

Image processing and image analysis are mainly focused on working with 2D images, i.e. how to convert one image to another. For example, pixel-by-pixel operations for increasing contrast, operations for selecting edges, eliminating noise, or geometric transformations, such as image rotation. These operations assume that image processing / analysis is independent of the content of the images themselves.

Computer vision focuses on processing three-dimensional scenes projected onto one or more images. For example, restoring a structure or other information about a 3D scene from one or more images. Computer vision often depends on more or less complex assumptions about what is represented in the images.

There is also an area called visualization, which was originally associated with the image creation process, but sometimes dealt with processing and analysis. For example, radiography works with video analysis of medical applications.

Finally, pattern recognition is an area that uses various methods to obtain information from video data, mainly based on a statistical approach. A significant part of this area is devoted to the practical application of these methods.

Thus, we can conclude that the concept of "machine vision" today includes: computer vision, recognition of visual images, analysis and image processing, etc.

Machine vision tasks

  • Распознавание
  • Идентификация
  • Обнаружение
  • Распознавание текста
  • Восстановление 3D формы по 2D изображениям
  • Оценка движения
  • Восстановление сцены
  • Восстановление изображений
  • Выделение на изображениях структур определенного вида, сегментация изображений
  • Анализ оптического потока


A classic task in computer vision, image processing, and machine vision is determining whether the video data contains some characteristic object, feature, or activity.

This problem can be reliably and easily solved by a person, but still has not been satisfactorily solved in computer vision in the general case: random objects in random situations.

One or more predefined or studied objects or classes of objects can be recognized (usually together with their two-dimensional position in the image or three-dimensional position in the scene).


An individual instance of an object belonging to a class is recognized.
Examples: identification of a specific human face or fingerprint or car.


Video data is checked for a specific condition.

Detection based on relatively simple and fast calculations is sometimes used to find small areas in the analyzed image, which are then analyzed using techniques that are more demanding on resources to obtain the correct interpretation.

Text recognising

Search for images by content: finding all images in a large set of images that have content defined in various ways.

Assessment of position: the determination of the position or orientation of a specific object relative to the camera.

Optical character recognition: character recognition on images of printed or handwritten text (usually for translation into text format, the most convenient for editing or indexing. For example, ASCII).

3D shape recovery from 2D images is carried out using the stereo reconstruction of the depth map, reconstruction of the normal field and depth map by filling in the grayscale image, reconstruction of the depth map by texture and determining the shape by moving

An example of restoring a 3D shape from a 2D image

Motion estimation

Several tasks related to motion estimation, in which a sequence of images (video data) are processed to find an estimate of the speed of each point in the image or 3D scene. Примерами таких задач являются: определение трехмерного движения камеры, слежение, то есть following the movements of an object (for example, cars or people).

Scene recovery

Two or more scene images or video data are given. Scene restoration is designed to recreate a three-dimensional model of the scene. In the simplest case, the model can be a set of points of three-dimensional space. More sophisticated methods reproduce a full three-dimensional model.

Image recovery

The task of image restoration is to remove noise (sensor noise, motion blur of a moving object, etc.).

The simplest approach to solving this problem are various types of filters, such as low or medium frequency filters.

A higher level of noise removal is achieved during the initial analysis of video data for the presence of various structures, such as lines or borders, and then controlling the filtering process based on this data.

Image recovery

Optical flow analysis (finding the movement of pixels between two images).
Several tasks related to motion estimation, in which a sequence of images (video data) are processed to find an estimate of the speed of each point in the image or 3D scene.

Examples of such tasks are: determining the three-dimensional movement of the camera, tracking, i.e. following the movements of an object (for example, cars or people).

Image Processing Methods

  • Счетчик пикселей
  • Бинаризация
  • Сегментация
  • Чтение штрих-кодов
  • Оптическое распознавание символов
  • Измерение
  • Обнаружение краев
  • Сопоставление шаблонов

Pixel counter

Counts the number of light or dark pixels.
Using a pixel counter, the user can select a rectangular area on the screen in a place of interest, for example, where he expects to see the faces of people passing by. The camera will immediately respond with information on the number of pixels represented by the sides of the rectangle.

The pixel counter makes it possible to quickly check whether the mounted camera meets the regulatory requirements or customer requirements regarding pixel resolution, for example, for the faces of people entering the doors controlled by the camera, or for the purpose of recognizing license plates.


Converts a grayscale image to binary (white and black pixels).
The values ​​of each pixel are conditionally encoded as "0" and "1". The value "0" is conventionally called the background or background, and "1" - the foreground.

Often when storing digital binary images, a bitmap is used, where one bit of information is used to represent one pixel.

Also, especially in the early stages of technology, the two possible colors were black and white, which is optional.


Used to search and (or) count parts.

The purpose of segmentation is to simplify and / or change the presentation of an image so that it is easier and easier to analyze.

Image segmentation is usually used to highlight objects and borders (lines, curves, etc.) in images. More precisely, image segmentation is the process of assigning labels to each image pixel so that pixels with the same labels have common visual characteristics.

The result of image segmentation is a plurality of segments that together cover the entire image, or many contours extracted from the image. All pixels in a segment are similar in some characteristic or calculated property, for example, in color, brightness, or texture. Neighboring segments differ significantly in this characteristic.

Barcode reading

Barcode - graphic information applied to the surface, marking or packaging of products, representing the ability to read it by technical means - a sequence of black and white stripes or other geometric shapes.
In machine vision, barcodes are used to decode 1D and 2D codes designed to be read or scanned by machines.

Optical character recognition

Optical character recognition: automated reading of text, such as serial numbers.

Recognition is used to convert books and documents into electronic form, to automate business accounting systems or to publish text on a web page.

Optical text recognition allows you to edit text, search for words or phrases, store it in a more compact form, demonstrate or print material without losing quality, analyze information, and apply electronic translation, formatting, or speech to text.

My LabView Image Program

Computer vision was used for non-destructive quality control of superconducting materials.

Introduction The solution of the tasks of ensuring integrated security (both anti-terrorist and mechanical security of facilities, and technological safety of engineering systems), at present, requires a systematic organization of control, the current state of the facilities. One of the most promising methods for monitoring the current state of objects is optical and optoelectronic methods based on processing technologies for video images of an optical source. These include: image manipulation programs; latest image processing methods; equipment for obtaining, analyzing and processing images, i.e. a set of tools and methods related to the field of computer and machine vision. Computer vision is a common set of methods that allow computers to see and recognize three- or two-dimensional objects, as an engineering direction, and no. To work with computer vision, digital or analog input-output devices, as well as computer networks and IP location analyzers, are required to control the production process and prepare information for operational decisions in the shortest possible time.

Formulation of the problem. Today, the main task for the designed machine vision systems remains the detection, recognition, identification and qualification of potential risk objects located in a random place in the operational responsibility area of ​​the complex. Currently existing software products aimed at solving the above problems have a number of significant drawbacks, namely: significant complexity associated with high detail optical images; high power consumption and a fairly narrow range of capabilities. Expanding the tasks of detecting objects of potential risk to the search for random objects in random situations located in a random place is not possible with existing software products, even with the use of a supercomputer.

Purpose. Development of a universal program for processing images of an optical source, with the possibility of streaming data analysis, that is, the program must be light and fast so that it can be recorded on a small-sized computer device.

  • разработка математической модели программы;
  • написание программы;
  • опробирование программы в условиях лабораторного эксперимента, с полной подготовкой и проведением эксперимента;
  • исследование возможности применения программы в смежных областях деятельности.

The relevance of the program is determined by:
  • отсутствием на рынке программного обеспечения программ обработки изображений с выводом подробного анализа инженерных составляющих объектов;
  • постоянно растущими требованиями к качеству и скорости получения визуальной информации, резко повышающими востребованность программ обработки изображений;
  • существующей потребность в программах высокой производительности, надежных и простых с точки зрения пользователя;
  • высокой стоимостью профессиональных программ обработки визуальной информации.

Analysis of the relevance of program development.
  • отсутствием на рынке программного обеспечения программ обработки изображений с выводом подробного анализа инженерных составляющих объектов;
  • постоянно растущими требованиями к качеству и скорости получения визуальной информации, резко повышающими востребованность программ обработки изображений;
  • существующей потребность в программах высокой производительности, надежных и простых с точки зрения пользователя;
  • существует потребность программ высокой производительности и простого управления, чего добиться в наше время крайне сложно. Для примера я взял Adobe Photoshop. Данный графический редактор обладает гармоничным сочетанием функциональности и простоты использования для рядового пользователя, но в данной программе невозможно работать со сложными инструментами по обработке изображения (например, анализ изображения путём построения математической зависимости (функции) или же интегральной обработкой изображений);
  • высокой стоимостью профессиональных программ обработки визуальной информации. Если программное обеспечение качественно, то цена на него крайне высока, вплоть до отдельных функции того или иного набора программ. На графике ниже представлена зависимость цены/качества простых аналогов программы.

To simplify solving problems of this type, I developed a mathematical model and wrote a program for a computer device for image analysis using the simplest transformations of the original images.

The program works with transformations such as binarization, brightness, image contrast, etc. The principle of the program is demonstrated by the example of analysis of superconducting materials.

When creating composite superconductors based on Nb3Sn, the volume ratio of bronze and niobium, the size and number of fibers in it, the uniformity of their distribution over the cross section of the bronze matrix, the presence of diffusion barriers and stabilizing materials are varied. For a given volume fraction of niobium in the conductor, an increase in the number of fibers leads, respectively, to a decrease in their diameter. This leads to a noticeable increase in the Nb / Cu-Sn interaction surface, which greatly accelerates the growth of the superconducting phase. Such an increase in the amount of the superconducting phase with an increase in the number of fibers in the conductor provides an increase in the critical characteristics of the superconductor. In this regard, it is necessary to have a tool to control the volume fraction of the superconducting phase in the final product (composite superconductor).

When creating the program, the importance of researching the materials from which superconducting cables are created is taken into account, since if the ratio of niobium to bronze is incorrect, wires may explode, and, consequently, human casualties, money costs and loss of time. This program allows you to determine the quality of the wires based on the chemical physical analysis of the object.

Block diagram of the program
Description of the stages of the study.

Stage 1. Sample preparation: cutting a composite superconductor on an EDM machine; pressing a sample into a plastic matrix; polishing the sample to a mirror state; etching a sample to isolate niobium fibers on a bronze matrix. Samples of pressed composite superconducting samples were obtained;

2 stage. Image acquisition: obtaining metallographic images on a scanning electron microscope.

3 stage. Image processing: creating a tool for determining the volume fraction of the superconducting phase in a metallographic image; a set of statistically significant data on a particular type of sample. Mathematical models of various image processing tools have been created; software development was created to estimate the volume fraction of the superconducting phase; the program was facilitated by combining several mathematical functions into one; the average value of the volume fraction of niobium fibers in the bronze matrix was obtained 24.7 ± 0.1%. A low percentage of deviation indicates a high repeatability of the structure of the composite wire.

Electron microscopic images of composite superconductors
. Image processing methods in the program.
  • Идентификация — распознается индивидуальный экземпляр объекта, принадлежащего к какому-либо классу.
  • Бинаризация – процесс перевода цветного (или в градациях серого) изображения в двухцветное черно-белое.
  • Сегментация — это процесс разделения цифрового изображения на несколько сегментов (множество пикселей, также называемых суперпикселями).
  • Эрозия – сложный процесс, при выполнении которого структурный элемент проходит по всем пикселам изображения. Если в некоторой позиции каждый единичный пиксел структурного элемента совпадет с единичным пикселом бинарного изображения, то выполняется логическое сложение центрального пиксела структурного элемента с соответствующим пикселом выходного изображения.
  • Дилатация — свертка изображения или выделенной области изображения с некоторым ядром. Ядро может иметь произвольную форму и размер. При этом в ядре выделяется единственная ведущая позиция, которая совмещается с текущим пикселем при вычислении свертки.

Formulas of the program
Binarization formula (Otsu method):
Erosion formula:
Dilation formula:
Dilation and erosion scheme
Segmentation formulas by color thresholds:
Determination of the brightness gradient module for each image pixel:
Threshold calculation:
Equipment used
  • CHMER GX-320L с ЧПУ — станок для электроэрозионной резки образцов
  • SimpliMet 1000 — станок для горячей запрессовки
  • AutoMet 250 Buehler – машина для шлифовки и полировки
  • Axio Scope A1 Carl Zeiss – оптический микроскоп для контроля качества шлифов
  • Hitachi TM-1000 — сканирующий электронный микроскоп для получения металлографических изображений

Program interface