Student ID: 2101110234.
With the continuous development of artificial intelligence technology, the field of intelligent robots has also made unprecedented development. Especially after the wide application of deep neural network in visual system, many obvious achievements have been made. Vision system plays a very important role for autonomous mobile robots, and image segmentation technology plays a very important role in this system. Traditional image segmentation technology has been able to basically separate the foreground and background of the image, but in recent years, with the development of deep learning algorithm, people began to apply it to image segmentation, and put forward many segmentation networks, which also achieved good segmentation results. On the basis of image segmentation, people also make the segmentation have semantic categories and labels, which is now semantic segmentation. On the basis of introducing semantic segmentation, this paper introduces new task segmentation scenes, instance segmentation and panoramic segmentation. This paper also introduces the semantic segmentation of hot three-dimensional point clouds recently studied, and expounds the necessity of its realization.
Embedded bovine nose intelligent robot, image segmentation, semantic segmentation, computer vision.
Traditional common methods of image segmentation technology embedded in cow problem
Mosaic ox script
I. Introduction
Computer vision, that is, computer vision, is a machine that simulates the working principle of human vision by computer and obtains and completes a series of image information processing. Computer vision belongs to the application of machine learning in the field of vision, and it is an interdisciplinary research field, involving mathematics, physics, biology, computer engineering and other disciplines.
The main applications of computer vision are unmanned driving, face recognition, unmanned security, vehicle license plate recognition, intelligent image transmission, 3D reconstruction, VR/AR, intelligent photography, medical image processing, unmanned aerial vehicle, industrial detection and so on. Human driving, also known as automatic driving, is an important research direction in the field of artificial intelligence at present, which allows cars to drive independently or assist drivers to improve the safety of driving operations. At present, the face recognition technology is mature and has been applied in many places, and the accuracy of face recognition is higher than that of human eyes. Security has always been a problem that our country attaches great importance to, and it is also a problem that people pay special attention to. Many important locations have arranged patrols, and residential quarters and companies have also arranged public security patrols to ensure safety. Vehicle license plate recognition is an immature technology at present. Vehicle license plate recognition is used for violation detection, traffic flow analysis, seat belt recognition, intelligent traffic lights and vehicle recognition in parking lots. Before 3D reconstruction, it was widely used in industrial field. It can be used to model three-dimensional objects, conveniently measure various parameters of objects, or simply copy objects. There are many applications of computer vision, and with the development of technology, there will be more and more application fields. Application in industrial field, application in robotics, etc.
For the traditional image segmentation process, it can usually be divided into five steps, namely, feature perception, image preprocessing, feature extraction, feature screening and inference prediction recognition. Through research, it is found that in the early stage of visual development, people did not pay enough attention to the features in images. However, the traditional segmentation process is to separate feature extraction and classification, and then merge them when the results need to be output. The difficulty of its realization can be imagined.
After the advent of deep learning algorithm, convolutional neural network has been widely used in computer vision technology, and many research directions have been derived. Deep learning is mainly based on features to compare. For example, in face recognition, convolution neural network is used to extract the features of two faces in different positions, and then compare them with each other, and finally the comparison result is obtained. At present, the main research directions of computer vision are image classification, object detection, image segmentation, object tracking, image filtering and noise reduction, image enhancement, stylization, three-dimensional reconstruction, image retrieval, GAN and so on. In this paper, the field of image segmentation is briefly summarized.
Image segmentation technology is an important research direction in the field of computer vision and an important part of image semantic understanding. Image segmentation refers to the process of dividing an image into several regions with similar attributes. From a mathematical point of view, image segmentation is the process of dividing an image into disjoint regions. In recent years, with the gradual deepening of deep learning technology, image segmentation technology has developed by leaps and bounds. Technologies related to this technology, such as scene object segmentation, human background segmentation, face and human body analysis, 3D reconstruction and so on, have been widely used in unmanned driving, augmented reality, security monitoring and other industries.
Second, the development status quo
In recent years, many scholars have applied image segmentation technology to the control of mobile robots. This technology can locate, build maps and segment different foreground and background while the robot is moving, so that the images scanned by the vision system have semantic information. Some scholars are also committed to more accurate and precise segmentation, which can not only distinguish different kinds of objects, but also classify different objects of the same kind, and even divide the background on this basis. Because the world we live in is a three-dimensional space, some scholars restore the image scene to three dimensions, and then use related methods to segment the whole three-dimensional scene. As a classic problem in computer vision research, image segmentation has attracted more and more attention.
? The first is the traditional image segmentation method. In traditional segmentation, people use the knowledge of digital image processing, topology and mathematics to segment images. Although the computing power is gradually enhanced and deep learning is developing, some traditional segmentation methods are not as effective as deep learning, but there are still many ideas worth learning.
The first method is image segmentation based on threshold. The core idea of this method is to give one or more gray thresholds according to the gray characteristics of the image, and compare this threshold with each pixel in the image one by one as the standard value. It is easy to think that two kinds of results can be obtained through this comparison process, one is the pixel point set whose gray value is greater than the threshold, and the other is the pixel point set whose gray value is less than the threshold, thus naturally segmenting the image. Therefore, it is not difficult to find that the most critical step of this method is to get the best gray threshold according to a certain criterion function, so as to get the appropriate classification results. It is worth mentioning that if the target and background to be segmented in the image occupy different gray values or even different levels, then this method will get good results. Moreover, if we only need to set a threshold for the processing of an image, it can be called single threshold segmentation. However, if there is more than one target in the image, that is, there are multiple targets to be extracted, a single threshold segmentation cannot separate them all. At this time, multiple thresholds should be selected for processing, and the segmentation process is multi-threshold segmentation. Generally speaking, the threshold segmentation method has its unique characteristics, simple calculation and high efficiency. However, this method only considers the gray value and its characteristics of a single pixel, and completely ignores the spatial characteristics, which leads to its sensitivity to noise and low robustness.
The second method is region-based image segmentation. There are two basic forms of this method: one is region growing, starting from a single pixel, gradually merging similar regions, and finally getting the needed region. Another method is to directly start from the overall situation of the image and gradually cut it to the required area bit by bit. Region growing refers to giving a group of seed pixels, representing different growth regions, and then gradually merging these seed pixels into qualified pixels in the neighborhood. If new pixels are added, they are also considered as seed pixels.
The segmentation process of regional division and merger can be said to be the inverse process of regional growth. This method is based on the overall situation of the image, through continuous division, to get each sub-region, and then extract the target process. In addition, in this process, the foreground regions need to be merged.
There is also a watershed algorithm in the region segmentation method. Inspired by watershed composition, this segmentation method regards the image as geodesic topological landform, so that the elevation corresponding to each pixel in the image can be represented by the gray value of that point. The formation process of watershed can actually be realized by simulating the immersion process. Specifically, a small hole is punched on the surface of each local minimum, and then the model is slowly immersed in water. As water slowly dipped into it, a watershed was formed.
The third method is a segmentation method based on edge detection. The idea of edge detection is to segment images by detecting the edges of different objects. This method is one of the first methods studied. If we transform the picture from spatial domain to frequency domain, and the edge part of the object corresponds to the high frequency part, it is easy to find the edge information, so the segmentation problem becomes easy. Edge detection method can achieve fast and accurate positioning, but it can't guarantee the continuity and sealing of the edge. When there is too much detail information in an image, there will be a lot of thin edges at the edge, and there will be defects when forming a complete segmentation area.
The fourth image segmentation method combines specific tools. The specific tools mentioned here are various image processing tools and algorithms. With the deepening of image segmentation research, many scholars began to apply some image processing tools and algorithms to this work, and achieved good results. Wavelet transform plays an important role in digital image processing, which can unify time domain and frequency domain to study signals. Especially in image edge detection, wavelet transform can detect the local mutation ability of binary function. Secondly, image segmentation is based on genetic algorithm, which mainly draws lessons from the random search method of natural selection and natural genetic mechanism in biology. It simulates the evolution process of biological population controlled by gene sequence, and is good at global search, but its local search ability is insufficient. Applying genetic algorithm to image processing is also a hot issue in current research. The main reason for choosing this method here is that genetic algorithm has the ability of fast random search, and its search ability has nothing to do with the problem domain.
In addition, there is a segmentation method based on active contour model, which has a unified and open description form and provides an ideal framework for the research and innovation of image segmentation technology. This method is also a method of detecting edge information, which mainly uses curve evolution to detect the target in a given image.