Open Access
Issue
Wuhan Univ. J. Nat. Sci.
Volume 31, Number 1, February 2026
Page(s) 79 - 90
DOI https://doi.org/10.1051/wujns/2026311079
Published online 06 March 2026

© Wuhan University 2026

Licence Creative CommonsThis is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

0 Introduction

China is the largest coal producer in the world, and the green mining, efficient and intelligent utilization of coal resources are crucial to national development[1]. In the process of coal mining, raw coal is mixed with gangue accompanying the coal seam, which is a type of solid waste with low calorific value, high water content, and high impurity content[2]. Due to its relatively high content of sulfur dioxide and other harmful substances, gangue causes environmental pollution to rivers, atmosphere, and soil. Therefore, gangue sorting is particularly important in the field of coal mining. Presently, the primary coal gangue sorting methods employed in China are manual gangue sorting and mechanical gangue separation. Manual gangue sorting is plagued by low sorting efficiency, poor working conditions, high labor intensity, and high costs. Traditional mechanical gangue separation methods generally suffer from low identification accuracy, large floor space, high investment costs, and severe environmental pollution, failing to meet the requirements of green manufacturing[3].

Coal mine intelligence serves as the core technical support for adapting to the development direction of the modern industrial technology revolution, safeguarding national energy security, and promoting the high-quality development of the coal industry[4-6]. In recent years, to align with the trend of gangue sorting technology toward automation and intelligence, the main methods commonly used in non-contact gangue identification include infrared thermal imaging identification[7], electromagnetic wave identification[8], reflectance spectral identification[9], ultrasonic identification[10], vibration signal identification[11], and image recognition[12]. Among these, image recognition combined with machine vision technology, leveraging its advantages of high intelligence and safety reliability, has been widely applied in coal-rock interface identification, dry coal beneficiation, and conveyor belt speed regulation, driving the automation and intelligent advancement of the coal gangue separation process. In practical applications, Shang et al[13] adopted Support Vector Machines (SVM) for coal gangue image recognition and verified the feasibility of the coal gangue sorting platform. Cao’s team[14] utilized transfer learning methods to extract classification information and spatial coordinates from coal gangue images, achieving an identification accuracy of 90.17%. They also investigated the impact of different belt speeds on the system’s recognition performance, providing a case study for intelligent coal mining. Hu et al[15] proposed a two-stage object detection network combining YOLOv2 and AlexNet for coal gangue image identification and realized intelligent coal gangue sorting using a specially developed sorting device, though the sorting efficiency still needs improvement. Pang et al[16] improved the SVM through the Particle Swarm Optimization (PSO) algorithm, enabling effective distinction of characteristic information between coal and gangue. Li et al[17] designed the ResNet18-YOLOv3 algorithm and constructed an experimental platform for coal gangue sorting robots; through testing with the coal gangue sorting system, an accuracy rate of 82% was achieved. However, most of the aforementioned studies have focused on algorithm optimization. While recognition accuracy has been enhanced, the approach remains plagued by inherent limitations, including high hardware costs and bulky device form factors. These limitations pose significant challenges to its deployment on resource-constrained embedded platforms.

To address the aforementioned issues, this study designs an intelligent coal-gangue sorting system based on machine vision and convolutional neural network. In this system, coal and gangue are placed on a conveyor belt, which transports them to the detection area. An OpenMV camera captures images of the coal and gangue to identify and locate the gangue, and the gangue location information is sent to the STM32 microcontroller via serial communication. When the STM32 receives the gangue location information, the relay is activated, the conveyor belt stops, and the robotic arm is then controlled to act, grab the gangue, and transfer it, ultimately completing the automatic sorting of coal and gangue. This machine vision-based sorting method can realize real-time identification and classification of coal and gangue, improving sorting speed and accuracy. It not only reduces labor costs and energy consumption but also plays a positive role in environmental protection and sustainable development.

1 Overall Design of the System

The system comprises a computer, an OpenMV camera, a robotic arm, an STM32 microcontroller, a conveyor belt, relays, and detection objects (coal and gangue). The system framework diagram is shown in Fig. 1.

Thumbnail: Fig. 1 Refer to the following caption and surrounding text. Fig. 1 System frame diagram

The primary design ideas of the system are as follows: Firstly, a simulated coal gangue sorting platform is constructed in the laboratory. Secondly, the gangue image data is collected to produce a sample dataset, and a detection model is built using a deep learning network for target object recognition and localization. Finally, the combination of the STM32 microcontroller and the relay module is used to control the robotic arm to perform the gangue gripping function. The system is capable of real-time online identification, positioning, and intelligent sorting of coal gangue based on machine vision. It has established a collaborative architecture between lightweight deep learning and embedded hardware. The model is efficiently deployed to OpenMV, achieving a coal gangue identification accuracy rate of 97.56%. Combined with microcontroller unit (MCU) timing control, the mechanical arm’s grasping response is less than 0.5 s, with low latency. The block diagram of the overall system implementation is shown in Fig. 2.

Thumbnail: Fig. 2 Refer to the following caption and surrounding text. Fig. 2 Overall system implementation diagram

1.1 Hardware Design

1.1.1 OpenMV image processing module

This design uses the OPENMV4 H7 R2 removable camera module system, which is a small, low-power, low-cost board that can be easily used for machine vision applications using the Python language[18]. Currently, OpenMV cameras are mainly used in frame-differencing algorithms, color tracking, face detection, and so on. Designed with Color Tracking gangue color tracking, the OpenMV cameras can be programmed to identify and track gangue location information.

The sensor configured in OpenMV is the MT9M114, which is capable of processing 640×480 8-bit greyscale images or 640×480 16-bit RGB565 color images. The processor utilized in this design is the STM32H743Ⅵ ARM Cortex M7, and the IO ports used in the design are the full-speed USB (12 mb/s) interface and the serial bus (TX/RX). A full-speed USB connection to the computer is used to power the OpenMV camera. After a successful power supply, the computer will have a virtual OpenMV COM port, and a “USB stick” will appear on the computer. Connect the UART3 TX serial bus to the UART2 RX serial bus of the STM32 microcontroller using a DuPont cable for serial communication.

1.1.2 Microcontroller control module

The control module is based on the STM32F103C8T6 microcontroller as the main control chip. It integrates six servo ports with overcurrent protection for connecting each servo; a USB serial port for downloading programs, debugging code, and communicating with the host computer; a reset button for resetting the MCU; a switching power supply for powering the microcontroller; four IO ports for communicating with the OpenMV serial port and external relays; and a voltage regulator coil for outputting a stable pulse width modulation (PWM) waveform[19-20]. Since the operating voltage of the servo is in the range of 4.8-7.4 V, the voltage of the fully charged battery can be more than 8 V when the 7.4 V lithium battery is used to power the servo. If power is supplied directly to the servo, it may burn out the servo, so we choose an adjustable voltage regulator module with a maximum output of 5 A. The board is designed with a compact structure, taking into full consideration the two conflicting sides of cost and functionality. The hardware structure is shown in Fig. 3.

Thumbnail: Fig. 3 Refer to the following caption and surrounding text. Fig. 3 Hardware structure diagram

1.1.3 Conveyor belt module

The conveyor belt is driven by a 24 V DC gear motor with a speed of 600 r/min, and changing the positive and negative poles will make the conveyor belt run in reverse. The conveyor belt is equipped with a knob to adjust the voltage governor, with a speed range of 1-5 m/min, which allows control of the conveyor belt’s running speed. Conveyor start/stop operation is controlled by a 1-channel 5 V relay, with a DC power supply 5.5×2.1 mm adapter for hardware wiring. The relay inputs VCC, GND, and IN are connected to the 3.3 V, GND, and PB6 pins of the STM32; The COM and NC terminals of the outputs are connected to the positive terminal of the motor, and the positive terminal of the power supply, respectively, and the negative terminal of the motor and the negative terminal of the power supply are connected directly. The gangue and coal are to be conveyed along the conveyor belt. Upon receipt of the gangue information by the STM32 microcontroller, the relay will be activated, the green indicator light will be illuminated, and the conveyor belt will cease to function. Subsequently, the robotic arm grabs the gangue until the robotic arm grabs all the gangue in the display area and transfers them successfully. At this point, the relay power indicator light is illuminated, and the conveyor belt resumes operation.

1.1.4 Robotic arm actuator module

Following the research, the six-degree-of-freedom robotic arm manufactured by Hangzhou Hailing Zhidian Technology Co., Ltd. was selected as the experimental platform. The arm is mainly composed of a bracket, servos, a control board and a power supply module, which enable it to perform a variety of operational movements. Servos Nos. 1, 3, 4, 5, and 6 are 180° servos, with servo No. 6 being TBSN20 and the remaining servos being MG996R; Servo No. 2 is a 270° servo, TBSK20. The blocking torque of the robotic arm servo is 20 kg·cm, and it can actually stabilize the grasping of objects with a weight of ≤500 g. The servo works as follows: it first receives the control signal and then drives the motor to rotate through the control circuit board. The motor is decelerated by a gear set, which in turn causes the rudder to rotate. The rotation of the rudder is detected by means of a position feedback potentiometer, and the resulting data is then fed back to the control board.

1.2 Software Design

1.2.1 Programming ideas

The system programming encompasses the OpenMV part, which is written in Python language, and the STM32 microcontroller part, which is written in C language. The system workflow diagram is shown in Fig. 4.

Thumbnail: Fig. 4 Refer to the following caption and surrounding text. Fig. 4 System workflow chart

In this design, the OpenMV4 H7 R2 smart camera is employed to identify gangue and acquire its pixel coordinates, with two approaches available for this task. One is to use a threshold editor[21], which relies on manual parameter adjustment to distinguish gangue from coal by setting pixel intensity thresholds, offering a simple and direct solution for preliminary recognition. The other is to leverage the cloud-based EDGE IMPULSE platform[22] for deep learning, generating FOMO[23] models that are then ported to the OpenMV platform for visual recognition. These models serve as lightweight target detection models, characterized by their compact size and high processing speed, rendering them well-suited for application in coal gangue sorting scenarios.

The threshold editor method is to open the OpenMV Integrated Development Environment (IDE) to initialize first, reset the photoreceptor to keep the light source stable, set the color format of the image capture to RGB595 and the resolution to Quarter Video Graphics Array(QVGA), flip the image to make the screen normal, and perform aberration correction on the camera. The color of the target object can be determined by setting a reasonable threshold in the frame buffer. Then, it is necessary to write the appropriate serial port program to obtain the pixel point coordinates of the gangue.

The method for training the deep learning model differs from the threshold editor method in that it first uses OpenMV IDE software to create two classification files of coal and gangue in the dataset editor, and calls the sensor.snapshot() function to achieve the image data acquisition. The original image sample datasets of coal and gangue are collected, then the collected datasets are uploaded to the EDGE IMPULSE platform, where the training and test sets are split at an 80∶20 ratio, and all data are labeled. The dataset was constructed using coal and gangue samples mined from the Zhuzhuang Coal Mine in Huaibei City, Anhui Province. A total of more than 300 coal-gangue images were collected, of which 80% were used for model training and the remaining 20% were designated as the test set. This dataset includes coal-gangue images with different placement positions, as well as images captured under natural light and artificial light sources. On the experimental platform, the input image size was set to 128×128 pixels, the number of training epochs was set to 100, and training was automatically terminated when the model achieved a satisfactory performance to avoid overfitting (which would otherwise lead to an excessively high loss rate). The learning rate was configured as 0.001 and the batch size as 8; in addition, data augmentation was performed on the entire dataset. Feature extraction was conducted based on the RGB color depth parameter, and the FOMO neural network model was trained and generated using the training dataset.

Finally, the trained FOMO neural network model was transplanted to the OpenMV platform, with the corresponding recognition program written and the dedicated software launched for coal-gangue identification. Given that camera-based coal and gangue recognition is highly susceptible to light interference, automatic white balance and gain adjustment were disabled in program design to keep the camera parameters consistent with the lighting conditions of the coal-gangue sorting working environment. At the same time, set the horizontal and vertical flip, and set the detection window area as 120×120 to ensure that the hardware installation height, angle, and actual detection area are reasonable, and to enhance the stability of coal and gangue image acquisition and recognition. Considering the clamping range of mechanical gripping, the center point coordinates of coal and gangue are extracted immediately after the target recognition is completed, and the pixel point coordinates of gangue are sent to the MCU in real time so that the robotic arm can carry out coordinate conversion and trajectory planning. The OpenMV workflow diagram is shown in Fig. 5.

Thumbnail: Fig. 5 Refer to the following caption and surrounding text. Fig. 5 OpenMV workflow chart

Programming for the robotic arm was implemented using Keil5 software, with the code primarily incorporating the OpenMV communication module and the servo control module. The overall implementation process is as follows: the STM32 microcontroller first receives the pixel coordinates of gangue transmitted by OpenMV, then calculates the actual spatial coordinates of the target gangue, solves for the required rotation angle of each robotic arm servo, and further conducts path planning for the robotic arm, thus enabling the robotic arm to accurately grasp and transfer the gangue.

1.2.2 Algorithm

The neural network algorithm that has been selected for the training of the FOMO neural network is the MobileNetV2 0.35 neural network of lightweight convolutional neural networks (CNNs). MobileNetV2 version 0.35 is a variant of MobileNetV2. The primary characteristics of this model include its compact size and reduced computational demands, which render it well-suited for efficient image recognition and classification tasks in environments with limited resources, such as mobile devices. MobileNetV2 0.35 is comprised primarily of depth-separable convolution, an inverted residual structure, and a linear bottleneck. The Depthwise Separable Convolution (DSC) operation can be decomposed into the following two steps: Depthwise Convolution (DC) and Pointwise Convolution (PC)[24]. The former is employed for the extraction of spatial features, whilst the latter is utilized for the extraction of channel features. The formula is:

D S C = D C + P C Mathematical equation(1)

Assuming that the convolution kernel is DK×DKMathematical equation, the feature map size is DF×DFMathematical equation, the input channel is M, and the output channel is N, the computation of DSC is:

D S C = D K × D K × M × D F × D F + M × N × D F × D F Mathematical equation(2)

Following the generation of M equal-sized feature maps by means of DC, the implementation process of its operational output feature maps is illustrated in Fig. 6.

Thumbnail: Fig. 6 Refer to the following caption and surrounding text. Fig. 6 Convolutional output feature graph implementation process

2 Key Technologies

2.1 Kinematic Analysis

Prior to executing a target action, the robotic arm is required to perform kinematic calculations based on the pose algorithm, thereby deriving the optimal movement path of the arm[25]. The inverse kinematic solution for the robotic arm[26] refers to solving for the joint rotation angles with the known position of the robotic arm’s end effector. In contrast, the forward kinematic equations[27] are applied to calculate the pose and position of the end effector from the known angle values of each joint: j1Mathematical equation, j2Mathematical equation, j3Mathematical equation, and j4Mathematical equation, where j1 is the horizontal rotation joint angle of the base of the robotic arm, j2 is the pitch joint angle of the large arm of the robotic arm, j3 is the pitch joint angle of the small arm of the robotic arm, and j4 is the pitch joint angle of the wrist of the robotic arm. In this robotic arm, a six-degree-of-freedom robotic arm that ignores the rotation angle of the end controller is used for the kinematic solution. Its positive kinematics solution procedure is as follows:

l e n = ( y + p ) 2 + ( x ) 2 ,     h i g h = z    , t a n   j 1 = y + p x . Mathematical equation(3)

j 1 = a r c t a n   ( y + p x ) . Mathematical equation(4)

l e n = A 2 s i n   j 2 + A 3 s i n ( j 2 + j 3 ) + A 4 s i n ( j 2 + j 3 + j 4 ) , Mathematical equation(5)

h i g h = A 1 + A 2 c o s   j 2 + A 3 c o s ( j 2 + j 3 ) + A 4 c o s ( j 2 + j 3 + j 4 ) . Mathematical equation(6)

Let α=j2+j3+j4Mathematical equation, we can get L=A2sin j2+A3sin(j2+j3), H=A2cos j2+A3cos(j2+j3)Mathematical equation.

c o s   j 3 = L 2 + H 2 - A 2 2 - A 3 2 2 A 2 A 3 ,   s i n   j 3 = 1 - ( c o s   j 3 ) 2 . Mathematical equation(7)

j 3 = a r c t a n   ( s i n   j 3 c o s   j 3 ) . Mathematical equation(8)

Let k1=A2+A3cos j3, k2=A3sin j3Mathematical equation. Solving the above equations gives:

c o s   j 2 = k 1 H + k 2 L k 1 2 + k 2 2 ,   s i n   j 2 = 1 - ( c o s   j 2 ) 2 . Mathematical equation(9)

j 2 = a r c t a n ( s i n   j 2 c o s   j 2 ) ,    j 4 = α - j 2 - j 3 . Mathematical equation(10)

Here, len represents the straight line distance from the center of the base of the arm to the projection point of the target point in the XY plane, which is also the horizontal projection length of the end actuator of the arm in the XY plane, and high represents the vertical height of the target point, that is the Z-axis coordinates of the center of the gangue in the world coordinate system. α is the pitch angle of the end controller, and is variable so that the upper and lower limits of the combined solution can be found. This method extends the scope of the capture while ensuring a normal solution. Figure 7 shows the structure of the model.

Thumbnail: Fig. 7 Refer to the following caption and surrounding text. Fig. 7 Mechanical arm model structure diagram

2.2 Camera Calibration

The gangue position information detected by OpenMV is based on the pixel coordinate system of the camera, whereas the six-degree-of-freedom robotic arm executes the gangue sorting task in a three-dimensional spatial coordinate system. To address this coordinate system discrepancy, the Zhang Zhengyou calibration method[28] was adopted to calibrate the camera, thus realizing the coordinate transformation from pixel coordinates to the robotic arm’s spatial coordinate system[29-30]. The Zhang Zhengyou calibration method uses a printed image of a checkerboard grid captured by a camera as a calibration plate and extracts the corner points of the checkerboard grid to obtain the coordinates of the corner points (u, v). The world coordinates (X, Y, Z) of the corner points of the calibration plate are calculated, from which the internal and external parameter matrices of the camera are obtained, so that a unique camera model can be obtained. The relationship between the world coordinate system and the pixel coordinate system can be represented by the following matrix:

Z c [ u v 1 ] = [ 1 d x 0 u 0 0 1 d y v 0 0 0 1 ] [ f 0 0 0 0 f 0 0 0 0 1 0 ] [ R t 0 1 ] [ X Y Z 1 ] Mathematical equation

= [ f x 0 u 0 0 0 f y v 0 0 0 0 1 0 ] [ R t 0 1 ] [ X Y Z 1 ] Mathematical equation(11)

The coordinates of the origin of the image coordinate system in the pixel coordinate system are denoted by (u0, v0). The physical dimensions of each pixel in the x and y directions of the image plane are denoted by dx and dy, respectively. The focal length is denoted by f, the 3×3 orthogonal rotation matrix is denoted by R, and the 3D translation vector is denoted by t. The matrices following the last equal sign of equation (11) are, from left to right, the inner and outer reference matrices of the camera.

3 Debugging Process and Results

3.1 Model Training and Analysis

Feature Explorer visualization from the training platform is presented in Fig. 8, where blue denotes coal and orange denotes gangue. The feature distributions of coal and gangue exhibit a distinct clustering tendency, with the features of most sample points being effectively distinguishable. Specifically, coal and gangue scatter points in the upper-left and center-left regions are clearly separated, indicating that the model can capture the characteristic differences of these samples. In the lower-right region, the two types of samples also form relatively independent clusters, with only a small number of sample points overlapping. These results demonstrate two key points: first, the model can extract fundamental visual features (e.g., texture and grayscale) from standard RGB images to learn the interclass decision boundary between coal and gangue; second, the feature vectors extracted by the MobileNetV2 0.35 network can effectively map coal and gangue to distinct feature spaces.

Thumbnail: Fig. 8 Refer to the following caption and surrounding text. Fig. 8 Spatial distribution and visualization of coal and gangue sample characteristics

Model evaluation metrics are metrics used to quantify the performance of machine learning or deep learning models when processing data. Precision (P) is indicative of the degree of confidence that the model predicts the outcome as a positive example. Recall (R) measures the ability of the model to identify positive class samples. The F1 score is the reconciled average of precision and recall. Accuracy indicates the number of correctly predicted samples as a proportion of the total number of samples. The formula is as follows:

P r e c i s i o n = T P T P + F N   , Mathematical equation(13)

R e c a l l = T P T P + F N   , Mathematical equation(14)

F 1 = 2 P × R P + R   , Mathematical equation(15)

A c c u r a c y = T P + T N T P + T N + F P + F N   , Mathematical equation(16)

where TP indicates the total number correctly identified as coal or gangue, FN denotes the total number of coals or gangue not identified, TN denotes the total number of coals or gangue that are not coal or gangue and have not been identified, and FP represents the total number of coal and gangue that are not coal or gangue but have been misclassified as coal or gangue.

The findings from the training and testing processes, which were conducted on the cloud-based EDGE IMPULSE platform, are presented in Figs. 9-10. The model evaluation metric F1 score was 95.5%, the recall was 91.30%, and the test output was 97.56%.

Thumbnail: Fig. 9 Refer to the following caption and surrounding text. Fig. 9 EDGE IMPULSE platform training result

Thumbnail: Fig. 10 Refer to the following caption and surrounding text. Fig. 10 EDGE IMPULSE platform test result

Prior to deploying the model to the target hardware device, its validity must be verified on the Edge IMPULSE platform to confirm whether the target processor performance can meet the model’s predicted performance metrics. The weight data type during model training is float32, and INT8 quantization can be applied to prune redundant information and reduce memory overhead. As shown in Table 1, compared with the unquantized model, the quantized model achieves identical accuracy while reducing RAM usage by 17% and ROM usage by 29%. For the quantized model, the end-to-end image preprocessing and inference latency is 7 ms; the predicted inference latency on the target processor is 1 715 ms, with a RAM footprint of 409 KB and a flash memory footprint of 78.4 KB. OpenMV is equipped with a Cortex-M7 core processor featuring 1 MB of RAM and 2 MB of flash memory, which demonstrates that the coal-gangue recognition task can be executed efficiently and stably on this platform.

The real-time classification function of the platform can be used to visualize the results of coal and gangue classification. As shown in Fig. 11 for single target classification, the F1-scores of coal and gangue both reach 100%, with their bounding boxes and center coordinates identified accurately; the corresponding confidence scores are 0.89 and 0.95, respectively. Figure 12 shows the multi-target classification, and it can be found that coal and gangue are correctly identified, and there is no overlap of the bounding box, and the center coordinates are accurately identified, and there is no leakage and misdetection phenomenon.

Thumbnail: Fig. 11 Refer to the following caption and surrounding text. Fig. 11 Single-target classification visualization

Thumbnail: Fig. 12 Refer to the following caption and surrounding text. Fig. 12 Multi-objective classification visualization

To verify the adaptability of OpenMV to the practical visual inspection requirements of coal-gangue sorting, validation tests were conducted from three aspects: model size, inference speed, and real-time imaging performance. Given that the MobileNetV2 0.35 neural network was adopted in this study, the trained FOMO model was quantized to INT8 precision. This enables the model to be deployed on the target device in .tflite format, which is then automatically converted to a dedicated format compatible with OpenMV. The final model file size is merely 56 KB, far below the storage upper limit of OpenMV’s built-in memory. This allows for rapid loading and operation of the model without imposing storage pressure on the hardware and avoids the waste of general-purpose computing resources. Actual tests show that the model has an inference latency of 1 715 ms, and the overall system operation frame rate is 38.5 frame/s, which indicates that the acquisition and processing of a single coal-gangue image frame are completed every 26 ms. These results demonstrate that OpenMV fully meets the practical requirements for real-time performance and hardware compatibility in the visual inspection of coal and gangue.

Table 1

EON™ compiler performance comparison (same accuracy, 17% less RAM, 29% less ROM)

3.2 System Hardware Connection and Performance Test Verification

In the process of constructing the hardware platform, the OpenMV camera was situated in a fixed position at a height of 51 cm and a distance of 53 cm from the iron frame table. The robotic arm was positioned at a distance of 10 cm from the iron frame table, and the conveyor belt was placed in a suitable position. Initially, it is imperative to ensure that the hardware connection is complete. This is necessary to facilitate effective communication between the STM32 and OpenMV components. Secondly, after keeping the light source stable, open OpenMV IDE to run the program and enter the image recognition module to identify the gangue and obtain the pixel position coordinates of the gangue. The master module then receives the position information and converts the pixel coordinate system to the world coordinate system via serial communication. Finally, the actuator performs position solution and trajectory planning to complete the gripping and transfer of the gangue. A physical diagram of the system is shown in Figs. 13-14.

Thumbnail: Fig. 13 Refer to the following caption and surrounding text. Fig. 13 Overall physical diagram

Thumbnail: Fig. 14 Refer to the following caption and surrounding text. Fig. 14 OpenMV recognition diagram

In the actual testing process, the robotic arm used in the system is a laboratory-grade hardware device, and in order to adapt to its load capacity, the weight of each selected coal or gangue block is controlled between 200-400 g. Setting the conveyor belt transfer speed to 2-3 m/min can reduce the problem of leakage and mispicking. After system commissioning and operation, the communication success rate between OpenMV and the MCU was tested to be 100%, ensuring the reliable and stable transmission of gangue coordinate information. The positioning error range of the robotic arm during gangue grasping is [9,12.5] mm, and the response time from receiving gangue position information to initiating grasping and transfer actions is less than or equal to 0.5 s. These performance metrics fully meet the task requirements for dynamic, small-batch coal-gangue sorting in real-scenario laboratory tests.

The actual robotic arm gripping success rate was then tested. The test results are shown in Table 2. Most gangue can be gripped successfully, with an average success rate of 90.13%, and the overall design requirements can be met. The results show that OpenMV recognition combined with STM32 to control the robotic arm to execute the core logic of sorting coal and gangue is feasible in practice, and also provides technical support for subsequent upgrading of industrial-grade robotic arm and conveyor belt, and other hardware for industrial-grade coal and gangue sorting scenarios.

Table 2

Test results

4 Conclusion

Aiming at the requirements for lightweight design and high real-time performance in coal-gangue sorting, this study achieves deep integration between OpenMV and STM32, and proposes an intelligent coal-gangue sorting system while elaborating on the functions and corresponding parameters of the hardware involved in the overall system. Based on a self-constructed dataset, this study conducts model training and performance analysis using the MobileNetV2 0.35 neural network, enabling the efficient deployment of the model on OpenMV for accurate coal and gangue recognition. The control board drives the robotic arm’s servos via PWM signals, allowing the robotic arm to accurately locate and grasp target objects; experimental tests were further conducted to verify the feasibility and effectiveness of the proposed system. The system can stably and in real time implement the entire process of coal-gangue identification, grasping and sorting. This sorting method overcomes the limitations of traditional manual sorting, realizes an intelligent coal-gangue sorting approach with high accuracy and low cost, and thus provides a low-cost technical solution for small and medium-scale industrial coal-gangue sorting applications. However, the current design is susceptible to external interference such as lighting conditions and inherent limitations of the hardware devices themselves. Therefore, future research will focus on three key directions: first, adapting the system to diverse working environments and reducing positioning errors by optimizing and improving the recognition algorithms to enhance their adaptability to complex scenarios; second, further optimizing the robotic arm control system to improve the response speed and execution accuracy of control commands. These improvements are expected to promote the system’s adaptation to industrial-grade coal-gangue sorting scenarios.

References

  1. Wang G F, Liu F, Pang Y H, et al. Coal mine intellectualization: The core technology of high quality development[J]. Journal of China Coal Society, 2019, 44(2): 349-357(Ch). [Google Scholar]
  2. Wang X W, Liu S G, Wang X S, et al. XR Intelligent operation and maintenance system of coal mine for multi-man-machine complex cooperative task[J]. Journal of China Coal Society, 2019, 49(4): 2124-2140(Ch). [Google Scholar]
  3. Gao X L. Status and development of intelligent drilling tools for coal mine[J]. Coal Geology & Exploration, 2023, 51(10): 156-166(Ch). [Google Scholar]
  4. He S R, Yang J K. Practice the concept of green development and build an ecological green mine[J]. China Cement, 2022, 8: 98-101(Ch). [Google Scholar]
  5. Liang W G, Guo F Q, Yu Y J, et al. Research progress on in-situ intelligent sorting and filling technology of coal gangue underground[J]. Coal Science and Technology, 2024, 52(4): 12-27(Ch). [Google Scholar]
  6. Miao J. Research and application of intelligent sorting robot for coal gangue[J]. Intelligent Mine, 2023, 4(1): 58-62(Ch). [Google Scholar]
  7. Eshaq R M A, Hu E Y, Qaid H A A M, et al. Using deep convolutional neural networks and infrared thermography to identify coal quality and gangue[J]. IEEE Access, 2021, 9: 147315-147327. [Google Scholar]
  8. Liu Y, Si L, Wang Z B, et al. Research progress on coal rock recognition technology based on electromagnetic waves[J]. Journal of Mine Automation, 2024, 50(1): 65(Ch). [Google Scholar]
  9. Li L J, Fan S X, Wang X W, et al. Classification method of coal and gangue based on hyperspectral imaging technology[J]. Spectroscopy and Spectral Analysis, 2022, 42(4): 1250-1256(Ch). [Google Scholar]
  10. Zhang Q, Zhang R X, Liu J M, et al. Review on coal and rock identification technology for intelligent mining in coal mines[J]. Coal Science and Technology, 2022, 50(2): 1-26(Ch). [Google Scholar]
  11. Liu W. Application of Hilbert-Huang transform to vibration signal analysis of coal and gangue[J]. Applied Mechanics and Materials, 2010, 40/41: 995-999. [Google Scholar]
  12. Tripathy D P, Guru R R K. Novel methods for separation of gangue from limestone and coal using multispectral and joint color-texture features[J]. Journal of the Institution of Engineers (India): Series D, 2017, 98(1): 109-117. [Google Scholar]
  13. Shang D Y, Huang Y S, Zhang T Y, et al. Design of experimental platform for coal gangue sorting delta robot[J]. Coal Technology, 2023, 42(7): 136-139(Ch). [Google Scholar]
  14. Cao X G, Liu S Y, Wang P, et al. Research on coal gangue identification and positioning system based on coal-gangue sorting robot[J]. Coal Science and Technology, 2022, 50(1): 237-246(Ch). [Google Scholar]
  15. Hu P, Li X, Lyu C H, et al. Development of coal gangue separation device based on machine vision for mines[J]. Coal Mine Machinery, 2022, 43(12): 211-213(Ch). [Google Scholar]
  16. Pang S Z, Li B, Wang X W, et al. Design and experimental research of coal and gangue recognition system based on machine vision[J]. Coal Engineering, 2021, 53(2): 141-146(Ch). [Google Scholar]
  17. Li S X, Li Y N, Wang Z J, et al. Experimental platform for coal gangue sorting robot based on image detection[J]. Journal of Mine Automation, 2023, 49(7): 107-113(Ch). [Google Scholar]
  18. Yang Y K, Wang J W, Sun Y Z. Experimental research on object recognition and tracking based on OpenMV vision technology[J]. Control and Instruments in Chemical Industry, 2023, 50(4): 569-572(Ch). [Google Scholar]
  19. Li H, Zhang Y C, Miao X T. Development of OpenMV intelligent material handling direction[J]. Electronic Technology & Software Engineering, 2022, 11(3): 107-112(Ch). [Google Scholar]
  20. Xu Y J, Wang Z, Zou J M, et al. Design and implementation of motion target control and automatic tracking system based on OpenMV camera[J]. Modern Electronic Technology, 2024, 47(17): 166-172(Ch). [Google Scholar]
  21. Zhou W, Yang A S. Research on target recognition system for self-picking orchard fruit basket based on OpenMV[J]. Agricultural Technology & Equipment, 2023, 10: 63-65(Ch). [Google Scholar]
  22. Wei M C, Li Y, Mao Y L, et al. Design and implementation of garbage sorting robot based on machine vision[J]. Internal Combustion Engine & Parts, 2023, 23: 42-44(Ch). [Google Scholar]
  23. Boyle L, Moosmann J, Baumann N, et al. DSORT-MCU: Detecting small objects in real-time on microcontroller units[EB/OL]. [2025-03-25]. arXiv:2410.16769. [Google Scholar]
  24. Wang Z P, Wu Z X, You M L, et al. Color grading detection of cotton based on improved MobileNetV2[J]. Cotton Textile Technology, 2019, 52(6): 15-21(Ch). [Google Scholar]
  25. Gao L, Liang Z H, Dong H J, et al. Research progress on motion planning technology of intelligent sorting robots for coal gangue[J]. Science Technology and Engineering, 2024, 24(16): 6567-6575(Ch). [Google Scholar]
  26. Sun L X, Geng Q L, Tang J H, et al. Research on inverse kinematics of redundant manipulator with joint limitation[J]. Modern Manufacturing Engineering, 2022, 8: 46-52(Ch). [Google Scholar]
  27. Zhang Y H, Pang C H, Song Z H, et al. Research on control method of five degree of freedom manipulator based on OpenMV[J]. Machine Tool & Hydraulics, 2024, 52(5): 1-7(Ch). [Google Scholar]
  28. Zhang Z. A flexible new technique for camera calibration[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2000, 22(11): 1330-1334. [CrossRef] [Google Scholar]
  29. Huang G. Binocular vision system realizes the real-time tracking of badminton[J]. Journal of Electronic Measurement and Instrumentation, 2021, 35(6): 117-123(Ch). [Google Scholar]
  30. Yuan B, Wang H, Li R H, et al. Research on part recognition and grasping method of vision robot[J]. Machinery Design & Manufacture, 2024, 2: 309-313, 318(Ch). [Google Scholar]

All Tables

Table 1

EON™ compiler performance comparison (same accuracy, 17% less RAM, 29% less ROM)

Table 2

Test results

All Figures

Thumbnail: Fig. 1 Refer to the following caption and surrounding text. Fig. 1 System frame diagram
In the text
Thumbnail: Fig. 2 Refer to the following caption and surrounding text. Fig. 2 Overall system implementation diagram
In the text
Thumbnail: Fig. 3 Refer to the following caption and surrounding text. Fig. 3 Hardware structure diagram
In the text
Thumbnail: Fig. 4 Refer to the following caption and surrounding text. Fig. 4 System workflow chart
In the text
Thumbnail: Fig. 5 Refer to the following caption and surrounding text. Fig. 5 OpenMV workflow chart
In the text
Thumbnail: Fig. 6 Refer to the following caption and surrounding text. Fig. 6 Convolutional output feature graph implementation process
In the text
Thumbnail: Fig. 7 Refer to the following caption and surrounding text. Fig. 7 Mechanical arm model structure diagram
In the text
Thumbnail: Fig. 8 Refer to the following caption and surrounding text. Fig. 8 Spatial distribution and visualization of coal and gangue sample characteristics
In the text
Thumbnail: Fig. 9 Refer to the following caption and surrounding text. Fig. 9 EDGE IMPULSE platform training result
In the text
Thumbnail: Fig. 10 Refer to the following caption and surrounding text. Fig. 10 EDGE IMPULSE platform test result
In the text
Thumbnail: Fig. 11 Refer to the following caption and surrounding text. Fig. 11 Single-target classification visualization
In the text
Thumbnail: Fig. 12 Refer to the following caption and surrounding text. Fig. 12 Multi-objective classification visualization
In the text
Thumbnail: Fig. 13 Refer to the following caption and surrounding text. Fig. 13 Overall physical diagram
In the text
Thumbnail: Fig. 14 Refer to the following caption and surrounding text. Fig. 14 OpenMV recognition diagram
In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.