When you obtain this document, Available for download from the RVS developer community unstacking_runtime_en.zip Unstack tutorial supporting data. After decompression, its contents should include:

image-20231117135303531

  • data :The robot model used in this case is included, if you need other models, please go to the RVS software distribution model website to download.

  • HandEyeCalibration is used for hand-eye calibration and includes the following documents:

    • TCP_test.xml :For communication testing。

    • Camera_save.xml:Used for recording calibration board image data. (Note the data saving path of the SaveData group.)

    • HandEyeCalibration.xml:Data recording and matrix calculation for hand-eye calibration.

  • handeye_data :In the case of standard data recording in hand-eye calibration, the user can directly load the corresponding path for calculation.

  • MaskRCNN is suitable for AI.

    • train_data :It contains the image recorded by AI training and the trained pth and annotations.json files, and users pay attention to the path when directly loading the files required by AI.

    • ty_ai_savedata.xml:Suitable for AI image recording.

    • ty_ai_train.xml:Suitable for AI training.

  • RVSCommonGroup :Some commonly used Group groups are provided, including save loading and matrix conversion, which can be loaded as required。

  • unstacking_data :It is suitable for loading the data in offline mode of the depalletizer. Please pay attention to the correct path when using.

  • ty_color/depth_calib.txt :The ProjectMask parameter is loaded for AI unpalletizers, and the contents of this file are updated when using different cameras.

  • unstacking.xml :When loading offline data, update the offline data path in the LoadLocalData Group, click Restart, and then click LocalCapture to iterate over offline data.

The whole process of depalletizing will be carried out by two modules: hand-eye calibration and depalletizing main process. All files used in the case are saved under unstacking_runtime and run under projects in the RVS installation directory.

If you do not want to adjust the file path in XML. You can rename unstacking_runtime to runtime.

Note: The version of RVS software used in this document is 1.5.174. If there is any problem in use or version incompatibility, please contact us at rvs-support@percipio.xyz!

Hand-eye calibration

Introduction to hand-eye calibration

Hand-eye calibration refers to calculating the coordinate conversion matrix between the camera and the robot after the camera is installed, and the purpose is to convert the point cloud taken in the camera coordinate system to the robot coordinate system according to this coordinate conversion matrix. Depending on how the camera is mounted, there are two types of hand-eye calibration:

  • HandInEye:The camera is mounted on the robot arm, the calibration plate is fixed in a fixed position, and the hand-eye combination changes the attitude to shoot the calibration plate.

  • HandToEye:The camera (ceiling) is fixed, the calibration plate is fixed on the robot arm, the robot arm changes the attitude, and the camera takes pictures of the calibration plate with different attitudes.

This part takes HandInEye as an example to introduce the process of hand-eye calibration.

  • Communication test:It is to ensure that the RVS software can start normally, and establish communication with the robot normally, and prepare for the next data recording.

  • Data recording:Save multiple sets of RGB image and point cloud of the calibration board taken by the camera at the same time, as well as the current TCP and JOINTS data of the robot.

  • Calculation calibration result:Calculate the calibration result.

Communication test

The purpose of the communication test is to ensure that the RVS software can start properly and establish normal communication with the robot in preparation for the next data recording. The function of the communication class node is to establish the tcp_server server on the RVS client and wait for the robot to establish a connection and carry out tcp communication. The specific parameters, please refer to the node is introduced in communication node is introduced.

Operation process:

Step 1: Create TCP_test.xml

Note: You can directly use TCP_test.xml in the unstacking_runtime/HandEyeCalibration/ path to perform communication tests, or you can create TCP_test.xml by following the steps below.

  1. Create XML Under the unstacking_runtime path to create a new project, save path for unstacking_runtime/HandEyeCalibration /, and renamed TCP_test. XML. Drag the Trigger and HandEyeTCPServer nodes into the Node Graph. The HandEyeTCPServer node does not need to modify any parameters.

  2. Add nodes and form a Group Drag the Trigger and HandEyeTCPServer nodes into the Node Graph. Combine two nodes into a Group.

  3. Set the node parameters.

    • Set Trigger node parameters:type → InitTrigger

    • Set HandEyeTCPServer node parameters:

      • server_mode →Once

      • delimiter → RT

      • command_delimiter → Space

    • Set Trigger node parameters:type → InitTrigger

    • Set HandEyeTCPServer node parameters:

      • server_mode →Once

      • delimiter → RT

      • command_delimiter → Space

      Note: Please set according to the ending character sent by the robot.

  4. Connect the node.

ROBOT_TCP, ROBOT_JOINTS and CAPTURE on the right side of the HandEyeTCPServer node are connected to MirrorOutput respectively. Then connect from MirrorInput to the corresponding entry to the left of the HandEyeTCPServer node. A loop is formed so that information can be returned to the client when the instruction is received. tcp is connected to the mirror output. The node graph is connected as follows.

image-20231116185458846

The full XML has provided in unstacking_runtime/HandEyeCalibration/TCP_test. XML path.

Step 2: Conduct a communication test (offline)

After the project is started, you can test it locally, open the terminal and enter:

  • echo “ROBOT_TCP 0 0 0 0 0 0 “ | nc localhost 2013

  • echo “ROBOT_JOINTS 0 0 0 0 0 0 “ | nc localhost 2013

  • echo “CAPTURE “ | nc localhost 2013

If it runs successfully, ROBOT_TCP, ROBOT_JOINTS and CAPTURE characters will be returned in the terminals respectively, as shown in the picture:

test1

The following picture is displayed in the RVS log view:

image-20231117140453080

Step 3: Conduct a communication test (online)

Ask the robot engineer to help:

  1. The robot gets the current TCP value and sends the following string, x y z in meters, rx ry rz in radians, to the RVS server through the connection.

    ROBOT_TCP x y z rx ry rz \n
    
  2. The robot obtained the coordinate values of the joints at the current position, obtained the radian values of the first six joints, and sent the following string to the RVS server.

    ROBOT_JOINTS J1 J2 J3 J4 J5 J6 \n
    
  3. The robot sends the following string to the RVS server.

     CAPTURE \n 
    

After success, the robot should establish normal communication with the RVS.

Data recording

After normal communication between the robot and RVS software, data recording can be performed to save multiple sets of RGB and point cloud of the calibration board taken by the camera at the same time, as well as the current TCP and JOINTS data of the robot. For a set of data, the parent path is named unstacking_runtime/test_data/ timestamp, for example, unstacking_runtime/test_data/20220301152509156.

Next, we will show you how to use Percipio’s 3D camera in RVS and save RGB images and point cloud images in a simple xml, if you already know this part, you can skip to the standard hand-eye calibration module in this chapter.

Note: the full XML has provided in unstacking_runtime/HandEyeCalibration/Camera_save. XML path. You can also create TCP_test.xml yourself by following these steps.

Operation process:

Step one:Create Camera_save.xml

  1. Open the RVS software, new XML, named Camera_save. XML and deposit to unstacking_runtime/HandEyeCalibration.

  2. Add nodes Trigger and HandEyeTCPServer to the node graph, form a Group of the two nodes, and name them the main Group.

  3. You do not need to modify HandEyeTCPServer node parameters when setting Trigger node parameters.

    • node_name → InitTCP

    • type → InitTrigger

  4. Connect CAPTURE on the right of the HandEyeTCPServer node to MirrorOutput, and then connect CAPTURE on the left of the HandEyeTCPServer node from MirrorInput. The node graph connection is shown in the following picture.

    image-20231117140615229

  5. Click Resource at the top of the toolbar, select TyCameraResource from the drop-down menu, click the resource node and set the node parameters.

    • start → True

    • The remaining parameters are automatically completed when the RVS software first connects to the camera

  6. Add nodes Trigger and TyCameraAccess to the main Group and set parameters.

    • Set Trigger node parameters:

      • type → InitTrigger

      • loop → True

      Description: Indicates that the camera picture is constantly refreshed in 3D and 2D views.

    • Set the TyCameraAccess node parameter:

      • cloud → icon_visOn visible → icon_color -2

      • rgb → icon_visOn visible

  7. Right click in the main Group of margin in the pop-up menu bar, select the import Group here XML, select unstacking_runtime/RVSCommonGroup/SaveData Group. XML, The function of this group is to save RGB images and point cloud images. Click on this group, find the savedata_parent_dir entry in the properties panel, and adjust the file directory to test_data/.

    image-20231117140740501

  8. Connect the CAPTURE signal on the right of the HandEyeTCPServer to the start signal of SaveData. Connect the cloud and rgb data in TyCameraAccess to the corresponding image and pointcloud input of SaveData. When the TCP server receives the CAPTURE command sent by the robot, it replies the CAPTURE \n string and triggers SaveData to save the cloud and rgb data collected by TyCameraAccess.

The final project process should be as shown in the picture:

image-20231117140803626

Step two:Connected camera

  1. Connect the camera and run the XML when the camera is initialized. The following log message is displayed if the camera is successfully connected (the test camera in this manual is FS820-E1). image-20231117141500572

  2. Send the CAPTURE command to the RVS server.

  • Local test: Open terminal input echo "CAPTURE " | nc localhost 2013

  • Online testing: Send a CAPTURE \n string to the RVS server using a bot.

If it works, a new folder is created under unstacking_runtime/test_data/ with contents of rgb.png and cloud.pcd each time it is triggered. (The test_data folder is automatically created on the first run if it did not exist before)

Once you are familiar with how to use the RVS software to use and connect the camera, you will begin the data recording process. A standardized hand-eye calibration module is provided in RVS.

Please open the unstacking_runtime HandEyeCalibration/HandEyeCalibration. XML

image-20231117142624004

In this xml, the whole is divided into storage data module and calculation of calibration results, in which the interactive panel:

image-20231215111300524

Operation process:

  1. Open the RVS software, loading unstacking_runtime/HandEyeCalibration path HandEyeCalibration. XML project files.

  2. Fill in the number of rows and columns of the calibration board checkerboard, the unit length of the calibration board checkerboard grid, and the file path for data storage.

  3. After starting the project, first click Open the camera and then click Loop capture to refresh the image.

    Note

    If you encounter a red error, the channel is rvs_tycamera and the message is no TyCamera avaliable, which means that you did not close the loop display function before saving and exiting. Please click `Loop capture` to close the loop display first. Then open `Open the camera ` and click `Loop capture` cycle display.

    If an error message is displayed in red, the channel is rvs_tycamera and the message is Cant open XXX.txt to export depth/colorcalib! , please click TyCameraResource node will depth_calib_file and replacement for unstacking_runtime/HandEyeCalibration color_calib_file path, choose corresponding TXT file.

  4. After ensuring the complete appearance of the calibration board in the 2D view, the robot was controlled to continuously send ROBOT_TCP, ROBOT_JOINTS and CAPTURE characters to trigger the data saving function (robot engineers are required to assist here). The ChessboardLocalization node determines whether the Angle points in the field of view meet the requirements. If they meet the requirements, they are saved normally; if they do not, invalid data is eliminated.

Please also note:

  • Make sure that the corners of the calibration plate are clear, there is no overlap or gap, and the ranks are odd and even.

  • The position of the calibration board is always fixed during the calibration process, and there can be no rotation and movement, and it can ensure that it can be completed in the picture.

  • Pay attention to the impact of the light on the calibration board, the corner point can not be over-exposed, and it does not need to be particularly dark.

  • The position and attitude must be changed, the attitude must be rich, and the height must be changed at multiple levels, so that the calibration plate can appear at the 4 corners of the picture and there is a rotation of the attitude.

  • Check as much as possible whether the tcp and joints data under each group of folders are consistent with the numbers on the robot control panel when sending data.

  • The number of groups is generally more than 18 groups, and 20-25 groups are appropriate.

ChessboardLocalization_result

When using the Eyeinhand calibration method, ensure that the calibration board is taken:

  • Make sure that the corners of the calibration plate are clear, there is no overlap or gap, and the ranks are odd and even.

  • The position of the calibration board is fixed during the calibration process, and there can be no rotation and movement, and it can be ensured that it can appear completely in the picture.

  • Pay attention to the impact of light on the calibration board, and the corner point should not be too exposed or too dark.

  • Calibration pose as rich as possible. Select a variety of calibration positions (center of field, edge of field), a variety of calibration heights, and a variety of rotation angles in the camera field of view.

  • Check as much as possible whether the tcp and joints data under each group of folders are consistent with the numbers on the robot control panel when sending data.

  • The number of groups is generally more than 18 groups, and 20-25 groups are appropriate.

    ChessboardLocalization_result

Calculation calibration file

After we have finished data recording, please check again whether the data in the folder is complete, whether the images and corners are clear, and check whether the tcp and joints are consistent with the real situation by sampling. Then we began to calculate the calibration results.

Operation process::

  1. Still in unstacking_runtime/HandEyeCalibration/HandEyeCalibration. XML project.

  2. Check whether the reading path is the same as the saving path. After correctly selecting the path in the interactive panel, click the corresponding mode to start the calculation. eyeinhand is used as an example here.

  3. Click the Eye in hand button on the interactive panel to automatically calculate the result and display the error in the log view window of the program, as shown below.

    image-20231215111756328

    How to determine whether the error is normal: Different robots with different camera types will have different precision performance, in general, positional error is less than 0.005 (5 mm), it is more ideal. If the error is large, check that the preceding steps are performed correctly.

  4. The punctuation result is displayed in the coordinate output control. When the accuracy meets the fluctuation allowed within the error range, click the copy button to copy the calibration result and apply the conversion matrix to the point cloud conversion process.

    image-20231215133826801

The main process of unstacking

Unstacking process

The actual goal of automatic unstacking is:

  1. Find the middle point of the box geometry

  2. Grasp according to a certain order. The specific unstacking process can be divided into the following steps:

  • Loading local data: In this case, offline data is used to complete the unstacking project.

  • Coordinate system conversion: The point cloud in the camera coordinate system is transferred to the robot coordinate system.

  • AI Detecting:Put close boxes when using AIDetect.

  • Identification and location:The center point of the target point cloud plane is identified and returned to the robot for movement.

  • Robot grabbing:Use robot simulation to disassemble the stack in sequence.

Load local data

In this case, multiple groups of offline unstacking_data were stored in the unstacking_data folder, including point clouds, joints, 2d images, and tcp. This data needs to be loaded and traversed to complete the unstacking project.

Note: If you need to carry out the unstacking process according to the actual situation on site, you can connect the camera and camera resources to complete online unstacking.

Operation process:

  1. The new project Unstacking.xml.

  2. Drag in the Trigger and TestTCPCommand nodes. Used to automatically trigger subsequent nodes the first time XML is run.

    • Set Trigger node parameters:

      • node_name → InitTrigger

      • type → InitTrigger

      Note:Used to automatically trigger subsequent nodes the first time XML is run.

    • Set the TestTCPCommand node parameter:sever_mode → Once

      Note:Communicate with the simulated robot once to obtain the current TCP of the robot.

      The node graph connection is shown in the following picture.

      image-20231117150240459

  3. Drag the Trigger into the node graph. The analog robot TCP can also be obtained by manually triggering the Trigger.

    • Set Trigger node parameters:

      • node_name → LocalCapture

      • Trigger → light Exposure →Bind to the input tool “button” in the dashboard。Rename the button → LocalCapture 。

  4. When the terminal according to certain message send echo "test1 " | nc localhost 2013 trigger this node.

  5. Drag the Or node into the node graph and rename it Capture_Or to trigger subsequent nodes using either TCP or local.

    The node graph connection is shown in the following picture.

    image-20231117151302363

  6. When the terminal according to certain message send echo "test3" | nc localhost 2013 trigger this node.After completing the entire project, the input and results are returned as follows:

    Unstacking_terminal1

  7. Import unstacking_runtime/RVSCommonGroup/LoadLocalData.group.xml in node group (for loading local data),Connect the output port of the Or node to the iterate port of the node group.

    Note: After directly loading group, adjust the file path of the parent_directory property in the DirectoryOperation node.

  8. Drag the Trigger node to reload the local data.

    • Set Trigger node parameters:

      • node_name → Restart

      • Trigger → light Exposure → Bind to the input tool “button” in the dashboard panel,Rename the button→ Restart 。

  9. Double-click the LoadLocalData node group to expand the group。Find the node named “LoadImage”,Open image → icon_visOnVisual attribute.Open the cloud of the “LoadCloud” node → icon_visOnVisual attribute.The offline point cloud can be seen in 3D view。The node connection is shown in the picture below:

image-20231117152139161

Coordinate system transformation

After loading the local data, the coordinate system needs to be converted, and the coordinate system where the point cloud is located —- camera rgb lens coordinate system is converted to the robot coordinate system. This conversion involves camera parameters and hand-eye calibration results.

Operation process:

  1. Right-click in the node graph and select Import Group XML here,Import HandInEye_Depth2Robot.group.xml in RVSCommonGroup.

    Note:In addition to this file, there is HandToEye_Depth2Robot.group.xml. In a real project, select the corresponding operator group based on your actual camera installation.

    image-20231117152241598

  2. The tcp_pose port of the LoadLocalData group is connected to the tcp2robot port of the HandInEye_depth2Robot group.

  3. Drag the LoadCalibFile node to load the calibration file. The finished port is connected to the initial_start port in the HandInEye_depth2Robot group. The extrinsic_pose port is connected to the rgb2depth port. The start port and InitTrigger port are connected to the finished port. The specific connection is as follows:

    image-20231117152340981

  4. Double-click Group, find the “rgb2tcp” node, paste the hand-eye calibration result in the pose attribute of the properties panel.

    image-20231117152626415

    Through the above steps, the matrix rgb2robot of the camera’s rgb lens to the robot coordinate system and the matrix depth2robot of the camera’s depth lens to the robot coordinate system have been obtained, and the point cloud under the camera’s depth lens coordinate system has been converted to the robot coordinate system.

    Note:HandToEye mode, please select unstacking_runtime/RVSCommonGroup/HandToEye_Depth2Robot group. XML, after this group, find rgb2robot node, Copy and paste the resultpose results from the dashboard in the previous calibration program here. Click TyCameraResource in the calibration program, find extrinsic in the property panel, and copy the camera external parameter to the rgb2depth node.

  5. Add the Transform node to the group。

    • Set the Transform node parameters:

      • node_name → Cloud2Robot

      • type → PointCloud

      • cloud → icon_visOn visible → icon_color -2

  6. Connect the depth2robot port to the pose input port of this node, and connect the pointcloud port of the LoadLocalData node group to the input port of the same name of this nod. Click Run to get the point cloud in the robot coordinate system at the output port.image-20231117152811031

AI Detecting

In actual work, the box type is placed more closely, and the point clouds on the upper surface of multiple boxes will be connected together to become a large point cloud that is not easy to divide.To solve this problem, the AI model can be trained, and on the basis of AI Detect, the point cloud list of the box can be obtained.Because AI training requires a certain configuration (see the RVS installation tutorial for details),If the configuration requirements are not met, inference can be made directly from the pth file provided in the unstacking_runtime folder.The overall process will be divided into recording, annotation, training, detecting. The tutorials on collecting training images, labeling training images, and training AI models are included in the chapter [AI Train](#AI Train).

Operation process:

  1. Loading unstacking_runtime/RVSCommonGroup/GetRobotTCP group.

  2. Add Emit and set the node parameters. This node is used in subsequent nodes. Connect the pose port in this node to the down_pose port in the GetRobotTCP group.

    • node_name → TowardsDownPose

    • type → pose

    • pose_roll → 3.141592654

  3. Connect the cloud port in Transform (Cloud2Robot) to the port with the same name in the group, and the image port in the LoadLocalDate group to the port with the same name in the group. The rgb2Robot port in the HandInEye_depth2Robot group is connected to the port with the same name in this group, and the calib_info port in the LoadCalibfile node is connected to the port with the same name in this group. The specific connection is shown in the following picture.

    image-20231117152859530

  4. Double-click to expand the GetRobotTCP group, set the AIDetectGPU node type → MaskRCNN, and modify the rest of the configuration file path.

    Note:MaskRCNN is used in this case. Since the AI inference operator needs to be initialized once before it is officially run, an additional Trigger(type → InitTrigger) needs to be added before the node.

  5. Click to run.

    Note:After running, the AIDetectGPU node obtains the location area of the target in the 2D image (i.e. the mask , corresponding to the mask_list port), The ProjectMask operator converts the point cloud to the robot coordinate system based on the point cloud corresponding to the input 2D image, the conversion matrix between the rgb lens coordinate system used in the 2D image acquisition and the point cloud coordinate system (due to the above operations and coordinate system conversion). Therefore, it is necessary to enter the conversion matrix of rgb lens to robot coordinate system) and the internal parameters of camera rgb lens, and convert mask_list obtained by AIDetect node into a point cloud list of all detection targets.

    image-20231117152944299

Identification and location

After the point cloud list of all detected targets in the robot coordinate system is obtained through AI Detect, the center coordinate of the point cloud needs to be obtained.

Operation process:

Step 1 : Filter boxes

  1. Double-click to expand Group GetObj in the GetRobotTCP group.

    image-20231117153031997

  2. Need to filter box, and according to the list box of Z axis coordinates screening, select the top box, and the upper box for sorting. As shown in the following picture.

    image-20231117153133660

  3. So here we use the FilterBoxList node, rename → SortBoxList, and the property values of the node are adjusted as follows:

    image-20231117153207484

Step 2:Extracts the center point of the point cloud plane of the first box in the upper box point cloud list.

  1. Extract the point cloud of the first bin in the upper bin point cloud list, using the AtLitst node.

    • Set the AtList node parameters:

      • node_name → GetFirstInList

      • type → PointCloud

      • index → 0

  2. To obtain a Plane, use FindElement, type → Plane, to obtain a plane suitable for grasping in the point cloud. Adjust the node property distance_threshold to adjust the selected plane. Open cloud Visual properties to view the selected plane.

  3. To obtain the center point of the plane, use the MinimumBoundingBox node, rename → GetBoxCube, type → ApproxMVBB to obtain a coordinate center point convenient for robot grasping. Here we need to give the node a ref_pose, which is connected to Emit (TowardsDownPose), indicating a rotation of 180° around the X-axis, so that the Z-axis faces down, easy for the robot to grasp. Open the MinimumBoundingBox (GetBoxCube) property panel box and box_pose visual properties to display the calculated center point of the plane.

    image-20231117153325900

Robot grab

After completing the above operations, we have obtained the coordinates of the target point, and we need to simulate the grasp by loading the robot model.

  1. First of all, we need to measure the length of the robot’s suction cup tool. Since the suction cup is centrally aligned with the flanges of the robot during installation, we only need to adjust the length along the Z-axis of the center attitude of the upper surface of the target, which can be used as the position of the robot flanges during the final grasp.Use the Emit node, type →Pose, z →-0.192. After that, the adjusted posture is sent to the robot model for simulated movement, using Transform node, type → Pose, node renaming → RobotToolTCP. The node connection is as follows:

    image-20231117153706792

  2. Move the robot。 Drag SimulatedRobotResource into the Resource node group and set the property robot_filename to load your bot file, such as EC66/EC66.rob under data.Set properties tool_filename loading robot sucker tools such as data of Tool/EliteRobotSucker1 Tool.xml.

  3. Drag the RobotMovement node into the node graph, type → MoveJoint, robot_resource_name → SimulateRobotResource. Set velocity and acceleration to adjust the robot’s moving speed and acceleration. The left Joint port connects to the right Jointarray port in the LoadLocalData group. The start port is connected to the finished port.

  4. Drag the WaitAllTrigger node to connect the finished port of the GetRobotTCP group to its input_1 port. The finished port of the RobotMovement node is connected to its input_2 port.

  5. Drag the RobotMovement node into the node graph, type → MoveTCP, rename →MoveToGrasp, robot_resource_name → SimulateRobotResource. Set velocity and acceleration to adjust the robot’s moving speed and acceleration. The start port is connected to the finished port of the WaitAllTrigger node. goal_pose is connected to the robot_tcp port of the GetRobotTCP group, and ref_pose is connected to the joint port of the RobotMovement.

  6. Finally, connect the finished and robot_tcp ports of GetRobotTCP to the MirrorOutput port of the outermost node group and back to TestTCPCommand. At this point, the editing of the project file is complete.

    image-20231117153734128

Full xml is provided in the unstacking_runtime/unstacking.xml path.

AI Train

The main objective of this chapter is to explain how to collect training images, label training images, and train AI models. The overall process will be divided into collection, annotation and training.

Note:AITrain needs certain configuration, details refer to the download and installation of RVS

Acquisition training image

Open unstacking_runtime/MaskRCNN/ty_ai_savedata.xml, the content is basically the same as the recorded RGB image, here we only need to adjust the Emit string parameter to set to the desired path. Click the Capture button to record the image. The number of recorded images is guaranteed to be more than 10, the more the better.

image-20231117153849492

In the image acquisition, we should pay attention to the following points:

  1. Background illumination

    • A single stable light source, suitable for light and dark without too much reflection.

    • Outdoor light conditions change too much, not recommended.

  2. Consideration of complex conditions: try to make the sample have enough representative of the actual global sample rather than just a special case of the global sample

    • Diversified target objects, try not to choose a fixed single target object. If the number of samples currently available is relatively small, you can consider using both sides of the sample (both sides are the same) to double the sample.

    • Diverse working conditions, for the target object, as much as possible to consider its situation in the working conditions, multi-situation shooting:

      • If there are many situations in the project, such as the size of the stacker, the posture of the target in the stacker, the appearance of the target, etc., then the training data we collect should include these diverse situations.

      • The deviation margin between the training sample and the actual running sample should be fully considered. For example, the stacker only has a horizontal posture, and we should also consider adding a small Angle to the stacker when training the data, because it is impossible to perfect the level in practice. For example, the actual operation is placed by the machine, and the target pairs are relatively close. When we record the training set data, if we use artificial placement, we must ensure that the target pairs are relatively close to each other, rather than randomly placing the gap between the target pairs is loose.

      • For a small number of possible target damage folds and other situations, although it is rare, if the project actually has the need to identify, it is necessary to specially find more damaged targets for data collection. Although the proportion of damaged targets in the training samples is larger than that of damaged targets in the actual samples, it is also very meaningful.

    • The target edge must be clearly visible to the human eye, especially the layer farthest from the camera, otherwise consider replacing the camera for specific image recording can refer to the rgb image in unstacking_runtime/MaskRCNN/train_data

Label the training images

Using the labelme software to annotate the recorded RGB, this document provides a labelme installation method.

Version 1.5.639 or later has integrated annotation software,No installation labelme。Click Menu Barimage-20231020182747977Image Label Tool can be opened.

Step 1:Install labelme

  1. Install pip

    sudo apt install python3-pip
    
  2. Install pyqt5

    sudo apt install python3-pyqt5
    
  3. Run the following command in labelme-4.5.7.tar.gz (network query)

    python3 -m pip install labelme -i https://pypi.tuna.tsinghua.edu.cn/simple
    
  4. Installation complete, running

    python3 -m labelme
    

Step 2: Preparation before labeling

  1. First of all, the task objective is determined, and it is clear what objects need to be detected and what objects do not need to be detected in the detection process, so as to carry out targeted labeling.

  2. Given annotation conditions do not need to be too harsh, do not consider according to human thinking, but according to their own subjective set of annotation ideas is convenient to implement the code.

Step 3: Label the images

  1. Open the labelme software, click the OpenDir button, and select the path we marked.

    image-20231117154019078

  2. Click the Create Polygons button to draw a red border for the box, as shown in the picture below.

    Note:Supports Create Polygons, Create Circle, and Create Rectangle.

    image-20231117154114160

  3. After the completion, the naming box will pop up. The first name is: box, and the subsequent similar types can be selected directly.

    box

  4. When all boxes in the Image are marked, click Save to save, default the current folder, default name can be used, then select Next Image to switch to the next image.

    image-20231117154419820

We should pay attention to the following points in the process of annotation:

  1. A bounding box that marks the visible area of the object (not the estimated total area of the object).

  2. Mark the end of the frame and take care to do a boundary check, such as making sure that the frame coordinates are not on the image boundary.

  3. Poor quality images (such as excessive motion blur), partially obscured objects, and selected annotations or omissions based on project objectives.

We should try to avoid the following problems:

  • Incorrect boundary labeling

    label1

  • Omission

label2

  • The occluded part should not be marked

label3

Training AI model

Open the unstacking_runtime/MaskRCNN/ty_ai_train.xml. Here we only need to adjust the data_directory and classnames_filepath paths. Trigger generate_jsonfile to start training. Finally, we generate a train_output folder with the weights we need, named model_final.pth (changeable by default).

image-20231117154501910