Abstract
Vision-based robots have been utilized for pick-and-place operations by their ability to find object poses. As they progress into handling a variety of objects with cluttered state, more flexible and lightweight operations have been presented. In this paper, an autonomous robotic bin-picking platform is proposed. It combines human demonstration with a collaborative robot for the flexibility of the objects and YOLOv5 neural network model for faster object localization without prior computer-aided design models or dataset in the training. After a simple human demonstration of which target object to pick and place, the raw color and depth images were refined, and the one on top of the bin was utilized to create synthetic images and annotations for the YOLOv5 model. To pick up the target object, the point cloud was lifted using the depth data corresponding to the result of the trained YOLOv5 model, and the object pose was estimated by matching them with Iterative Closest Points (ICP) algorithm. After picking up the target object, the robot placed it where the user defined it in the previous human demonstration stage. From the result of experiments with four types of objects and four human demonstrations, it took a total of 0.5 s to recognize the target object and estimate the object pose. The success rate of object detection was 95.6%, and the pick-and-place motion of all the found objects was successful.