Depth Yolact ROS

This is a ROS wrapper for Yolact that extends it by utilising a depth image to generate 3D bounding boxes and pointclouds of the detected objects.

depth_yolact_ros is a pipeline. It:
  • takes the detection boxes and their associated masks.
  • crops the depth image using the masks.
  • takes the masked pixels for each object and converts them to pointclouds using the camera intrinsics from the /camera_info topic.
  • then filters the points for any mislabeled pixel in the mask using k-means clustering first
  • then a Gaussian model to reject outliers on the depth axis.
  • Each detected instance runs on a thread and all results are published on a MarkerArray topic and a pointcloud topic.

    The package runs at around 10 Hz on an NVIDIA GTX 1660 4Gb GPU with a realsense camera. There is a lot of modifications that can be done to make the package much faster. I have included a what's next? section in the Github repo. Please feel free to contact me if you have any questions!

    Below are two videos showing the the package working on a real D435i camera and in gazebo with a D435 camera on a turtlebot robot.