Robin uses its suction gripper to pick packages from a conveyor belt. | Source: Amazon Robotics and AI
Thousands of packages pass through Amazon’s fulfillment centers every day. More and more of those packages are picked up, scanned, and organized by Amazon’s Robin robotic arm.
Robin picks packages from a conveyor belt with its suction gripper, scans them and then places them on a drive robot that routes it to the correct loading dock. Robin’s job is particularly difficult because of its rapidly changing environment. Unlike other robotic arms, Robin doesn’t just perform a series of pre-set motions, it responds to its environment in real-time.
“Robin deals with a world where things are changing all around it. It understands what objects are there — different sized boxes, soft packages, envelopes on top of other envelopes — and decides which one it wants and grabs it,” Charles Swan, a senior manager of software development at Amazon Robotics and AI, said. “It does all these things without a human scripting each move that it makes. What Robin does is not unusual in research. But it is unusual in production.”
Amazon’s team decided to take a unique approach when teaching Robin how to recognize packages coming down a conveyor belt. Instead of teaching computer vision algorithms to segment scenes into individual elements, the team allowed the model to try to find objects in an image on its own. After the model finds an object, the team provides feedback on how accurate it is.
Beginning with pre-trained models that were able to identify simple object elements like edges and planes, the team slowly taught Robin how to identify a wide variety of packages it would handle. To continue to improve the system, the team also gathered thousands of images and drew lines around the different packages represented.
“Everything comes in a jumble of sizes and shapes, some on top of the other, some in the shadows,” Bhavana Chandrashekhar, a software development manager at Amazon Robotics, said. “During the holidays, you might see pictures of Minions or Billie Eilish mixed in with our usual brown and white packages. The taping might change. Sometimes, the differences between one package and another are hard to see, even for humans. You might have a white envelope on another white envelope, and both are crinkled so you can’t tell where one begins and the other ends.”
These images are used to continually re-train Robin, but they’re not the only way the team pushes for the highest accuracy possible for its robot. Robin is able to give feedback on how confident it is about the decisions it makes. Images that the robot marks as low-confidence are automatically sent for annotation and then added to the teams training deck.
Robin also knows when it’s made a mistake. If it drops a package, or accidentally puts two packages onto one sortation robot, Robin will try to correct the problem. If it can’t, then a human is called for intervention.
Robin is currently deployed in small numbers, but the team’s push for accuracy means that it’s closer to being rolled out at scale. The robot still has some room to learn, however. Robin is retrained every few days with new fleet metrics, and the team hopes that it can roll out updates multiple times a week to the robot.