MIT researchers are using generative AI models to help robots more efficiently solve complex object manipulation problems, such as packing a box with different objects.
The method – a form of generative AI, called Diffusion-CCSP – uses a collection of machine-learning models, each of which is trained to represent one specific type of constraint.
These models are combined to generate global solutions to the packing problem, taking into account all constraints at once.
The method was reportedly able to generate effective solutions faster than other techniques, and it produced a greater number of successful solutions in the same amount of time.
The technique was also able to solve problems with novel combinations of constraints and larger numbers of objects, that the models did not see during training.
According to MIT, due to this generalisability, the technique can be used to teach robots how to understand and meet the overall constraints of packing problems, such as the importance of avoiding collisions or a desire for one object to be next to another object.
Robots trained in this way could be applied to a wide array of complex tasks in diverse environments, from order fulfilment in a warehouse to organising a bookshelf in someone’s home.
“My vision is to push robots to do more complicated tasks that have many geometric constraints and more continuous decisions that need to be made – these are the kinds of problems service robots face in our unstructured and diverse human environments,” said Zhutian Yang, an electrical engineering and computer science graduate student and lead author of a paper on the new machine-learning technique.
“With the powerful tool of compositional diffusion models, we can now solve these more complex problems and get great generalisation results.”
In the future, Yang and her fellow researchers wants to test Diffusion-CCSP in more complicated situations, such as with robots that can move around a room.
They also want to enable Diffusion-CCSP to tackle problems in different domains without the need to be retrained on new data.