Imagine a scenario in which you can directly communicate with robots, enabling them to complete various tasks for you. To achieve this, Microsoft has outlined its plans to partner with OpenAI to develop ChatGPT's capabilities to control robots. The software giant used the chatbot and "controlled multiple platforms such as robot arms, drones, and home assistant robots intuitively with language," the company wrote in a blog post.
Robots still rely heavily on hand-written codes to perform their tasks, while humans find spoken language the most intuitive way to communicate. Microsoft has worked to alter this reality and "make natural human-robot interactions possible using OpenAI‘s new AI language model, ChatGPT."
How can ChatGPT help in this regard?
The team plans to leverage the platform's ability to develop coherent and grammatically correct responses to various prompts and questions and see if ChatGPT can think beyond the text and reason about the physical world to help with robotics tasks. "We want to help people interact with robots more easily, without needing to learn complex programming languages or details about robotic systems."
The key obstacle in the way for a language model based on AI is to solve problems considering the laws of physics, the context of the operating environment, and how the robot’s physical actions can change the state of the world. Even though ChatGPT can do a lot alone, it still needs some help. Microsoft has released a series of design principles, including unique prompting structures, high-level APIs, and human feedback via text. These models can be used to guide language models toward solving robotics tasks.
The firm is also introducing PromptCraft, an open-source platform where anyone can "share examples of prompting strategies for different robotics categories."
Using these design principles, researchers could fine-tune and utilize ChatGPT's knowledge to control different robot form factors for various tasks. The team could use the language model to solve "robotics puzzles, along with complex robot deployments in the manipulation, aerial, and navigation domains."
Various instances where the model worked
The team was able to use the system to allow ChatGPT to control a drone. According to Microsoft, ChatGPT asked follow-up questions when the commands were unclear and "wrote complex code structures for the drone such as a zig-zag pattern to inspect shelves visually. It even figured out how to take a selfie."
The model also performed a simulated industrial inspection exercise with the Microsoft AirSim simulator. "The model was able to effectively parse the user’s high-level intent and geometrical cues to control the drone accurately."
The model showed the ability to bridge textual and physical domains when tasked with building the Microsoft logo out of wooden blocks.
ChapGPT could also write an algorithm for a drone to reach a goal in space while not crashing into obstacles.
Microsoft has, however, sounded a word of caution for users as such practices need a thorough analysis before being used in their day-to-day lives. "We encourage users to harness the power of simulations to evaluate these algorithms before potential real-life deployments and to always take the necessary safety precautions."
The new book “Climate Change and Human Behavior” bridges the gap by explaining how a warming planet increases aggression and violence.