The field of robotics and artificial intelligence (AI) has made remarkable progress over the past few years, thanks to the rapid advancements in generative AI and Large Language Models (LLMs). Google Research, a powerhouse in AI research, has been at the forefront of this revolution, exploring new ways to integrate LLMs and robotics for improving human productivity and enhancing the quality of life.
In this article, we will discuss Google Research’s latest breakthroughs in the field of robotics and AI, focusing on the implication of these new technologies and why robotics and automation will be the next things that get swept by the new wave of LLMs. However, we must also note that traditional jobs may be gone sooner than expected as the technology advances further.
Enhancing Robot Learning with Large Language Models
Within our lifetimes, we will see robotic technologies that can help with everyday activities, enhancing human productivity and quality of life. Before robotics can be broadly useful in helping with practical day-to-day tasks in people-centered spaces, they need to be able to safely & competently provide assistance to people.
In 2022, Google Research focused on challenges that come with enabling robots to be more helpful to people. One of the biggest challenges was allowing robots and humans to communicate more efficiently and naturally. Enabling robots to understand and apply common sense knowledge in real-world situations, as well as scaling the number of low-level skills robots need to effectively perform tasks in unstructured environments were also essential.
An undercurrent this past year has been the exploration of how large, generalist models, like PaLM, can work alongside other approaches to surface capabilities allowing robots to learn from a breadth of human knowledge and allowing people to engage with robots more naturally. As we do this, we’re transforming robot learning into a scalable data problem so that we can scale learning of generalized low-level skills, like manipulation.
Large language and multimodal models help robots understand the context in which they’re operating, like what’s happening in a scene and what the robot is expected to do. But robots also need low-level physical skills to complete tasks in the physical world, like picking up and precisely placing objects.
While we often take these physical skills for granted, executing them hundreds of times every day without even thinking, they present significant challenges to robots. The difficulty of learning these low-level skills is known as Moravec’s paradox: reasoning requires very little computation, but sensorimotor and perception skills require enormous computational resources.
Inspired by the recent success of LLMs, which shows that the generalization and performance of large Transformer-based models scale with the amount of data, Google Research is taking a data-driven approach, turning the problem of learning low-level physical skills into a scalable data problem.
Robotics Transformer-1: Turning Robot Learning into a Scalable Data Problem
With Robotics Transformer-1 (RT-1), Google Research trained a robot manipulation policy on a large-scale, real-world robotics dataset of 130k episodes that cover 700+ tasks using a fleet of 13 robots from Everyday Robots. The results showed that increasing the scale and diversity of data improves the model ability to generalize to new tasks, environments, and objects.
Behind both language models and many of our robotics learning approaches, like RT-1, are Transformers, which allow models to make sense of Internet-scale data. Unlike LLMs, robotics is challenged by multimodal representations of constantly changing environments and limited compute.
In 2020, Google Research introduced Performers as an approach to make Transformers more computationally efficient, which has implications for many applications beyond robotics. In Performer-MPC, Google Research applied this to introduce a new class of implicit control policies combining the benefits of imitation learning with the robust handling of system constraints from data-driven approaches in robotics
One of the main obstacles to the development of robotics is the lack of data compared to other AI fields, such as natural language processing and computer vision. However, Google Research is finding ways to overcome this challenge and make data-driven approaches a reality in the world of robotics.
The team has been using large-scale, real-world robotics datasets to physical skills, such as picking up and precisely placing objects. With their Robotics Transformer-1 (RT-1) approach, they trained a robot manipulation policy on a dataset of over 130,000 episodes that covered over 700 different tasks, using a fleet of 13 robots from Everyday Robots. This approach has shown that increasing the scale and diversity of data can improve a model’s ability to generalize to new tasks, environments, and objects.
Google Research has also explored the use of simulation as a way to collect data more efficiently and safely. However, it is difficult to replicate the full environment in simulation, especially the physics and human-robot interactions. In their i-Sim2Real project, they addressed this issue by bootstrapping from a simple model of human behavior and alternating between training in simulation and deploying in the real world to teach robots to play table tennis with a human opponent.
In addition, Google Research is exploring ways for robots to learn by watching people. Using their Cross-Embodiment Inverse Reinforcement Learning approach, robots can learn new tasks by watching people perform them, which could allow for more efficient data collection and faster skill acquisition.
Optimistic outlook with a warning
While these advances in robotics and AI are exciting, it is important to consider the implications for the future of work. Automation and robotics are likely to replace certain jobs, and the rapid development of AI technologies means that this could happen sooner than expected.
However, it’s important to remember that technological advances have historically created new jobs and industries, and this trend is likely to continue. Jobs that are repetitive and predictable may be replaced, but new jobs will be created in areas such as robotics design, maintenance, and repair. Additionally, the increased productivity and efficiency that robots can bring to certain industries may create new job opportunities in those sectors.
As with any new technology, it’s important to consider the potential ethical implications and ensure that it is being used in a responsible and safe manner. Google Research has emphasized the importance of developing safe and robust AI systems that can work alongside humans and enhance their capabilities.
Conclusion
The advancements in AI and robotics being made by Google Research are truly remarkable, with the potential to transform the way we live and work. The use of large language models and data-driven approaches is helping to bring robotics and automation to the next level, making it possible for robots to learn and understand the world around them in ways that were previously impossible.
While these developments are exciting, it’s important to approach them with caution and consider the potential impact on the workforce. By focusing on responsible and safe development, however, we can ensure that these technologies are used to enhance human productivity and quality of life.
In the end, it’s clear that Google Research will continue to be a powerhouse in the AI and robotics field, driving innovation and pushing the boundaries of what is possible. With the world-changing possibilities that these technologies present, the future is looking brighter than ever before.