Greetings,
We are thrilled to present the landmark 40th edition of the State of AI. In this special issue, we explore the forefront of AI innovation, featuring groundbreaking advancements that redefine the boundaries of possibility.
Delve into the world of TinyLlama, an open-source small language model making big waves. Discover Infinite-LLM, an efficient service for managing long contexts with cutting-edge techniques. Learn about DeepSeek LLM, a pioneering venture in scaling open-source language models for long-term impact. Experience the revolution in mobile robotics with Mobile ALOHA, mastering bimanual manipulation with low-cost teleoperation. Finally, witness the synergy in LLM Augmented LLMs, a novel approach to expanding capabilities through composition.
Join us on this journey through the latest and most exciting developments in AI, each offering a window into the future of technology and its potential to transform our world. Here's to celebrating the progress, innovation, and the continuous quest for knowledge. Enjoy the read!
Best regards,
Contents
Mobile ALOHA: Learning Bimanual Mobile Manipulation with Low-Cost Whole-Body Teleoperation
TinyLlama: An Open-Source Small Language Model
Infinite-LLM: Efficient LLM Service for Long Context with DistAttention and Distributed KVCache
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
LLM Augmented LLMs: Expanding Capabilities through Composition
Mobile ALOHA: Learning Bimanual Mobile Manipulation with Low-Cost Whole-Body Teleoperation
Authors: Zipeng Fu, Tony Z. Zhao, Chelsea Finn (Stanford University)
Source and references: https://mobile-aloha.github.io , https://arxiv.org/abs/2401.02117
Meet Mobile ALOHA
If you've ever wondered how robots could learn to help out with everyday tasks like cooking, cleaning, and housekeeping, look no further than this impressive research. In their paper, Zipeng Fu, Tony Z. Zhao, and Chelsea Finn at Stanford University introduce Mobile ALOHA, a low-cost, highly dexterous, and versatile mobile manipulator system. It's capable of learning complex, long-horizon tasks using imitation learning from human demonstrations.
Unlike other expensive solutions, Mobile ALOHA delivers general-purpose, affordable, and practical bimanual mobile manipulation. Costing just $32k, it's a great candidate for labs and research groups that would like to explore the fascinating world of bimanual mobile robotics.
Building a Mobile Manipulation System
The Mobile ALOHA system is born from several design considerations to ensure it's a standout option for applications in dynamic environments. Key considerations include:
Mobility: The system can move at a speed similar to human walking, around 1.42m/s
Stability: It's stable enough to manipulate heavy household objects like pots and cabinets
Whole-body teleoperation: Both arms and the mobile base can be controlled simultaneously
Untethered: Provides onboard power and compute
Mobile ALOHA benefits from the original ALOHA's design (a low-cost and dexterous bimanual teleoperation setup) and extends its capabilities by adding a wheeled base to enable mobility. By tethering the human operator's waist to the mobile base, the system allows for seamless whole-body control, including independent movement of the base and both arm control via the ALOHA leader arms.
Key Features of Mobile ALOHA
Mobile ALOHA performs a wide range of tasks in various settings thanks to its impressive technical specifications. Boasting a teleoperation system that allows continuous usage for several hours, it can engage in activities like cooking a 3-course meal, cleaning public restrooms, and doing laundry. It's also capable of various household tasks such as watering plants, using a vacuum, loading and unloading dishwashers, opening doors, using washing machines, and much more.
The Mobile ALOHA system comes equipped with a 1.26kWh battery for onboard power and a consumer-grade laptop for on-site computing. These features, combined with the agile and affordable design, make the Mobile ALOHA an attractive solution for research and experimentation in bimanual mobile manipulation.
Co-Training with Static ALOHA Data
Imitation learning has gained popularity as it enables robots to learn from human demonstrations. However, lengthy data collection processes and specialized datasets have hindered large-scale applications. Inspired by the recent success of co-training with diverse real-world datasets, the researchers decide to use an existing static ALOHA dataset to make bimanual manipulator learning more efficient.
Although the static ALOHA data is collected on a fixed setup (black table-top with the arms facing each other), the researchers observe that there is positive transfer to mobile manipulation tasks. Consequently, the system achieves better performance and data efficiency than policies trained solely on Mobile ALOHA data, confirming that co-training is a promising direction for the future of bimanual mobile manipulation.
Impressive Results in a Range of Complex Tasks
Mobile ALOHA has proven to be highly effective in a variety of practical scenarios:
It can perform housekeeping tasks like vacuuming, doing laundry, opening doors, and even using a washing machine.
In a cooking setting, it's capable of tasks like cracking eggs, mincing garlic, and even stir-frying.
Mobile ALOHA is adept at human-robot interactions, such as greeting humans with a shake of the "hand."
With co-training, Mobile ALOHA achieves an 80% success rate on these tasks, averaging a 34% absolute improvement when compared to no co-training solutions. It demonstrates that the combination of affordable hardware and efficient imitation learning can lead to significant progress in bimanual mobile manipulation.
Wrapping it up
The Mobile ALOHA system by Zipeng Fu, Tony Z. Zhao, and Chelsea Finn truly stands out as an accessible, efficient, and practical solution for bimanual mobile manipulation research. By combining low-cost hardware, innovative whole-body teleoperation, and cleverly leveraging existing static ALOHA datasets, the researchers have demonstrated promising results in a range of complex, real-world tasks.
With the continuing development of imitation learning and co-training adaptations, it's easy to imagine a future where robots like Mobile ALOHA become commonplace in both households and businesses, making our lives more convenient and exciting.
TinyLlama: An Open-Source Small Language Model
Authors: Peiyuan Zhang, Guangtao Zeng, Tianduo Wang, Wei Lu
Source and references: https://arxiv.org/abs/2401.02385
Introducing TinyLlama
TinyLlama is an open-source, compact language model with 1.1 billion parameters, pretrained on approximately 1 trillion tokens for about three epochs. Despite its relatively small size, TinyLlama demonstrates remarkable performance in various downstream tasks, outperforming other existing open-source language models of similar sizes. This makes TinyLlama an attractive and accessible platform for researchers and practitioners in language model research.
Keep reading with a 7-day free trial
Subscribe to State of AI to keep reading this post and get 7 days of free access to the full post archives.