Contents
Corki: Enabling Real-time Embodied AI Robots via Algorithm-Architecture Co-Design
Empowering Robot Path Planning with Large Language Models: osmAG Map Topology & Hierarchy Comprehension with LLMs
MoE-Infinity: Offloading-Efficient MoE Model Serving
Fast Multipole Attention: A Divide-and-Conquer Attention Mechanism for Long Sequences
From Feature Importance to Natural Language Explanations Using LLMs with RAG
Stable Audio Open
MINT-1T: Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens
Large Language Monkeys: Scaling Inference Compute with Repeated Sampling
Safetywashing: Do AI Safety Benchmarks Actually Measure Safety Progress?
Knowledge Mechanisms in Large Language Models: A Survey and Perspective
Evolutionary Reinforcement Learning via Cooperative Coevolution
SynthVLM: High-Efficiency and High-Quality Synthetic Data for Vision Language Models
Unlocking Guidance for Discrete State-Space Diffusion and Flow Models
Prover-Verifier Games improve legibility of LLM outputs
The opportunities and risks of large language models in mental health
Corki: Enabling Real-time Embodied AI Robots via Algorithm-Architecture Co-Design
Authors: Yiyang Huang, Yuhui Hao, Bo Yu, Feng Yan, Yuxin Yang, Feng Min, Yinhe Han, Lin Ma, Shaoshan Liu, Qiang Liu, Yiming Gan
Source and references: https://arxiv.org/abs/2407.04292v2
Introduction
This paper proposes Corki, an algorithm-architecture co-design framework for real-time embodied AI robot control. Existing embodied AI systems struggle to meet real-time constraints due to the sequential execution of LLM inference, robot action execution, and data communication.
Key Points
Corki decouples LLM inference, robotic control, and data communication in the embodied AI robot compute pipeline.
Corki predicts a trajectory for the near future instead of a discrete action for a single frame, reducing the frequency of LLM inference.
Corki includes a hardware accelerator that transforms the predicted trajectory into actual torque signals to control robots in real-time.
Corki's execution pipeline parallelizes data communication with computation, hiding communication latency.
Methodology
Corki's algorithm predicts a continuous trajectory for the robot's near-future movement instead of a discrete action for a single frame. This reduces the frequency of LLM inference. Corki's hardware accelerator is designed to efficiently transform the predicted trajectory into real-time control signals for the robot.
Results and Findings
Corki achieves up to 3.6× speed-up and up to 8.0× reduction in LLM inference frequency compared to the baseline. The maximum success rate improvement is 17.3%.
Implications and Conclusions
Corki's algorithm-architecture co-design approach enables real-time performance for embodied AI robots, a critical requirement for practical deployment in real-world applications.
Empowering Robot Path Planning with Large Language Models: osmAG Map Topology & Hierarchy Comprehension with LLMs
Authors: Fujing Xie, Sören Schwertfeger
Source and references: https://arxiv.org/abs/2403.08228v2
Introduction
This paper explores enabling Large Language Models (LLMs) to comprehend the topology and hierarchy of Area Graph (osmAG), a text-based hierarchical, topometric semantic map representation, for robotic applications like navigation and path planning.
Keep reading with a 7-day free trial
Subscribe to State of AI to keep reading this post and get 7 days of free access to the full post archives.