Bi-Weekly AI Research Roundup

Latest research summaries in ML, Robotics, CV, NLP and AI

Aug 02, 2024

∙ Paid

Corki: Enabling Real-time Embodied AI Robots via Algorithm-Architecture Co-Design
Empowering Robot Path Planning with Large Language Models: osmAG Map Topology & Hierarchy Comprehension with LLMs
MoE-Infinity: Offloading-Efficient MoE Model Serving
Fast Multipole Attention: A Divide-and-Conquer Attention Mechanism for Long Sequences
From Feature Importance to Natural Language Explanations Using LLMs with RAG
Stable Audio Open
MINT-1T: Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens
Large Language Monkeys: Scaling Inference Compute with Repeated Sampling
Safetywashing: Do AI Safety Benchmarks Actually Measure Safety Progress?
Knowledge Mechanisms in Large Language Models: A Survey and Perspective
Evolutionary Reinforcement Learning via Cooperative Coevolution
SynthVLM: High-Efficiency and High-Quality Synthetic Data for Vision Language Models
Unlocking Guidance for Discrete State-Space Diffusion and Flow Models
Prover-Verifier Games improve legibility of LLM outputs
The opportunities and risks of large language models in mental health

Corki: Enabling Real-time Embodied AI Robots via Algorithm-Architecture Co-Design

Authors: Yiyang Huang, Yuhui Hao, Bo Yu, Feng Yan, Yuxin Yang, Feng Min, Yinhe Han, Lin Ma, Shaoshan Liu, Qiang Liu, Yiming Gan

Source and references: https://arxiv.org/abs/2407.04292v2

Introduction

This paper proposes Corki, an algorithm-architecture co-design framework for real-time embodied AI robot control. Existing embodied AI systems struggle to meet real-time constraints due to the sequential execution of LLM inference, robot action execution, and data communication.

Key Points

Corki decouples LLM inference, robotic control, and data communication in the embodied AI robot compute pipeline.
Corki predicts a trajectory for the near future instead of a discrete action for a single frame, reducing the frequency of LLM inference.
Corki includes a hardware accelerator that transforms the predicted trajectory into actual torque signals to control robots in real-time.
Corki's execution pipeline parallelizes data communication with computation, hiding communication latency.

Methodology

Corki's algorithm predicts a continuous trajectory for the robot's near-future movement instead of a discrete action for a single frame. This reduces the frequency of LLM inference. Corki's hardware accelerator is designed to efficiently transform the predicted trajectory into real-time control signals for the robot.

Results and Findings

Corki achieves up to 3.6× speed-up and up to 8.0× reduction in LLM inference frequency compared to the baseline. The maximum success rate improvement is 17.3%.

Implications and Conclusions

Corki's algorithm-architecture co-design approach enables real-time performance for embodied AI robots, a critical requirement for practical deployment in real-world applications.

Empowering Robot Path Planning with Large Language Models: osmAG Map Topology & Hierarchy Comprehension with LLMs

Authors: Fujing Xie, Sören Schwertfeger

Source and references: https://arxiv.org/abs/2403.08228v2

Introduction

This paper explores enabling Large Language Models (LLMs) to comprehend the topology and hierarchy of Area Graph (osmAG), a text-based hierarchical, topometric semantic map representation, for robotic applications like navigation and path planning.

Get 14 day free trial

Keep reading with a 7-day free trial

Subscribe to State of AI to keep reading this post and get 7 days of free access to the full post archives.

Bi-Weekly AI Research Roundup

Latest research summaries in ML, Robotics, CV, NLP and AI

Contents

Corki: Enabling Real-time Embodied AI Robots via Algorithm-Architecture Co-Design

Introduction

Key Points

Methodology

Results and Findings

Implications and Conclusions

Empowering Robot Path Planning with Large Language Models: osmAG Map Topology & Hierarchy Comprehension with LLMs

Introduction

Keep reading with a 7-day free trial