Bi-Weekly AI Research Roundup

Latest research summaries in ML, Robotics, CV, NLP and AI

State of AI

Oct 11, 2024

∙ Paid

Scalable and Accurate Graph Reasoning with LLM-based Multi-Agents
TextHawk2: A Large Vision-Language Model Excels in Bilingual OCR and Grounding with 16x Fewer Tokens
Stateful Large Language Model Serving with Pensieve
PrefixQuant: Static Quantization Beats Dynamic through Prefixed Outliers in LLMs
Creative Beam Search: LLM-as-a-Judge For Improving Response Generation
ZS4C: Zero-Shot Synthesis of Compilable Code for Incomplete Code Snippets using LLMs
MM-Ego: Towards Building Egocentric Multimodal LLMs
LayerKV: Optimizing Large Language Model Serving with Layer-wise KV Cache Management
Astute RAG: Overcoming Imperfect Retrieval Augmentation and Knowledge Conflicts for Large Language Models
Embodied Agent Interface: Benchmarking LLMs for Embodied Decision Making
Executing Arithmetic: Fine-Tuning Large Language Models as Turing Machines
DART: Denoising Autoregressive Transformer for Scalable Text-to-Image Generation
Steering Masked Discrete Diffusion Models via Discrete Denoising Posterior Prediction
Mapping the Unseen: Unified Promptable Panoptic Mapping with Dynamic Labeling using Foundation Models

Scalable and Accurate Graph Reasoning with LLM-based Multi-Agents

Authors: Yuwei Hu, Runlin Lei, Xinyi Huang, Zhewei Wei, Yongchao Liu

Source and references: https://arxiv.org/abs/2410.05130v1

Introduction

This research paper introduces GraphAgent-Reasoner, a new framework that leverages the power of multi-agent collaboration to enable large language models (LLMs) to perform scalable and accurate graph reasoning.

Key Points

GraphAgent-Reasoner is the first LLM-based multi-agent framework for graph reasoning that requires no fine-tuning and can utilize any LLM as the underlying reasoning model.
The framework achieves near-perfect accuracy on various polynomial-time graph reasoning tasks, significantly outperforming existing methods.
GraphAgent-Reasoner can handle graph reasoning tasks on graphs with over 1,000 nodes, demonstrating exceptional scalability compared to previous approaches.
The framework also showcases its potential for addressing complex real-world graph reasoning applications, such as webpage importance analysis.

Methodology

The GraphAgent-Reasoner framework follows a node-centric approach, where an agent is assigned to each node in the graph. The agents collaborate to solve the overall graph reasoning problem, significantly reducing the amount of information and complexity handled by a single LLM. This approach is inspired by distributed graph computation theory, where the graph problem is decomposed into smaller, node-centric tasks that are distributed among the agents for collaborative resolution.

Results and Findings

Evaluated on the GraphInstruct dataset, the GraphAgent-Reasoner framework demonstrates near-perfect accuracy on polynomial-time graph reasoning tasks, significantly outperforming the best available models, both closed-source and fine-tuned open-source variants. As the graph size increases, the framework maintains robust accuracy, unlike other methods that exhibit significant performance degradation.

Implications and Conclusions

The GraphAgent-Reasoner framework represents a significant advancement in the field of graph reasoning using LLMs. By leveraging the power of multi-agent collaboration, the framework addresses the limitations of single LLMs in handling complex graph structures and large-scale graphs, paving the way for LLMs to tackle real-world graph reasoning applications with high accuracy and scalability.

TextHawk2: A Large Vision-Language Model Excels in Bilingual OCR and Grounding with 16x Fewer Tokens

Authors: Ya-Qi Yu, Minghui Liao, Jiwen Zhang, Jihao Wu

Source and references: https://arxiv.org/abs/2410.05261v1

Introduction

The paper presents TextHawk2, a bilingual Large Vision-Language Model (LVLM) that excels in Optical Character Recognition (OCR) and grounding tasks while using 16 times fewer tokens than previous models.

Get 7 day free trial

Give a gift subscription

Keep reading with a 7-day free trial

Subscribe to State of AI to keep reading this post and get 7 days of free access to the full post archives.

Bi-Weekly AI Research Roundup

Latest research summaries in ML, Robotics, CV, NLP and AI

Contents

Scalable and Accurate Graph Reasoning with LLM-based Multi-Agents

Introduction

Key Points

Methodology

Results and Findings

Implications and Conclusions

TextHawk2: A Large Vision-Language Model Excels in Bilingual OCR and Grounding with 16x Fewer Tokens

Introduction

Keep reading with a 7-day free trial