Bi-Weekly AI Research Roundup

Latest research summaries in ML, Robotics, CV, NLP and AI

State of AI

Oct 15, 2024

∙ Paid

Inference Scaling Laws: An Empirical Analysis of Compute-Optimal Inference for Problem-Solving with Language Models
NT-LLM: A Novel Node Tokenizer for Integrating Graph Structure into Large Language Models
Converging Paradigms: The Synergy of Symbolic and Connectionist AI in LLM-Empowered Autonomous Agents
MEGA-Bench: Scaling Multimodal Evaluation to over 500 Real-World Tasks
TALK-Act: Enhance Textural-Awareness for 2D Speaking Avatar Reenactment with Diffusion Model
Cavia: Camera-controllable Multi-view Video Diffusion with View-Integrated Attention
Reducing the Barriers to Entry for Foundation Model Training
SeedLM: Compressing LLM Weights into Seeds of Pseudo-Random Generators
The Future of Large Language Model Pre-training is Federated
DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads
Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free
Mix Data or Merge Models? Optimizing for Diverse Multi-Task Learning
AutoTurb: Using Large Language Models for Automatic Algebraic Model Discovery of Turbulence Closure
Neuro-Vision to Language: Enhancing Brain Recording-based Visual Reconstruction and Language Interaction
Mindalogue: LLM -- Powered Nonlinear Interaction for Effective Learning and Task Exploration

Inference Scaling Laws: An Empirical Analysis of Compute-Optimal Inference for Problem-Solving with Language Models

Authors: Yangzhen Wu, Zhiqing Sun, Shanda Li, Sean Welleck, Yiming Yang

Source and references: https://arxiv.org/abs/2408.00724v2

Introduction

This paper explores inference scaling laws and compute-optimal inference strategies for large language models (LLMs) on problem-solving tasks, with a focus on mathematical reasoning.

Key Points

The authors study the trade-offs between model size, inference strategies, and computational budget during inference.
They propose a novel tree search algorithm called Reward Balanced Search (REBASE) that is compute-optimal compared to existing methods like sampling and Monte Carlo Tree Search (MCTS).
Their findings indicate that smaller models can outperform larger ones under the same compute budget by increasing the number of samples during inference.
The authors provide theoretical analysis on the asymptotic behavior and convergence of voting-based inference strategies, highlighting the need for more sophisticated algorithms.
REBASE consistently outperforms other inference strategies across different model sizes and tasks, often achieving a Pareto-optimal trade-off between accuracy and compute.

Methodology

The authors explore various inference strategies, including greedy search, best-of-n, majority voting, weighted voting, and their tree-search variants. They formulate the compute-optimal inference problem and propose the REBASE algorithm, which uses a reward model to guide the tree search process efficiently.

Results and Findings

The authors find that smaller models (e.g., Llemma-7B) can outperform larger models (e.g., Llemma-34B) under the same computational budget by using more samples during inference. Their theoretical analysis shows that standard voting-based strategies have performance limits and diminishing returns as the number of samples increases.

The proposed REBASE algorithm consistently outperforms sampling-based methods and MCTS across all settings, models, and tasks. REBASE with a smaller language model often achieves a Pareto-optimal trade-off between accuracy and computational cost.

Implications and Conclusions

The results demonstrate the value of using smaller models with advanced inference-time algorithms to achieve better returns on inference-time compute. The authors' findings contribute to a broader understanding of inference scaling laws for LLMs and highlight the importance of developing new, compute-optimal inference algorithms.

NT-LLM: A Novel Node Tokenizer for Integrating Graph Structure into Large Language Models

Authors: Yanbiao Ji, Chang Liu, Xin Chen, Yue Ding, Dan Luo, Mei Li, Wenqing Lin, Hongtao Lu

Source and references: https://arxiv.org/abs/2410.10743v1

Introduction

This paper introduces NT-LLM, a novel framework that efficiently encodes graph structures for seamless integration with Large Language Models (LLMs). The core of the method is the strategic selection of key nodes, referred to as anchors, which serve as reference points for encoding the graph topology.

Get 7 day free trial

Keep reading with a 7-day free trial

Subscribe to State of AI to keep reading this post and get 7 days of free access to the full post archives.

Bi-Weekly AI Research Roundup

Latest research summaries in ML, Robotics, CV, NLP and AI

Contents

Inference Scaling Laws: An Empirical Analysis of Compute-Optimal Inference for Problem-Solving with Language Models

Introduction

Key Points

Methodology

Results and Findings

Implications and Conclusions

NT-LLM: A Novel Node Tokenizer for Integrating Graph Structure into Large Language Models

Introduction

Keep reading with a 7-day free trial