State of AI

State of AI

Share this post

State of AI
State of AI
Bi-Weekly AI Research Roundup

Bi-Weekly AI Research Roundup

Latest research summaries in ML, Robotics, CV, NLP and AI

State of AI's avatar
State of AI
Oct 15, 2024
∙ Paid
1

Share this post

State of AI
State of AI
Bi-Weekly AI Research Roundup
2
Share

Contents

  1. Inference Scaling Laws: An Empirical Analysis of Compute-Optimal Inference for Problem-Solving with Language Models

  2. NT-LLM: A Novel Node Tokenizer for Integrating Graph Structure into Large Language Models

  3. Converging Paradigms: The Synergy of Symbolic and Connectionist AI in LLM-Empowered Autonomous Agents

  4. MEGA-Bench: Scaling Multimodal Evaluation to over 500 Real-World Tasks

  5. TALK-Act: Enhance Textural-Awareness for 2D Speaking Avatar Reenactment with Diffusion Model

  6. Cavia: Camera-controllable Multi-view Video Diffusion with View-Integrated Attention

  7. Reducing the Barriers to Entry for Foundation Model Training

  8. SeedLM: Compressing LLM Weights into Seeds of Pseudo-Random Generators

  9. The Future of Large Language Model Pre-training is Federated

  10. DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads

  11. Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free

  12. Mix Data or Merge Models? Optimizing for Diverse Multi-Task Learning

  13. AutoTurb: Using Large Language Models for Automatic Algebraic Model Discovery of Turbulence Closure

  14. Neuro-Vision to Language: Enhancing Brain Recording-based Visual Reconstruction and Language Interaction

  15. Mindalogue: LLM -- Powered Nonlinear Interaction for Effective Learning and Task Exploration


Inference Scaling Laws: An Empirical Analysis of Compute-Optimal Inference for Problem-Solving with Language Models

Authors: Yangzhen Wu, Zhiqing Sun, Shanda Li, Sean Welleck, Yiming Yang

Source and references: https://arxiv.org/abs/2408.00724v2


Introduction

This paper explores inference scaling laws and compute-optimal inference strategies for large language models (LLMs) on problem-solving tasks, with a focus on mathematical reasoning.

Key Points

  • The authors study the trade-offs between model size, inference strategies, and computational budget during inference.

  • They propose a novel tree search algorithm called Reward Balanced Search (REBASE) that is compute-optimal compared to existing methods like sampling and Monte Carlo Tree Search (MCTS).

  • Their findings indicate that smaller models can outperform larger ones under the same compute budget by increasing the number of samples during inference.

  • The authors provide theoretical analysis on the asymptotic behavior and convergence of voting-based inference strategies, highlighting the need for more sophisticated algorithms.

  • REBASE consistently outperforms other inference strategies across different model sizes and tasks, often achieving a Pareto-optimal trade-off between accuracy and compute.

Methodology

The authors explore various inference strategies, including greedy search, best-of-n, majority voting, weighted voting, and their tree-search variants. They formulate the compute-optimal inference problem and propose the REBASE algorithm, which uses a reward model to guide the tree search process efficiently.

Results and Findings

The authors find that smaller models (e.g., Llemma-7B) can outperform larger models (e.g., Llemma-34B) under the same computational budget by using more samples during inference. Their theoretical analysis shows that standard voting-based strategies have performance limits and diminishing returns as the number of samples increases.

The proposed REBASE algorithm consistently outperforms sampling-based methods and MCTS across all settings, models, and tasks. REBASE with a smaller language model often achieves a Pareto-optimal trade-off between accuracy and computational cost.

Implications and Conclusions

The results demonstrate the value of using smaller models with advanced inference-time algorithms to achieve better returns on inference-time compute. The authors' findings contribute to a broader understanding of inference scaling laws for LLMs and highlight the importance of developing new, compute-optimal inference algorithms.


NT-LLM: A Novel Node Tokenizer for Integrating Graph Structure into Large Language Models

Authors: Yanbiao Ji, Chang Liu, Xin Chen, Yue Ding, Dan Luo, Mei Li, Wenqing Lin, Hongtao Lu

Source and references: https://arxiv.org/abs/2410.10743v1


Introduction

This paper introduces NT-LLM, a novel framework that efficiently encodes graph structures for seamless integration with Large Language Models (LLMs). The core of the method is the strategic selection of key nodes, referred to as anchors, which serve as reference points for encoding the graph topology.

Get 7 day free trial

Keep reading with a 7-day free trial

Subscribe to State of AI to keep reading this post and get 7 days of free access to the full post archives.

Already a paid subscriber? Sign in
© 2025 StateOfAI
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share