Bi-Weekly AI Research Roundup

Latest research summaries in ML, Robotics, CV, NLP and AI

State of AI

Jul 23, 2024

∙ Paid

Q*: Improving Multi-step Reasoning for LLMs with Deliberative Planning
LLMmap: Fingerprinting For Large Language Models
ACEGEN: Reinforcement learning of generative chemical agents for drug discovery
Mélange: Cost Efficient Large Language Model Serving by Exploiting GPU Heterogeneity
Wisdom of the Silicon Crowd: LLM Ensemble Prediction Capabilities Rival Human Crowd Accuracy
MarkLLM: An Open-Source Toolkit for LLM Watermarking
UltraEval: A Lightweight Platform for Flexible and Comprehensive Evaluation for LLMs
Wisdom of the Silicon Crowd: LLM Ensemble Prediction Capabilities Rival Human Crowd Accuracy
Attention Is All You Need But You Don't Need All Of It For Inference of Large Language Models
Imposter.AI: Adversarial Attacks with Hidden Intentions towards Aligned Large Language Models
Chain of Code: Reasoning with a Language Model-Augmented Code Emulator
T2V-CompBench: A Comprehensive Benchmark for Compositional Text-to-video Generation
SynthBA: Reliable Brain Age Estimation Across Multiple MRI Sequences and Resolutions
The Future of Large Language Model Pre-training is Federated
SparQ Attention: Bandwidth-Efficient LLM Inference

**Q*: Improving Multi-step Reasoning for LLMs with Deliberative Planning**

Authors: Chaojie Wang, Yanchen Deng, Zhiyi Lyu, Liang Zeng, Jujie He, Shuicheng Yan, Bo An

Source and references: https://arxiv.org/abs/2406.14283v4

Introduction

This paper explores a framework called Q* that aims to improve the multi-step reasoning capabilities of Large Language Models (LLMs) by guiding their decoding process through deliberative planning.

Key Points

Q* casts multi-step reasoning of LLMs as a heuristic search problem, allowing for more deliberative and logical thinking.
Q* learns a plug-and-play Q-value model as a heuristic function to estimate the expected future rewards, guiding LLMs to select the most promising next reasoning step.
Q* does not require fine-tuning the LLMs for the current task, avoiding significant computational overhead and potential performance degeneration on other tasks.
Extensive experiments on GSM8K, MATH, and MBPP datasets demonstrate the superiority of Q* in improving the reasoning performance of existing open-source LLMs.

Methodology

The authors approach the problem of multi-step reasoning in LLMs by casting it as a heuristic search problem. They introduce Q*, a framework that learns a Q-value model as a heuristic function to estimate the expected future rewards, effectively guiding the LLMs' decoding process without the need for fine-tuning the models.

Results and Findings

The results of the extensive experiments conducted on the GSM8K, MATH, and MBPP datasets show that Q* significantly improves the reasoning performance of existing open-source LLMs. The authors present quantitative data and comparisons to demonstrate the superiority of their approach.

Implications and Conclusions

This research presents an important step towards enhancing the multi-step reasoning capabilities of LLMs, which is crucial for their application in complex problem-solving tasks. By introducing a general and versatile framework like Q*, the authors aim to contribute to the ongoing efforts to improve the reasoning abilities of existing LLMs.

LLMmap: Fingerprinting For Large Language Models

Authors: Dario Pasquini, Evgenios M. Kornaropoulos, Giuseppe Ateniese

Source and references: https://arxiv.org/abs/2407.15847v1

Introduction

This paper introduces LLMmap, a first-generation fingerprinting attack targeted at applications integrating Large Language Models (LLMs). LLMmap employs an active fingerprinting approach, sending carefully crafted queries to the application and analyzing the responses to identify the specific LLM model in use.

Get 7 day free trial

Keep reading with a 7-day free trial

Subscribe to State of AI to keep reading this post and get 7 days of free access to the full post archives.

Bi-Weekly AI Research Roundup

Latest research summaries in ML, Robotics, CV, NLP and AI

Contents

Q*: Improving Multi-step Reasoning for LLMs with Deliberative Planning

Introduction

Key Points

Methodology

Results and Findings

Implications and Conclusions

LLMmap: Fingerprinting For Large Language Models

Introduction

Keep reading with a 7-day free trial

**Q*: Improving Multi-step Reasoning for LLMs with Deliberative Planning**