State of AI

State of AI

Automated Research, and Reasoning Breakthroughs

State of AI's avatar
State of AI
Sep 10, 2025
∙ Paid
12
3
Share

Welcome to today's edition of State of AI 🚀

This issue highlights the latest advancements in unifying different modalities within large language models, automating the creation of research software, and pushing the boundaries of complex reasoning capabilities. We'll explore how these technical innovations are shaping the future of AI-powered scientific discovery and practical applications.

Here's what caught our attention:

  • Unified Multimodal LLMs with Discrete Representations: The AnyGPT model demonstrates how discrete representations can effectively unify the processing of speech, text, images, and music within a single language model.

  • Automating Empirical Software Creation: The AI system presented in this paper can systematically explore a vast solution space and integrate research ideas to generate expert-level empirical software across diverse scientific domains.

  • Evaluating LLM Reasoning with SearchBench: This new benchmark tests the ability of LLMs to reason about complex combinatorial search problems, revealing opportunities to improve their generalized reasoning capabilities.

  • Self-Improving Multimodal Models with Dual Rewards: The SUDER framework leverages the inherent duality between understanding and generation tasks to provide self-supervised optimization signals, enhancing the performance of unified LLMs.

  • Adaptive Multi-Turn RL for LLM Step-Provers: The BFS-Prover-V2 system scales up both training-time RL and inference-time compute to advance the integration of LLMs into automated theorem proving.

Let's get into it 👇

Bi-Weekly AI Research Roundup

Latest research summaries in ML, Robotics, CV, NLP and AI

Contents

  1. An AI system to help scientists write expert-level empirical software

  2. Navigating the Labyrinth: Evaluating LLMs' Ability to Reason About Search Problems

  3. Scaling up Multi-Turn Off-Policy RL and Multi-Agent Tree Search for LLM Step-Provers

  4. SUDER: Self-Improving Unified Large Multimodal Models for Understanding and Generation with Dual Self-Rewards

  5. AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling

  6. MM-DINOv2: Adapting Foundation Models for Multi-Modal Medical Image Analysis

  7. From Noise to Narrative: Tracing the Origins of Hallucinations in Transformers

  8. Learning words in groups: fusion algebras, tensor ranks and grokking

  9. Data-driven solar forecasting enables near-optimal economic decisions

  10. Transforming Wearable Data into Personal Health Insights using Large Language Model Agents

  11. Beyond Two-Stage Training: Cooperative SFT and RL for LLM Reasoning

  12. Paper2Agent: Reimagining Research Papers As Interactive and Reliable AI Agents

  13. LLaDA-VLA: Vision Language Diffusion Action Models

  14. Oyster-I: Beyond Refusal -- Constructive Safety Alignment for Responsible Language Models

  15. F1: A Vision-Language-Action Model Bridging Understanding and Generation to Actions

An AI system to help scientists write expert-level empirical software

Keep reading with a 7-day free trial

Subscribe to State of AI to keep reading this post and get 7 days of free access to the full post archives.

Already a paid subscriber? Sign in
© 2025 StateOfAI
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture