State of AI

State of AI

Scaling Transformers, Video-Language Models, and Collaborative Reasoning

Latest research summaries in ML, Robotics, CV, NLP and AI

State of AI's avatar
State of AI
Jan 18, 2026
∙ Paid

Welcome to today’s edition of State of AI 🤖 👋

This edition covers scaling transformer architectures and developing advanced video-language models, to novel multi-agent reasoning frameworks and benchmarks for cultural and multilingual video understanding. We also delve into the origins of neural scaling laws and reinforcement learning approaches that promote creative problem-solving.

Here’s what caught our attention:

  • STEM: Scaling Transformers with Embedding Modules - A static, token-indexed approach that decouples parametric capacity from per-token compute, reducing FLOPs and parameter accesses while improving downstream accuracy.

  • Molmo2: Open Weights and Data for Vision-Language Models - A new family of open-source video-language models that demonstrate exceptional grounding capabilities, matching or surpassing prior open models and even proprietary systems.

  • Collaborative Multi-Agent Test-Time Reinforcement Learning - A framework that leverages structured textual experience to enhance the capabilities of collaborative multi-agent systems, leading to improved performance across various domains.

Let’s get into it 👇

Contents

  1. Pareto-Grid-Guided Large Language Models for Fast and High-Quality Heuristics Design in Multi-Objective Combinatorial Optimization

  2. Multi-Property Synthesis

  3. From Single to Multi-Agent Reasoning: Advancing GeneGPT for Genomics QA

  4. STEP3-VL-10B Technical Report

  5. Molmo2: Open Weights and Data for Vision-Language Models with Video Understanding and Grounding

  6. CURVE: A Benchmark for Cultural and Multilingual Long Video Reasoning

  7. STEM: Scaling Transformers with Embedding Modules

  8. On the origin of neural scaling laws: from random graphs to natural language

  9. Riesz Representer Fitting under Bregman Divergence: A Unified Framework for Debiased Machine Learning

  10. Rewarding the Rare: Uniqueness-Aware RL for Creative Problem Solving in LLMs

  11. Collaborative Multi-Agent Test-Time Reinforcement Learning for Reasoning

  12. PlotCraft: Pushing the Limits of LLMs for Complex and Interactive Data Visualization

  13. See Less, Drive Better: Generalizable End-to-End Autonomous Driving via Foundation Models Stochastic Patch Selection

  14. TinyMyo: a Tiny Foundation Model for Flexible EMG Signal Processing at the Edge

  15. Generative AI collective behavior needs an interactionist paradigm

Pareto-Grid-Guided Large Language Models for Fast and High-Quality Heuristics Design in Multi-Objective Combinatorial Optimization

Source and references: https://arxiv.org/abs/2507.20923v3


Introduction

This paper introduces a novel framework called MPaGE for automatically designing heuristics to solve multi-objective combinatorial optimization problems (MOCOP). MPaGE leverages large language models (LLMs) and Pareto Front Grid (PFG) techniques to discover a diverse set of heuristics that jointly optimize solution quality and runtime efficiency.

Key Points

  • MPaGE is the first framework to systematically combine LLMs with the Simple Evolutionary Multiobjective Optimization (SEMO) paradigm and PFG.

  • It uses LLMs to verify the logical structure of heuristics and perform cross-cluster recombination, enhancing diversity and reducing redundancy.

  • Through extensive experiments on standard MOCOP benchmarks, MPaGE demonstrates consistent improvements in runtime efficiency, solution quality, and semantic diversity over LLM-based baselines and traditional multi-objective evolutionary algorithms (MOEAs).

Methodology

MPaGE partitions the objective space into grid cells using PFG and retains top-performing candidates to guide heuristic generation. It then employs LLMs to assess the semantic structures of the candidate heuristics, clustering them into groups of similar logic. Variation is then performed with respect to these clusters, promoting semantic diversity and mitigating redundancy within the heuristic population.

Results and Findings

User's avatar

Continue reading this post for free, courtesy of State of AI.

Or purchase a paid subscription.
© 2026 StateOfAI · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture