Taming Long Tails, Securing Smart Contracts, and Probing the Critical Point of AI Reasoning

Latest research summaries in ML, Robotics, CV, NLP and AI

Nov 24, 2025

∙ Paid

Welcome to today’s edition of State of AI! 👋 And a warm welcome to our new subscribers since last edition!

This issue covers a range of fascinating technical topics, from improving the efficiency of reinforcement learning training to securing smart contract languages and establishing new frontiers for evaluating AI reasoning capabilities. We’ll also explore advancements in 3D reconstruction, multimodal understanding and generation, and speech representation learning.

Here’s what caught our attention:

Taming the Long-Tail: Efficient Reasoning RL Training with Adaptive Drafter - A system that accelerates reasoning RL training by addressing the efficiency bottleneck caused by the long-tail distribution of response generation.
Securing Smart Contract Languages with a Unified Agentic Framework for Vulnerability Repair in Solidity and Move - A novel multi-agent framework that leverages LLMs to automatically detect and repair vulnerabilities in Solidity and Move smart contracts.
Probing the Critical Point (CritPt) of AI Reasoning: a Frontier Physics Research Benchmark - A new benchmark designed to evaluate the reasoning abilities of LLMs on unpublished, research-level physics problems.
SAM 3D: 3Dfy Anything in Images - A generative neural network for 3D reconstruction from a single image, capable of reconstructing 3D shape, texture, and layout.
LightFusion: A Light-weighted, Double Fusion Framework for Unified Multimodal Understanding and Generation - An efficient unified multimodal framework that strategically fuses pre-trained vision and language models.

Let’s get into it

Cognitive Foundations for Reasoning and Their Manifestation in LLMs

Authors: Priyanka Kargupta, Shuyue Stella Li, Haocheng Wang, Jinu Lee, Shan Chen, Orevaoghene Ahia, Dean Light, Thomas L. Griffiths, Max Kleiman-Weiner, Jiawei Han, Asli Celikyilmaz, Yulia Tsvetkov

Source and references: https://arxiv.org/abs/2511.16660v1

Introduction

This paper proposes a unified taxonomy of cognitive foundations for reasoning, synthesizing theories from cognitive science research. The authors conduct the first large-scale empirical comparison of cognitive elements in human versus large language model (LLM) reasoning across diverse problem types.

Key Points

Synthesize cognitive science theories into a taxonomy of 28 cognitive elements spanning computational constraints, meta-cognitive controls, knowledge representations, and transformation operations.
Analyze 170K reasoning traces from 17 models across text, vision, and audio modalities, alongside 54 human think-aloud traces.
Reveal systematic structural differences: humans employ hierarchical nesting and meta-cognitive monitoring while models rely on shallow forward chaining, with divergence most pronounced on ill-structured problems.
Observe that models consistently employ behaviors which are not the most conducive to success.
Introduce test-time reasoning guidance that automatically scaffolds successful reasoning structures, improving performance by up to 60% on complex problems.

Methodology

The authors synthesize cognitive science theories through Marr’s levels of analysis to propose a taxonomy of 28 cognitive elements. They collect and annotate 170K model reasoning traces and 54 human traces across text, vision, and audio modalities. Using fine-grained span-level annotation, they identify which cognitive elements appear in each trace and how they are sequenced.

Results and Findings

Keep reading with a 7-day free trial

Subscribe to State of AI to keep reading this post and get 7 days of free access to the full post archives.