Taming Long Tails, Securing Smart Contracts, and Probing the Critical Point of AI Reasoning
Latest research summaries in ML, Robotics, CV, NLP and AI
Welcome to today’s edition of State of AI! 👋 And a warm welcome to our new subscribers since last edition!
This issue covers a range of fascinating technical topics, from improving the efficiency of reinforcement learning training to securing smart contract languages and establishing new frontiers for evaluating AI reasoning capabilities. We’ll also explore advancements in 3D reconstruction, multimodal understanding and generation, and speech representation learning.
Here’s what caught our attention:
Taming the Long-Tail: Efficient Reasoning RL Training with Adaptive Drafter - A system that accelerates reasoning RL training by addressing the efficiency bottleneck caused by the long-tail distribution of response generation.
Securing Smart Contract Languages with a Unified Agentic Framework for Vulnerability Repair in Solidity and Move - A novel multi-agent framework that leverages LLMs to automatically detect and repair vulnerabilities in Solidity and Move smart contracts.
Probing the Critical Point (CritPt) of AI Reasoning: a Frontier Physics Research Benchmark - A new benchmark designed to evaluate the reasoning abilities of LLMs on unpublished, research-level physics problems.
SAM 3D: 3Dfy Anything in Images - A generative neural network for 3D reconstruction from a single image, capable of reconstructing 3D shape, texture, and layout.
LightFusion: A Light-weighted, Double Fusion Framework for Unified Multimodal Understanding and Generation - An efficient unified multimodal framework that strategically fuses pre-trained vision and language models.
Let’s get into it
Contents
Cognitive Foundations for Reasoning and Their Manifestation in LLMs
Utilizing Large Language Models for Zero-Shot Medical Ontology Extension from Clinical Notes
TimeViper: A Hybrid Mamba-Transformer Vision-Language Model for Efficient Long Video Understanding
Taming the Long-Tail: Efficient Reasoning RL Training with Adaptive Drafter
gfnx: Fast and Scalable Library for Generative Flow Networks in JAX
Turning Up the Heat: Min-p Sampling for Creative and Coherent LLM Outputs
Probing the Critical Point (CritPt) of AI Reasoning: a Frontier Physics Research Benchmark
Codec2Vec: Self-Supervised Speech Representation Learning Using Neural Speech Codecs
InternData-A1: Pioneering High-Fidelity Synthetic Data for Pre-training Generalist Policy
Dexterity from Smart Lenses: Multi-Fingered Robot Manipulation with In-the-Wild Human Demonstrations
Cognitive Foundations for Reasoning and Their Manifestation in LLMs
Authors: Priyanka Kargupta, Shuyue Stella Li, Haocheng Wang, Jinu Lee, Shan Chen, Orevaoghene Ahia, Dean Light, Thomas L. Griffiths, Max Kleiman-Weiner, Jiawei Han, Asli Celikyilmaz, Yulia Tsvetkov
Source and references: https://arxiv.org/abs/2511.16660v1
Introduction
This paper proposes a unified taxonomy of cognitive foundations for reasoning, synthesizing theories from cognitive science research. The authors conduct the first large-scale empirical comparison of cognitive elements in human versus large language model (LLM) reasoning across diverse problem types.
Key Points
Synthesize cognitive science theories into a taxonomy of 28 cognitive elements spanning computational constraints, meta-cognitive controls, knowledge representations, and transformation operations.
Analyze 170K reasoning traces from 17 models across text, vision, and audio modalities, alongside 54 human think-aloud traces.
Reveal systematic structural differences: humans employ hierarchical nesting and meta-cognitive monitoring while models rely on shallow forward chaining, with divergence most pronounced on ill-structured problems.
Observe that models consistently employ behaviors which are not the most conducive to success.
Introduce test-time reasoning guidance that automatically scaffolds successful reasoning structures, improving performance by up to 60% on complex problems.
Methodology
The authors synthesize cognitive science theories through Marr’s levels of analysis to propose a taxonomy of 28 cognitive elements. They collect and annotate 170K model reasoning traces and 54 human traces across text, vision, and audio modalities. Using fine-grained span-level annotation, they identify which cognitive elements appear in each trace and how they are sequenced.
Results and Findings
Keep reading with a 7-day free trial
Subscribe to State of AI to keep reading this post and get 7 days of free access to the full post archives.


