State of AI

State of AI

Taming Long Tails, Securing Smart Contracts, and Probing the Critical Point of AI Reasoning

Latest research summaries in ML, Robotics, CV, NLP and AI

State of AI's avatar
State of AI
Nov 24, 2025
∙ Paid

Welcome to today’s edition of State of AI! 👋 And a warm welcome to our new subscribers since last edition!

This issue covers a range of fascinating technical topics, from improving the efficiency of reinforcement learning training to securing smart contract languages and establishing new frontiers for evaluating AI reasoning capabilities. We’ll also explore advancements in 3D reconstruction, multimodal understanding and generation, and speech representation learning.

Here’s what caught our attention:

  • Taming the Long-Tail: Efficient Reasoning RL Training with Adaptive Drafter - A system that accelerates reasoning RL training by addressing the efficiency bottleneck caused by the long-tail distribution of response generation.

  • Securing Smart Contract Languages with a Unified Agentic Framework for Vulnerability Repair in Solidity and Move - A novel multi-agent framework that leverages LLMs to automatically detect and repair vulnerabilities in Solidity and Move smart contracts.

  • Probing the Critical Point (CritPt) of AI Reasoning: a Frontier Physics Research Benchmark - A new benchmark designed to evaluate the reasoning abilities of LLMs on unpublished, research-level physics problems.

  • SAM 3D: 3Dfy Anything in Images - A generative neural network for 3D reconstruction from a single image, capable of reconstructing 3D shape, texture, and layout.

  • LightFusion: A Light-weighted, Double Fusion Framework for Unified Multimodal Understanding and Generation - An efficient unified multimodal framework that strategically fuses pre-trained vision and language models.

Let’s get into it

Contents

  1. Cognitive Foundations for Reasoning and Their Manifestation in LLMs

  2. Utilizing Large Language Models for Zero-Shot Medical Ontology Extension from Clinical Notes

  3. Securing Smart Contract Languages with a Unified Agentic Framework for Vulnerability Repair in Solidity and Move

  4. SAM 3D: 3Dfy Anything in Images

  5. LightFusion: A Light-weighted, Double Fusion Framework for Unified Multimodal Understanding and Generation

  6. TimeViper: A Hybrid Mamba-Transformer Vision-Language Model for Efficient Long Video Understanding

  7. Taming the Long-Tail: Efficient Reasoning RL Training with Adaptive Drafter

  8. Leveraging Reinforcement Learning, Genetic Algorithms and Transformers for background determination in particle physics

  9. gfnx: Fast and Scalable Library for Generative Flow Networks in JAX

  10. Turning Up the Heat: Min-p Sampling for Creative and Coherent LLM Outputs

  11. Probing the Critical Point (CritPt) of AI Reasoning: a Frontier Physics Research Benchmark

  12. Codec2Vec: Self-Supervised Speech Representation Learning Using Neural Speech Codecs

  13. MiMo-Embodied: X-Embodied Foundation Model Technical Report

  14. InternData-A1: Pioneering High-Fidelity Synthetic Data for Pre-training Generalist Policy

  15. Dexterity from Smart Lenses: Multi-Fingered Robot Manipulation with In-the-Wild Human Demonstrations

Cognitive Foundations for Reasoning and Their Manifestation in LLMs

Authors: Priyanka Kargupta, Shuyue Stella Li, Haocheng Wang, Jinu Lee, Shan Chen, Orevaoghene Ahia, Dean Light, Thomas L. Griffiths, Max Kleiman-Weiner, Jiawei Han, Asli Celikyilmaz, Yulia Tsvetkov

Source and references: https://arxiv.org/abs/2511.16660v1


Introduction

This paper proposes a unified taxonomy of cognitive foundations for reasoning, synthesizing theories from cognitive science research. The authors conduct the first large-scale empirical comparison of cognitive elements in human versus large language model (LLM) reasoning across diverse problem types.

Key Points

  • Synthesize cognitive science theories into a taxonomy of 28 cognitive elements spanning computational constraints, meta-cognitive controls, knowledge representations, and transformation operations.

  • Analyze 170K reasoning traces from 17 models across text, vision, and audio modalities, alongside 54 human think-aloud traces.

  • Reveal systematic structural differences: humans employ hierarchical nesting and meta-cognitive monitoring while models rely on shallow forward chaining, with divergence most pronounced on ill-structured problems.

  • Observe that models consistently employ behaviors which are not the most conducive to success.

  • Introduce test-time reasoning guidance that automatically scaffolds successful reasoning structures, improving performance by up to 60% on complex problems.

Methodology

The authors synthesize cognitive science theories through Marr’s levels of analysis to propose a taxonomy of 28 cognitive elements. They collect and annotate 170K model reasoning traces and 54 human traces across text, vision, and audio modalities. Using fine-grained span-level annotation, they identify which cognitive elements appear in each trace and how they are sequenced.

Results and Findings

Keep reading with a 7-day free trial

Subscribe to State of AI to keep reading this post and get 7 days of free access to the full post archives.

Already a paid subscriber? Sign in
© 2025 StateOfAI
Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture