State of AI

State of AI

Diffusion Models, LLM Benchmarks, and Visuo-Haptic Perception

Latest research summaries in ML, Robotics, CV, NLP and AI

State of AI's avatar
State of AI
Sep 13, 2025
∙ Paid
8
2
Share

Welcome to today's edition of State of AI 👋 And a warm welcome to our new subscribers since last edition!

This edition covers a range of exciting topics, from the fascinating insights into the underlying mechanisms behind diffusion models, to the introduction of comprehensive benchmarks for evaluating long-context large language models in complex software engineering tasks, and the advancement of visuo-haptic perception for robust robotic manipulation.

Here's what caught our attention:

  • Locality in Image Diffusion Models Emerges from Data Statistics: An insightful exploration of how the locality patterns in trained diffusion models arise from the statistics of the training data, rather than architectural inductive biases.

  • LoCoBench: A Benchmark for Long-Context Large Language Models in Complex Software Engineering: A groundbreaking benchmark that provides a comprehensive evaluation framework for assessing long-context understanding capabilities in sophisticated software development scenarios.

  • V-HOP: Visuo-Haptic 6D Object Pose Tracking: A novel approach that fuses egocentric visual and haptic sensing to achieve accurate real-time in-hand object tracking, showcasing the advantages of combining visual and haptic perception for robust robotic manipulation.

Let's get into it 👇

Contents

  1. Towards Adaptive Memory-Based Optimization for Enhanced Retrieval-Augmented Generation

  2. KROMA: Ontology Matching with Knowledge Retrieval and Large Language Models

  3. LoCoBench: A Benchmark for Long-Context Large Language Models in Complex Software Engineering

  4. FLUX-Reason-6M & PRISM-Bench: A Million-Scale Text-to-Image Reasoning Dataset and Comprehensive Benchmark

  5. Locality in Image Diffusion Models Emerges from Data Statistics

  6. MM-Prompt: Cross-Modal Prompt Tuning for Continual Visual Question Answering

  7. AU-Harness: An Open-Source Toolkit for Holistic Evaluation of Audio LLMs

  8. Functional Groups are All you Need for Chemically Interpretable Molecular Property Prediction

  9. Investigating Energy Efficiency and Performance Trade-offs in LLM Inference Across Tasks and DVFS Settings

  10. ButterflyQuant: Ultra-low-bit LLM Quantization through Learnable Orthogonal Butterfly Transforms

  11. CDE: Curiosity-Driven Exploration for Efficient Reinforcement Learning in Large Language Models

  12. Steering MoE LLMs via Expert (De)Activation

  13. villa-X: Enhancing Latent Action Modeling in Vision-Language-Action Models

  14. LLMs for sensory-motor control: Combining in-context and iterative learning

  15. V-HOP: Visuo-Haptic 6D Object Pose Tracking

Towards Adaptive Memory-Based Optimization for Enhanced Retrieval-Augmented Generation

Keep reading with a 7-day free trial

Subscribe to State of AI to keep reading this post and get 7 days of free access to the full post archives.

Already a paid subscriber? Sign in
© 2025 StateOfAI
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture