Memory Mosaics, Diffusion Transformers, and Robotic Agents
Latest research summaries in ML, Robotics, CV, NLP and AI
Welcome to today’s edition of State of AI 👋
This week’s highlights include a scaling up of a novel associative memory network, architectural advances in diffusion transformer models, and the development of new frameworks for AI agents and data collection. We’ll dive into the technical details of these exciting advancements.
Here’s what caught our attention:
Memory Mosaics at scale: Researchers have successfully scaled up a network of associative memories to the size of large language models, demonstrating superior new-task learning capabilities compared to transformers.
Routing Matters in MoE: A new Mixture-of-Experts framework, ProMoE, leverages explicit routing guidance to enable more effective scaling of diffusion transformer models for visual synthesis tasks.
ADMN: A Layer-Wise Adaptive Multimodal Network: This work presents a novel layer-wise adaptive network that can dynamically allocate computational resources across input modalities based on their quality, enabling robust and efficient multimodal deep learning.
Agent Data Protocol: Unifying Datasets for LLM Agents: The authors introduce a standardized dataset representation to enable effective fine-tuning of large language model agents across diverse tasks, from coding to browsing to general agentic workflows.
ParallelMuse: Agentic Parallel Thinking for Deep Information Seeking: This research proposes a two-stage paradigm to enhance the performance and efficiency of deep information-seeking agents by leveraging uncertainty-guided path reuse and compressed reasoning aggregation.
Let’s get into it 👇
Contents
Bridging Tool Dependencies and Domain Knowledge: A Graph-Based Framework for In-Context Planning
Routing Matters in MoE: Scaling Diffusion Transformers with Explicit Routing Guidance
ADMN: A Layer-Wise Adaptive Multimodal Network for Dynamic Input Noise and Compute Resources
Pearl: A Foundation Model for Placing Every Atom in the Right Location
Agent Data Protocol: Unifying Datasets for Diverse, Effective Fine-tuning of LLM Agents
ParallelMuse: Agentic Parallel Thinking for Deep Information Seeking
Hybrid Deep Learning Model to Estimate Cognitive Effort from fNIRS Signals
Agent-Omni: Test-Time Multimodal Reasoning via Model Coordination for Understanding Anything
TWIST2: Scalable, Portable, and Holistic Humanoid Data Collection System
Memory Mosaics at scale
Keep reading with a 7-day free trial
Subscribe to State of AI to keep reading this post and get 7 days of free access to the full post archives.


