State of AI

State of AI

Scaling Diverse Generation, Efficient LLM Training, and Simulated Robots for Task Planning

Latest research summaries in ML, Robotics, CV, NLP and AI

State of AI's avatar
State of AI
Aug 22, 2025
∙ Paid
11
1
Share

Welcome to today's edition of State of AI 🚀

👋 And a warm welcome to our 117 new subscribers since last edition!

This edition covers a range of exciting advancements, from techniques for generating diverse and high-quality samples from diffusion models, to communication-efficient training of large language models, and the development of a comprehensive benchmark for evaluating robotic task planning and control in a simulated kitchen environment.

Here's what caught our attention:

  • Scaling Group Inference for Diverse and High-Quality Generation: A scalable method for optimizing sample quality and diversity in diffusion models, outperforming independent sampling and recent single-sample inference algorithms.

  • Communication Efficient LLM Pre-training with SparseLoCo: A communication-efficient training algorithm for large language models that achieves extreme compression ratios while maintaining or improving performance.

  • Mind and Motion Aligned: A Joint Evaluation IsaacSim Benchmark for Task Planning and Low-Level Policies in Mobile Manipulation: A unified benchmark for evaluating high-level task planning and low-level robot control in a realistic simulated kitchen environment.

Let's get into it 👇

Contents

  1. Measuring the environmental impact of delivering AI at Google Scale

  2. Surya: Foundation Model for Heliophysics

  3. VerilogLAVD: LLM-Aided Rule Generation for Vulnerability Detection in Verilog

  4. Intern-S1: A Scientific Multimodal Foundation Model

  5. Scaling Group Inference for Diverse and High-Quality Generation

  6. CineScale: Free Lunch in High-Resolution Cinematic Visual Generation

  7. Communication Efficient LLM Pre-training with SparseLoCo

  8. OPDR: Order-Preserving Dimension Reduction for Semantic Embedding of Multimodal Scientific Data

  9. GRAFT: GRaPH and Table Reasoning for Textual Alignment -- A Benchmark for Structured Instruction Following and Visual Reasoning

  10. RefineCoder: Iterative Improving of Large Language Models via Adaptive Critique Refinement for Code Generation

  11. InfAlign: Inference-aware language model alignment

  12. Learning to Generate Unit Tests for Automated Debugging

  13. "Does the cafe entrance look accessible? Where is the door?" Towards Geospatial AI Agents for Visual Inquiries

  14. Exploiting Policy Idling for Dexterous Manipulation

  15. Mind and Motion Aligned: A Joint Evaluation IsaacSim Benchmark for Task Planning and Low-Level Policies in Mobile Manipulation

Measuring the environmental impact of delivering AI at Google Scale

Authors: Cooper Elsworth, Keguo Huang, David Patterson, Ian Schneider, Robert Sedivy, Savannah Goodman, Ben Townsend, Parthasarathy Ranganathan, Jeff Dean, Amin Vahdat, Ben Gomes, James Manyika

Source and references: https://arxiv.org/abs/2508.15734v1


Introduction

This paper addresses the critical gap in understanding the environmental impact of delivering AI at scale by proposing and executing a comprehensive methodology for measuring the energy usage, carbon emissions, and water consumption of AI inference workloads in a large-scale, AI production environment.

Key Points

  • Propose a full-stack measurement approach that accounts for all material energy sources, including active AI accelerator power, host system energy, idle machine capacity, and data center energy overhead.

  • Apply this methodology to Google's Gemini Apps product to provide the first analysis of three AI serving environmental metrics: energy/prompt, emissions/prompt, and water consumption/prompt.

  • Demonstrate that existing measurement approaches are missing material energy consumption activities for AI serving.

  • Illustrate the compounding AI serving efficiency gains across the serving stack over a year of development, resulting in a 44x reduction in the total emissions generated for the median Gemini Apps prompt.

Methodology

Keep reading with a 7-day free trial

Subscribe to State of AI to keep reading this post and get 7 days of free access to the full post archives.

Already a paid subscriber? Sign in
© 2025 StateOfAI
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture