Large Language 3D Modeling, Hardware Acceleration, and Code Verification

Latest research summaries in ML, Robotics, CV, NLP and AI

Aug 12, 2025

∙ Paid

Welcome to today's edition of State of AI 👋 And a warm welcome to our 86 new subscribers since last edition!

Unlock Your Deepest Focus 50% Off for 2 Days

I’ve written before about how Forget Work is a quiet game changer for deep work and I stand by it. If you’ve ever lost an afternoon to tab chaos, dopamine-click traps, or “just one more Slack check,” this is your reset button. Forget Work strips away the noise, locks you into one task, and lets you finally finish what you start.

For the next 2 days only, you can grab it for 50% off. Don’t wait, your future focused self will thank you.

This issue dives into the latest research in large language model-powered 3D asset generation, novel hardware architectures for efficient attention computation, and the use of language models to enable formal verification of Python code. These cutting-edge advancements showcase the versatility of large language models and their growing impact across diverse domains.

Here's what caught our attention:

LL3M: Large Language 3D Modelers - A multi-agent system that leverages pre-trained language models to generate 3D assets by writing interpretable Blender scripts, enabling iterative, user-guided refinement of the generated content.
SystolicAttention: Fusing FlashAttention within a Single Systolic Array - A novel hardware architecture that enables the full execution of the FlashAttention algorithm within a single systolic array, significantly improving hardware utilization.
PyVeritas: On Verifying Python via LLM-Based Transpilation and Bounded Model Checking for C - A framework that combines LLM-based code transpilation, bounded model checking, and fault localization to enable formal verification and bug diagnosis for Python programs.
TBAC-UniImage: Unified Understanding and Generation by Ladder-Side Diffusion Tuning - A unified multimodal model that deeply integrates a pre-trained diffusion model with a language model to achieve high-quality and versatile text-to-image generation.
SynthVLM: Towards High-Quality and Efficient Synthesis of Image-Caption Datasets for Vision-Language Models - A data synthesis and curation method that generates high-quality, precisely aligned image-caption pairs to train advanced vision-language models.

Let's get into it 👇

LL3M: Large Language 3D Modelers

Authors: Sining Lu, Guan Chen, Nam Anh Dinh, Itai Lang, Ari Holtzman, Rana Hanocka

Source and references: https://arxiv.org/abs/2508.08228v1

Introduction

This paper presents LL3M, a multi-agent system that leverages pre-trained large language models (LLMs) to generate 3D assets by writing interpretable Python code in Blender.

Keep reading with a 7-day free trial

Subscribe to State of AI to keep reading this post and get 7 days of free access to the full post archives.