Scaling Long Videos, Unifying Multi-Modal AI, and Securing Large Language Models

Latest research summaries in ML, Robotics, CV, NLP and AI

Jul 13, 2025

∙ Paid

Welcome to today's edition of State of AI 👋 And a warm welcome to our 65 new subscribers since last edition!

This edition covers a range of cutting-edge AI research, from techniques for enhancing legal dispute analysis and multi-modal generative models, to methods for scaling up long video reasoning and improving the security of large language models. We also see exciting developments in areas like biodiversity analysis and efficient deployment of neural networks on microcontrollers.

Here's what caught our attention:

An Integrated Framework of Prompt Engineering and Multidimensional Knowledge Graphs for Legal Dispute Analysis: This research proposes an enhanced framework that combines prompt engineering and a multi-layered knowledge graph architecture to boost the performance of large language models in legal reasoning tasks.
Geometry Forcing: Marrying Video Diffusion and 3D Representation for Consistent World Modeling: The authors introduce a novel method that aligns the internal representations of video diffusion models with 3D geometric features, leading to more coherent and realistic long-term video generation. For diffusion models, the paper reviews the development of text-to-image and text-to-video generation, including the shift from pixel-based to latent-based approaches and the introduction of Transformer-based diffusion models.
Operationalizing a Threat Model for Red-Teaming Large Language Models (LLMs): This paper presents a comprehensive taxonomy of potential attacks on large language models and provides a framework for conducting effective red-teaming exercises to improve the security and robustness of LLM-based systems.

Let's get into it 👇

Meek Models Shall Inherit the Earth

Keep reading with a 7-day free trial

Subscribe to State of AI to keep reading this post and get 7 days of free access to the full post archives.

Scaling Long Videos, Unifying Multi-Modal AI, and Securing Large Language Models

Latest research summaries in ML, Robotics, CV, NLP and AI

Contents

Meek Models Shall Inherit the Earth

Keep reading with a 7-day free trial