State of AI

State of AI

Hierarchical RL, LLM-Driven Sentiment, and Protein Generation

Latest research summaries in ML, Robotics, CV, NLP and AI

State of AI's avatar
State of AI
Jul 26, 2025
∙ Paid
8
2
Share

Welcome to today's edition of State of AI 🤖

👋 And a warm welcome to our 93 new subscribers since last edition!

This edition explores the latest advancements in applying hierarchical reinforcement learning, integrating large language models for financial portfolio optimization, and developing novel diffusion-based models for efficient protein generation. These cutting-edge techniques demonstrate the continued progress in making AI systems more capable, efficient, and aligned with user needs.

Here's what caught our attention:

  • HARLF: Hierarchical Reinforcement Learning and Lightweight LLM-Driven Sentiment Integration for Financial Portfolio Optimization: A novel framework that seamlessly combines sentiment analysis from lightweight LLMs with traditional financial data to optimize investment portfolios through a hierarchical RL approach.

  • Machine Learning Solutions Integrated in an IoT Healthcare Platform for Heart Failure Risk Stratification: An ensemble learning model that leverages both clinical and echocardiographic features to achieve high sensitivity and accuracy in predicting heart failure risk, with a modular structure that enhances interpretability.

  • Demystify Protein Generation with Hierarchical Conditional Diffusion Models: A multi-level conditional diffusion model that generates proteins at the amino acid, backbone, and all-atom levels simultaneously, ensuring consistency across the hierarchical representations.

  • Checklists Are Better Than Reward Models For Aligning Language Models: A novel "Reinforcement Learning from Checklist Feedback" approach that outperforms traditional reward models in enabling language models to follow complex, multi-step instructions.

  • TRPrompt: Bootstrapping Query-Aware Prompt Optimization from Textual Rewards: A framework that utilizes textual rewards to optimize prompts for large language models, improving their reasoning abilities on tasks like mathematical and logical reasoning.

Let's get into it 👇

Contents

  1. Compliance Brain Assistant: Conversational Agentic AI for Assisting Compliance Tasks in Enterprise Environments

  2. HARLF: Hierarchical Reinforcement Learning and Lightweight LLM-Driven Sentiment Integration for Financial Portfolio Optimization

  3. Machine Learning Solutions Integrated in an IoT Healthcare Platform for Heart Failure Risk Stratification

  4. Captain Cinema: Towards Short Movie Generation

  5. Identifying Prompted Artist Names from Generated Images

  6. SIDA: Synthetic Image Driven Zero-shot Domain Adaptation

  7. Explainable Mapper: Charting LLM Embedding Spaces Using Perturbation-Based Explanation and Verification Agents

  8. Demystify Protein Generation with Hierarchical Conditional Diffusion Models

  9. The Geometry of LLM Quantization: GPTQ as Babai's Nearest Plane Algorithm

  10. Checklists Are Better Than Reward Models For Aligning Language Models

  11. TRPrompt: Bootstrapping Query-Aware Prompt Optimization from Textual Rewards

  12. Sparse Logit Sampling: Accelerating Knowledge Distillation in LLMs

  13. Diffusion Beats Autoregressive in Data-Constrained Settings

  14. RUMI: Rummaging Using Mutual Information

  15. PosterMate: Audience-driven Collaborative Persona Agents for Poster Design

Compliance Brain Assistant: Conversational Agentic AI for Assisting Compliance Tasks in Enterprise Environments

Authors: Shitong Zhu, Chenhao Fang, Derek Larson, Neel Reddy Pochareddy, Rajeev Rao, Sophie Zeng, Yanqing Peng, Wendy Summer, Alex Goncalves, Arya Pudota, Hervé Robert

Source and references: https://arxiv.org/abs/2507.17289v2


Introduction

This paper presents Compliance Brain Assistant (CBA), a conversational, agentic AI assistant designed to boost the efficiency of daily compliance tasks for personnel in enterprise environments.

Key Points

  • CBA uses a lightweight classifier (the router) to intelligently choose between two workflows: FastTrack and FullAgentic, to balance response quality and latency.

  • FastTrack handles simple requests that only need additional relevant context retrieved from knowledge corpora, while FullAgentic handles complicated requests that need composite actions and tool invocations.

  • FullAgentic utilizes a catalog of tools to access enterprise-internal artifacts, perform semantic search, retrieve knowledge, and invoke specialized AI models.

  • CBA leverages the ReAct framework to enable the LLM agent to dynamically update its reasoning based on the outcomes of prior actions.

  • CBA substantially outperforms a vanilla LLM on various compliance-related tasks, demonstrating the effectiveness of the compliance-oriented enhancements.

Methodology

Keep reading with a 7-day free trial

Subscribe to State of AI to keep reading this post and get 7 days of free access to the full post archives.

Already a paid subscriber? Sign in
© 2025 StateOfAI
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture