Hierarchical RL, LLM-Driven Sentiment, and Protein Generation

Latest research summaries in ML, Robotics, CV, NLP and AI

Jul 26, 2025

∙ Paid

Welcome to today's edition of State of AI 🤖

👋 And a warm welcome to our 93 new subscribers since last edition!

This edition explores the latest advancements in applying hierarchical reinforcement learning, integrating large language models for financial portfolio optimization, and developing novel diffusion-based models for efficient protein generation. These cutting-edge techniques demonstrate the continued progress in making AI systems more capable, efficient, and aligned with user needs.

Here's what caught our attention:

HARLF: Hierarchical Reinforcement Learning and Lightweight LLM-Driven Sentiment Integration for Financial Portfolio Optimization: A novel framework that seamlessly combines sentiment analysis from lightweight LLMs with traditional financial data to optimize investment portfolios through a hierarchical RL approach.
Machine Learning Solutions Integrated in an IoT Healthcare Platform for Heart Failure Risk Stratification: An ensemble learning model that leverages both clinical and echocardiographic features to achieve high sensitivity and accuracy in predicting heart failure risk, with a modular structure that enhances interpretability.
Demystify Protein Generation with Hierarchical Conditional Diffusion Models: A multi-level conditional diffusion model that generates proteins at the amino acid, backbone, and all-atom levels simultaneously, ensuring consistency across the hierarchical representations.
Checklists Are Better Than Reward Models For Aligning Language Models: A novel "Reinforcement Learning from Checklist Feedback" approach that outperforms traditional reward models in enabling language models to follow complex, multi-step instructions.
TRPrompt: Bootstrapping Query-Aware Prompt Optimization from Textual Rewards: A framework that utilizes textual rewards to optimize prompts for large language models, improving their reasoning abilities on tasks like mathematical and logical reasoning.

Let's get into it 👇

Compliance Brain Assistant: Conversational Agentic AI for Assisting Compliance Tasks in Enterprise Environments

Authors: Shitong Zhu, Chenhao Fang, Derek Larson, Neel Reddy Pochareddy, Rajeev Rao, Sophie Zeng, Yanqing Peng, Wendy Summer, Alex Goncalves, Arya Pudota, Hervé Robert

Source and references: https://arxiv.org/abs/2507.17289v2

Introduction

This paper presents Compliance Brain Assistant (CBA), a conversational, agentic AI assistant designed to boost the efficiency of daily compliance tasks for personnel in enterprise environments.

Key Points

CBA uses a lightweight classifier (the router) to intelligently choose between two workflows: FastTrack and FullAgentic, to balance response quality and latency.
FastTrack handles simple requests that only need additional relevant context retrieved from knowledge corpora, while FullAgentic handles complicated requests that need composite actions and tool invocations.
FullAgentic utilizes a catalog of tools to access enterprise-internal artifacts, perform semantic search, retrieve knowledge, and invoke specialized AI models.
CBA leverages the ReAct framework to enable the LLM agent to dynamically update its reasoning based on the outcomes of prior actions.
CBA substantially outperforms a vanilla LLM on various compliance-related tasks, demonstrating the effectiveness of the compliance-oriented enhancements.

Methodology

Keep reading with a 7-day free trial

Subscribe to State of AI to keep reading this post and get 7 days of free access to the full post archives.