Bi-Weekly AI Research Roundup

Latest research summaries in ML, Robotics, CV, NLP and AI

State of AI

Jan 25, 2025

∙ Paid

State of AI Pod • 014

State of AI

Jan 25

Read full story

EICopilot: Search and Explore Enterprise Information over Large-scale Knowledge Graphs with LLM-driven Agents
Text-to-SQL based on Large Language Models and Database Keyword Search
Adaptive Testing for LLM-Based Applications: A Diversity-based Approach
DART: Denoising Autoregressive Transformer for Scalable Text-to-Image Generation
VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding
IMAGINE-E: Image Generation Intelligence Evaluation of State-of-the-art Text-to-Image Models
Privacy-Preserving Personalized Federated Prompt Learning for Multimodal Large Language Models
Utilizing Evolution Strategies to Train Transformers in Reinforcement Learning
Look Into the LITE in Deep Learning for Time Series Classification
Sigma: Differential Rescaling of Query, Key and Value for Efficient Language Models
URSA: Understanding and Verifying Chain-of-thought Reasoning in Multimodal Mathematics
Integrative Decoding: Improve Factuality via Implicit Self-consistency
Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass
Eye Gaze as a Signal for Conveying User Attention in Contextual AI Systems
FAST-LIVO2 on Resource-Constrained Platforms: LiDAR-Inertial-Visual Odometry with Efficient Memory and Computation

EICopilot: Search and Explore Enterprise Information over Large-scale Knowledge Graphs with LLM-driven Agents

Authors: Yuhui Yun, Huilong Ye, Xinru Li, Ruojia Li, Jingfeng Deng, Li Li, Haoyi Xiong

Source and references: https://arxiv.org/abs/2501.13746v1

Introduction

This paper introduces EICopilot, an agent-based solution that enhances search and exploration of enterprise registration data within large-scale online knowledge graphs. Traditional methods require text-based queries and manual subgraph explorations, leading to time-consuming processes. EICopilot aims to improve this using Large Language Models (LLMs) to interpret natural language queries and efficiently summarize complex enterprise relationships.

Key Points

Presents a novel agent-based solution, EICopilot, for search and exploration of enterprise information within large-scale knowledge graphs
Utilizes LLMs to interpret natural language queries and automatically generate and execute Gremlin scripts for efficient knowledge graph navigation
Includes a data pre-processing pipeline to compile and annotate representative queries into a vector database for In-Context Learning (ICL)
Implements a comprehensive reasoning pipeline combining Chain-of-Thought with ICL to enhance Gremlin script generation
Introduces a novel query masking strategy to improve intent recognition and script accuracy

Methodology

EICopilot's approach consists of an offline phase focused on data preparation and enrichment, as well as an online phase that leverages the LLM-driven capabilities for query processing and response generation. The offline phase includes activities such as schema semantic governance, seed data construction, data augmentation, and masked question similarity selection to build a high-quality vector database of queries and scripts. The online phase then utilizes this database, along with ICL and Chain-of-Thought techniques, to interpret user queries, generate Gremlin scripts, and summarize the retrieved enterprise information.

Results and Findings

Empirical evaluations demonstrate that EICopilot outperforms baseline methods in terms of speed and accuracy for enterprise information retrieval and interpretation. The Full Mask variant of EICopilot achieves a syntax error rate reduction to as low as 10.00% and an execution correctness of up to 82.14%, highlighting the effectiveness of the proposed components.

Implications and Conclusions

The research presented in this paper represents a significant advancement in enhancing the search and exploration of large-scale knowledge graphs for enterprise information. EICopilot's innovative approach, which combines LLM-driven capabilities with tailored solutions for complex graph database queries, positions it as a groundbreaking tool for efficiently extracting and summarizing valuable insights from extensive enterprise data repositories.

Text-to-SQL based on Large Language Models and Database Keyword Search

Authors: Eduardo R. Nascimento, Caio Viktor S. Avila, Yenier T. Izquierdo, Grettel M. García, Lucas Feijó L. Andrade, Michelle S. P. Facina, Melissa Lemos, Marco A. Casanova

Source and references: https://arxiv.org/abs/2501.13594v1

Keep reading with a 7-day free trial

Subscribe to State of AI to keep reading this post and get 7 days of free access to the full post archives.