Contents
LLAssist: Simple Tools for Automating Literature Review Using Large Language Models
N-Version Assessment and Enhancement of Generative AI
Melody Is All You Need For Music Generation
MM1.5: Methods, Analysis & Insights from Multimodal LLM Fine-tuning
DressRecon: Freeform 4D Human Reconstruction from Monocular Video
SpaceMesh: A Continuous Representation for Learning Manifold Surface Meshes
Manifold-Constrained Nucleus-Level Denoising Diffusion Model for Structure-Based Drug Design
Harmful Fine-tuning Attacks and Defenses for Large Language Models: A Survey
Frequency Adaptive Normalization For Non-stationary Time Series Forecasting
LLM Hallucinations in Practical Code Generation: Phenomena, Mechanism, and Mitigation
The Base-Rate Effect on LLM Benchmark Performance: Disambiguating Test-Taking Strategies from Benchmark Performance
Health-LLM: Personalized Retrieval-Augmented Disease Prediction System
POEM: Interactive Prompt Optimization for Enhancing Multimodal Reasoning of Large Language Models
LLM-Craft: Robotic Crafting of Elasto-Plastic Objects with Large Language Models
Robot Navigation Using Physically Grounded Vision-Language Models in Outdoor Environments
Scaling Proprioceptive-Visual Learning with Heterogeneous Pre-trained Transformers
LLAssist: Simple Tools for Automating Literature Review Using Large Language Models
Authors: Christoforus Yoga Haryanto
Source and references: https://arxiv.org/abs/2407.13993v2
Introduction
This paper introduces LLAssist, an open-source tool designed to streamline literature reviews in academic research by leveraging Large Language Models (LLMs) and Natural Language Processing (NLP) techniques.
Key Points
Introducing LLAssist, an open-source tool that uses LLMs to automate key aspects of the literature review process.
Demonstrating a novel approach to relevance estimation using LLMs.
Providing insights into different LLM backends' performance for literature review tasks.
Promoting transparency and reproducibility in AI-assisted literature reviews through open-source development.
Methodology
The methodology consists of two main parts: 1) the design and implementation of LLAssist, and 2) the experimental evaluation of its performance. LLAssist accepts research article metadata and abstracts, as well as user-defined research questions, and performs key semantics extraction, relevance estimation, and "must-read" determination. The authors conducted two experiments, a small dataset test and a large dataset test, to evaluate LLAssist's performance across different academic databases and LLM backends.
Results and Findings
The small dataset test verified the functionality of LLAssist, with the Gemma 2 and GPT-4 models demonstrating reasonable performance in relevance assessment. The large dataset test using the Gemma 2 model on 2,576 articles revealed several key insights, including an increase in potentially relevant articles over time, with RQ2 (risks and vulnerabilities of LLMs in cybersecurity) consistently showing the highest number of relevant and contributing articles.
Implications and Conclusions
LLAssist effectively identifies relevant papers, works with various LLM backends, and significantly reduces manual screening time compared to human performance. The open-source nature of the tool promotes transparency and reproducibility in AI-assisted literature reviews. While not a replacement for human judgment, LLAssist can enhance research efficiency and allow researchers to focus on high-quality work.
N-Version Assessment and Enhancement of Generative AI
Authors: Marcus Kessel, Colin Atkinson
Source and references: https://arxiv.org/abs/2409.14071v2
Introduction
This paper proposes a new approach, called "Differential Generative AI" (D-GAI), to address the challenges of untrustworthy outputs from Generative AI (GAI) systems, particularly in the context of code synthesis. The core idea is to leverage GAI's ability to generate multiple versions of code and tests to facilitate comparative analysis and enhance the reliability of GAI-generated artifacts.
Keep reading with a 7-day free trial
Subscribe to State of AI to keep reading this post and get 7 days of free access to the full post archives.