Contents
Logic Query of Thoughts: Guiding Large Language Models to Answer Complex Logic Queries with Knowledge Graphs
SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models
Gymnasium: A Standard Interface for Reinforcement Learning Environments
Recycled Attention: Efficient inference for long-context language models
End-to-End Navigation with Vision Language Models: Transforming Spatial Reasoning into Question-Answering
GenCode: A Generic Data Augmentation Framework for Boosting Deep Learning-Based Code Understanding
CTIBench: A Benchmark for Evaluating LLMs in Cyber Threat Intelligence
Edify Image: High-Quality Image Generation with Pixel Space Laplacian Diffusion Models
Reminding Multimodal Large Language Models of Object-aware Knowledge with Retrieved Tags
Score-based generative diffusion with "active" correlated noise sources
TabM: Advancing Tabular Deep Learning with Parameter-Efficient Ensembling
Stronger Random Baselines for In-Context Learning
Continual Memorization of Factoids in Large Language Models
BehaviorGPT: Smart Agent Simulation for Autonomous Driving with Next-Patch Prediction
Logic Query of Thoughts: Guiding Large Language Models to Answer Complex Logic Queries with Knowledge Graphs
Authors: Lihui Liu, Zihao Wang, Ruizhong Qiu, Yikun Ban, Eunice Chan, Yangqiu Song, Jingrui He, Hanghang Tong
Source and references: https://arxiv.org/abs/2404.04264v4
Introduction
This paper introduces a novel model called 'Logic-Query-of-Thoughts' (LGOT) that combines Large Language Models (LLMs) with knowledge graph reasoning to enhance the accuracy of answering complex logic queries.
Key Points
Demonstrates that LLMs are insufficient for answering complex logic queries that require multiple reasoning steps.
Introduces the LGOT algorithm, which decomposes complex logic queries into multiple subquestions and guides LLMs accordingly.
Evaluates LGOT on various real-world datasets, showing state-of-the-art performance with up to 20% improvement over ChatGPT.
Methodology
LGOT utilizes a combination of knowledge graph reasoning methods and LLMs to identify potentially correct answers for each subquestion. These results are merged to generate a comprehensive answer set, which then serves as input for subsequent subquestions. This iterative process continues until the final answers are obtained.
Results and Findings
The experimental results demonstrate that LGOT significantly outperforms existing methods, including ChatGPT, on various logic query datasets with incomplete knowledge graphs. For example, LGOT achieves up to 20% improvement over ChatGPT on the MetaQA dataset.
Implications and Conclusions
The proposed LGOT model provides an effective way to integrate knowledge graph reasoning with LLMs, mitigating the hallucination problem of LLMs and the incompleteness issue of knowledge graphs. This research has broader impacts in improving the reliability and accuracy of LLMs for complex reasoning tasks.
SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models
Authors: Muyang Li, Yujun Lin, Zhekai Zhang, Tianle Cai, Xiuyu Li, Junxian Guo, Enze Xie, Chenlin Meng, Jun-Yan Zhu, Song Han
Source and references: https://arxiv.org/abs/2411.05007v2
Introduction
This paper proposes SVDQuant, a novel 4-bit post-training quantization paradigm for diffusion models. The goal is to accelerate the inference of large-scale diffusion models, which are computationally intensive and memory-demanding, to enable their deployment on edge devices.
Key Points
Introduces a low-rank branch to absorb outliers in both the weights and activations, easing the quantization process.
Co-designs an inference engine called Nunchaku that fuses the low-rank and low-bit branch kernels to reduce memory usage and cut off redundant data movement overhead.
Supports both INT4 and FP4 quantization, and integrates seamlessly with pre-trained low-rank adapters (LoRA) without requiring re-quantization.
Achieves a 3.5× reduction in memory usage and 3.0× speedup over weight-only quantization on an NVIDIA RTX-4090 laptop.
Methodology
The core idea of SVDQuant is to introduce a low-rank branch to absorb the outliers on both the weights and activations. First, the outliers are migrated from the activations to the weights via smoothing. Then, Singular Value Decomposition (SVD) is applied to the updated weights, decomposing them into a low-rank branch and a residual. The low-rank branch operates at 16-bit precision, allowing the residual to be quantized to 4 bits.
Results and Findings
Extensive experiments on various text-to-image diffusion models, including FLUX.1, PixArt-Σ, and SDXL, demonstrate that SVDQuant can effectively preserve image quality while achieving significant memory savings and speedup. Compared to the original 16-bit models, the 4-bit INT and FP models produced by SVDQuant show comparable or even better visual quality. On the 12B FLUX.1 model, SVDQuant reduces the memory usage by 3.5× and delivers a 3.0× speedup over the weight-only quantized baseline on an NVIDIA RTX-4090 laptop.
Implications and Conclusions
The proposed SVDQuant and the Nunchaku inference engine enable the efficient deployment of large-scale diffusion models on edge devices, unlocking broader potential for interactive AI applications. The method's ability to seamlessly integrate with pre-trained LoRAs further enhances its practical applicability.
Gymnasium: A Standard Interface for Reinforcement Learning Environments
Authors: Mark Towers, Ariel Kwiatkowski, Jordan Terry, John U. Balis, Gianluca De Cola, Tristan Deleu, Manuel Goulão, Andreas Kallinteris, Markus Krimmel, Arjun KG, Rodrigo Perez-Vicente, Andrea Pierré, Sander Schulhoff, Jun Jet Tai, Hannah Tan, Omar G. Younis
Source and references: https://arxiv.org/abs/2407.17032v3
Introduction
This paper introduces Gymnasium, an open-source library that provides a standard API for reinforcement learning (RL) environments. Gymnasium aims to address the lack of standardization in RL environment and algorithm implementations, which has hindered progress in the field.
Keep reading with a 7-day free trial
Subscribe to State of AI to keep reading this post and get 7 days of free access to the full post archives.