State of AI

State of AI

Share this post

State of AI
State of AI
Bi-Weekly AI Research Roundup

Bi-Weekly AI Research Roundup

Latest research summaries in ML, Robotics, CV, NLP and AI

State of AI's avatar
State of AI
Aug 23, 2024
∙ Paid

Share this post

State of AI
State of AI
Bi-Weekly AI Research Roundup
1
Share

Contents

  1. Non-autoregressive Generative Models for Reranking Recommendation

  2. Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model

  3. Kilometer-Scale Convection Allowing Model Emulation using Generative Diffusion Modeling

  4. MagicDec: Breaking the Latency-Throughput Tradeoff for Long Context Generation with Speculative Decoding

  5. Causal Reasoning and Large Language Models: Opening a New Frontier for Causality

  6. Leveraging Chemistry Foundation Models to Facilitate Structure Focused Retrieval Augmented Generation in Multi-Agent Workflows for Catalyst and Materials Design

  7. KOSMOS-2.5: A Multimodal Literate Model

  8. NYU CTF Dataset: A Scalable Open-Source Benchmark Dataset for Evaluating LLMs in Offensive Security

  9. FocusLLM: Scaling LLM's Context by Parallel Decoding

  10. Scaling Cross-Embodied Learning: One Policy for Manipulation, Navigation, Locomotion and Aviation

  11. Natural Language Programming in Medicine: Administering Evidence Based Clinical Workflows with Autonomous Agents Powered by Generative Large Language Models

  12. xGen-VideoSyn-1: High-fidelity Text-to-Video Synthesis with Compressed Representations

  13. Exploiting Student Parallelism for Low-latency GPU Inference of BERT-like Models in Online Services

  14. Jamba-1.5: Hybrid Transformer-Mamba Models at Scale

  15. RuleAlign: Making Large Language Models Better Physicians with Diagnostic Rule Alignment


Non-autoregressive Generative Models for Reranking Recommendation

Authors: Yuxin Ren, Qiya Yang, Yichun Wu, Wei Xu, Yalong Wang, Zhiqiang Zhang

Source and references: https://arxiv.org/abs/2402.06871v4


Introduction

This paper proposes a non-autoregressive generative model called NAR4Rec for reranking recommendations in real-time recommendation systems.

Key Points

  • The paper addresses the challenges of using autoregressive models in real-time recommendation systems, including slow inference speed, training-inference discrepancy, and limited information utilization.

  • NAR4Rec, a non-autoregressive generative model, is introduced to generate all items in the recommendation list simultaneously, improving efficiency.

  • The authors address the challenges of sparse training data and dynamic candidates by introducing a matching model.

  • Unlikelihood training is used to differentiate feasible and unfeasible sequences, and contrastive decoding is proposed to capture correlations among target items.

  • Extensive offline experiments and online A/B tests demonstrate the superior performance of NAR4Rec compared to state-of-the-art reranking methods.

Methodology

The proposed NAR4Rec model consists of a candidates encoder and a position encoder. The candidates encoder uses a Transformer architecture to encode the candidate items, while the position encoder captures position-specific information. A matching mechanism is employed to match each candidate with each position in the target sequence, generating probabilities for each candidate at every position.

Results and Findings

Offline experiments show that NAR4Rec outperforms state-of-the-art reranking methods. Online A/B tests further validate the effectiveness of NAR4Rec, which has been fully deployed in the popular video app Kuaishou with over 300 million daily active users, notably improving the user experience.

Implications and Conclusions

The non-autoregressive approach of NAR4Rec significantly improves the efficiency of real-time recommendation systems by enabling simultaneous generation of all items in the recommendation list, addressing the challenges associated with autoregressive models.


Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model

Authors: Chunting Zhou, Lili Yu, Arun Babu, Kushal Tirumala, Michihiro Yasunaga, Leonid Shamis, Jacob Kahn, Xuezhe Ma, Luke Zettlemoyer, Omer Levy

Source and references: https://arxiv.org/abs/2408.11039v1


Introduction

This paper introduces Transfusion, a method for training a single unified model to understand and generate both discrete (text) and continuous (image) data.

Get 20% off forever

Keep reading with a 7-day free trial

Subscribe to State of AI to keep reading this post and get 7 days of free access to the full post archives.

Already a paid subscriber? Sign in
© 2025 StateOfAI
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share