Bi-Weekly AI Research Roundup

Latest research summaries in ML, Robotics, CV, NLP and AI

State of AI

Jan 21, 2025

∙ Paid

State of AI Pod • 013

State of AI

Jan 21

Read full story

Good things come in small packages: Should we adopt Lite-GPUs in AI infrastructure?
Aligning Instruction Tuning with Pre-training
Tarsier2: Advancing Large Vision-Language Models from Detailed Video Description to Comprehensive Video Understanding
HiMix: Reducing Computational Complexity in Large Vision-Language Models
Towards Human-Guided, Data-Centric LLM Co-Pilots
Accelerating Large Language Models through Partially Linear Feed-Forward Network
Hierarchical Autoregressive Transformers: Combining Byte-~and Word-Level Processing for Robust, Adaptable Language Models
Computational Protein Science in the Era of Large Language Models (LLMs)
Universal Actions for Enhanced Embodied Foundation Models
Robotic World Model: A Neural Network Simulator for Robust Policy Optimization in Robotics
Moonshine: Distilling Game Content Generators into Steerable Generative Models
HiMix: Reducing Computational Complexity in Large Vision-Language Models
Towards Human-Guided, Data-Centric LLM Co-Pilots
Hierarchical Autoregressive Transformers: Combining Byte-~and Word-Level Processing for Robust, Adaptable Language Models
STPOTR: Simultaneous Human Trajectory and Pose Prediction Using a Non-Autoregressive Transformer for Robot Following Ahead

Good things come in small packages: Should we adopt Lite-GPUs in AI infrastructure?

Authors: Burcu Canakci, Junyi Liu, Xingbo Wu, Nathanaël Cheriere, Paolo Costa, Sergey Legtchenko, Dushyanth Narayanan, Ant Rowstron

Source and references: https://arxiv.org/abs/2501.10187v1

Introduction

The paper proposes an alternative approach to scaling AI infrastructure by using "Lite-GPUs" - GPUs with single, small dies and a fraction of the capabilities of larger GPUs - instead of complex and expensive large GPUs.

Key Points

Lite-GPUs offer lower manufacturing cost, higher hardware yield, better power efficiency, and lower cooling requirements compared to large GPU packages.
Lite-GPUs can enable finer-grained resource management, power optimization, and fault tolerance in AI clusters.
Challenges around distributed workload management, efficient networking, and memory management need to be addressed to realize the benefits of Lite-GPUs.

Methodology

The paper uses roofline modeling to evaluate the performance of Lite-GPU clusters running large language model (LLM) inference workloads. They compare different configurations of Lite-GPUs, with variations in network bandwidth, memory bandwidth, and FLOPS, against a baseline NVIDIA H100 GPU cluster.

Results and Findings

The results show that while a basic Lite-GPU cluster may face performance limitations due to increased network demands, customized Lite-GPU configurations can match or even exceed the performance of the H100 cluster, especially for memory-bound stages of LLM inference. The improved bandwidth-to-compute ratio and cooling efficiency of Lite-GPUs contribute to these performance gains.

Implications and Conclusions

The paper suggests that Lite-GPUs have the potential to disrupt the design and scaling of AI infrastructure, providing cost-effective and efficient alternatives to large, complex GPU packages. However, key research challenges around distributed systems management need to be addressed to realize the full benefits of Lite-GPU deployments.

Aligning Instruction Tuning with Pre-training

Authors: Yiming Liang, Tianyu Zheng, Xinrun Du, Ge Zhang, Xingwei Qu, Xiang Yue, Chujie Zheng, Jiaheng Liu, Lei Ma, Wenhu Chen, Guoyin Wang, Zhaoxiang Zhang, Wenhao Huang, Jiajun Zhang

Source and references: https://arxiv.org/abs/2501.09368v2

Introduction

This paper proposes Aligning Instruction Tuning with Pre-training (AITP), a method that bridges the gap between instruction-tuning datasets and pre-training corpora to improve the performance of large language models (LLMs) on instruction-following tasks.

Keep reading with a 7-day free trial

Subscribe to State of AI to keep reading this post and get 7 days of free access to the full post archives.

Bi-Weekly AI Research Roundup

Latest research summaries in ML, Robotics, CV, NLP and AI

State of AI Pod • 013

Contents

Good things come in small packages: Should we adopt Lite-GPUs in AI infrastructure?

Introduction

Key Points

Methodology

Results and Findings

Implications and Conclusions

Aligning Instruction Tuning with Pre-training

Introduction

Keep reading with a 7-day free trial