Greetings,
Welcome to the monumental 30th edition of the State of AI. This issue promises to be a treasure trove of groundbreaking discoveries and innovations. We explore the pioneering work in language model alignment with Zephyr, and push the boundaries of image-context reasoning via HallusionBench. Discover the intricacies of Matryoshka Diffusion Models and delve into Woodpecker's capabilities in hallucination correction for large multimodal language models. Last but not least, be amazed by HyperFields' revolutionary approach to zero-shot generation of Neural Radiance Fields from text.
Each topic showcases the remarkable strides we are taking in advancing multi-modal AI, alignment techniques, and generative models. This issue provides a comprehensive and stimulating narrative about the ever-expanding frontier of artificial intelligence. Enjoy your read!
Best regards,
Contents
Zephyr: Direct Distillation of LM Alignment
HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models
Matryoshka Diffusion Models
Woodpecker: Hallucination Correction for Multimodal Large Language Models
HyperFields: Towards Zero-Shot Generation of NeRFs from Text
Zephyr: Direct Distillation of Language Model Alignment
Authors: Lewis Tunstall, Edward Beeching, Nathan Lambert, Nazneen Rajani, Kashif Rasul, Younes Belkada, Shengyi Huang, Leandro von Werra, Clémentine Fourrier, Nathan Habib, Nathan Sarrazin, Omar Sanseviero, Alexander M. Rush, and Thomas Wolf
Source & References: https://arxiv.org/abs/2310.16944
Introduction
In the constantly evolving landscape of large language models (LLMs), the key challenge lies in aligning these models to user intent without accessing human annotations or incurring high operational costs. The authors of the paper "Zephyr: Direct Distillation of Language Model Alignment" focus on three steps: distilled supervised fine-tuning (dSFT), AI feedback through preferences (AIF), and distilled direct preference optimization (dDPO). The end product, Zephyr-7B, sets a new state-of-the-art for 7B parameter models in chat benchmarks, providing improved intent alignment without human annotation.
Smaller Language Models and Distillation
With the inception of various open-source large language models (LLMs), researchers aim to improve the performance of smaller models by distilling knowledge from larger models. This process has augmented the capabilities of open models on different tasks; however, they often lack intent alignment, which means they do not adhere well to users' preferences.
To align an open-source LLM solely through distillation, researchers utilize AI feedback from an ensemble of teacher models as preference data. This method, known as distilled direct preference optimization (dDPO), demands no human annotation or sampling during fine-tuning. As a result, the final chat model can be trained within hours on powerful GPUs.
Three-Step Methodology
Distilled Supervised Fine-Tuning (dSFT): The first step involves using a large-scale, self-instruct-style dataset (UltraChat) to train a raw LLM for responding to user prompts. Given access to a teacher model, the model generates instructions and responses used for training the student model in a supervised manner. The final dataset consists of numerous diverse examples.
AI Feedback through Preferences (AIF): Human feedback, when given through preferences on LLM responses, can help align LLMs. For distillation, the researchers use AI preferences from the teacher model on generated outputs from other models. The UltraFeedback dataset is used to collect binary preferences, allowing for offline computation without sampling from the reference model.
Distilled Direct Preference Optimization (dDPO): The objective of the final step is to optimize the preference model from the static data using direct preference optimization (DPO). The preference model is determined by a reward function that utilizes the student language model. By plugging the reward function into the preference model, the researchers derive a simple training procedure that updates the student model.
Experimental Framework
The experiments are conducted using Mistral 7B, a state-of-the-art base LLM, trained with the Transformer Reinforcement Learning (TRL) library for fine-tuning. The models are trained on the UltraChat dataset and the UltraFeedback dataset, which both consist of dialogues generated by teacher models.
To evaluate the models, the researchers use benchmarks like MT-Bench, AlpacaEval, and the Open LLM Leaderboard. These benchmarks measure a model's ability to follow instructions and respond to challenging prompts across a diverse range of domains. Zephyr-7B's performance is compared with various open-source and proprietary models to demonstrate its benefits in terms of alignment and intent clarity.
Results and Discussion
Zephyr-7B achieves remarkable performance in chat benchmarks when compared to other open and proprietary models, showcasing improved intent alignment without requiring human annotation. Its performance on the Open LLM Leaderboard also proves that fine-tuning does not introduce regression in the base model's reasoning and truthfulness capabilities.
The combination of dSFT, AIF, and dDPO as a three-step methodology offers promising results in the realm of large language models. It is worth noting that the work does not focus on safety considerations of the models, such as preventing harmful outputs or illegal advice. This aspect presents an important subject for future research, as curating synthetic data for such models poses numerous challenges.
Conclusion
In summary, the paper "Zephyr: Direct Distillation of Language Model Alignment" presents a three-step approach to align open-source large language models to user intent without human annotation. The methodology comprises distilled supervised fine-tuning (dSFT), AI feedback through preferences (AIF), and distilled direct preference optimization (dDPO). Notably, Zephyr-7B demonstrates outstanding performance in chat benchmarks, proving the effectiveness of this approach for creating aligned models with reduced human intervention.
This research brings us one step closer to a more refined and intent-aligned generation of language models that can be effectively utilized for various applications. However, the safety considerations left unaddressed in the current scope of work pose intriguing challenges for future research on distillation and alignment of language models.
HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models
Authors: Fuxiao Liu, Tianrui Guan, Zongxia Li, Lichang Chen, Yaser Yacoob, Dinesh Manocha, Tianyi Zhou
Source & References: https://arxiv.org/abs/2310.14566
Introduction
The exciting advancements in artificial intelligence have seen large language models (LLMs) integrated with vision systems to give birth to large vision-language models (LVLMs). This powerful combination of language and vision has led to significant progress in image reasoning tasks. However, despite their impressive capabilities, state-of-the-art LVLMs like GPT-4V(ison) and LLaV A-1.5 still have their shortcomings. Most notably, their strong language bias can sometimes overshadow the visual context and cause them to rely too much on their parametric memory, leading to what the researchers call "Language Hallucination" and "Visual Illusion."
To further understand and study these challenges, Fuxiao Liu and his team introduce "HallusionBench" a curated benchmark specifically designed to investigate the visual illusion and knowledge hallucination of LVLMs. By analyzing various examples and failure scenarios in detail, the researchers hope to provide valuable insights that can help improve LVLMs in the future.
Keep reading with a 7-day free trial
Subscribe to State of AI to keep reading this post and get 7 days of free access to the full post archives.