AI Research Roundup: Long-Context LLMs, Multimodal Breakthroughs, and Optimizing AI at Scale
Latest research summaries in ML, Robotics, CV, NLP and AI
Contents
Testing GPT-4 with Wolfram Alpha and Code Interpreter plug-ins on math and science problems
LongWriter-V: Enabling Ultra-Long and High-Fidelity Generation in Vision-Language Models
LServe: Efficient Long-sequence LLM Serving with Unified Sparse Attention
OmniThink: Expanding Knowledge Boundaries in Machine Writing through Thinking
Scaling Text-Rich Image Understanding via Code-Guided Synthetic Multimodal Data Generation
Humanoid-VLA: Towards Universal Humanoid Control with Visual Integration
SongGen: A Single Stage Auto-regressive Transformer for Text-to-Song Generation
STAR: Scale-wise Text-conditioned AutoRegressive image generation
Scaling Test-Time Compute Without Verification or RL is Suboptimal
Dynamic Low-Rank Sparse Adaptation for Large Language Models
FR-Spec: Accelerating Large-Vocabulary Language Models via Frequency-Ranked Speculative Sampling
Testing GPT-4 with Wolfram Alpha and Code Interpreter plug-ins on math and science problems
Authors: Ernest Davis, Scott Aaronson
Source and references: https://arxiv.org/abs/2308.05713v4
Introduction
This research paper reports on a test of the large language model GPT-4 with two plug-ins, Wolfram Alpha and Code Interpreter, on 105 original problems in science and math at the high school and college levels.
Key Points
GPT-4 with either the Wolfram Alpha or Code Interpreter plug-in is significantly stronger than GPT-4 alone on the tested problems.
Keep reading with a 7-day free trial
Subscribe to State of AI to keep reading this post and get 7 days of free access to the full post archives.