Greetings,
Welcome to the landmark 35th edition of the State of AI. In this issue, we celebrate the ever-expanding horizons of artificial intelligence. We explore the discovery of millions of new materials using deep learning, mark the one-year anniversary of ChatGPT, and examine the rise of open-source large language models. Our journey continues with a dive into SeaLLMs — specialized large language models for Southeast Asia, and the innovative Merlin, which empowers multimodal large language models with foresight capabilities. Finally, we delve into MEDITRON-70B, a groundbreaking project scaling medical pretraining for large language models.
Each article in this edition highlights the extraordinary breadth and depth of AI's impact across different domains, reaffirming its role as a transformative force in our world. Get ready for an enlightening and inspiring journey through the latest AI breakthroughs.
Best regards,
Contents
Millions of new materials discovered with deep learning
ChatGPT's One-year Anniversary: Are Open-Source Large Language Models Catching up?
SeaLLMs -- Large Language Models for Southeast Asia
Merlin:Empowering Multimodal LLMs with Foresight Minds
MEDITRON-70B: Scaling Medical Pretraining for Large Language Models
Scaling Deep Learning for Materials Discovery
Authors: Amil Merchant, Simon Batzner, Samuel S. Schoenholz, Muratahan Aykol, Gowoon Cheon, Ekin Dogus Cubuk
Source & References: Nature
Introduction
Researchers have recently made groundbreaking advancements in the field of materials discovery by leveraging the power of deep learning. A new study led by Google DeepMind titled "Scaling Deep Learning for Materials Discovery" describes an innovative approach using graph networks to improve the efficiency of materials discovery by an order of magnitude. This development has the potential to revolutionize various industries, from clean energy to information processing.
The Problem with Traditional Approaches
The discovery of novel functional materials, like inorganic crystals used in batteries and photovoltaics, has been hindered by expensive trial-and-error approaches. Experimental methods have cataloged around 20,000 computationally stable structures, but these methods are impractical for large-scale discovery due to financial constraints, low throughput, and synthetic complications. Meanwhile, computational efforts such as the Materials Project and the Open Quantum Materials Database have used density functional theory (DFT) for first-principles calculations to approximate physical energies. However, machine learning methods haven't been successful in accurately predicting stability or aiding in materials discovery so far.
The Solution: Graph Networks for Materials Exploration (GNoME)
The researchers behind this study proposed a large-scale active learning approach to overcome these limitations, using graph networks to accurately predict and guide materials discovery. Their method relies on two pillars: diverse candidate structure generation using symmetry-aware partial substitutions (SAPS) and random structure search, and the employment of state-of-the-art graph neural networks (GNNs) to predict material properties based on structure or composition.
By iterating various rounds of training and utilizing models on available data, the GNoME method discovered over 2.2 million stable crystal structures that were previously unknown. They observed that GNoME models improved in predictive capabilities with increased data, following the power-law behavior observed in deep learning domains.
Validating the Approach
The discovered crystal structures were validated using two methods: experimental matching and comparison with r2SCAN functionals. A total of 736 experimental structures from the Inorganic Crystal Structure Database matched the structures independently produced through the GNoME approach. The researchers also compared their work with the r2SCAN functional, a more accurate energy functional than the standard projector augmented wave (PAW)-Perdew–Burke–Ernzerhof (PBE) potentials used in the GNoME study. They found that 84% of binary and ternary materials from their computations retained negative phase separation energies, ensuring stability with the r2SCAN method.
Expanding Crystal Discovery to New Material Combinations
The GNoME approach enabled efficient discovery in combinatorial material spaces with more than four unique elements, which has often been challenging for human researchers to explore. With nearly 2.2 million stable crystal structures discovered, the total count of stable materials increased by an order of magnitude. This discovery not only validates GNoME's potential but also opens doors for a broad exploration of new material combinations for various applications.
Unlocking New Modeling Capabilities for Downstream Applications
Besides discovering stable materials, the large and diverse dataset produced through the GNoME approach offers an opportunity to train learned interatomic potentials for condensed-phase molecular dynamics simulations and high-fidelity zero-shot prediction of ionic conductivity. This development showcases the potential for machine learning approaches to accelerate materials property prediction and discovery of novel superionic conductors, without training on any material-specific data.
Conclusion
The new study on "Scaling Deep Learning for Materials Discovery" has showcased the immense potential of GNoME, a graph network-based approach for scaling deep learning in materials science. Not only does this breakthrough pave the way for efficient materials discovery, but it also demonstrates the power of scaling deep learning for various downstream applications, opening the door for the development of new specialized materials to revolutionize fields like renewable energy and advanced electronics.
ChatGPT's One-year Anniversary: Are Open-Source Large Language Models Catching up?
Authors: Hailin Chen, Fangkai Jiao, Xingxuan Li, Chengwei Qin, Mathieu Ravaut, Ruochen Zhao, Caiming Xiong, Shafiq Joty
Source & References: https://arxiv.org/abs/2311.16989
Introduction
When ChatGPT was released a year ago, it marked a significant shift in the AI landscape, both in research and commercial applications. This large language model (LLM) displayed an impressive ability to answer human questions and follow instructions on a variety of tasks. Consequently, interest in LLMs has skyrocketed, with numerous models emerging from academia and industry, including start-ups focusing on LLMs.
Although closed-source LLMs (e.g., OpenAI's GPT, Anthropic's Claude) generally outperform their open-source counterparts, rapid progress in open-source models has led to claims of parity, or in some cases, better performance on certain tasks. This paper, commemorating ChatGPT's first anniversary, offers an exhaustive overview of instances where open-source LLMs have matched or surpassed ChatGPT's performance.
Keep reading with a 7-day free trial
Subscribe to State of AI to keep reading this post and get 7 days of free access to the full post archives.