
Large Language Models: Due to the risks, NASA decides against fine tuning a generative earth science LLM.
“Based on our initial assessment, the costs and risks associated with developing an exclusive NASA Science Mission Directorate (SMD) decoder (generative) model currently outweigh the benefits.”
In a paper published yesterday in the American Geophysical Union (AGU) – Perspectives of Earth and Space Scientists, instead they opt for an encoder model with an off the shelf LLM such as Meta’s Llama or OpenAI’s GPT using a Retrieval Augmented Generation (RAG) arrangement.
In other words, the AI generated answers come from content which is external to the Large Language Model to minimise hallucinations (false information).
Abstract
The rapid adoption of artificial intelligence (AI) in scientific research is accelerating progress but also challenging core scientific norms such as accountability, transparency, and replicability. Large language models (LLMs) like ChatGPT are revolutionizing scientific communication and problem-solving, but they introduce complexities regarding authorship and the integrity of scientific work. LLMs have the potential to transform various research practices, including literature surveys, meta-analyses, and data management tasks like entity resolution and query synthesis. Despite their advantages, LLMs present challenges such as content verification, transparency, and accurate attribution. This study explores the appropriate use of LLMs for NASA’s Science Mission Directorate (SMD), considering whether to develop a custom bespoke model or fine-tune an existing open-source model. This article reviews the outcomes and lessons learned from this effort, providing insights for other research groups navigating similar decisions.
Key Points
Generative AI and Language Models have accelerated scientific discovery but also introduced new challenges
NASA collaborated with IBM Research to develop INDUS models tailored to specific science tasks such as document retrieval and classification
NASA is exploring a Retrieval-Augmented Generation strategy that combines encoder models like INDUS with generative models like GPT to minimize risks by grounding responses in authoritative sources eliminating the need to develop a dedicated generative language model for science
Congrats to the authors on an excellent paper. This has significant implications for the geological sciences. This fits with paper I published in December on Ethical Recommendations for Large Language Models in the Geological Sciences.
https://agupubs.onlinelibrary.wiley.com/doi/10.1029/2024CN000258