Sandeep Madireddy — Computer Scientist

About

📋 Biography

Sandeep Madireddy is a Computer Scientist in the Mathematics and Computer Science Division at Argonne National Laboratory. His research spans probabilistic machine learning, bio-inspired and energy-efficient learning, high-performance computing, and generative AI with an emphasis on safety and robustness. His work integrates algorithmic advances with applied research across fusion energy sciences, cosmology and high-energy physics, weather and climate, and material science.

He previously served as a postdoc and assistant computer scientist in Mathematics and Computer Science Division, advised by Prasanna Balaprakash and Stefan Wild. Before joining Argonne, he earned his Ph.D. in mechanical and materials engineering (probabilistic machine learning) from the University of Cincinnati as part of the UC Simulation Center (a Procter & Gamble collaboration). He previously earned his master's at Utah State University and his bachelor's from BITS Pilani, India.

🔬 Projects & Grants

Sandeep serves as a Co-investigator (and AI lead) for DOE and NSF-funded projects:

ModCon/Genesis Mission, Lead for Multimodal Scientific AI Reasoning Models BASE Capability Thrust
AuroraGPT, Co-lead for Evaluation and AI-Safety Thrust
RAPIDS3: A SciDAC Institute for Computer Science, Data, and Artificial Intelligence, AI thrust co-lead
CETOP: A Center for Edge of Tokamak OPtimization, Co-investigator and AI Lead

📝 Professional Service

Sandeep also provides professional services to machine learning, high-performance computing, and domain-science conferences and journals.

PC member (HPC): Supercomputing 2023 and 2026, CCGRID 2024, INFOCOMP 2019-2023; reviewer for IPDPS and IEEE Cluster
Reviewer (AI): ICML (2021-2024), ICLR (2021-2024), NeurIPS (2021-2025), AISTATS (2023, 2025)
Journals: Nature, Neural Networks, NeuroComputing, JMLMC, SIAM SISC, JPDC, TPDS, Parallel Computing, TCC, JoS, MNRAS, CISE, IEEE TPSC, and JoLT

Research Areas

Probabilistic Machine Learning

Research at the intersection of Bayesian Inference and Information-Theoretic learning.

Scientific Machine Learning and AI for Science

Theoretical and Applied Machine learning for Large-scale Science including Fusion Energy, Astrophysics, HPC, and Climate Science

Multi-modal Foundation Models

Theoretical and Applied research on developing and evaluating (for skill and safety) Multimodal Language Models as scientific research assistants and Spatio-Temporal Foundation models for science.

Energy Efficient AI

Energy-efficient AI with Bio-inspired architectures such as Neuromorphics for large scale AI models and life-long learning in this paradigm.

News & Announcements

Oct 1, 2025

(New Funding) The Transformational AI Models Consortium (ModCon)

Senior Personnel (ANL), Lab 25-3560, (Lead PI: Rick Stevens, Argonne National Laboratory), FY26-FY28

Oct 1, 2025

(New Funding) Multimodal AI Foundation Modeling of Turbulent, Multiphase, and Reacting Flows for Propulsion and Power Applications

Co-PI (ANL), Foundational Models for Energy Applications, LDRD Prime Focus Area, (Lead-PI: Pinaki Pal, Argonne National Laboratory), FY26-FY28

Oct 1, 2025

(New Funding) The RAPIDS3 Institute for Artificial Intelligence, Computer Science, and Data

Co-PI (ANL), LAB 25-3510.000002, (Lead PI: Robert Ross, Argonne National Laboratory), FY26-FY31

Oct 1, 2025

(New Funding) Supply Chain Optimizer for Understanding Trends in Critical Minerals (SCOUT-CM)

Co-PI (ANL), Critical Materials and Supply Chains, LDRD Prime Focus Area, (Lead-PI: Barron YoungSmith, Argonne National Laboratory), FY26-FY28

Dec 1, 2024

(New Funding) BIA A CoDesign Methodology to Transform Materials and Computer Architecture Research for Energy Efficiency

Microelectronics Science Research Center Projects for Energy Efficiency and Extreme Environments (LAB 24-3320), (Lead PI: Valerie Taylor, Argonne National Laboratory), FY25-FY27 https://www.energy.gov/sites/default/files/2024-12/122324-msrc-awards-list.pdf

Publications

2026

KORAL: Knowledge Graph Guided LLM Reasoning for SSD Operational Analysis

Preprint

M Akewar, S Madireddy, D Luo, …

arXiv preprint arXiv:2602.10246, 2026

Scholar

Multi-task Modeling for Engineering Applications with Sparse Data

Preprint

Y Comlek, RM Krishnan, SK Ravi, …

arXiv preprint arXiv:2601.05910, 2026

PDF Scholar

This paper presents a Multi-Task Gaussian Processes (MTGP) framework designed for engineering systems with multi-source, multi-fidelity data, addressing issues of data scarcity and varying task correlations. By leveraging inter-task relationships, the framework improves prediction accuracy and reduces computational costs, as demonstrated through benchmarks including the Forrester function, 3D void modeling, and friction-stir welding scenarios.

2025

Aeris: Argonne earth systems model for reliable and skillful predictions

Conference

V Hatanpää, E Ku, J Stock, …

Proceedings of the International Conference for High Performance Computing …, 2025

4 citationsPDF DOI Scholar

This research introduces AERIS, a large-scale pixel-level Swin diffusion transformer, and SWiPe, a technique for efficient parallelism in window-based transformers, to improve weather forecasting with diffusion models. AERIS achieves state-of-the-art performance and scalability on the ERA5 dataset, outperforming existing methods and maintaining stability for seasonal-scale predictions up to 90 days.

Ailuminate: Introducing v1. 0 of the ai risk and reliability benchmark from mlcommons

Preprint

S Ghosh, H Frase, A Williams, …

arXiv preprint arXiv:2503.05731, 2025

17 citationsDOI Scholar

The paper presents AILuminate v1.0, the first comprehensive industry-standard benchmark designed to evaluate AI systems' resistance to prompts eliciting dangerous, illegal, or undesirable behaviors across 12 hazard categories. This benchmark features a novel evaluation framework, extensive prompt datasets, a five-tier grading scale, and an entropy-based response assessment, supported by infrastructure for ongoing development and cross-disciplinary input.

★AstroMLab 1: Who wins astronomy jeopardy!?

Article

YS Ting, TD Nguyen, T Ghosal, …

Astronomy and Computing 51, 100893, 2025

20 citationsPDF DOI Scholar

AstroMLab 1

Article

YS Ting, TD Nguyen, T Ghosal, …

PDF DOI Scholar

AuroraGPT Data Collection Interface

Article

R Underwood, A Maurya, Z Li, …

Argonne National Laboratory (ANL), Argonne, IL (United States), 2025

PDF DOI Scholar

The software package enables efficient data collection to evaluate the performance of large language models (LLMs) on scientific topics. It is designed to support comprehensive assessment, ensuring accurate measurement of LLM capabilities in understanding and processing scientific information.

Automated MCQA Benchmarking at Scale: Evaluating Reasoning Traces as Retrieval Sources for Domain Adaptation of Small Language Models

Conference

O Gokdemir, N Getty, R Underwood, …

Proceedings of the SC'25 Workshops of the International Conference for High …, 2025

1 citationsPDF DOI Scholar

The authors present a scalable, modular framework that automates the creation of multiple-choice question-answering benchmarks from large scientific corpora, demonstrated by generating over 16,000 questions from 22,000 radiation and cancer biology papers. Evaluating small language models on these questions reveals that retrieval-augmented generation using reasoning traces significantly improves performance, enabling several smaller models to outperform GPT-4 on a specialized 2023 exam.

Can LLMs Model the Environmental Impact on SSD?

Conference

M Akewar, G Quan, S Madireddy, …

Proceedings of the 17th ACM Workshop on Hot Topics in Storage and File …, 2025

1 citationsPDF DOI Scholar

Environmental stressors significantly affect SSD performance and reliability, but studying these effects is difficult due to limited data, complex interrelated factors, and the specialized equipment required. Existing storage management techniques neglect environmental impacts, and accurately modeling these effects remains challenging because of varied NAND flash responses and the cascading influence of historical exposure.

Chance-constrained Flow Matching for High-Fidelity Constraint-aware Generation

Preprint

J Liang, Y Sun, A Samaddar, …

arXiv preprint arXiv:2509.25157, 2025

DOI Scholar

This paper introduces Chance-constrained Flow Matching (CCFM), a training-free method that incorporates stochastic optimization during sampling to enforce hard constraints while preserving high-fidelity generation. CCFM guarantees feasibility comparable to repeated projection but operates on noisy intermediates, reducing distributional distortion and complexity.

Data-Efficient Dimensionality Reduction and Surrogate Modeling of High-Dimensional Stress Fields

Journal

A Samaddar, SK Ravi, N Ramachandra, …

Journal of Mechanical Design 147 (3), 031701, 2025

5 citationsPDF DOI Scholar

This study presents a convolutional autoencoder framework enhanced by an information bottleneck loss to model tensor datatypes in simulations under limited data conditions, improving robustness by filtering nuisance information while retaining relevant features. The approach is combined with hyperparameter optimization and is numerically compared with dimensionality reduction-based surrogate modeling methods to demonstrate improved predictive performance for engineering applications.

Eaira: Establishing a methodology for evaluating ai models as scientific research assistants

Preprint

F Cappello, S Madireddy, R Underwood, …

arXiv preprint arXiv:2502.20309, 2025

11 citationsPDF DOI Scholar

This paper presents EAIRA, a comprehensive methodology developed at Argonne National Laboratory for evaluating Large Language Models as scientific research assistants through multiple choice questions, open responses, lab-style experiments, and field-style experiments. These evaluations provide a holistic assessment of LLMs' factual recall, reasoning, problem-solving, and real-world research interactions across diverse scientific domains.

Efficient Flow Matching using Latent Variables

Preprint

A Samaddar, Y Sun, V Nilsson, …

arXiv preprint arXiv:2505.04486, 2025

2 citationsPDF DOI Scholar

Flow matching models typically overlook the clustering structure in target data, leading to inefficient learning, especially for high-dimensional datasets on low-dimensional manifolds. Latent-CFM addresses this by conditioning on pretrained deep latent features, achieving better generation quality with less training and computation across synthetic, image, and physical spatial field datasets.

Ensembles of Neural Surrogates for Parametric Sensitivity in Ocean Modeling

Preprint

Y Sun, R Egele, SHK Narayanan, …

arXiv preprint arXiv:2508.16489, 2025

DOI Scholar

This research improves ocean simulation accuracy by using deep learning surrogates combined with large-scale hyperparameter search and ensemble learning to enhance forward predictions and sensitivity estimations. The ensemble approach also quantifies epistemic uncertainty, increasing the reliability of neural surrogates for parameter tuning and decision making.

Evaluating the Safety and Skill Reasoning of Large Reasoning Models Under Compute Constraints

Preprint

A Balaji, L Chen, R Thakur, …

arXiv preprint arXiv:2509.18382, 2025

PDF DOI Scholar

This work investigates reducing the computational cost of reasoning language models by using reasoning length constraints and model quantization, focusing on their effect on safety performance. It proposes fine-tuning with length controlled policy optimization and quantization to balance computational efficiency with model safety within user-defined compute limits.

Evaluation of Test-Time Compute Constraints on Safety and Skill Large Reasoning Models

Conference

A Balaji, L Chen, R Thakur, …

Proceedings of the SC'25 Workshops of the International Conference for High …, 2025

1 citationsDOI Scholar

This research examines two compute constraint strategies—reasoning length constraint and model quantization—to manage the trade-off between computational efficiency and safety in reasoning language models. It proposes fine-tuning models with a length-controlled policy optimization method and applying quantization to maximize chain-of-thought generation within user-defined compute limits.

Heimdall: Optimizing storage i/o admission with extensive machine learning pipeline

Conference

DH Kurniawan, RA Putri, P Qin, …

Proceedings of the Twentieth European Conference on Computer Systems, 1109-1125, 2025

4 citationsPDF DOI Scholar

This paper presents Heimdall, a machine learning-based I/O admission policy for flash storage that achieves 93% decision accuracy through domain-specific innovations and fine-tuning. Heimdall offers sub-microsecond inference latency, minimal memory overhead, and reduces average I/O latency by 15-35% compared to state-of-the-art methods, supporting various deployment scenarios.

Large Language Models Inference Engines based on Spiking Neural Networks

Preprint

A Balaji, S Madireddy, P Balaprakash

arXiv preprint arXiv:2510.00133, 2025

1 citationsPDF DOI Scholar

This research addresses the computational challenges of transformer models by exploring spiking neural networks (SNNs) for efficient design. The proposed NeurTransformer methodology combines supervised fine-tuning with existing conversion techniques to optimize transformer-based SNNs for inference, reducing latency caused by high spike time-steps.

LExI: Layer-Adaptive Active Experts for Efficient MoE Model Inference

Preprint

K Teja Chitty-Venkata, S Madireddy, M Emani, …

arXiv e-prints, arXiv: 2509.02753, 2025

PDF DOI Scholar

This work shows that traditional pruning methods for Mixture-of-Experts (MoE) models mainly reduce memory usage without significantly boosting inference speed on GPUs. The authors propose LExI, a data-free technique that adaptively selects the number of active experts per layer based on model weights, leading to improved inference efficiency and performance on language and vision benchmarks.

Multi-diagnostic Time Series Generative model for Prediction of the Edge Localized Modes in Tokamak plasmas

Article

A Samaddar, Q Gong, S Madireddy, …

DPP 2025, 2025

PDF DOI Scholar

This research addresses edge-localized modes (ELMs), instabilities in high performance tokamak plasmas driven by edge current density and pressure gradients predicted by magnetohydrodynamic peeling-ballooning theory. It introduces a conceptual model for the evolving magnetic separatrix topology in tokamaks with a dominant lower hyperbolic point and discusses the nonlinear dynamics of ELMs governed by this topology.

Talks

InvitedAug 2025

Argonne, IL

Language Model Evaluation and Safety for Scientific Tasks

Argonne Training Program on Extreme-Scale Computing

ContributedJul 2025

Half Day Tutorial: Evaluation of AI Model Scientific Skills

Trillion Parameter Consortium's 2025 all-hands conference and exhibition

ContributedJul 2025

Multi-diagnostic Generative modeling of Edge-Localized Modes in Tokamak plasma

SciDAC PI Meeting (Poster)

ContributedJul 2025

Scalable Flow based Generative Models for Science

SciDAC PI Meeting (Poster)

ContributedMay 2025

Hamburg, Germany

Half Day Tutorial: Leveraging and Evaluation of LLMs For HPC Research

ISC High Performance Computing Conference

Projects

CETOP - A Center for Edge of Tokamak OPtimization

The project overarching objective is to develop the simulation capability and to perform extended MHD and (drift-gyro) k…

Deep LearningFushion Energy ScienceProbabilistic Models

ML Assisted Equilibrium Reconstruction for Tokamak Experiments and Burning Plasmas

Deelop a framework for efficient and accurate equilibrium reconstructions, by automating and maximizing the information …

Deep LearningFushion Energy ScienceProbabilistic Models

Foundations for Correctness Checkability and Performance Predictability of Systems at Scale (ScaleSTUDS)

Multi-dimensional automated scalability tests, program analysis, performance learning and prediction at various levels o…

Deep LearningScientific Machine LearningSciDAC-RAPIDS

View Project

Probabilistic Machine Learning for Rapid Large-Scale and High-Rate Aerostructure Manufacturing (PRISM)

he project goal is to develop a probabilistic ML framework – PRISM – to improve manufacturing efficiency and demonstrate…

Deep LearningMaterial ScienceScientific Machine Learning

Enabling Cosmic Discoveries in the Exascale Era

The overarching objective of this SciDAC-5 project is to create consistent predictions of the dark and visible Universe …

Deep LearningHigh Energy PhysicsProbabilistic Models

RAPIDS2:- SciDAC Institute for Computer Science, Data, and Artificial Intelligence

The objective of RAPIDS2 is to assist the Office of Science (SC) application teams in overcoming computer science, data,…

Deep LearningScientific Machine LearningSciDAC-RAPIDS

View Project

A Transformative Co-Design Approach to Materials and Computer Architecture Research (Threadwork)

Co-design approach that encompasses neuromorphic computing, systems architecture, and datacentric applications. Focus on…

Deep LearningScientific Machine LearningSystems

View Project

Team

Research Staff

Adarsha Balaji,

Research Staff (2024-Current)

Yixuan Sun

Research Staff (2025-Current)

Post Doctoral Researchers

Anirban Samaddar

NSF-MSGI Fellow (2021)

Graduate Students

Josh Nguyen

Visiting Student (2024-)

Tung Nguyen

Visiting Student (2023-)

Ray Sinurat

Ph.D. Candidate (2021-Current)

Zizhang Chen

Summer Intern (2023)