SoR Project Archive

Welcome to our comprehensive collection of projects from the Summer of Reproducibility program. This initiative connects students with mentors from various institutions worldwide, working on cutting-edge research in reproducibility, benchmarking, and artifact evaluation. Below you’ll find a detailed listing of all projects, including project titles, mentors, students, and relevant links to project blogs and descriptions.

Year	Title	Mentor	Mentor Affiliation	Student	Student Affiliation	Location	Links
2024	Auto Appendix	Sascha Hunold	Technical University of Vienna	Klaus Kraßnitzer	Technical University of Vienna	Austria	Blog 1 \| Blog 2 \| Blog 3
2024	BenchmarkST: Cross-Platform, Multi-Species Spatial Transcriptomics Gene Imputation Benchmarking	Ziheng Duan	University of California, Irvine	Qianru Zhang	University of Waterloo	Canada	Blog 1 \| Blog 2 \| Blog 3
2024	ML-Powered Problem Detection in Chameleon	Ayse Coskun	Boston University	Syed Qasim	Boston University	US	Blog 1 \| Blog 2
2024	Data leakage in applied ML: reproducing examples of irreproducibility	Fraida Fund	New York University	Kyrillos Ishak	Alexandria University	Egypt	Blog 1 \| Blog 2 \| Blog 3
2024	Data leakage in applied ML: reproducing examples of irreproducibility	Fraida Fund	New York University	Shaivi Malik	Guru Gobind Singh Indraprastha University	India	Blog 1 \| Blog 2 \| Blog 3
2024	EdgeRep: Reproducing and benchmarking edge analytic systems	Yuyang (Roy) Huang, Junchen Jiang	University of Chicago	Rafael Sinjunatha Wulangsih	Bandung Institute of Technology	Indonesia	Blog 1
2024	FEP-Bench: Benchmarking for Enhanced Feature Engineering and Preprocessing in Machine Learning	Yuyang (Roy) Huang, Swami Sundararaman	University of Chicago	Lihaowen Zhu	University of Chicago	US	Blog 1 \| Blog 2 \| Blog 3
2024	FetchPipe: Data Science Pipeline for ML-based Prefetching	Haryadi Gunawi, Daniar Kurniawan	University of Chicago	Peiran Qin	University of Chicago	US	Blog 1 \| Blog 2 \| Blog 3
2024	FSA: Benchmarking Fail-Slow Algorithms	Kexin Pei, Ruidan Li	University of Chicago	Xikang Song	University of Chicago	US	Blog 1 \| Blog 2 \| Blog 3
2024	LAST: Let’s Adapt to System Drift	Ray Andrew Sinurat, Sandeep Madireddy	University of Chicago	Joanna Cheng	Johns Hopkins University	US	Blog 1 \| Blog 2 \| Blog 3
2024	LAST: Let’s Adapt to System Drift	Ray Andrew Sinurat, Sandeep Madireddy	University of Chicago	William Nixon	Bandung Institute of Technology	Indonesia	Blog 1 \| Blog 2 \| Blog 3
2024	OpenMLEC: Open-source MLEC implementation with HDFS on top of ZFS	Meng Wang, Anjus George	Oak Ridge National Laboratory	Jiajun Mao	University of Chicago	US	Blog 1
2024	ReproNB: Reproducibility of Interactive Notebook Systems	Tanu Malik	DePaul University	Nicole Brewer	Arizona State University	US	Blog 1
2024	Automatic reproducibility of COMPSs experiments through the integration of RO-Crate in Chameleon	Raül Sirvent	Barcelona Supercomputing Center	Archit Dabral	Indian Institute of Technology (BHU)	India	Blog 1 \| Blog 2 \| Blog 3
2024	ScaleRep: Reproducing and benchmarking scalability bugs hiding in cloud systems	Bogdan Stoica, Yang Wang	University of Chicago	Shuang Liang	Ohio State University	US	Blog 1 \| Blog 2 \| Blog 3
2024	ScaleRep: Reproducing and benchmarking scalability bugs hiding in cloud systems	Bogdan Stoica, Yang Wang	University of Chicago	Zahra Nabila Maharani	University Dian Nuswantoro	Indonesia	Blog 1 \| Blog 2 \| Blog 3
2024	SciStream-Rep: An Artifact for Reproducible Benchmarks of Scientific Streaming Applications	Joaquin Chung, Flavio Castro	Argonne National Laboratory	Christopher Acheme	Clemson University	US	Blog 1 \| Blog 2 \| Blog 3
2024	SLICES/pos: Reproducible Experiment Workflows	Georg Carle, Sebastian Gallenmüller	Technical University of Munich	Kilian Warmuth	Technical University of Munich	Germany	Blog 1 \| Blog 2 \| Blog 3
2024	Static Python Perf: Measuring the Cost of Sound Gradual Types	Ben Greenman	University of Utah	Mrigank Pawagi	Indian Institute of Science	India	Blog 1 \| Blog 2 \| Blog 3
2024	Chameleon Trovi Redesign	Mark Powers	University of Chicago	Alicia Esquivel Morel	University of Missouri	US	Blog 1 \| Blog 2 \| Blog 3
2024	Reproducibility in Data Visualization	David Koop	Northern Illinois University	Triveni Gurram	Northern Illinois University	US	Blog 1 \| Blog 2 \| Blog 3
2024	Reproducibility in Data Visualization	David Koop	Northern Illinois University	Arya Sarkar	University of Engineering and Management, Kolkata	India	Blog 1 \| Blog 2 \| Blog 3
2023	Automatic Cluster Performance Shifts Detection Toolkit	Sandeep Madireddy, Ray Andrew Sinurat	Argonne National Laboratory	Kangrui Wang	University of Chicago	US	Blog 1 \| Blog 2
2023	Is Reproducibility Enough? Understanding the Impact of Missing Settings in Artifact Evaluation	Yang Wang, Miao Yu	Ohio State University	Xueyuan Ren	Ohio State University	US	Blog 1 \| Blog 2 \| Blog 3
2023	GPU Emulator for Easy Reproducibility of DNN Training	Vijay Chidambaram	University of Texas at Austin	Haoran Wu	University of Chicago	US	Blog 1 \| Blog 2 \| Blog 3
2023	Reproduce and benchmark self-adaptive edge applications under dynamic resource management	Junchen Jiang	University of Chicago	Faishal Zharfan	Bandung Institute of Technology	Indonesia	Blog 1 \| Blog 2
2023	Reproducible Evaluation of Multi-level Erasure Coding	John Bent, Anjus George	Oak Ridge National Laboratory	Zhiyan “Alex” Wang	University of Chicago	US	Blog 1 \| Blog 2
2023	FlashNet: Towards Reproducible Data Science for Storage System	Haryadi Gunawi	University of Chicago	Maharani Ayu Putri Irawan	Bandung Institute of Technology	Indonesia	Blog 1 \| Blog 2
2023	FlashNet: Towards Reproducible Data Science for Storage System	Haryadi Gunawi	University of Chicago	Eunsoo Justin Shin	University of Chicago	US	Blog 1 \| Blog 2
2023	Reproducible Analysis & Models for Predicting Genomics Workflow Execution Time	In Kee Kim	University of Georgia	Charis Christopher Hulu	Calvin Institute of Technology	Indonesia	Blog 1 \| Blog 2
2023	Reproducible Analysis & Models for Predicting Genomics Workflow Execution Time	In Kee Kim	University of Georgia	Shayantan Banerjee	Indian Institute of Technology Bombay	India	Blog 1 \| Blog 2
2023	Reproducible Analysis & Models for Predicting Genomics Workflow Execution Time	In Kee Kim	University of Georgia	Martin Putra	University of Chicago	US	Blog 1 \| Blog 2 \| Blog 3
2023	Using Reproducibility in Machine Learning Education	Fraida Fund	New York University	Shekhar	New York University	US	Blog 1 \| Blog 2
2023	Using Reproducibility in Machine Learning Education	Fraida Fund	New York University	Jonathan Edwin	Korea University of Science and Technology	South Korea	Blog 1 \| Blog 2
2023	Using Reproducibility in Machine Learning Education	Fraida Fund	New York University	Mohamed Saeed	Alexandria University	Egypt	Blog 1 \| Blog 2 \| Blog 3
2023	noWorkflow	João Felipe Pimentel, Juliana Freire	Northern Arizona University	Jesse Lima	Sao Paulo University	Brazil	Blog 1 \| Blog 2 \| Blog 3
2023	LabOP – an open specification for laboratory protocols, that solves common interchange problems stemming from variations in scale, labware, instruments, and automation.	Tim Fallon, Dan Bryce	UC San Diego	Luiza Zucchi Hesketh	University of San Diego	US	Blog 1 \| Blog 2
2023	Public Artifact Data and Visualization	Anjo Vahldiek-Oberwagner	Intel Labs	Jiayuan Zhu	Xi’an Jiaotong-Liverpool University	China	Blog 1 \| Blog 2 \| Blog 3
2023	Public Artifact Data and Visualization	Anjo Vahldiek-Oberwagner	Intel Labs	Krishna Madhwani	Indian Institute of Technology (BHU)	India	Blog 1 \| Blog 2 \| Blog 3
2023	ScaleBugs: Reproducible Scalability Bugs	Haryadi Gunawi, Hao-Nan Zhu, Cindy Rubio González	University of Chicago	Goodness Ayinmode	University of Ibadan	Nigeria	Blog 1 \| Blog 2
2023	ScaleBugs: Reproducible Scalability Bugs	Haryadi Gunawi, Hao-Nan Zhu, Cindy Rubio González	University of Chicago	Zahra Nabila Maharani	University Dian Nuswantoro	Indonesia	Blog 1 \| Blog 2 \| Blog 3
2023	Teaching Computer Networks with Reproducible Research	Fraida Fund	New York University	Srishti Jaiswal	Indian Institute of Technology (BHU)	India	Blog 1 \| Blog 2 \| Blog 3