DRBSD 2024
The 10th International Workshop on Data Analysis and Reduction for Big Scientific Data
 
Nov 18th, 2024
Atlanta, GA

DRBSD-10

In cooperation with IEEE Computer Society and ACM

Held in conjunction with SC24: The International Conference for High Performance Computing, Networking, Storage and Analysis

Program

Monday, 18 November 2024
Time: 9:00am - 12:30pm EST; Location: B304
Link to SC24 workshop page

9:00 - 9:05 Opening Remarks and Welcome
9:05 - 9:50 Invited Talk: Learning from Automatically Synthesized Compression Algorithms
Martin Burtscher, Texas State University
9:50 - 10:05 FRSZ2 for In-Register Block Compression Inside GMRES on GPUs
Thomas Grützmacher, Robert Underwood, Sheng Di, Franck Cappello, Hartwig Anzt
Best Paper Award
10:05 - 10:20 An Exploration of How Volume Rendering is Impacted by Lossy Data Reduction
Yanni Etchi, Daoce Wang, Pascal Grosset, Terece Turton, James Ahrens, David Rogers
Best Paper Runner-up
10:20 - 10:25 Break
10:25 - 10:40 SZOps: Scalar Operations for Error-bounded Lossy Compressor for Scientific Data
Tripti Agarwal, Sheng Di, Jiajun Huangm Yafan Huangm Ganesh Gopalakrishnan, Robert Underwood, Kai Zhao, Xin Liang, Guanpeng Li, Franck Cappello
10:40 - 10:55 Enabling Data Reduction for Flash-X Simulations
Rajeev Jain, Houjun Tang, Akash Dhruv, Suren Byna
10:55 - 11:10 BCSR on GPU: A Way Forward Extreme-scale Graph Processing on Accelerator-enabled Frontier Supercomputer
Naw Safrin Sattar, Hao Lu, Feiyi Wang
11:10 - 11:25 Filling the Void: Data-Driven Machine Learning-based Reconstruction of Sampled Spatiotemporal Scientific Simulation Data
Ayan Biswas, Aditi Mishra, Meghanto Majumder, Subhashis Hazarika, Alexander Most, Juan Castorena, Christopher Bryan, Patrick McCormick, James Ahrens, Earl Lawrence, Aric Hagberg
11:25 - 11:40 Enhancing Lossy Compression Through Cross-Field Information for Scientific Applications
Youyuan Liu, Wenqi Jia, Taolue Yang, Miao Yin, Sian Jin
11:40 - 11:55 Shifting Between Compute and Memory Bounds: A Compression-Enabled Roofline Model
Ramasoumya Naraparaju, Tianyu Zhao, Yanting Hu, Dongfang Zhao, Luanzheng Guo, Nathan Tallent
11:55 - 12:10 GPUFastqLZ: An Ultra Fast Compression Methodology for Fastq Sequence Data on GPUs
Taolue Yang, Youyuan Liu, Bo Jiang, Sian Jin
12:10 - 12:25 Accelerating Viz Pipelines Using Near-Data Computing: An Early Experience
Qing Zheng, Brian Atkinson, Daoce Wang, Jason Lee, John Patchett, Dominic Manno, Gary Grider
12:25 - 12:30 Closing Remarks
-->

Topics

A growing disparity between simulation speeds and I/O rates makes it increasingly infeasible for high-performance applications to save all results for offline analysis. By 2024, computers are expected to compute at 1018 ops/sec but write to disk only at 1012 bytes/sec: a compute-to-output ratio 200 times worse than on the first petascale system. In this new world, applications must increasingly perform online data analysis and reduction—tasks that introduce algorithmic, implementation, and programming model challenges that are unfamiliar to many scientists and that have major implications for the design and use of various elements of exascale systems.

This trend has spurred interest in high-performance online data analysis and reduction methods, motivated by a desire to conserve I/O bandwidth, storage, and/or power; increase accuracy of data analysis results; and/or make optimal use of parallel platforms, among other factors. This requires our community to understand the clear yet complex relationships between application design, data analysis and reduction methods, programming models, system software, hardware, and other elements of a next-generation High Performance Computer, particularly given constraints such as applicability, fidelity, performance portability, and power efficiency.

There are at least three important topics that our community is striving to answer: (1) whether several orders of magnitude of data reduction is possible for exascale sciences; (2) understanding the performance and accuracy trade-off of data reduction; and (3) solutions to effectively reduce data while preserving the information hidden in large scientific data. Tackling these challenges requires expertise from computer science, mathematics, and application domains to study the problem holistically, and develop solutions and hardened software tools that can be used by production applications.

The goal of this workshop is to provide a focused venue for researchers in all aspects of data reduction and analysis to present their research results, exchange ideas, identify new research directions, and foster new collaborations within the community.

Topics of interest include but are not limited to:

• Data reduction methods for scientific data

  ° Data deduplication methods

  ° Motif-specific methods (structured and unstructured meshes, particles, tensors, ...)

  ° Methods with accuracy guarantees

  ° Feature/QoI-preserving reduction

  ° Optimal design of data reduction methods

  ° Compressed sensing and singular value decomposition

• Metrics to measure reduction quality and provide feedback

• Data analysis and visualization techniques that take advantage of the reduced data

  ° AI/ML methods

  ° Surrogate/reduced-order models

  ° Feature extraction

  ° Visualization techniques

  ° Artifact removal during reconstruction

  ° Methods that take advantage of the reduced data

• Data analysis and reduction co-design

  ° Methods for using accelerators

  ° Accuracy and performance trade-offs on current and emerging hardware

  ° New programming models for managing reduced data

  ° Runtime systems for data reduction

• Large-scale code coupling and workflows

• Experience of applying data reduction and analysis in practical applications or use-cases

  ° State of the practice

  ° Application use-cases which can drive the community to develop MiniApps

Submission

Important Dates

Full Paper submission deadline: August 16, 2024 August 19, 2024 (AoE)

Author notification: September 6, 2024

Camera-ready final papers submission deadline: September 27, 2024 (AoE)

AD/AE submission deadline: October 15, 2024 September 27, 2024 (AoE)

Remote presentation videos (Optional) submission deadline: October 15, 2024 September 27, 2024 (AoE)

Submissions

• Papers should be submitted electronically on SC Submission Website.

https://submissions.supercomputing.org

• Paper submission should be in single-blind IEEE Format.

http://www.ieee.org/conferences_events/conferences/publishing/templates.html

• DRBSD-10 will accept full papers (10 pages including references/appendix) and short papers (6 pages excluding references/appendix).

• Submitted papers will be evaluated by at least 3 reviewers based upon technical merits.

• DRBSD-10 encourages submissions to provide artifact description and evaluation. Details for SC'24 Reproducibility Initiative: https://sc24.supercomputing.org/program/papers/reproducibility-initiative.

• DRBSD-10 will select papers for Best Paper Award and Best Paper Runner-up Award. All accepted papers will be included in the SC workshop proceedings.

Committee Members

Organizing Committee

Dingwen Tao, Indiana University (Chair)

Sheng Di, Argonne National Laboratory

Ana Gainaru, Oak Ridge National Laboratory

Sian Jin, Temple University, USA

Kento Sato, RIKEN, Japan

Program Chair

Xin Liang, University of Kentucky

Steering Committee

Ian Foster, Argonne National Laboratory/University of Chicago

Scott Klasky, Oak Ridge National Laboratory

Qing Liu, New Jersey Institute of Technology

Todd Munson, Argonne National Laboratory

Technical Program Committee

Martin Burtscher, Texas State University

Michael Bussmann, Helmholtz-Zentrum Dresden-Rossendorf

Jon Calhoun, Clemson University

Frank Cappello, Argonne National Laboratory

Jieyang Chen, University of Alabama at Birmingham

Jong Youl Choi, Oak Ridge National Laboratory

Qian Gong, Oak Ridge National Laboratory

Xubin He, Temple University

Dan Huang, Sun Yat-sen University, China

Sian Jin, Temple University

Samuel Li, National Center for Atmospheric Research

Peter Lindstrom, Lawrence Livermore National Laboratory

Jinyang Liu, University of Houston

Qing Liu, New Jersey Institute of Technology

Tao Lu, DapuStor Corporation, China

Viktor Reshniak, Oak Ridge National Laboratory

Kento Sato, RIKEN, Japan

Robert Underwood, Argonne National Laboratory

Xiaodong Yu, Stevens Institute of Technology

Chengming Zhang, University of Houston

Kai Zhao, Florida State University

  • Call for Papers

    Learn more about the topics

    Learn More