SC22 Proceedings

The International Conference for High Performance Computing, Networking, Storage, and Analysis

Workshops Archive

What Can Real Information Content Tell Us about Compressing Climate Model Data?


Workshop: The 8th International Workshop on Data Analysis and Reduction for Big Scientific Data (DRBSD-8) in Conjunction with SC22

Authors: Hayden Sather and Alexander Pinard (Colorado School of Mines), Allison Baker (National Center for Atmospheric Research (NCAR)), and Dorit Hammerling (Colorado School of Mines)


Abstract: The massive data volumes produced by climate simulation models create an urgent need for data reduction. Lossy compression is one solution that can significantly reduce storage requirements, however, as the amount of compression applied increases, the scientific integrity of the data decreases. One metric for gauging the quality of compression is the percentage of real information present in the original data that is preserved in the compressed data. We compute bitwise real information content for several climate variables from the Community Earth System Model Large Ensemble provided by the National Center for Atmospheric Research and investigate the amount of compression that can be applied to each of these climate variables using two popular compression algorithms designed for floating-point data while preserving 99% of the real information content. Finally, we demonstrate how the real information content can be used in a straightforward manner to determine compressor settings for our data.





Back to The 8th International Workshop on Data Analysis and Reduction for Big Scientific Data (DRBSD-8) in Conjunction with SC22 Archive Listing



Back to Full Workshop Archive Listing