University of Washington Climate Analysis (UWCA) Advisor: Professor

University of Washington Climate Analysis (UWCA) Advisor: Professor Fukuda Student: Jason Woodring A Multi-Agent distributed approach to analyzing large climate data sets. World Climate . Is changing Rising Temperatures

Its our fault We need to simulate the future changes The Impact Local sea level rise Decreased snowpack Glaciers in decline Stream flow peaks earlier in year

Winter Floods Summer Droughts Ocean acidification What's being done about it? Simulated World Climate Data is produced using Many different models

Climate Scientists Analyze this Data in different ways to try to predict extreme weather events Climate Analysis Large climate science facilities may have

dedicated computing resources to analyze these climate models. Small to medium size labs often are resource constrained, and create custom tools.

Climate Science Programming Climate scientists typically use domain specific scripting languages (CDO / NCL). They use the scripting languages to do analysis on climate models produced by different scientific agencies. These environments can be difficult to install and work with for the average person.

Time of Emergence (ToE) When the average conditions consistently exceed some threshold extreme weather events are more likely to happen. An essential calculation for predicting climate behavior. UWCA Web Based Application

Performs Time of Emergence calculations Parameterizes input Utilizes UW-320 Linux lab computing cluster Using MASS Agents to travel MASS Places / computing Nodes to analyze NetCDF data Workflow Based 1. 2.

3. 4. 5. Select ToE variable to calculate Select climate model(s) to use Select parameters for ToE calculations Then wait.. And view job status Download files when done

MASS Library Designed for this type of simulation. Abstracts difficult parallel / distributed programming. Brings the programmer closer to the data they are working with rather than the boiler plate code that is

usually necessary in order to achieve these types of simulations. Input Data NetCDF Software [4] A set of libraries and self-describing, machine

independent data formats that support the creation, access, and sharing of array-oriented scientific data. Free software available All sorts of software packages and APIs for working with NetCDF from different environments. Provenance (to come from) Records input climate model files used Tracking all calculation steps, and files

produced between them. Tracking parameters used in calculations Implementation Used NetBeans and Glassfish Enterprise Web Application Maven Archetype

project Utilizing MASS Places and Agents Architecture Hardware View Architecture SAVN Notation Class Diagram

Infrastructure Setup Deployed on Juno Port 8080 opened Large size shared NFS storage for climate models NetBeans Project folder setup on file share. GUI Application Header

Job Creator Job Status Viewer Algorithms Architecture Using Template Pattern Provides a skeleton for an algorithm while deferring some steps to sub classes.

Following Hollywood Principle Dont call us, well call you ToE calculation 5 step calculation 1. Find days over threshold 2 Find historical tolerances 3. Find climatology 4. Least Squared Regression 5. Find ToE

Step 1: Find days over Threshold Use Temp Threshold parameter to find days over threshold. Every 365 days temp values collapse to a days over threshold integer value in a year index. Step 1: Code

Step 2: Find Historical Tolerances Agents spawn at year 1950 and travel to year 1999 gathering days over threshold variables.

User parameter specifies Min / Max %, and values are calculated Step 2: Code Step 3: Find Climatology Agents move to year

1980 and travel along the z axis until 2010 gathering days over threshold values The Climatology is the average of those values collected Step 3: Code

Step 4: Least Squared Regression All agents move to year 2006, and travel all the way down to year 2099 gathering days over threshold values For each x && y

column sets of values, slopes and confidence intervals are calculated. Step 4: Code Step 4 results in 3 artifacts 3 2-dimensional arrays

These array values represent year 2001 Step 5: Find ToE The arrays from step 4 get expanded by 200 years Step 5: Code

ToE Found! Start at year 2001, and go down each longitude && longitude column to find the year in which the value exceeds the min or max values. Final ToE output 3 2-dimensional array NetCdf files with each

cell containing year values representing the ToE Performance Best Timing 1 node 8 thread with current algorithm 1 node 1 Thread 4 Thread

8 Thread 16 Thread Script: 6:01 5:22 5:00 5:04 21:00

8 node 18:26 11:33 12:14 11:27 17 node 10:47 11:04 11:39

11:45 Algorithm challenges Reading in the 22gb data is time consuming Significant Heap considerations Had to experiment with many different methods of reading in the data Current method reads in all data at master

node and distributes over the network to slave nodes Debugging large data sets Debugging remote nodes Provenance File Displays all times for steps of the job Which input files were used What output files were produced

What steps were executed for the job What parameters were used for the calculations Usability Analysis Questionnaire Learning how to use the tool was quite simple Adaptability of the tool was rated low, as it requires software developer skillsets.

The amount of provenance and detail was rated highly Climate researcher appreciated the workflow logging qualities of the provenance file, it provides information on what steps, files, and parameters were used. The fact that it is a web tool allows access to

various interested parties, including funding agencies to perform climate science analysis. Data Visualization The web application features file downloading. The user can use their software of choice for visualization

Some free options are: Panopoly Viewer ncBrowse Next Steps Experiment with Places reading Algorithm Add test cases for main classes Add new climate models More issues recorded in Redmine for ongoing

work References [1] Fukuda, M., Stiber, M., Salathe, E., & Kim, W. (2013). CDS&E: Small: Multi-Agent-Based Parallelization of Scientific Data Analysis and Simulation. [2] Michalakes, J., Dudhia, J., Henderson, T., Klemp, J., Skamarock, W., & Wang, W. (n.d.). The Weather Research and Forecast Model: Software Architecture and Performance. [3] Fukuda, M. (2010). MASS: Parallel-Computing Library for Multi-Agent Spatial Simulation.

[4] Yasutake, B., Simonson, N., Asuncion, H., Fukuda, M., & Salathe, E. (2014). Supporting Provenance in Climate Research. [5] Zender, C., & Mangalam, H. (2007). Scaling Properties of Common Statistical Operators for Gridded Datasets. International Journal of High Performance Computing Applications,21. [6] Dalton, M., Mote, P., & Snover, A. (2013). Climate Change In The Northwest Implications for our Landscapes, Waters, and Communities. Island Press. [7] Salathe, E., Hamlet, A., & Mass, C. (2014). Estimates of 21st Century Flood Risks in the Pacific Northwest. [8] Climate Change Impacts and Adaptation in Washington State. (2013). Climate Impacts Group University of Washington. [9] Climate Change. (2014, January 1). Retrieved July 11, 2014, from

[10] NetCDF FAQ. (n.d.). Retrieved July 12, 2014, from [11] Software for Manipulating or Displaying NetCDF Data. (n.d.). Retrieved July 13, 2014, from [12] Muir, L., Brown, J., Risbey, J., & Wijffels, S. (n.d.). Determining the time of emergence of the climate change signal at regional scales. The Center for Australian Weather and Climate Research, Hobart, Australia. [13] Hawkins, E., & Sutton, R. (2012). Time of emergence of climate signals. American Geophysical Union. [14] Keller, K., Joos, F., & Raible, C. (2014). Time of emergence of trends in the ocean biogeochemistry. [15] Time of Emergence of Climate Change Signals in the Puget Sound Basin Quality Assurance Project Plan. (2013). Climate Impacts Group

University of Washington. [16] Mahlstein, I., Knutti, R., Solomon, S., & Portmann, R. (2011). Early onset of significant local warming in low latitude countries. [17] Maraun, D. (2013). When will trends in European mean and heavy daily precipitation emerge? [18] Ho, C., Hawkins, E., & Shaffrey, L. (2012). Statistical decadal predictions for sea surface temperatures: A benchmark for dynamical GCM predictions. [19] Capalbo, S., Eigenbrode, S., Glick, P., Littell, J., Raymondi, R., & Reeder, S. (2014). NORTHWEST. In Climate Change Impacts in the United States.

Recently Viewed Presentations