Seminars and Colloquia

Sensitivity of GEMnet 1.0 (CNN based simulator for GEM) to hyperparameters and architectureAstrophysics Seminar

Name: Sensitivity of GEMnet 1.0 (CNN based simulator for GEM) to hyperparameters and architecture
Start: 2024-12-16T15:30:00+05:30
End: 2024-12-16T16:30:00+05:30
Location: Auditorium

by Vikram Khade (Environment and Climate Change Canada (ECCC), Montreal, Canada)

Monday Dec 16, 2024, 3:30 PM → 4:30 PM Asia/Kolkata

Auditorium

Description

Abstract

An introduction to Deep Learning is presented. This includes back propagation and hyper-parameters of the network. A particular type of Deep Learning network, namely the Convolution Neural Network (CNN) which is popular in image processing and computer vision domain is explained. A simulator (GEMnet) for the 39 km resolution GEM model for a 6 hour forecast is presented. GEMnet uses the Convolutional Neural Network (CNN) technique. The basics of the CNN are introduced and explained. The ensemble members (1-10) from the GEPS archives from 2017-2023 from January to July are used for training and testing. The analysis and 6 hour trials of the 1.5 meter temperature are extracted for the four synoptic hours every day and a dataset is prepared for training/validation (2017-2022) and testing (2023). Various architectures for CNN are explored. Depending on the architecture, the training MSE loss converges after about 100 epochs. The sensitivity of these results to hyperparameters like learning rate, the type of optimizer, activation, batch size and kernel size are presented. When the maximum number of filters in the CNN are increased from 256 to 512 the test RMSE decreases from 1.6 to 1.45. However, the total number of tenable parameters increases from 775K to 3.3 million which increases the training time from 5.6 minutes to 6.6 minutes per epoch . The execution time statistics for training and testing over a single GPU are compared between different experiments. Some possibilities for further work are presented. One of the most important is the question about the metric used in the loss function. Currently, GEMnet uses MSE and it is well known that simulator forecasts using MSE tend to produce smoother forecasts. Secondly, persistence, though useful, is a low-bar to assess the quality of the forecast. Future directions and the rationale for developing GEM are discussed.

Contact

colloqm@iiap.res.in