Commit efe93580 authored by Julia Wagemann's avatar Julia Wagemann
Browse files

Dust workshop part 1 and 2

parent 3e2fb4ee
%% Cell type:markdown id: tags:
 
<img src='../img/dust_banner.png' alt='Training school and workshop on dust' align='center' width='100%'></img>
 
<br>
 
%% Cell type:markdown id: tags:
 
# AERONET
 
%% Cell type:markdown id: tags:
 
### About
 
%% Cell type:markdown id: tags:
 
The [AERONET (AErosol RObotic NETwork)](https://aeronet.gsfc.nasa.gov/new_web/index.html) project is a federation of ground-based remote sensing aerosol networks established by NASA and LOA-PHOTONS (CNRS) and is greatly expanded by collaborators from national agencies, institutes, universities, individual scientists, and partners. The program provides a long-term (more than 25 years), continuous and readily accessible public domain database of aerosol optical, microphysical and radiative properties for aerosol research and characterization, validation of satellite retrievals, and synergism with other databases. The network imposes standardization of instruments, calibration, processing and distribution.
 
AERONET collaboration provides globally distributed observations of spectral aerosol optical Depth (AOD), inversion products, and precipitable water in diverse aerosol regimes. Aerosol optical depth data are computed for three data quality levels: Level 1.0 (unscreened), Level 1.5 (cloud-screened), and Level 2.0 (cloud screened and quality-assured). Inversions, precipitable water, and other AOD-dependent products are derived from these levels and may implement additional quality checks.
 
You can see an overview of all available AERONET Site Names [here](https://aeronet.gsfc.nasa.gov/cgi-bin/draw_map_display_aod_v3?long1=-180&long2=180&lat1=-90&lat2=90&multiplier=2&what_map=4&nachal=1&formatter=0&level=3&place_code=10&place_limit=0).
 
%% Cell type:markdown id: tags:
 
### Basic Facts
 
%% Cell type:markdown id: tags:
 
> **Spatial coverage**: `Observation stations worldwide` <br>
> **Temporal resolution**: `sub-daily and daily` <br>
> **Temporal coverage**: `since 1993` <br>
> **Data format**: `txt` <br>
> **Versions**: `Level 1.0 (unscreened)`, `Level 1.5 (cloud-screened)`, `Level 2.0 (cloud screened and quality-assured)`
 
%% Cell type:markdown id: tags:
 
### How to access the data
 
%% Cell type:markdown id: tags:
 
AERONET offers a web service which allows you to request and save aeronet data via wget, which is a command to download files from the internet. The AERONET web service endpoint is available under https://aeronet.gsfc.nasa.gov/cgi-bin/print_web_data_v3 and a detailed documentation of how to construct requests can be found here.
 
The first part of this notebook ([1 - Download AERONET data for a specific station and time period](#aeronet_download)) shows you how to request AERONET data with a `wget` command.
 
%% Cell type:markdown id: tags:
 
### Module outline
* [1 - Download AERONET data for a specific station and time period](#aeronet_download)
* [2 - Read the observation data with pandas](#read_aeronet)
* [3 - Visualize all points of AERONET Aerosol Optical Depth in Palma de Mallorca for February 2021](#visualize_all_points)
* [4 - Load and visualize daily Aerosol Optical Depth aggregates together with the Angstrom Exponent](#daily_angstrom)
 
%% Cell type:markdown id: tags:
 
<hr>
 
%% Cell type:markdown id: tags:
 
#### Load libraries
 
%% Cell type:code id: tags:
 
``` python
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import wget
```
 
%% Cell type:markdown id: tags:
 
<hr>
 
%% Cell type:markdown id: tags:
 
## <a id='aeronet_download'></a>1. Download AERONET data for a specific station and time period
 
%% Cell type:markdown id: tags:
 
AERONET offers a web service which allows you to request and save aeronet data via `wget`, which is a command to download files from the internet. The AERONET web service endpoint is available under https://aeronet.gsfc.nasa.gov/cgi-bin/print_web_data_v3 and a detailed documentation of how to construct requests can be found [here](https://aeronet.gsfc.nasa.gov/print_web_data_help_v3_new.html).
 
An example request from the website looks like this:
 
`wget --no-check-certificate -q -O test.out "https://aeronet.gsfc.nasa.gov/cgi-bin/print_web_data_v3?site=Cart_Site&year=2000&month=6&day=1&year2=2000&month2=6&day2=14&AOD15=1&AVG=10"`
 
In this request, you use `wget` to download data from the webservice. You tailor your request with a set of keywords which you concatenate to the service endpoint with `&`. However, constructing such requests manually can be quite cumbersome. For this reason, we will show you an approach how you can dynamically generate and request AERONET data from the web service with Python.
 
%% Cell type:markdown id: tags:
 
As a first step, let us create a Python dictionary in which we store all the parameters we would like to use for the request as dictionary keys. You can initiate a dictionary with curled brackets `{}`. Below, we specify the following parameters:
* `endpoint`: Endpoint of the AERONET web service
* `station`: Name of the AERONET station
* `year`: year 1 of interest
* `month`: month 1 of interest
* `day`: day 1 of interest
* `year2`: year 2 of interest
* `month2`: month 2 of interest
* `day2`: day 2 of interest
* `AOD15`: data type, other options include `AOD10`, `AOD20`, etc.
* `AVG`: data format, `AVG=10` - all points, `AVG=20` - daily averages
 
%% Cell type:markdown id: tags:
 
Please note, that there are additional parameters that can be set, e.g. `hour`. The keywords below are those we will need for requesting all data points of Aerosol Optical Depth Level 1.5 data for the station Palma de Mallorca for February 2021.
 
%% Cell type:code id: tags:
 
``` python
data_dict = {
'endpoint': 'https://aeronet.gsfc.nasa.gov/cgi-bin/print_web_data_v3',
'station':'Ispra',
'station':'Palma_de_Mallorca',
'year': 2021,
'month': 2,
'day': 1,
'year2': 2021,
'month2': 2,
'day2': 28,
'AOD15': 1,
'AVG': 10
}
```
 
%% Cell type:markdown id: tags:
 
<br>
 
%% Cell type:markdown id: tags:
 
In a next step, we construct the final string for the wget request with the `format` function. You construct a string by adding the dictionary keys in curled brackets. At the end of the string, you provide the dictionary key informatoin to the string with the `format()` function. A print of the resulting url shows, that the format function replaced the information in the curled brackets with the data in the dictionary.
 
%% Cell type:code id: tags:
 
``` python
url = '{endpoint}?site={station}&year={year}&month={month}&day={day}&year2={year2}&month2={month2}&day2={day2}&AOD15={AOD15}&AVG={AVG}'.format(**data_dict)
url
```
 
%%%% Output: execute_result
 
'https://aeronet.gsfc.nasa.gov/cgi-bin/print_web_data_v3?site=Ispra&year=2021&month=2&day=1&year2=2021&month2=2&day2=28&AOD15=1&AVG=10'
'https://aeronet.gsfc.nasa.gov/cgi-bin/print_web_data_v3?site=Palma_de_Mallorca&year=2021&month=2&day=1&year2=2021&month2=2&day2=28&AOD15=1&AVG=10'
 
%% Cell type:markdown id: tags:
 
<br>
 
%% Cell type:markdown id: tags:
 
Now we are ready to request the data with the function `download()` from the wget Python library. You have to pass to the function the constructed url above together with a file path of where the downloaded that shall be stored. Let us store the data as `txt` file in the folder `../data/2_observations/aeronet/`.
 
%% Cell type:code id: tags:
 
``` python
wget.download(url, '../data/2_observations/aeronet/202102_ispra_aod15_10.txt')
wget.download(url, '../data/2_observations/aeronet/202102_palma_aod15_10.txt')
```
 
%%%% Output: execute_result
 
'../data/2_observations/aeronet/202102_ispra_aod15_10.txt'
'../data/2_observations/aeronet/202102_palma_aod15_10 (1).txt'
 
%% Cell type:markdown id: tags:
 
<br>
 
%% Cell type:markdown id: tags:
 
Let us repeat the data request and let us also request the daily average of Aerosol Optical Depth for February 2021 for the station Palma de Mallorca. The parameter you have to change is `AVG`. By setting `20`, you indicate the request that you are interested in daily averages. You can make the required changes in the dictionary above and re-run the data request. Make sure to also change the name of the output file to e.g. `202102_palma_aod15_20.txt`.
 
%% Cell type:markdown id: tags:
 
<br>
 
%% Cell type:markdown id: tags:
 
## <a id='read_aeronet'></a>2. Read the observation data with pandas
 
%% Cell type:markdown id: tags:
 
The next step is now to read the downloaded txt file. Let us start with the file of all station measurements for the station Palma de Mallorca in February 2021. The file has the name `202102_palma_aod15_10.txt`. You can read `txt` files with the function `read_table` from the Python library `pandas`. If you inspect the txt file before (you can simply open it), you see that the first few lines contain information we do not need in the pandas dataframe. For this reason, we can set additonal keyword arguments that allow us to specify the columns and rows of interest:
* `delimiter`: specify the delimiter in the text file, e.g. comma
* `header`: specify the index of the row that shall be set as header.
* `index_col`: specify the index of the column that shall be set as index
 
You see below that the resulting dataframe has 258 rows and 113 columns.
 
%% Cell type:code id: tags:
 
``` python
df = pd.read_table('../data/2_observations/aeronet/202102_palma_aod15_10.txt', delimiter=',', header=[7], index_col=1)
df
```
 
%%%% Output: execute_result
 
AERONET_Site Time(hh:mm:ss) Day_of_Year \
Date(dd:mm:yyyy)
01:02:2021 Palma_de_Mallorca 07:51:31 32.0
01:02:2021 Palma_de_Mallorca 07:55:22 32.0
01:02:2021 Palma_de_Mallorca 08:00:00 32.0
01:02:2021 Palma_de_Mallorca 08:05:26 32.0
01:02:2021 Palma_de_Mallorca 08:12:02 32.0
... ... ... ...
21:02:2021 Palma_de_Mallorca 08:17:06 52.0
21:02:2021 Palma_de_Mallorca 08:42:16 52.0
21:02:2021 Palma_de_Mallorca 14:34:25 52.0
21:02:2021 Palma_de_Mallorca 14:49:25 52.0
NaN </body></html> NaN NaN
Day_of_Year(Fraction) AOD_1640nm AOD_1020nm AOD_870nm \
Date(dd:mm:yyyy)
01:02:2021 32.327442 -999.0 0.047363 0.051125
01:02:2021 32.330116 -999.0 0.049941 0.053621
01:02:2021 32.333333 -999.0 0.054279 0.058240
01:02:2021 32.337106 -999.0 0.048314 0.052426
01:02:2021 32.341690 -999.0 0.046866 0.050713
... ... ... ... ...
21:02:2021 52.345208 -999.0 0.656322 0.681146
21:02:2021 52.362685 -999.0 0.626930 0.648986
21:02:2021 52.607234 -999.0 2.027537 2.092611
21:02:2021 52.617650 -999.0 2.105442 2.169734
NaN NaN NaN NaN NaN
AOD_865nm AOD_779nm AOD_675nm ... \
Date(dd:mm:yyyy) ...
01:02:2021 -999.0 -999.0 0.054040 ...
01:02:2021 -999.0 -999.0 0.056214 ...
01:02:2021 -999.0 -999.0 0.061025 ...
01:02:2021 -999.0 -999.0 0.055008 ...
01:02:2021 -999.0 -999.0 0.053161 ...
... ... ... ... ...
21:02:2021 -999.0 -999.0 0.700608 ...
21:02:2021 -999.0 -999.0 0.666602 ...
21:02:2021 -999.0 -999.0 2.142720 ...
21:02:2021 -999.0 -999.0 2.221110 ...
NaN NaN NaN NaN ...
Exact_Wavelengths_of_AOD(um)_380nm \
Date(dd:mm:yyyy)
01:02:2021 0.3806
01:02:2021 0.3806
01:02:2021 0.3806
01:02:2021 0.3806
01:02:2021 0.3806
... ...
21:02:2021 0.3806
21:02:2021 0.3806
21:02:2021 0.3806
21:02:2021 0.3806
NaN NaN
Exact_Wavelengths_of_AOD(um)_340nm \
Date(dd:mm:yyyy)
01:02:2021 0.3407
01:02:2021 0.3407
01:02:2021 0.3407
01:02:2021 0.3407
01:02:2021 0.3407
... ...
21:02:2021 0.3409
21:02:2021 0.3409
21:02:2021 0.3409
21:02:2021 0.3409
NaN NaN
Exact_Wavelengths_of_PW(um)_935nm \
Date(dd:mm:yyyy)
01:02:2021 0.9350
01:02:2021 0.9350
01:02:2021 0.9350
01:02:2021 0.9350
01:02:2021 0.9350
... ...
21:02:2021 0.9349
21:02:2021 0.9349
21:02:2021 0.9349
21:02:2021 0.9349
NaN NaN
Exact_Wavelengths_of_AOD(um)_681nm \
Date(dd:mm:yyyy)
01:02:2021 -999.0
01:02:2021 -999.0
01:02:2021 -999.0
01:02:2021 -999.0
01:02:2021 -999.0
... ...
21:02:2021 -999.0
21:02:2021 -999.0
21:02:2021 -999.0
21:02:2021 -999.0
NaN NaN
Exact_Wavelengths_of_AOD(um)_709nm \
Date(dd:mm:yyyy)
01:02:2021 -999.0
01:02:2021 -999.0
01:02:2021 -999.0
01:02:2021 -999.0
01:02:2021 -999.0
... ...
21:02:2021 -999.0
21:02:2021 -999.0
21:02:2021 -999.0
21:02:2021 -999.0
NaN NaN
Exact_Wavelengths_of_AOD(um)_Empty \
Date(dd:mm:yyyy)
01:02:2021 -999.0
01:02:2021 -999.0
01:02:2021 -999.0
01:02:2021 -999.0
01:02:2021 -999.0
... ...
21:02:2021 -999.0
21:02:2021 -999.0
21:02:2021 -999.0
21:02:2021 -999.0
NaN NaN
Exact_Wavelengths_of_AOD(um)_Empty.1 \
Date(dd:mm:yyyy)
01:02:2021 -999.0
01:02:2021 -999.0
01:02:2021 -999.0
01:02:2021 -999.0
01:02:2021 -999.0
... ...
21:02:2021 -999.0
21:02:2021 -999.0
21:02:2021 -999.0
21:02:2021 -999.0
NaN NaN
Exact_Wavelengths_of_AOD(um)_Empty.2 \
Date(dd:mm:yyyy)
01:02:2021 -999.0
01:02:2021 -999.0
01:02:2021 -999.0
01:02:2021 -999.0
01:02:2021 -999.0
... ...
21:02:2021 -999.0
21:02:2021 -999.0
21:02:2021 -999.0
21:02:2021 -999.0
NaN NaN
Exact_Wavelengths_of_AOD(um)_Empty.3 \
Date(dd:mm:yyyy)
01:02:2021 -999.0
01:02:2021 -999.0
01:02:2021 -999.0
01:02:2021 -999.0
01:02:2021 -999.0
... ...
21:02:2021 -999.0
21:02:2021 -999.0
21:02:2021 -999.0
21:02:2021 -999.0
NaN NaN
Exact_Wavelengths_of_AOD(um)_Empty<br>
Date(dd:mm:yyyy)
01:02:2021 -999.<br>
01:02:2021 -999.<br>
01:02:2021 -999.<br>
01:02:2021 -999.<br>
01:02:2021 -999.<br>
... ...
21:02:2021 -999.<br>
21:02:2021 -999.<br>
21:02:2021 -999.<br>
21:02:2021 -999.<br>
NaN NaN
[258 rows x 113 columns]
 
%% Cell type:markdown id: tags:
 
<br>
 
%% Cell type:markdown id: tags:
 
In the dataframe above, you see that missing data entries are filled with -999.0. Let us replace those as `NaN`. You can use the function `replace()` to do so. The resulting dataframe has now `NaN` (not a number) for all entries with missing data. This replacement facilitates the plotting of the data.
 
%% Cell type:code id: tags:
 
``` python
df = df.replace(-999.0, np.nan)
df
```
 
%%%% Output: execute_result
 
AERONET_Site Time(hh:mm:ss) Day_of_Year \
Date(dd:mm:yyyy)
01:02:2021 Palma_de_Mallorca 07:51:31 32.0
01:02:2021 Palma_de_Mallorca 07:55:22 32.0
01:02:2021 Palma_de_Mallorca 08:00:00 32.0
01:02:2021 Palma_de_Mallorca 08:05:26 32.0
01:02:2021 Palma_de_Mallorca 08:12:02 32.0
... ... ... ...
21:02:2021 Palma_de_Mallorca 08:17:06 52.0
21:02:2021 Palma_de_Mallorca 08:42:16 52.0
21:02:2021 Palma_de_Mallorca 14:34:25 52.0
21:02:2021 Palma_de_Mallorca 14:49:25 52.0
NaN </body></html> NaN NaN
Day_of_Year(Fraction) AOD_1640nm AOD_1020nm AOD_870nm \
Date(dd:mm:yyyy)
01:02:2021 32.327442 NaN 0.047363 0.051125
01:02:2021 32.330116 NaN 0.049941 0.053621
01:02:2021 32.333333 NaN 0.054279 0.058240
01:02:2021 32.337106 NaN 0.048314 0.052426
01:02:2021 32.341690 NaN 0.046866 0.050713
... ... ... ... ...
21:02:2021 52.345208 NaN 0.656322 0.681146
21:02:2021 52.362685 NaN 0.626930 0.648986
21:02:2021 52.607234 NaN 2.027537 2.092611
21:02:2021 52.617650 NaN 2.105442 2.169734
NaN NaN NaN NaN NaN
AOD_865nm AOD_779nm AOD_675nm ... \
Date(dd:mm:yyyy) ...
01:02:2021 NaN NaN 0.054040 ...
01:02:2021 NaN NaN 0.056214 ...
01:02:2021 NaN NaN 0.061025 ...
01:02:2021 NaN NaN 0.055008 ...
01:02:2021 NaN NaN 0.053161 ...
... ... ... ... ...
21:02:2021 NaN NaN 0.700608 ...
21:02:2021 NaN NaN 0.666602 ...
21:02:2021 NaN NaN 2.142720 ...
21:02:2021 NaN NaN 2.221110 ...
NaN NaN NaN NaN ...
Exact_Wavelengths_of_AOD(um)_380nm \
Date(dd:mm:yyyy)
01:02:2021 0.3806
01:02:2021 0.3806
01:02:2021 0.3806
01:02:2021 0.3806
01:02:2021 0.3806
... ...
21:02:2021 0.3806
21:02:2021 0.3806
21:02:2021 0.3806
21:02:2021 0.3806
NaN NaN
Exact_Wavelengths_of_AOD(um)_340nm \
Date(dd:mm:yyyy)
01:02:2021 0.3407
01:02:2021 0.3407
01:02:2021 0.3407
01:02:2021 0.3407
01:02:2021 0.3407
... ...
21:02:2021 0.3409
21:02:2021 0.3409
21:02:2021 0.3409
21:02:2021 0.3409
NaN NaN
Exact_Wavelengths_of_PW(um)_935nm \
Date(dd:mm:yyyy)
01:02:2021 0.9350
01:02:2021 0.9350
01:02:2021 0.9350
01:02:2021 0.9350
01:02:2021 0.9350
... ...
21:02:2021 0.9349
21:02:2021 0.9349
21:02:2021 0.9349
21:02:2021 0.9349
NaN NaN
Exact_Wavelengths_of_AOD(um)_681nm \
Date(dd:mm:yyyy)
01:02:2021 NaN
01:02:2021 NaN
01:02:2021 NaN
01:02:2021 NaN
01:02:2021 NaN
... ...
21:02:2021 NaN
21:02:2021 NaN
21:02:2021 NaN
21:02:2021 NaN
NaN NaN
Exact_Wavelengths_of_AOD(um)_709nm \
Date(dd:mm:yyyy)
01:02:2021 NaN
01:02:2021 NaN
01:02:2021 NaN
01:02:2021 NaN
01:02:2021 NaN
... ...
21:02:2021 NaN
21:02:2021 NaN
21:02:2021 NaN
21:02:2021 NaN
NaN NaN
Exact_Wavelengths_of_AOD(um)_Empty \
Date(dd:mm:yyyy)
01:02:2021 NaN
01:02:2021 NaN
01:02:2021 NaN
01:02:2021 NaN
01:02:2021 NaN
... ...
21:02:2021 NaN
21:02:2021 NaN
21:02:2021 NaN
21:02:2021 NaN
NaN NaN
Exact_Wavelengths_of_AOD(um)_Empty.1 \
Date(dd:mm:yyyy)
01:02:2021 NaN
01:02:2021 NaN
01:02:2021 NaN
01:02:2021 NaN
01:02:2021 NaN
... ...
21:02:2021 NaN
21:02:2021 NaN
21:02:2021 NaN
21:02:2021 NaN
NaN NaN
Exact_Wavelengths_of_AOD(um)_Empty.2 \
Date(dd:mm:yyyy)
01:02:2021 NaN
01:02:2021 NaN
01:02:2021 NaN
01:02:2021 NaN
01:02:2021 NaN
... ...
21:02:2021 NaN
21:02:2021 NaN
21:02:2021 NaN
21:02:2021 NaN
NaN NaN
Exact_Wavelengths_of_AOD(um)_Empty.3 \
Date(dd:mm:yyyy)
01:02:2021 NaN
01:02:2021 NaN
01:02:2021 NaN
01:02:2021 NaN
01:02:2021 NaN
... ...
21:02:2021 NaN
21:02:2021 NaN
21:02:2021 NaN
21:02:2021 NaN
NaN NaN
Exact_Wavelengths_of_AOD(um)_Empty<br>
Date(dd:mm:yyyy)
01:02:2021 -999.<br>
01:02:2021 -999.<br>
01:02:2021 -999.<br>
01:02:2021 -999.<br>
01:02:2021 -999.<br>
... ...
21:02:2021 -999.<br>
21:02:2021 -999.<br>
21:02:2021 -999.<br>
21:02:2021 -999.<br>
NaN NaN
[258 rows x 113 columns]
 
%% Cell type:markdown id: tags:
 
<br>