Using MATLAB, IDL or Python - CSA Guide
USING MATLAB, IDL OR PYTHON
Here are some demonstrations of accessing data (synchronous/direct downloads) using MATLAB, IDL and Python.
An example of a script for performing an Asynchronous request using Python can be found at the bottom of this page.
FORMAT AND STRUCTURE OF DOWNLOAD
The downloaded file package from a synchronous data request will be a .tgz that when gunzipped and untarred will be a folder called CSA_Download_yyyyMMdd_hhmm using the date and time of the retrieval. Inside this folder will be another folder with the name of the dataset requested, containing the data file(s) requested :
CSA_Download_yyyyMMdd_hhmm/<DATASET_ID>/<DATASET_ID>__<START_DATE>_<END_DATE>_<VERSION_NO>.<DELIVERY_FORMAT>
For example:
CSA_Download_20180521_1134/C1_CP_WHI_ELECTRON_DENSITY/C1_CP_WHI_ELECTRON_DENSITY__20111110_180000_20111110_210000_V120703.cef
By default, if this dataset comes with additional information, such as active caveats, this will be included in the package. In our example here, this time interval includes a caveat about calibration, so two folders will be in the package.
MATLAB
There are different commands that can be used depending on your version of MATLAB. All three options will work with R2014b or later, but if you have R2014b or later, we would recommend using websave. You can check your MATLAB version by typing ver at the MATLAB prompt.
Note that since R2013a, the https request is essentially constructed in the same way as detailed in the How-to page but the parameter-value pairs are given as pairs of strings, e.g., 'parameter', 'value'.
MATLAB R2014B OR LATER (I.E., LATEST)
WEBSAVE - DATA REQUEST
With MATLAB R2014b, websave was introduced and recommended over urlwrite.
URL = 'https://csa.esac.esa.int/csa-sl-tap/data';
fileName=tempname;
gzFileName = [fileName '.gz'];
options = weboptions('RequestMethod', 'get', 'Timeout', Inf);
tgzFileName = websave(gzFileName, URL, ...
'RETRIEVAL_TYPE', 'product', ...
'DATASET_ID', 'C1_CP_FGM_SPIN', ...
'START_DATE', '2003-03-03T00:00:00Z', ...
'END_DATE', '2003-03-05T22:10:00Z', ...
'DELIVERY_FORMAT', 'CDF', ...
'DELIVERY_INTERVAL', 'ALL', ...
options);
gunzip(gzFileName);
fileNames=untar(fileName);
for iFile = 1:numel(fileNames), disp(fileNames{iFile}); end
WEBSAVE - metaDATA REQUEST
URL='https://csa.esac.esa.int/csa-sl-tap/tap/sync';
options=weboptions('RequestMethod','get','Timeout', Inf);
fn='mymetadata.csv';
eval(['tgzFileName=websave(fn, URL, ', ...
'''REQUEST'', ''doQuery'',', ...
'''LANG'', ''ADQL'',', ...
'''FORMAT'', ''CSV'',', ...
'''QUERY'', ''(SELECT dataset_id,experiments FROM v_dataset)'',', ...
'options);']);
MATLAB R2013A TO R2014A
URLWRITE
For releases of MATLAB previous to R2014b (when websave was introduced), you need to use urlwrite, as in this example below.
URL = 'https://csa.esac.esa.int/csa-sl-tap/data';
fileName=tempname;
gzFileName = [fileName '.gz'];
[gzFileName,st]=urlwrite(URL, gzFileName, 'Get', ...
{'RETRIEVAL_TYPE', 'product', ...
'DATASET_ID', 'C1_CP_WHI_ELECTRON_DENSITY', ...
'START_DATE', '2011-11-10T18:00:00Z', ...
'END_DATE', '2011-11-10T21:00:00Z'});
gunzip(gzFileName);
fileNames=untar(fileName);
for iFile = 1:numel(fileNames), disp(fileNames{iFile}); end
MATLAB R2012B OR EARLIER
URLWRITE
Note that this example uses the complete https request as constructed in the How-to page, and not the 'parameter', 'value' pairs in the above examples.
URL_CSA = 'https://csa.esac.esa.int/csa-sl-tap/data?RETRIEVAL_TYPE=product&DATASET_ID=C2_CP_RAP_PAD_L3DD&START_DATE=2010-01-01T00:00:00Z&END_DATE=2010-01-01T23:59:59Z';
fileName = tempname;
gzFileName = [fileName '.gz'];
[gzFileName,st] = urlwrite(URL_CSA,gzFileName);
gunzip(gzFileName);
fileNames = untar(fileName);
for iFile = 1:numel(fileNames), disp(fileNames{iFile}), end
IDL
A complete IDL routine, that uses TAP to download data directly from IDL is csa_tap_product.pro to download data, developed by Andrew Walsh. It works with Linux, Mac or Windows. There is an issue with IDL that means that the downloaded files cannot be directly untarred; the directory structure must be built first - this basic function (csa_untar.pro) performs the scanning, making the required directories, then finally untarring the downloaded package.
WARNING: currently (July 2021), IDL's FILE_GUNZIP has a problem with the extension .tgz and if used, will cause the file to expand forever (or until it fills your hard drive, whichever happens first...). This has been reported to Harris, the owners of IDL. Either change the extension to .tar.gz, or use FILE_UNTAR instead, since the compression is automatically detected; note that the directory structure needs to be in place first (see csa_untar.pro, given above).
If you use IDL versions before 8.4, there may be a problem with certificates - see below.
To download data using IDL, the IDLnetURL object allows IDL to act as a client to an HTTP or FTP server. Since authentication is not necessary for direct data download, the following is a minimal script for using IDL (a more complete procedure is csa_tap_product.pro, given above). The downloaded package will be saved to your home directory - to change this, put a path before the first occurrence of 'csa_buffer.dat'. Note that the string required for the query is a string matching that used for a simple HTTP request and that this script does not uncompress or untar the download (see above for function that can be added):
function csa_product_short
csa_query = 'RETRIEVAL_TYPE=product&DATASET_ID=C1_CP_FGM_SPIN&START_DATE=2003-03-03T12:00:00Z&END_DATE=2003-03-03T14:00:00Z&DELIVERY_FORMAT=CDF&DELIVERY_INTERVAL=hourly'
;Create IDLnetURL object and set properties
csa_product_obj = obj_new('IDLnetUrl')
csa_product_obj->SetProperty, VERBOSE=1
csa_product_obj->SetProperty, url_scheme = 'https'
csa_product_obj->SetProperty, url_host = 'csa.esac.esa.int'
csa_product_obj->SetProperty, url_path = 'csa-sl-tap/data'
csa_product_obj->SetProperty, url_query = csa_query
;send request to CSA AIO system, saving response in csa_buffer.dat
csa_product_response = csa_product_obj->get(filename='csa_buffer.dat')
;extract the header to get the correct filename
csa_product_obj->getproperty, response_header=csa_product_header
;check a .tar.gz file was downloaded
csa_filestart = strpos(csa_product_header,'filename=')
;if so, rename buffer to correct filename and return correct filename
if csa_filestart ne -1 then begin
csa_fileend = strpos(csa_product_header,'gz"')
csa_filename = strmid(csa_product_header, csa_filestart+10, csa_fileend-csa_filestart-8)
csa_dir_end = strpos(csa_product_response,'csa_buffer.dat')
csa_working_dir = strmid(csa_product_response,0,csa_dir_end)
file_move, csa_product_response, csa_working_dir+csa_filename
print, 'Downloaded data to '+csa_working_dir+csa_filename
outfile = csa_working_dir+csa_filename
return, outfile
;otherwise return 0
endif else begin
print, 'Something went wrong.'
return, 0
endelse
end
IDL Versions before 8.4: Important note
If you have IDL with an older version than 8.4, these programs may not work and display an error message as follows:
% IDLNETURL::GET: CCurlException: Error: Http Get Request Failed. Error = SSL certificate problem: self signed certificate in certificate chain, Curl Error Code = 60..
% Execution halted at: CSA_TAP_PRODUCT
To quickly solve this issue:
-
in the product script (csa_tap_product.pro), please add csa_product_obj->SetProperty, ssl_verify_peer = 0
Alternatively, please see the HTTPS SPDF Web Services section for IDL.
PYTHON
This section of code, in Python 3, will allow you to do the same as the previous scripts: download a selection of data to the local directory and uncompress the package.
from requests import get # to make GET request
import tarfile
def download(url, params, file_name):
# open in binary mode
with open(file_name, "wb") as file:
# get request
response = get(url, params=params)
# write to file
file.write(response.content)
myurl = 'https://csa.esac.esa.int/csa-sl-tap/data'
query_specs = {'RETRIEVAL_TYPE': 'product',
'DATASET_ID': 'C1_CP_FGM_SPIN',
'START_DATE': '2003-03-03T12:00:00Z',
'END_DATE': '2003-03-04T12:00:00Z',
'DELIVERY_FORMAT': 'CEF',
'DELIVERY_INTERVAL': 'hourly'}
download(myurl, query_specs, 'tap_download.tgz')
with tarfile.open("tap_download.tgz") as tar:
tarname = tar.getnames()
tar.extractall()
Asynchronous data requests using the command line can be tricky, since the cookie only lasts a limited time. This script, written by Alain Barthe, does just that: TapAsyncRequest.py
Metadata requests are very similar:
from requests import get # to make GET request
myurl = 'https://csa.esac.esa.int/csa-sl-tap/tap/sync'
query_specs = {
'REQUEST': 'doQuery',
'LANG': 'ADQL',
'FORMAT': 'CSV',
'QUERY': 'SELECT dataset_id,experiments FROM v_dataset'}
filename = 'metatest.csv'
def download(url, params, file_name):
# open in binary mode
with open(file_name, "wb") as file:
# get request
response = get(url, params=params)
# write to file
file.write(response.content)
return response.status_code
print(download(myurl, query_specs, filename))