Commit aeb21f59 authored by Daniel Lee's avatar Daniel Lee 🐐
Browse files

Merge tag '2.8.1-rc2'

parents 54234eaf f7c3e671
......@@ -5,6 +5,8 @@ conda/msg-gdal-driver/src/PublicDecompWT.zip
_build
~$*
__pycache__/
epct-webui/cypress/videos
epct-webui/cypress/screenshots
# Dask
**/dask-worker-space
......
This diff is collapsed.
repos:
- repo: https://github.com/psf/black
rev: stable
hooks:
- id: black
language_version: python3 # Should be a command that runs python3.6+
\ No newline at end of file
......@@ -4,17 +4,85 @@ All notable changes to this project are documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
## [2.8.1]
## Unreleased
### Added
- Added configurable parameter about EUMETSAT Data Store URL netloc (#700)
### Fixed
- Preventing re-projecting geostationary products to geostationary projections (#688)
- Update to outdated deployment instructions for DTWS (#683)
- Logging quota overflows, despite logs contribute to user quota (#642)
- Report quota path fix (#636)
## [2.8.0]
### Added
- CI/CD: installation of EUMETSAT package eugene on linux pipelines to not skip relative EPS validation tests (#667)
- Added customization time info to log for ELK reports (#650)
- Basket check functional test (#634)
- Added epct_webui code coverage (#622)
- Added the possibility to specify the path to general configuration folder via environmental variable (#606)
- DTWS: use an explicit whitelist with administrator usernames (#593)
- Automatic build and test of conda constructor package on Linux (#590, #591)
- Added nginx caching of JS, CSS to DTWS (#571)
- Integration with GEMS monitoring (#547)
- Enforced a configurable customisation time-out (#545)
- Allowing users to housekeep their space (#544)
- Installation without internet on Linux (#540)
- DTWS: enforced a configurable timeout on long-running jobs (#465)
- DTWS: implemented the ability to monitor and control working nodes (#431)
### Changed
- Optimized execution of long tests in CI pipelines (#665)
- Making the dask dashboard of the DTWS accessible and informative (#659)
- Reformatted code to standard (#654)
- Restructured DTWS fair queuing logic and handling of exceptions (#632, #633)
- Rename "output_dir" variable "root_path", remove ref to "test" deployment in epct-restapi/__init__.py (#607, #621)
- Webapp GUI changes for admin user to manage customisations (#600)
- Move scipy requirement from epct/setup.py to epct_plugin_gis/setup.py (#586)
- OAS* and OR1* products functional tests are now included in validation procedures and testing (#535)
- Unit tests coverage for epct-plugin-gis expanded up to 60% (#401, #403, #404, #405, #406, #557)
### Fixed
- Active user detection (#655)
- ELK reports only showing a selection of info and detailed list of customisation reporting id (#646, #648)
- Deleting a process not working if generic username used in deployment style desktop (#641)
- Reliable kill processes in DTWS (#639)
- Restructure logic of api.report_quota for administrator user (#638)
### Removed
- Removed filter feature from UMARF msgclmk backends definition (#601)
## [2.7.0]
### Added
- Automatic build of DT conda constructor package with CI pipelines on Linux (#590, #591)
- New manual job in CI allows creating pdfs for all sphinx documents (#574)
- First automated tests for GUI validation (#567)
- API and REST API report on usage of user quota (#565)
- DTWS: User quota activable and configurable (#564)
- Added a configurable limit to the max shapefile size allowed to upload (#542)
- Monitoring concurrent users on system (#536)
- Monitoring total customisation time per user (#528)
- Setup a fair-shared queuing (#428)
### Changed
- Improved install instructions for ELK stack and reports (#538)
- Automated additional manual tests PROC_TP_01_03,04,05 and EPCT_ERR_TP_01,02 (#532)
- User quota configuration reviewed and improved (#589)
- EDT: inhibit new customisations if a user has already used all his quota (#580)
- Updated DTWS user info URL (#570)
- Disabled input product selection when in "epcs" deployment (#552)
- Input product type(s) automatically discovered using Data Store REST API (#546)
- Allowed user configuration of compression wrappers (#541)
- Improvements to ELK stack and reports (#538)
- Automatized multiple manual tests (#532)
### Fixed
- Removed advertising of unavailable features for ASCATL1SZF product (#583)
- Configuration of user info in DTWS deployment was lacking (#570)
### Removed
- DT configuration in Docker container now stored in conda environment as well (#599)
## [2.6.0]
......@@ -27,7 +95,6 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- Improved the display of the basket content (#551)
- Clickable EUMETSAT logo, redirecting to EUMETSAT webpage (#550)
- GUI: showing username after the login (#548)
- Updated API.TC.01.01 to allow for more time (#529)
### Fixed
- Allow any feature relying on log parsing to fail gracefully if the format is unexpected (#477)
......
CONTRIBUTING TO THE EPCT DEVELOPMENT
====================================
# CONTRIBUTING TO THE DATA TAILOR DEVELOPMENT
The document describes the branching and release workflow used for the EPCT.
The following is a set of guidelines for contributing to the Data Tailor.
Branching and release workflow
------------------------------
## Branching and release workflow
The branching and release workflow is very close to GitFlow.
The following main branches always exist:
......@@ -36,4 +34,28 @@ To release a version:
Repository maintainer
* [ ] tag the release
\ No newline at end of file
* [ ] tag the release
## Coding style
### Python
#### Black
The Data Tailor Python code should conform to [Black](https://pypi.org/project/black) code style as
described [here](https://github.com/psf/black/blob/master/docs/the_black_code_style.md) with the
customisation described in the [pyproject.toml](pyproject.toml) file.
To avoid code-style degradation it is recommended to use [pre-commit](https://pre-commit.com/)
(the configuration is provided in the [.pre-commit-config.yaml](.pre-commit-config.yaml) file).
#### Additional formatting guidelines
In addition to the standards outlined above, the following guidelines are encouraged:
- f-string syntax instead of %-formatting
### Javascript
See the dedicated [CONTRIBUTING.md](epct-webui/CONTRIBUTING.md) file.
This diff is collapsed.
......@@ -2,13 +2,18 @@ Install the EUMETSAT Data Tailor
--------------------------------
This document describes how to build and install the EUMETSAT Data Tailor core components and
the customisation plugins from the source code, for the following Operating Systems:
the customisation plugins from the source code on a an machine provided with an internet connection.
The following Operating Systems are supported:
- Ubuntu Linux 18.04 64bit
- CentOS Linux 7 64bit
- Red Hat Enterprise Linux 7 64bit
- Windows 10 Pro 64bit.
Install the EUMETSAT Data Tailor on a host which has no internet connection is also possible and is described
in the `Installing EUMETSAT Data Tailor without an internet connection`_ paragraph. Such procedure is currently available on
Linux machines only.
Hardware pre-requisites
~~~~~~~~~~~~~~~~~~~~~~~~
......@@ -78,7 +83,7 @@ On Linux, it can be activated by default by adding it to `.bashrc`::
echo conda activate epct-2.5 >> $HOME/.bashrc
Once the environment is active, one can use `epct; e.g. to retrieve the version::
Once the environment is active, one can use `epct`; e.g. to retrieve the version::
epct --version
......@@ -104,3 +109,28 @@ change the `conda install` line in the `Installing EUMETSAT Data Tailor`
section and remove unneeded packages. The `epct` (API and CLI) interface
and at least one customisation plugin (typically `epct_plugin_gis`) are
required.
Installing EUMETSAT Data Tailor without an internet connection
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
It is possible to installing the EUMETSAT Data Tailor on a Linux machine
without a connection to the internet.
The following installer files need to be available; they can be obtained
from EUMETSAT:
* the installer proper; this is a bash executable named `data-tailor-<version-identifier>.sh`
* a Python wheel package for the `falcon_multipart` package-
Create a `/tmp/conda-channel` folder and copy the falcon-multipart Python wheel file in it::
mkdir /tmp/conda-channel
cp </path/to/unzipped_folder>/falcon_multipart-*.whl /tmp/conda-channel
Then run the installer and follow the instruction::
bash </path/to/folder>/data-tailor-*.sh
......@@ -8,7 +8,7 @@ for a set of EUMETSAT products in native formats.
Some technical aspects related to the renaming are still to be fixed.
Supported platforms and installation
Supported platforms and installation
------------------------------------
The EUMETSAT Data Tailor can be installed on:
......@@ -59,3 +59,61 @@ How to get User Support
Please contact the EUMETSAT User Service Helpdesk using the E-mail: ops@eumetsat.int
Known limitations
-----------------
This section includes the known limitations when working with the Eumetsat Data Tailor.
General limitations
~~~~~~~~~~~~~~~~~~~~
.. list-table::
:header-rows: 0
:widths: 40 80
* - projection and ROI extraction
- Extraction of Region of Interest (ROI) can be performed only if the customisation required a
new output projection
Limitations about input products
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
EPS Native products
'''''''''''''''''''
.. list-table::
:header-rows: 0
:widths: 80 80
* - ASCATL1SZF (EO:EUM:DAT:METOP:ASCSZF1B)
- It is impossible to perform re-projection and ROI-extraction for this product.
* - IASISND02 (EO:EUM:DAT:METOP:IASSND02)
- Only conversion to NetCDF4 file format is supported. Even if HNO3 and/or O3 data are present
in the EPS-native format, the NetCDF4 output file includes only datasets about CO data,
according to the corresponding product disseminated via Eumetsat DataCentre.
MSG products
''''''''''''
.. list-table::
:header-rows: 0
:widths: 80 80
* - MSGAMVE (EO:EUM:DAT:MSG:AMV, EO:EUM:DAT:MSG:AMV-IODC)
MSGCLAP (EO:EUM:DAT:MSG:CLA, EO:EUM:DAT:MSG:CLA-IODC)
MTGIRSL1
- on Windows environment output products in BUFR format are not generated because eccodes tool
(https://confluence.ecmwf.int/display/ECC) is not supported
SAF products
''''''''''''
.. list-table::
:header-rows: 0
:widths: 80 80
* - OAS025_BUFR (EO:EUM:DAT:METOP:OAS025)
OASWC12_BUFR (EO:EUM:DAT:METOP:OSI-104)
OR1ASWC12_BUFR (EO:EUM:DAT:METOP:OSI-150-B)
OR1SWW025_BUFR (EO:EUM:DAT:QUIKSCAT:REPSW25)
- on Windows environment input products in BUFR format are no supported because eccodes tool
(https://confluence.ecmwf.int/display/ECC) is not supported
EUMETSAT Data Tailor Web Service - Operational procedures
*********************************************************
This document describes how to maintain the Data Tailor Web Service (DTWS) on
the EUMETSAT ICSI infrastructure.
Short overview of the architecture
===================================
The DTWS is deployed on a distributed architecture, with:
- one service node, with a public IP associated to `tailor.eumetsat.int`. The node:
- acts as a proxy to the the DTWS WebApp (GUI); served on the public address
- acts as a proxy to the the DTWS REST API; served on the public address
- acts as a proxy to the DTWS Scheduler dashboard; served on a local address
- serves the ELK stack, providing usage statistics for the DTWS on a local address
- operates the GEMS client which sends DTWS events to the centralised GEMS server.
- one master node, which:
- serves the DTWS public interfaces (GUI, REST API)
- serves the DTWS scheduler and its dashboard
- serves the internal Docker registry used during during deployment
- N worker nodes, which execute the product customisations. The number of worker nodes
can be changed at runtime to accommodate changes in the work-load of the DTWS.
Master and worker nodes are part of a Docker Swarm, with the master taking on the
role of swarm manager. When the DTWS is deployed to the Swarm, the following `Docker services`
are created (assuming that, as described in the deployment instructions, the stack is called `dtws`):
- `dtws_dtws-restapi`: the REST API service, on the swarm manager
- `dtws_dtws-scheduler`: the scheduler, on the swarm manager
- `dtws_dtws-webapp`: the GUI service, on the swarm manager
- `dtws_dtws-worker`: the `worker` services that receive customisation requests, on each worker node.
Monitoring
==========
The DTWS provides monitoring capabilities at several levels. They are described below.
DTWS performance (ELK stack)
----------------------------
DTWS performance can be monitored by opening `<service node internal ip>:5607` with a browser,
then accessing the Dashboard.
Scheduler and worker health
----------------------------
The health of scheduler and workers can be monitored by opening `<service node internal ip>:8787` with a browser.
This is mainly useful to:
- check if the worker processes are up and running
- monitor the load on workers.
Note that the number of worker processes should be the number of worker nodes times the number of cores per worker.
Docker Swarm infrastructure
---------------------------
The composition and health of the Docker Swarm infrastructure can be monitored
by accessing the manager node as a superuser (user with sudo privileges), with:
.. code-block::
sudo docker node ls
For a swarm in nominal conditions, the output will show all nodes `Ready` and `Active`,
with one node with Manager Status "Leader".
DTWS Docker services
--------------------
The status of Docker services can be monitored
by accessing the manager node as a superuser (user with sudo privileges), with:
.. code-block::
sudo docker service ls
In nominal conditions, the output will show:
- `dtws_dtws-restapi` service operating on 1 node over 1 available (`REPLICAS` column) and served on port 8080
- `dtws_dtws-scheduler` service operating on 1 node over 1 available and served on port 8786 (scheduler proper),
8787 (monitoring dashboard), 9786 (internal communications)
- `dtws_dtws-webapp` service operating on 1 node over 1 available and served on port 8000
- `dtws_dtws-worker` service operating on `N` nodes over N available
- `registry` service operating on 1 node over 1 available and served on port 5000.
In non-nominal conditions, some services may not be present (botched deployment) or non deployed; e.g.
the following output:
.. code-block::
ID NAME MODE REPLICAS IMAGE PORTS
...
h6o6gv9d6dew dtws_dtws-worker replicated 8/10 127.0.0.1:5000/dtws:2.7.1
...
would reveal that the `dtws_dtws-worker` is only active on 8 nodes, even though 10 Docker worker nodes would be available.
To monitor a specific Docker service, the following commands may be useful:
.. code-block::
sudo docker service ps <service name> # lists service tasks, can be useful e.g. to detect its latest restart
sudo docker service inspect <service name> # displays information on the service, including start-up parameters
DTWS service logs
-----------------
The DTWS generates usage logs in the following files (paths assume that the default in the deployment guide have been
used):
- `/mnt/dtws-shared/dtws-workspace/epct_restapi_<YYYYMMDD>.log`: logs of the REST API
- `/mnt/dtws-shared/dtws-workspace/<username>/logs/*.log`: logs of user's customisations; the naming convention is:
`<username>_<timestamp>_<plugin>_<product_type>_<applied customisations>_<customisation_id>.log`
Control
=======
Add a worker node
------------------
Adding a worker node requires to:
- set-up the worker node and join it to the swarm as described in the deployment guide `DTWS nodes setup - Worker nodes`
- scale the `dtws_dtws-worker` service up, executing from the master node (with `N` is the previous number of nodes):
.. code-block::
sudo docker service scale dtws_dtws-worker=<N+1>
Verify that the `dtws_dtws-worker` is now using one more node ("replicas") with:
.. code-block::
sudo docker service ls
Remove a worker node
--------------------
To remove a worker node, follow the steps below.
On the master node, check which nodes are part of the swarm:
.. code-block::
sudo docker node ls
Then, scale down the worker service:
.. code-block::
sudo docker service scale dtws_dtws-worker=<N-1>
Now, identify which nodes the worker service is running on:
.. code-block::
sudo docker service ps dtws_dtws-worker
Nodes which are not listed can be removed.
On the worker node to be removed, execute:
.. code-block::
sudo docker warm leave
Then on the manager node, remove the target node:
.. code-block::
sudo docker node rm --force <node_id>
Restart service
---------------
To restart any docker service running on the swarm, execute from the master node:
.. code-block::
sudo docker service update --force <service_name>
This can be useful e.g. to make the REST API (`dtws_dtws-restapi`) re-read static parts of the configuration,
or to re-deploy the worker service to nodes.
.. note::
The command *restarts* the service. Restarting the `dtws_dtws-worker` kills all the processes running
on them, although the scheduler should be able to restart them.
Temporary service shutdown
--------------------------
In order to momentarily shutdown a service (e.g. for availability testing purposes), first scale it down:
.. code-block::
sudo docker service scale <service_name>=<N-1>
Then to resume the service, scale it back up:
.. code-block::
sudo docker service scale <service_name>=<N>
In order to confirm services are correctly resumed, it is always good practice to check again their status with:
.. code-block::
sudo docker service ls
.. note::
Shutting down the `dtws_dtws-scheduler` for too long will make the `dtws_dtws-worker` and `dtws_dtws-restapi`
service lose the connection to it; therefore they will need to be restarted once the scheduler service is
restarted.
Permanent service shutdown
--------------------------
To permanently shutdown any docker service running on the swarm, execute from the master node:
.. code-block::
sudo docker service rm <service_name>
Purge user space - console
---------------------------
User space can be purged with standard administrator commands from the command line of any node,
as user data are stored in the `/mnt/dtws-shared/dtws-workspace/<username>/` folder on the shared filesystem.
Purge user space - GUI
----------------------
User space can also be purged from the DTWS GUI, from a DTWS administrator.
A DTWS administrator is simply a user in EUMETSAT CAS which is whitelisted in the
DTWS configuration (see below).
To do that, access the GUI as an administrator, and open the process manager panel with the downward arrow button
on the bottom right.
The list of current and past customisations from all users opens.
Marking the customisations that need to be purged by ticking the corresponding checkbox, then pushing on the "Delete
Customisations" button below will remove all the selected customisations and related data and logs.
At the moment it is not possible to filter the list of customisations by user.
Configuration
=============
Understanding the DTWS configuration
-------------------------------------
Where to find it
................
DTWS proper (stored in the shared filesystem at `/mnt/dtws-shared/dtws-config/epct`, so accessible from all the nodes
where it is mounted):
- `epct.yaml`
- `epcs.yaml`
- `epct-webui.yaml`
- low-level (products,....)
Users have their own copy of the configuration in `/mnt/dtws-shared/dtws-workspaces/<username>/etc`; if the general configuration is
changed, it gets copied the next time the user makes a request to the REST API.
This implies that most configuration parameters, with the exception of the following ones,
which define the basic behaviour of the DTWS, can be changed at runtime and will be applied at the
following user request.
Key configuration parameters requiring a restart of the service(s)
...................................................................
This section lists the key parameters that may make sense to change in an operational system
and that would require a restart of some service in the Docker stack.
In `epct.yaml`:
- `api_base_path`
- `remove_processing_dir`
- `log_level`
After any of the parameters above is changed, the `dtws_dtws-restapi` service needs to be restarted.
In `epcs.yaml`:
- None of interest.
In `epct-webui.yaml`:
- `client_key`.
If such parameter changes, the `dtws_dtws-webapp` service needs to be restarted.
Key configuration parameters requiring a restart of the service(s)
...................................................................
This section lists the key parameters that may make sense to change in an operational system,
but do not require a restart of the service.
.. note::
It is advisable not to modify the files being used by the operational system,
as text editors often open temporary files that may cause the DTWS to break.
The best practice is to modify a copy in a temporary directory, then move the updated
configuration file to the configuration directory.
In `epcs.yaml`:
- `fair_queueing` section
- `disk_quota` section
- `administrators` section.
For more details on such sections, refer to the relevant paragraphs below.
Nginx configuration
--------------------
`Nginx` configuration for the DTWS is found on the service node at `/etc/nginx/sites.d/epcs.conf`.
ELK Stack configuration
-----------------------
The configuration of the components of the ELK stack are found on the service node at:
- `/etc/logstash/conf.d/epcs.conf` for `Logstash`
- `/etc/kibana/kibana.yml` for `Kibana`
- `/etc/elasticsearch/elasticsearch.yml` for `Elasticsearch`
Manage disk quotas
------------------
Configuration for disk quotas is kept in the `disk_quota` section
of the `epcs.yaml` file.
An example configuration could be:
.. code-block::
disk_quota:
active: true # true ---> disk-quota is ON; false ---> disk-quota is OFF
# max allowed size per user. The value 0 means "unlimited"
user_quota: # [GB]
default: 50
test-user_1: 10
The configuration above activates the management of disk quota (`active: true`), sets
a limit for all users at 50 GB (`default: 50`), but limits the `test-user` to 10 GB.
To set a disk quota for another user, say `test-user_2`, to 20 GB, just add the corresponding key-pair,