Improved Federation specifications and documentation
Summary of knowledge regarding deployment of MIP Local
This commit is contained in:
dianeperez
2018-01-17 10:53:26 +01:00
parent dd97db66e4
commit e73638534d
20 changed files with 1601 additions and 2 deletions

3
.gitignore vendored Normal file
View File

@@ -0,0 +1,3 @@
.DS_Store
._*
settings.local.*

Binary file not shown.

After

Width:  |  Height:  |  Size: 89 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 43 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 78 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 76 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 89 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 90 KiB

View File

@@ -0,0 +1,158 @@
# Firewall Configuration
## Configuration requirements
The Docker *Overlay Network* technology is a Software Defined Network (SDN) system which allows the arbitrary definition of network between docker containers.
To accomplish this, it maintains a database of hosts on which each network is available, and multiplexes the network traffic of the docker container over a single network connection between these hosts. It also allows encryption of the tunneled (application) data.
All management communication sare done through TLS encrypted communications between the hosts of the Docker Swarm (cluster, or federation in the case of the MIP). These certificates are automatically managed, and regenerated every 30 minutes by default.
The following ports and protocols are required for the proper function of the Docker Swarm overlay network technology need to be open for connection on the hosts:
* On **all** the Docker hosts which are part of the Swarm (federation):
* **TCP: 7946**
* **UDP: 7956**
* **UDP: 4789**
* **Protocol 50 (ESP)**
* **Only** on the Docker manager hosts of the Swarm (federation) :
* **TCP: 2377**
## UFW Configuration for the MIP
The following command will configure and then enable the firewall on Ubuntu, with the minimum ports required for the federation networks.
Specific public services provided by the MIP to the end-users will require their own configuration to be added.
1. Check the status of UFW
```sh
$ sudo ufw status
-> Status: inactive
```
2. Allow SSH access
```sh
$ sudo ufw allow ssh
-> Rules updated
-> Rules updated (v6)
```
3. Docker Swarm ports
```sh
$ sudo ufw allow 7946/tcp
$ sudo ufw allow 7946/udp
$ sudo ufw allow 4789/udp
$ sudo ufw allow proto esp from any to any
-> Rules updated
-> Rules updated (v6)
```
4. Docker Swarm ports for Manager nodes
**The following is required only on the Docker Swarm manager computers.**
```sh
$ sudo ufw allow 2377/tcp
-> Rules updated
-> Rules updated (v6)
```
5. Enable UFW to enforce the rules
```sh
$ sudo ufw enable
```
6. Check the status
*The example below has been executed on a worker node of the federation.*
```sh
$ sudo ufw status
Status: active
To Action From
-- ------ ----
22 ALLOW Anywhere
7946/tcp ALLOW Anywhere
7946/udp ALLOW Anywhere
4789/udp ALLOW Anywhere
Anywhere/esp ALLOW Anywhere/esp
22 (v6) ALLOW Anywhere (v6)
7946/tcp (v6) ALLOW Anywhere (v6)
7946/udp (v6) ALLOW Anywhere (v6)
4789/udp (v6) ALLOW Anywhere (v6)
Anywhere/esp (v6) ALLOW Anywhere/esp (v6)
```
## Firewalld Configuration for the MIP
The following command will configure and then enable the firewall on RedHat Enterprise Linux, with the minimum ports required for the federation networks.
Specific public services provided by the MIP to the end-users will require their own configuration to be added.
1. Check the status of Firewalld
```sh
$ sudo firewall-cmd --state
running
$ sudo firewall-cmd --list-services
ssh dhcpv6-client
$ sudo firewall-cmd --info-service=ssh
ssh
ports: 22/tcp
protocols:
source-ports:
modules:
destination:
```
2. If needed, start firewalld
```sh
$ sudo systemctl enable firewalld.service
```
3. Docker Swarm ports
Do one of the following, assuming you are in the folder containing this file. This installs and activate permanently the service profile.
* For worker nodes:
```sh
$ sudo cp docker-swarm-worker.xml /etc/firewalld/services/
$ sudo firewall-cmd --permanent --add-service=docker-swarm-worker
success
```
* For Manager nodes
```sh
$ sudo cp docker-swarm-manager.xml /etc/firewalld/services/
$ sudo firewall-cmd --permanent --add-service=docker-swarm-manager
success
```
4. Reload the firewall configuration:
```sh
$ sudo firewall-cmd --reload
success
```
5. Check the status
* For worker nodes:
```sh
$ sudo firewall-cmd --permanent --zone=public --add-service=docker-swarm-worker
```
* For Manager nodes
```sh
$ sudo firewall-cmd --permanent --zone=public --add-service=docker-swarm-manager
```

View File

@@ -0,0 +1,221 @@
# MIP Federation specifications
**Warning:** This document is work in progress. The on-going work on the second Federation PoC and the Federation demo setup might lead to improvement to the Federation specifications.
Contents:
- [Overview of the Federation](#overview-of-the-federation)
- [MIP Federated requirements](#mip-federated-requirements)
- [MIP Federated deployment](#mip-federated-deployment)
- [Behaviour in case of failure](#behaviour_in_case_of_failure)
- [Security](#security)
## Overview of the Federation
The MIP Federation allows to connect multiple MIP Local instances securely over the web, so that privacy-preserving analysis and queries on the data hosted at the Federation nodes can be performed in a distributed manner from the Federation manager, using the Exareme software.
### Federation architecture
The following schema shows on overview of the working principle of the Federation and of its infrastructure. The Federation is composed of one or more Federation manager nodes, and of any number of Federation nodes, usuallay hospitals hosting a MIP Local instance and sharing data on the Federation.
![Image](Federation_schema.001.jpg)
The Federation Manager server will run Docker engine (as all other MIP nodes). It will create the Federation Swarm (standard Docker functionality), which will make it the Swarm manager.
The Federation Manager server will host the following Federation elements (alongside its MIP Local or just LDSM instance):
- Federation Web Portal (container run locally)
- Federation Swarm Manager
- Consul (container run on the swarm, service published on port 8500)
- Portainer (optional UI for swarm management, container run on the swarm, service published on port 9000)
- Exareme Master (container run on the swarm, service published on port 9090)
The other MIP nodes will host an MIP Local instance, possibly deployed on several servers for improved security. The modifications will be:
- The server dedicated to the Federation (hosting the LDSM) will have an internet access.
- The Data Capture and Data Factory might be moved to other servers to improve security.
- The Federation server (or more accurately its Docker engine instance) will join the Federation Swarm.
- The Federation Swarm Manager will remotely start an Exareme worker on the node.
The software Exareme will expose federated analysis functionalities to the Federation Web Portal. Exareme provides several algorithms that can be performed over the data distributed in multiple nodes. Exareme algorithms retrieve only aggregated results from each node to ensure privacy (no individual patient data will leave the servers of the MIP partners). Exareme then combines the partial results in a statistically significant manner before returning results to the Federation Web Portal.
### Regarding Docker swarm
As written in the official documentation, "Docker includes a _swarm mode_ for natively managing a cluster of Docker Engines called a _swarm_". The Docker swarm functionality creates a link among distant Docker engines. A Docker engine can only be part of one swarm, so all the Docker Engine instances running on the Federation servers will be part of the Federation Swarm. (The Federation servers cannot be part of another swarm, assuming the normal and recommanded setup where only one Docker engine runs on each server.)
The swarm is created by the Swarm Manager; other Federation nodes will join as Swarm Workers. The Federation Swarm Manager will create a `mip-federation` network shared by the swarm nodes. All communications on this network will be encrypted using the option `--opt encrypted`.
Docker containers can be run in two ways:
- On the swarm. To run on the swarm, the containers must be started **from the Swarm Manager**. Containers started directly on the worker nodes cannot join the swarm for security reasons. This means that all Exareme containers (Master and Worker instances) will be started from the Federation Swarm Manager.
- Outside the swarm. Docker containers running outside the swarm can be started locally as usual on the worker nodes. All Docker services composing MIP Local will be run locally, without access to the swarm or the other MIP nodes.
### Planned Federation infrastructure
A Federation server is planned in the CHUV infrastructure, along with the hospital's MIP node server.
The Federation server should host the (first) Federation Manager node, as well as the Federation Web Portal providing the MIP federated functionalities.
## MIP Federated requirements
### Federation manager server requirements
- Static IP
- Network configuration:
- TCP: ports 2377 and 7946 must be open and available
- UDP: ports 4789 and 7946 must be open and available
- IP protocol 50 (ESP) must be enabled
- If the configuration uses a whitelist of allowed IP addresses, the IP of all other Federation nodes must be authorised.
The Federation manager server must run an instance of the LDSM as deployed in the MIP, exposing a valid federation view. The LDSM instance must be accessible locally through PostgresRAW-UI on port 31555.
- If the Federation Manager server is a hospital node, it will run a normal MIP Local instance.
- If the Federation Manager server is not a hospital node, it only needs to run an instance of the LDSM containing the research dataset that must be exposed at the Federation level.
### Federation nodes requirements
- Static IP
- Network configuration:
- TCP: port 7946 must be open and available
- UDP: ports 4789 and 7946 must be open and available
- IP protocol 50 (ESP) must be enabled
The node must also host a deployed MIP Local, or at least an LDSM instance. The LDSM instance must be accessible locally through PostgresRAW-UI on port 31555.
## MIP Federated deployment
### Initial setup
This document does not cover the deployment of MIP Local at the Federation nodes. It does not include either the deployment and configuration of the Federation Web Portal, for which no information is available yet (12.2017).
In summary, the initial setup expected is the following:
- On the Federation Manager server, Docker engine must be installed and the LDSM deployed, either alone or as part of the MIP Local (PostgresRaw and PostgresRaw-UI containers configured to expose their services on the port 31432 and 31555 respectively).
- On the other Federation nodes, MIP Local must be deployed including the LDSM, again with PostgresRaw and PostgresRaw-UI containers configured to expose their services on the port 31432 and 31555 respectively.
- The network access is configured at each node according to the requirements.
![Image](Federation_schema.002.jpg)
### Deployment of the Federation Manager node
Based on the last version of the Federation infrastructure schema provided, the Federation Manager node will be a server independant from any particular hospital. Alternatively, any hospital node hosting an instance of MIP Local could be the Federation manager.
In both cases, the Federation Manager server must host a deployed LDSM instance exposing the research data as part of its Federation view.
The Federation Manager server creates the Federation Swarm; it thus becomes the _Swarm Manager_. It also creates a network on the swarm dedicated to the Federation traffic named `mip-federation`.
At creation time, or any time later, two tokens can be retrieved: they allow to add worker or manager nodes to the swarm.
Note: The Swarm Manager can be located on any server running docker; ideally it should be duplicated on three (or any odd-numbered number of) servers for redundancy. We currently assume that the MIP Federation Server of CHUV will be the Swarm Manager (others can be added later using the "manager" token).
Once the Swarm is created, the Exareme master will be run on the swarm. The Federation Web Portal must be configured to access Exareme on the correct port.
#### Deployment steps
- Create the swarm by running the setupFederationInfrastructure.sh script.
```
git clone https://github.com/HBPMedical/Federation-PoC.git
cd Federation-PoC
./setupFederationInfrastructure.sh
```
![Image](Federation_schema.003.jpg)
### Deployment of other MIP nodes
MIP Local will mostly function as previously: the docker containers will be run locally, and can be deployed with the MIP Local deployment scripts (assuming that everything runs on the same server or that the deployment scripts are adapted to deploy individual building blocks).
The only supplementary deployment step to perform at the node is to join the swarm, using the token provided by the swarm manager.
#### Deployment steps
- If needed, retrieve the token on the Federation manager server with the following command:
```
$ sudo docker swarm join-token worker
```
- On the node, use the command retrived at the previous step to join the Federation swarm:
```
$ docker swarm join --token <Swarm Token> <Master Node URL>
```
![Image](Federation_schema.004.jpg)
### Deployment of Exareme and creation of the Federation
Once the worker nodes have joined the swarm, the swarm manager must tag each of them with a representative name (e.g. hospital name) and launch an Exareme worker on each of them. The Exareme worker will access the local LDSM to perform the queries requested by the Exarme master.
- On the Federation manager server, tag the new node(s) with an informative label:
```sh
$ sudo docker node update --label-add name=<Alias> <node hostname>
```
* `<node hostname>` can be found with `docker node ls`
* `<Alias>` will be used when bringing up the services and should be a short descriptive name.
- Restart Exareme taking into account the new node:
```sh
$ sudo ./start.sh <Alias>
```
![Image](Federation_schema.005.jpg)
### Deployment and configuration of the Federation Web Portal
To be defined.
![Image](Federation_schema.006.jpg)
## Behaviour in case of failure
The Swarm functionality of Docker is meant to orchestrate tasks in an unstable: "Swarm is resilient to failures and the swarm can recover from any number of temporary node failures (machine reboots or crash with restart) or other transient errors."
If a node crashes or reboots for any reason, docker should re-join the swarm automatically when restarted (to be confirmed). The manager will then restart the missing services on the swarm and thus restore the previous status as soon as possible.
On the other hand, Exareme will not work properly if all the expected worker nodes are not available, or if their IP addresses are modified. In case of prolonged unavailability or failure of one worker node, it should be restarted to adapt to the new situation.
**TODO: Check planned upgrades of Exareme for more flexibility regarding failures.**
The swarm cannot recover if it definitively loses its manager (or quorum of manager) because of "data corruption or hardware failures". In this case, the only option will be to remove the previous swarm and build a new one, meaning that each node will have to perform a "join" command again.
To increase stability, the manager role can be duplicated on several nodes (including worker nodes). For more information, see docker documentation about <a href="https://docs.docker.com/engine/swarm/join-nodes/#join-as-a-manager-node">adding a manager node</a> and <a href="https://docs.docker.com/engine/swarm/admin_guide/#add-manager-nodes-for-fault-tolerance">fault tolerance</a>.
## Security
This section documents a few elements regarding security.
### Swarm join tokens
The tokens allowing one node to join the swarm as a worker or a manager should not be made public. Joining the swarm as a manager, in particular, allows one node to control everything on the swarm. Ideally, the tokens should not leave the manager node except when a new node must join the swarm. There is no need to store these token somewhere else, as they can always be retrieved from the manager node.
Furthermore, the tokens can be changed (without impacting the nodes already in the swarm), following the documentation available <a href="https://docs.docker.com/engine/swarm/swarm-mode/#view-the-join-command-or-update-a-swarm-join-token">here</a>. It is recommended to rotate the tokens on a regular basis to improve security.
### Back up the Swarm
See documentation <a href="https://docs.docker.com/engine/swarm/admin_guide/#back-up-the-swarm">here</a>.

View File

@@ -0,0 +1,480 @@
# MIP Local deployment and documentation
This document summarises the knowledge of DIAS-EPFL regarding the deployment and upgrade process of MIP Local. It is based on the version 2.5.3 released on Nov 13, 2017.
**Disclaimer:** The authors of this document are not in charge of the MIP development and its deployment scripts. They have limited knowledge of most of the elements that are deployed. No guaranties are offered as to the correctness of this document.
See also the official documentation of the deployment scripts project on Github: <a href="https://github.com/HBPMedical/mip-microservices-infrastructure/blob/master/README.md">README</a> file, <a href="https://github.com/HBPMedical/mip-microservices-infrastructure/blob/master/docs/installation/mip-local.md">installation</a> instructions and some <a href="https://github.com/HBPMedical/mip-microservices-infrastructure/blob/master/docs">more documentation</a>.
## Contents
- [Introduction](#introduction)
- [User management](#user_management)
- [Known limitations](#known_limitations)
- [Deployment steps](#deployment_steps)
- [Deployment validation](#deployment_validation)
- [Direct access to the deployed databases](#direct_access_to_the_deployed_databases)
- [Reboot](#reboot)
- [Upgrades](#upgrades)
- [Adding clinical data](#adding_clinical_data)
- [Cleanup MIP installation](#cleanup_mip_installation)
- [Requirements](#requirements)
- [Network configuration](#network-configuration)
## Introduction
The MIP (Medical Informatics Platform) is a bundle of software developed by the HBP sub-project SP8.
Its goal is to enable research and studies on neurological medical data, locally at one hospital and in a Federated manner across hospitals, while maintaining the privacy of sensitive data. For more information, please refer to "SP8 Medical Informatics Platform Architecture and Deployment Plan" (filename `SGA1_D8.6.1_FINAL_Resubmission`).
The MIP is composed of four main parts:
- Web Portal (interface: metadata about the available data, functionalities for privacy-preserving exploration and analysis of the data).
- Structural software (aka "Hospital Bundle": anonymisation, data harmonisation, query engine, federated query engine).
- Data Factory (extraction of features from medical imaging data).
- Algorithm Factory (library of research algorithms that can be run in the MIP).
It is populated with:
- The research datasets PPMI, ADNI and EDSD.
- Local clinical datasets, once prepared and processed.
The MIP can be deployed using the scripts available in the <a href="https://github.com/HBPMedical/mip-microservices-infrastructure">mip-microservices-infrastructure</a> project on Github.
The software is organised into "building blocks" that should facilitate the deployment of the MIP on two or three servers, in an infrastructure that improves security in order to guaranty data privacy.
Based on the <a href="https://github.com/HBPMedical/mip-microservices-infrastructure/blob/master/roles/mip-local/templates/hosts.j2"> Ansible inventory file</a>, the building blocks are the following:
- infrastructure
- hospital-database
- reference
- data-factory
- algorithm-factory
- web-analytics
This file lists the building blocks that will be installed. In theory, it can be modified before running setup.sh to install only specific block (this has not been tested).
**TODO: Test building block deployment and improve documentation. Determine which blocks need to be deployed on the same server, and how to configure the blocks if they are deployed on different servers.**
## Requirements
- Ubuntu 16.04 system (partial support for RHEL).
- Matlab R2016b. (Required for the Data Factory. Alternatively the MIP can be installed without the Data Factory: see below the corresponding deployment option.)
- According to the official documentation, python version 2.7 and the library jmespath need to be installed beforehand.
- For ubuntu:
```
sudo apt install python2.7
ln -s /usr/bin/python2.7 /usr/bin/python
sudo apt install python-jmespath
```
## Network configuration
### Internet access for deployment
Access to the following internet domains is required during the deployment:
**TODO: Get Lille list and reproduce it here**
### Operational firewall configuration
The firewall of the server where MIP is deployed must be set up and deny all incoming connections, except on the following ports:
- 22 for ssh access
- 80 for Web Portal access
- MIP Local requirements
- Federation requirements (see Federation documentation)
- User management requirements (see below)
**TODO: Obtain user management requirement and reproduce it here.**
### MIP Local requirements
Some ports must be open for intra-server connections (accept only requests coming from the local server itself, but on its public address):
- 31543 ("LDSM", PostgresRAW database)
- 31555 (PostgresRAW-UI)
**TODO: Obtain list and reproduce it here.**
## User management
The Web Portal of MIP Local can be deployed in two settings:
- No user management: anybody who has access to the port 80 of the MIP Local server can access the Web Portal and all the data available in the MIP. This can either be
- Everybody that has access to the local network, if the firewall is open.
- Only users who have access to the server itself, if the firewall prevents external access.
- User authentification required: every user must obtain credentials to access the Web Portal. In this case, user rights and authentification are managed by the main HBP servers, so network access to these servers must be allowed.
Further information:
[//]: # ( from Jacek Manthey to Lille)
[... Users] can create accounts on the HBP Portal (see https://mip.humanbrainproject.eu/intro) through invitation, which means that the access control is not stringent.
[... Only] users that can access [the local] network and have an HBP account would be able to access MIP Local. In case you would need more stringent access control, we would need to implement in your MIP-Local a whitelist of authorized HBP accounts.
In order to activate the user access using the authentication through the HBP Portal, we would need a private DNS alias for your MIP local machine, something like mip.your\_domain\_name. [...]
## Known limitations
The following are known limitations of the deployment scripts, version 2.5.3.
- It is currently not possible to deploy MIP Local with a firewall enabled. MIP Local cannot run either with the firewall up, unless the correct rules are configured (see [MIP Local requirements](#mip-local-requirements)).
- The deployed MIP will include research datasets (PPMI, ADNI and EDSD), but the process to include hospital data in MIP-Local is as yet unclear. **TODO: Obtain information, test, complete dedicated section below**
Note: Clinical data processed and made available in the Local Data Store Mirror (LDSM) will not be visible from the Local Web Portal without further configuration, but they will be available to the Federation if the node is connected (variables included in the CDE only).
## Deployment steps
This section describes how to deploy MIP Local without clinical data, on a clean server. If a previous installation was attempted, please see [Cleanup MIP installation](#cleanup-mip-installation). To add hospital data see the section [Adding clinical data](#adding-clinical-data).
1. Retrieve informations requested for the deployment:
- Matlab installation folder path,
- server's address on the local network,
- credentials for the gitlab repository, to download the research data sets,
- sudo access to the target server.
2. Clone the `mip-microservices-infrastructure` git repo in the desired location (here a `mip-infra` folder):
```sh
git clone https://github.com/HBPMedical/mip-microservices-infrastructure.git mip-infra
cd mip-infra/
./after-git-clone.sh # Need confirmation whether this is needed or not
git checkout tags/2.5.3
./after-update.sh # Need confirmation whether this is needed or not
```
Also check the process as described in official doc.
3. Run the configuration script:
```
./common/scripts/configure-mip-local.sh
```
Provide the requested parameters.
Summary of requested input:
```
Where will you install MIP Local?
1) This machine
2) A remote server
>
Does sudo on this machine requires a password?
1) yes
2) no
>
>Which components of MIP Local do you want to install?
1) All 3) Data Factory only
2) Web analytics and databases only
>
Do you want to store research-grade data in CSV files or in a relational database?
1) CSV files
2) Relational database
>
```
WARNING: Both options load the research data (ADNI, PPMI and EDSD) in a relational database. The first option will upload the data in the LDSM database using PostgresRAW, and the second in an unofficial postgres database named "research-db".
```
Please enter an id for the main dataset to process, e.g. 'demo' and a
readable label for it, e.g. 'Demo data'
Id for the main dataset >
Label for the main dataset >
Is Matlab 2016b installed on this machine?
1) yes
2) no
>
Enter the root of Matlab installation, e.g. /opt/MATLAB/2016b :
path >
Do you want to send progress and alerts on data processing to a Slack channel?
1) yes
2) no
Do you want to secure access to the local MIP Web portal?
1) yes
2) no
To enable Google analytics, please enter the Google tracker ID or leave this blank to disable it
Google tracker ID >
```
```
TASK [Suggested target server hostname]***********************
ok: [localhost] => {
"ansible_hostname": "suggested_ansible_hostname"
}
TASK [Suggested target server FQDN]***************************
ok: [localhost] => {
"ansible_fqdn": "suggested_ansible_fqdn"
}
TASK [Suggested target server IP address]***********************
ok: [localhost] => {
"msg": "suggested_IP_address"
}
Target server hostname, e.g. myserver . Use ansible_hostname value if you agree with it.
Target server FQDN, e.g. myserver.myorg.com .
If the full server name cannot be reached by DNS (ping myserver.myorg.com fails),
you can use the IP address instead:
```
If unsure that the `suggested_ansible_fqdn` given above is valid, use the `suggested_IP_address` instead. (Or check if ping works on the `suggested_ansible_fqdn` from another computer.)
```
Target server IP address:
Base URL for the frontend, for example http://myserver.myorg.com:7000
```
This is the address the WebPortal will be accessed through.
The server's address must be valid on the local network (check with nslookup).
The port must be open.
```
Username on Gitlab to download private Docker images.
Leave blank if you do not have access to this information:
Password on Gitlab to download private Docker images.
Leave blank if you do not have access to this information:
```
Gitlab access to download the research data docker images.
```
Use research data only? (Y/n):
```
Using only the research data ("Y") should lead directly to a working MIP Local, accessing research data in a table name `mip_cde_features`.
Adding hospital data (i.e. answering "n") requires additional steps: see section [Adding clinical data](#adding-clinical-data).
In this case, MIP Local will use the view named "mip\_local\_features" to access data. This view groups the research and the clinical data in a uniform flat schema. It is automatically created when hospital data, in the form of a csv file name "harmonized\_clinical\_data", is dropped in the /data/ldsm folder of the MIP Local server. (See [PostgresRAW-UI documentation](https://github.com/HBPMedical/PostgresRAW-UI/blob/master/README.md#3-automated-mip-view-creation) for details.)
```
Generate the PGP key for this user...
[details]
Please select what kind of key you want:
(1) RSA and RSA (default)
(2) DSA and Elgamal
(3) DSA (sign only)
(4) RSA (sign only)
Your selection?
RSA keys may be between 1024 and 4096 bits long.
What keysize do you want? (2048)
Please specify how long the key should be valid.
0 = key does not expire
<n> = key expires in n days
<n>w = key expires in n weeks
<n>m = key expires in n months
<n>y = key expires in n years
Key is valid for? (0)
Is this correct? (y/N)
You need a user ID to identify your key; the software constructs the user ID
from the Real Name, Comment and Email Address in this form:
"Heinrich Heine (Der Dichter) <heinrichh@duesseldorf.de>"
Real name:
Email address:
Comment:
You selected this USER-ID:
[...]
Change (N)ame, (C)omment, (E)mail or (O)kay/(Q)uit?
You need a Passphrase to protect your secret key.
Enter passphrase:
Repeat passphrase:
```
This information is used by git-crypt to encrypt in the Git repository the sensitive information. This precaution is taken if the configuration is uploaded (pushed) to a different server.
4. Once the configuration script ends successfully with a message "Generation of the standard configuration for MIP Local complete!", commit the modifications before continuing.
```
git add .
git commit -m "Configuration for MIP Local"
```
5. Run the setup script, twice if required.
```
./setup.sh
```
The script should end with the following message:
```
PLAY RECAP *************************************************************************************
localhost : ok=?? changed=?? unreachable=0 failed=0
```
## Deployment validation
If the deployment was successful, the Web Portal should be accessible on the `target server IP address` defined at the configuration step.
The Web Portal documentation [HBP\_SP8\_UserGuide\_latest.pdf](https://hbpmedical.github.io/documentation/HBP_SP8_UserGuide_latest.pdf) can help check that the deployed MIP Local is running as expected. The Web Portal should provide similar results but not exactly the results shown in the doc.
[This report](https://drive.google.com/file/d/136RcsLOSECm4ZoLJSORpeM3RLaUdCTVe/view) of a successful deployment can also help check that MIP Local is behaving correctly.
The PostgresRAW-UI can be validated following this <a href="https://drive.google.com/open?id=0B5oCNGEe0yovNWU5eW5LYTAtbWs">test protocol</a>. PostgresRAW-UI should be accessible locally at `http://localhost:31555`.
## Direct access to the deployed databases
The ports and credentials to access the databases used in the MIP can be found in these files:
```
cat install_dir/envs/mip-local/etc/ansible/host_vars/localhost
cat install_dir/vars/hospital-database/endpoints.yml
cat install_dir/vars/reference/endpoints.yml
```
Adapt this command to connect to the databases:
```
psql -U ldsm -p 31432 -h hostname
```
## Reboot
The MIP is not automatically restarted if the server is shut down or rebooted.
The last instructions provided to restart it are:
[//]: # (Slack, MIP-Local & IAAN workspace, general channel, 06.12.2017)
```
./common/scripts/fix-mesos-cluster.sh --reset
./setup.sh
```
Before an updated version of the installer can be provided, it might be necessary to:
> stop all services, uninstall mesos, marathon and docker-ce, then run the installer again.
## Upgrades
> When you perform an upgrade, in most cases you will not need to run again the pre-configuration script mip-local-configuration.sh.
>
> In the few cases where that is necessary, for example if you want to install a new component such as the Data Factory or there has been a big update that affects configuration, then you need to be careful about the changes that this script brings to the configuration. For example, passwords are always re-generated. But the passwords for the existing databases should not be modified. To counter that, you can use Git features and do a review on all changes, line by line, and commit only the changes that are actually needed.
**TODO: Clarify procedure. How to guess which changes are needed? Revert at least the changes to `install_dir/envs/mip-local/etc/ansible/host_vars/` or to file `localhost` in particular?**
## Adding clinical data
**TODO: This section needs to be checked, and properly documented. Only general information is available.**
Draft guidelines to add clinical data:
[//]: # (from meeting on January 9th, 2018; untested)
> - Create a clone of gitlab project https://github.com/HBPMedical/mip-cde-meta-db-setup.
> - Modify clm.patch.json so that it can modify the default variables.json file to add the relevant new variables.
> - Adapt first line of Docker file to select / define the version / rename the Docker image, from hbpmip/mip-cde-meta-db-setup to something else (?)
> - Create the docker image and push it to gitlab (?)
> - Once the MIP-Local configuration for the deployment exist, modify (line 20 of) the file
> envs/mip-local/etc/ansible/group_vars/reference to reference the right docker image
> - Run setup.sh so that the new docker image is run and copies the data in the meta-db database
> - Restart all services of the following building blocks from Marathon (scale them down to 0, then up again to 1)
> - web portal
> - woken
> - data factory
## Cleanup MIP installation
Before attempting a second installation, in case a couple of updates have been delivered to your Linux distribution package manager, you will need to follow the next steps to ensure a proper deployment.
Please be advised this is drastic steps which will remove entirely several softwares, their configuration, as well as any and all data they might store.
### Ubuntu 16.04 LTS
1. Purge installed infrastructure:
```sh
$ sudo apt purge -y --allow-change-held-packages docker-ce marathon zookeeper mesos
```
2. Remove all remaining configuration as it will prevent proper installation:
```sh
$ sudo rm -rf /etc/marathon /etc/mip
$ sudo reboot
$ sudo rm -rf /etc/sysconfig/mesos-agent /etc/sysconfig/mesos-master /var/lib/mesos /var/lib/docker
$ sudo rm -rf /etc/systemd/system/marathon.service.d
$ sudo find /var /etc /usr -name \*marathon\* -delete
$ sudo find /etc /usr /var -name \*mesos\* -delete
$ sudo rm -rf /srv/docker/ldsmdb /srv/docker/research-db
```
------
**WARNING:**
Backup your data before executing the command above. This will remove anything placed inside databases, as well as stored insides docker images.
------
3. Reload the system initialisation scripts, and reboot:
```sh
$ sudo systemctl daemon-reload
$ sudo reboot
```
4. Manually pre-install the packages. As this requires to specify precise version numbers, this list will be out of date really soon:
```sh
$ sudo apt install -y --allow-downgrades --allow-change-held-packages docker-ce=17.09.0~ce-0~ubuntu
```
## Troubleshooting
[//]: # (from Slack)
> Zookeeper in an unstable state, cannot be restarted
>
> -> ```/common/scripts/fix-mesos-cluster.sh --reset, then ./setup.sh ```
See documentation folder on Github for a few specific fixes.

View File

@@ -0,0 +1,10 @@
<?xml version="1.0" encoding="utf-8"?>
<service>
<short>docker-swarm-master</short>
<description>Firewall configuration for a DockerSwarm Master node.</description>
<port protocol="tcp" port="2377"/>
<port protocol="tcp" port="7946"/>
<port protocol="udp" port="7946"/>
<port protocol="udp" port="4789"/>
<port protocol="esp" port=""/>
</service>

View File

@@ -0,0 +1,9 @@
<?xml version="1.0" encoding="utf-8"?>
<service>
<short>docker-swarm-worker</short>
<description>Firewall configuration for a DockerSwarm worker node.</description>
<port protocol="tcp" port="7946"/>
<port protocol="udp" port="7946"/>
<port protocol="udp" port="4789"/>
<port protocol="esp" port=""/>
</service>

134
README.md
View File

@@ -1,2 +1,132 @@
# mip-federation # MIP Federation deployment scripts and documentation
Scripts and documentation to deploy the MIP Federation
This repository contains all documentation regarding the MIP Federation, and scripts automating its deployment.
## Overview
The MIP Federation allows to connect multiple MIP Local instances securely over the web, so that privacy-preserving analysis and queries on all the Federation nodes' data can be performed in a distributed manner from the Federation manager, using the Exareme software.
Complete documentation of the Federation can be found in <a href="">MIP Federation specifications</a>.
The steps to deploy the Federation are the following:
- Setup the manager node(s).
- Add the worker nodes.
- Add name labels to the nodes to allow proper assignation of the different services.
- Start "services", which are described in docker-compose.yml files: Exareme, Consul and Portainer.
In the following we are going to use only one master node. More can be added for improved availability.
## Deployement
### Requirements
MIP Local should be installed on the nodes that will join the MIP Federation. To join a node without MIP Local, see section [Adding a node without MIP Local](#adding-a-node-without-mip-local).
The Federation manager server must have a fixed IP address; other nodes must have a public IP, ideally also fixed. The firewall must allow connections on several ports: see details in <a href="">Firewall configuration</a>.
### Deploy the Federation
1. Create the manager node(s).
```sh
$ sudo ./setupFederationInfrastructure.sh
```
The output will include the command to add a node to the swarm.
2. On each worker node (a.k.a node of the federation), run the swarm join command.
```sh
$ sudo docker swarm join --token <Swarm Token> <Master Node URL>
```
The command to execute on the worker node, including the `Swarm Token` and the `Master Node URL`, is provided when performing point 1. It can be obtained again at any time from the manager, with the following command:
```sh
$ sudo docker swarm join-token worker
To add a worker to this swarm, run the following command:
docker swarm join --token SWMTKN-1-11jmbp9n3rbwyw23m2q51h4jo4o1nus4oqxf3rk7s7lwf7b537-9xakyj8dxmvb0p3ffhpv5y6g3 10.2.1.1:2377
```
3. Add informative name labels for each worker node, on the swarm master.
```sh
$ sudo docker node update --label-add name=<Alias> <node hostname>
```
* `<node hostname>` can be found with `docker node ls`
* `<Alias>` will be used when bringing up the services and should be a short descriptive name.
4. Deploy the Federation service
```sh
$ sudo ./start.sh <Alias>
```
* `<Alias>` will be used when bringing up the services and should be a short descriptive name.
* if you set `SHOW_SETTINGS=true` a printout of all the settings which will be used will be printed before doing anything.
## Settings
All the settings have default values, but you can change them by either exporting in your shell the setting with its value, or creating `settings.local.sh` in the same folder as `settings.sh`:
```sh
: ${VARIABLE:="Your value"}
```
**Note**: To find the exhaustive list of parameters available please take a look at `settings.default.sh`.
**Note**: If the setting is specific to a node of the federation, you can do this in `settings.local.<Alias>.sh` where `<Alias>` is the short descriptive name given to a node.
Settings are taken in the following order of precedence:
1. Shell Environment, or on the command line
2. Node-specific settings `settings.local.<Alias>.sh`
3. Federation-specific `settings.local.sh`
4. Default settings `settings.default.sh`
## Adding a node without MIP Local
The following are required on all nodes. This is installed by default as part of the MIP, but can be installed manually when MIP Local is not present.
1. Install docker
```sh
$ sudo apt-get update
$ sudo apt-get install \
apt-transport-https \
ca-certificates \
curl \
software-properties-common
$ curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
```
2. Check the finger print: `9DC8 5822 9FC7 DD38 854A E2D8 8D81 803C 0EBF CD88`
```sh
$ sudo apt-key fingerprint 0EBFCD88
pub 4096R/0EBFCD88 2017-02-22
Key fingerprint = 9DC8 5822 9FC7 DD38 854A E2D8 8D81 803C 0EBF CD88
uid Docker Release (CE deb) <docker@docker.com>
sub 4096R/F273FCD8 2017-02-22
```
3. Add the Docker official repository
```sh
$ sudo add-apt-repository \
"deb [arch=amd64] https://download.docker.com/linux/ubuntu \
$(lsb_release -cs) \
stable"
```
4. Update the index and install docker:
```sh
$ sudo apt-get update
$ sudo apt-get install docker-ce
```
5. TODO: Run PostgresRAW and PostgresRAW-UI, create necessary tables / files, expose on correct ports.

View File

@@ -0,0 +1,76 @@
# Copyright (c) 2016-2017
# Data Intensive Applications and Systems Labaratory (DIAS)
# Ecole Polytechnique Federale de Lausanne
#
# All Rights Reserved.
#
# Permission to use, copy, modify and distribute this software and its
# documentation is hereby granted, provided that both the copyright notice
# and this permission notice appear in all copies of the software, derivative
# works or modified versions, and any portions thereof, and that both notices
# appear in supporting documentation.
#
# This code is distributed in the hope that it will be useful, but WITHOUT ANY
# WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR
# A PARTICULAR PURPOSE. THE AUTHORS AND ECOLE POLYTECHNIQUE FEDERALE DE LAUSANNE
# DISCLAIM ANY LIABILITY OF ANY KIND FOR ANY DAMAGES WHATSOEVER RESULTING FROM THE
# USE OF THIS SOFTWARE.
version: '3'
networks:
net-federation:
external:
name: mip-federation
services:
exareme-keystore:
image: ${CONSUL_IMAGE}:${CONSUL_VERSION}
image: progrium/consul
command:
- -server
- -bootstrap
deploy:
restart_policy:
condition: on-failure
delay: 5s
max_attempts: 3
window: 120s
placement:
constraints:
- node.role == manager # Ensures we only start on manager nodes
- node.labels.name == ${FEDERATION_NODE}
networks:
- "net-federation" # Connect the docker container to the global network
exareme-manager:
image: ${EXAREME_IMAGE}:${EXAREME_VERSION}
environment:
- CONSULURL=${EXAREME_KEYSTORE}
- MASTER_FLAG=master
- NODE_NAME=${FEDERATION_NODE}
- EXA_WORKERS_WAIT=${EXAREME_WORKERS_WAIT} # Wait for N workers
- RAWUSERNAME=${LDSM_USERNAME}
- RAWPASSWORD=${LDSM_PASSWORD}
- RAWHOST=${LDSM_HOST}
- RAWPORT=${LDSM_PORT}
- RAWENDPOINT=${EXAREME_LDSM_ENDPOINT}
- RAWRESULTS=${EXAREME_LDSM_RESULTS}
- RAWDATAKEY=${EXAREME_LDSM_DATAKEY}
- MODE=${EXAREME_MODE}
depends_on:
- exareme-keystore
deploy:
restart_policy:
condition: on-failure
delay: 5s
max_attempts: 3
window: 120s
placement:
constraints:
- node.role == manager # Ensures we only start on manager nodes
- node.labels.name == ${FEDERATION_NODE}
ports:
- "9090:9090" # So that we can access the Exareme REST API / interface
networks:
- "net-federation" # Connect the docker container to the global network

52
docker-compose-worker.yml Normal file
View File

@@ -0,0 +1,52 @@
# Copyright (c) 2016-2017
# Data Intensive Applications and Systems Labaratory (DIAS)
# Ecole Polytechnique Federale de Lausanne
#
# All Rights Reserved.
#
# Permission to use, copy, modify and distribute this software and its
# documentation is hereby granted, provided that both the copyright notice
# and this permission notice appear in all copies of the software, derivative
# works or modified versions, and any portions thereof, and that both notices
# appear in supporting documentation.
#
# This code is distributed in the hope that it will be useful, but WITHOUT ANY
# WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR
# A PARTICULAR PURPOSE. THE AUTHORS AND ECOLE POLYTECHNIQUE FEDERALE DE LAUSANNE
# DISCLAIM ANY LIABILITY OF ANY KIND FOR ANY DAMAGES WHATSOEVER RESULTING FROM THE
# USE OF THIS SOFTWARE.
version: '3'
networks:
net-federation:
external:
name: mip-federation
services:
exareme:
image: ${EXAREME_IMAGE}:${EXAREME_VERSION}
environment:
- CONSULURL=${EXAREME_KEYSTORE}
- NODE_NAME=${FEDERATION_NODE}
- RAWUSERNAME=${LDSM_USERNAME}
- RAWPASSWORD=${LDSM_PASSWORD}
- RAWHOST=${LDSM_HOST}
- RAWPORT=${LDSM_PORT}
- RAWENDPOINT=${EXAREME_LDSM_ENDPOINT}
- RAWRESULTS=${EXAREME_LDSM_RESULTS}
- RAWDATAKEY=${EXAREME_LDSM_DATAKEY}
- MODE=${EXAREME_MODE}
deploy:
restart_policy:
condition: on-failure
delay: 5s
max_attempts: 3
window: 120s
placement:
constraints:
- node.role == worker # Ensures we only start on worker nodes
- node.labels.name == ${FEDERATION_NODE}
networks:
- "net-federation" # Connect the docker container to the global network

29
settings.default.sh Normal file
View File

@@ -0,0 +1,29 @@
: ${SHOW_SETTINGS:=false}
# Swarm Manager settings
: ${MASTER_IP:=$(wget http://ipinfo.io/ip -qO -)}
# Swarm Management Services
: ${PORTAINER_PORT:="9000"}
# Federation Services
: ${CONSUL_IMAGE:="progrium/consul"}
: ${CONSUL_VERSION:="latest"}
: ${EXAREME_IMAGE:="hbpmip/exareme_dataset"}
: ${EXAREME_VERSION:="demo2"}
: ${EXAREME_ROLE:=""} # The default value is set to the federation node role (worker or manager)
: ${EXAREME_KEYSTORE_PORT:="8500"}
: ${EXAREME_KEYSTORE:="exareme-keystore:${EXAREME_KEYSTORE_PORT}"}
: ${EXAREME_MODE:="global"}
: ${EXAREME_WORKERS_WAIT:="1"} # Wait for N workers
: ${EXAREME_LDSM_ENDPOINT:="query"}
: ${EXAREME_LDSM_RESULTS:="all"}
: ${EXAREME_LDSM_DATAKEY:="output"} # query used with output, query-start with data
# Exareme LDSM Settings
: ${LDSM_USERNAME:="federation"}
: ${LDSM_PASSWORD:="federation"}
: ${LDSM_HOST:=""} # The default value is set to the federation node
: ${LDSM_PORT:="31555"}
: ${FEDERATION_NODE:=""} # Invalid default value, this a required argument of start.sh

30
settings.sh Normal file
View File

@@ -0,0 +1,30 @@
#Settings are taken in the following order of precedence:
# 1. Shell Environment, or on the command line
# 2. Node-specific settings `settings.local.<Alias>.sh`
if test ! -z "$1" && test -f ./settings.local.$1.sh;
then
. ./settings.local.$1.sh;
fi
# 3. Federation-specific `settings.local.sh`
if test -f ./settings.local.sh;
then
. ./settings.local.sh;
fi
# 4. Default settings `settings.default.sh`
if test -f ./settings.default.sh;
then
. ./settings.default.sh;
fi
if ${SHOW_SETTINGS};
then
echo "Current settings:"
for v in $(grep '^:' settings.default.sh|cut -c 5- |cut -d: -f1)
do
eval "echo $v=\$$v"
done
echo
fi

View File

@@ -0,0 +1,57 @@
#!/bin/sh
# Copyright (c) 2017-2017
# Data Intensive Applications and Systems Labaratory (DIAS)
# Ecole Polytechnique Federale de Lausanne
#
# All Rights Reserved.
#
# Permission to use, copy, modify and distribute this software and its
# documentation is hereby granted, provided that both the copyright notice
# and this permission notice appear in all copies of the software, derivative
# works or modified versions, and any portions thereof, and that both notices
# appear in supporting documentation.
#
# This code is distributed in the hope that it will be useful, but WITHOUT ANY
# WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR
# A PARTICULAR PURPOSE. THE AUTHORS AND ECOLE POLYTECHNIQUE FEDERALE DE LAUSANNE
# DISCLAIM ANY LIABILITY OF ANY KIND FOR ANY DAMAGES WHATSOEVER RESULTING FROM THE
# USE OF THIS SOFTWARE.
set -e
# Import settings
. ./settings.sh
# Master node/Manager
(
# Initialize swarm
docker swarm init --advertise-addr=${MASTER_IP}
)
# Portainer, a webUI for Docker Swarm
if true
then
(
portainer_data=/srv/portainer
test -d ${portainer_data} \
|| mkdir -p ${portainer_data} \
|| ( echo Failed to create ${portainer_data}; exit 1 )
docker service create \
--name portainer \
--publish ${PORTAINER_PORT}:9000 \
--constraint 'node.role == manager' \
--mount type=bind,src=/var/run/docker.sock,dst=/var/run/docker.sock \
--mount type=bind,src=${portainer_data},dst=/data \
portainer/portainer \
-H unix:///var/run/docker.sock
)
fi
docker network create \
--driver=overlay \
--opt encrypted \
--subnet=10.20.30.0/24 \
--ip-range=10.20.30.0/24 \
--gateway=10.20.30.254 \
mip-federation

181
start.sh Executable file
View File

@@ -0,0 +1,181 @@
#!/usr/bin/env bash
# Copyright (c) 2016-2017
# Data Intensive Applications and Systems Labaratory (DIAS)
# Ecole Polytechnique Federale de Lausanne
#
# All Rights Reserved.
#
# Permission to use, copy, modify and distribute this software and its
# documentation is hereby granted, provided that both the copyright notice
# and this permission notice appear in all copies of the software, derivative
# works or modified versions, and any portions thereof, and that both notices
# appear in supporting documentation.
#
# This code is distributed in the hope that it will be useful, but WITHOUT ANY
# WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR
# A PARTICULAR PURPOSE. THE AUTHORS AND ECOLE POLYTECHNIQUE FEDERALE DE LAUSANNE
# DISCLAIM ANY LIABILITY OF ANY KIND FOR ANY DAMAGES WHATSOEVER RESULTING FROM THE
# USE OF THIS SOFTWARE.
. ./settings.sh
federation_nodes=""
federation_hosts=""
for h in $(docker node ls --format '{{ .Hostname }}')
do
federation_nodes="${federation_nodes} $(docker node inspect --format '{{ .Spec.Labels.name }}' ${h})"
federation_hosts="${federation_hosts} ${h}"
done
usage() {
cat <<EOT
usage: $0 [-h|--help] (all|nodename [nodename ...])
-h, --help: show this message and exit
all: Start the federation on all the nodes currently known
nodename: one or more nodes on which to deploy the stack
You can use environment variables, or add them into settings.local.sh
to change the default values.
To see the full list, please refer to settings.default.sh
Please find below the list of known Federation nodes:
${federation_nodes}
Errors: This script will exit with the following error codes:
1 No arguments provided
2 Federation node is incorrect
EOT
}
start_node() {
(
FEDERATION_NODE=$1
LDSM_HOST=$2
EXAREME_ROLE=$3
# Export the settings to the docker-compose files
export FEDERATION_NODE
export LDSM_USERNAME LDSM_PASSWORD LDSM_HOST LDSM_PORT
export CONSUL_IMAGE CONSUL_VERSION
export EXAREME_IMAGE EXAREME_VERSION
export EXAREME_ROLE EXAREME_KEYSTORE EXAREME_MODE EXAREME_WORKERS_WAIT
export EXAREME_LDSM_ENDPOINT EXAREME_LDSM_RESULTS EXAREME_LDSM_DATAKEY
# Finally deploy the stack
docker stack deploy -c docker-compose-${EXAREME_ROLE}.yml ${FEDERATION_NODE}
)
}
start_nodes() {
# Make sure we start from empty lists
nodes="$*"
hosts=""
managers=""
workers=""
for n in ${nodes}
do
for h in ${federation_hosts}
do
label=$(docker node inspect --format '{{ .Spec.Labels.name }}' ${h})
if [ "x${label}" == "x${n}" ];
then
hosts="${hosts} ${h}"
break 1
fi
done
done
# Sort the nodes based on their roles
for h in ${hosts}
do
if [ "manager" == "$(docker node inspect --format '{{ .Spec.Role }}' ${h})" ];
then
managers="${managers} ${h}"
else
workers="${workers} ${h}"
fi
done
# Start all the manager nodes
for h in ${managers}
do
label=$(docker node inspect --format '{{ .Spec.Labels.name }}' ${h})
dbhost=$(docker node inspect --format '{{ .Status.Addr }}' ${h})
EXAREME_WORKERS_WAIT=$(echo "$workers" | wc -w)
start_node ${label} ${dbhost} manager
done
# Then start all the worker nodes
for h in ${workers}
do
label=$(docker node inspect --format '{{ .Spec.Labels.name }}' ${h})
dbhost=$(docker node inspect --format '{{ .Status.Addr }}' ${h})
start_node ${label} ${dbhost} worker
done
}
start_all_nodes() {
start_nodes ${federation_nodes}
}
start_one_node() {
for h in ${federation_hosts}
do
label=$(docker node inspect --format '{{ .Spec.Labels.name }}' ${h})
if [ "x${label}" == "x${FEDERATION_NODE}" ];
then
test -z "${LDSM_HOST}" && \
LDSM_HOST=$(docker node inspect --format '{{ .Status.Addr }}' ${h})
test -z "${EXAREME_ROLE}" && \
EXAREME_ROLE=$(docker node inspect --format '{{ .Spec.Role }}' ${h})
start_node ${label} ${LDSM_HOST} ${EXAREME_ROLE}
break
fi
done
}
if [ $# -lt 1 ];
then
usage
exit 1
fi
if [ $# -eq 1 ];
then
case $1 in
-h|--help)
usage
exit 0
;;
*)
FEDERATION_NODE="$1"
;;
esac
if [ -z "${FEDERATION_NODE}" ]; then
echo "Invalid federation node name"
usage
exit 3
fi
case ${FEDERATION_NODE} in
all)
start_all_nodes
;;
*)
start_one_node ${FEDERATION_NODE}
;;
esac
else
start_nodes $*
fi
exit 0

163
stop.sh Executable file
View File

@@ -0,0 +1,163 @@
#!/usr/bin/env bash
# Copyright (c) 2016-2017
# Data Intensive Applications and Systems Labaratory (DIAS)
# Ecole Polytechnique Federale de Lausanne
#
# All Rights Reserved.
#
# Permission to use, copy, modify and distribute this software and its
# documentation is hereby granted, provided that both the copyright notice
# and this permission notice appear in all copies of the software, derivative
# works or modified versions, and any portions thereof, and that both notices
# appear in supporting documentation.
#
# This code is distributed in the hope that it will be useful, but WITHOUT ANY
# WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR
# A PARTICULAR PURPOSE. THE AUTHORS AND ECOLE POLYTECHNIQUE FEDERALE DE LAUSANNE
# DISCLAIM ANY LIABILITY OF ANY KIND FOR ANY DAMAGES WHATSOEVER RESULTING FROM THE
# USE OF THIS SOFTWARE.
. ./settings.sh
federation_nodes=""
federation_hosts=""
for h in $(docker node ls --format '{{ .Hostname }}')
do
federation_nodes="${federation_nodes} $(docker node inspect --format '{{ .Spec.Labels.name }}' ${h})"
federation_hosts="${federation_hosts} ${h}"
done
usage() {
cat <<EOT
usage: $0 [-h|--help] (all|nodename [nodename ...])
-h, --help: show this message and exit
all: Stops the federation on all the nodes currently known
nodename: one or more nodes on which to stop the stack
You can use environment variables, or add them into settings.local.sh
to change the default values.
To see the full list, please refer to settings.default.sh
Please find below the list of known Federation nodes:
${federation_nodes}
Errors: This script will exit with the following error codes:
1 No arguments provided
2 Federation node is incorrect
EOT
}
stop_node() {
(
FEDERATION_NODE=$1
# Export the settings to the docker-compose files
export FEDERATION_NODE
# Finally stop the stack
docker stack rm ${FEDERATION_NODE}
)
}
stop_nodes() {
# Make sure we start from empty lists
nodes="$*"
hosts=""
managers=""
workers=""
for n in ${nodes}
do
for h in ${federation_hosts}
do
label=$(docker node inspect --format '{{ .Spec.Labels.name }}' ${h})
if [ "x${label}" == "x${n}" ];
then
hosts="${hosts} ${h}"
break 1
fi
done
done
# Sort the nodes based on their roles
for h in ${hosts}
do
if [ "manager" == "$(docker node inspect --format '{{ .Spec.Role }}' ${h})" ];
then
managers="${managers} ${h}"
else
workers="${workers} ${h}"
fi
done
# Stop all the worker nodes
for h in ${workers}
do
label=$(docker node inspect --format '{{ .Spec.Labels.name }}' ${h})
stop_node ${label}
done
# Then stop all the manager nodes
for h in ${managers}
do
label=$(docker node inspect --format '{{ .Spec.Labels.name }}' ${h})
stop_node ${label}
done
}
stop_all_nodes() {
stop_nodes ${federation_nodes}
}
stop_one_node() {
for h in ${federation_hosts}
do
label=$(docker node inspect --format '{{ .Spec.Labels.name }}' ${h})
if [ "x${label}" == "x${FEDERATION_NODE}" ];
then
stop_node ${label}
break
fi
done
}
if [ $# -lt 1 ];
then
usage
exit 1
fi
if [ $# -eq 1 ];
then
case $1 in
-h|--help)
usage
exit 0
;;
*)
FEDERATION_NODE="$1"
;;
esac
if [ -z "${FEDERATION_NODE}" ]; then
echo "Invalid federation node name"
usage
exit 3
fi
case ${FEDERATION_NODE} in
all)
stop_all_nodes
;;
*)
stop_one_node ${FEDERATION_NODE}
;;
esac
else
stop_nodes $*
fi
exit 0