Go to file
2021-01-19 09:10:28 +01:00
deploymentagent Fix unused argument 2021-01-11 13:35:22 +01:00
infrastructure Add infra code 2021-01-19 09:10:28 +01:00
nixos Add infra code 2021-01-19 09:10:28 +01:00
vault Add infra code 2021-01-19 09:10:28 +01:00
.gitignore Add infra code 2021-01-19 09:10:28 +01:00
config.json Add infra code 2021-01-19 09:10:28 +01:00
Makefile Add infra code 2021-01-19 09:10:28 +01:00
README.md Add infra code 2021-01-19 09:10:28 +01:00

Hetzner Cloud Environment

Project Structure

.
|-- Makefile         # Wrapper to simplify interaction
|-- config.json      # Read by Make, Terraform, Packer
|-- deploymentagent
|-- infrastructure   # Terraform modules
|   |-- compute      # Loads the compute module
|   |-- environment  # Loads the environment module and provides outputs
|   |-- ingress      # Loads the ingress module and provides outputs
|   |-- storage      # Loads the storage module
|   `-- modules      # Contains the code for all the modules
|-- nixos            # NixOS image builder with Packer
|-- secrets.json     # Read by Make, Terraform, Packer
`-- vault            # Policy examples

Overview

Tools and Dependencies

Configuration

Configuring a Hetzner Cloud Project

Login: https://accounts.hetzner.com/login

Visit the projects tab to either create a new project or to pick an existing one. A project will contain resources (servers, snapshots, load balancers, volumes, ..) as well as a security service to manage API tokens and TLS certificates (which can be used with load balancers). Check the links below to see which resources are available and how to use them.

To build and provision resources with Packer and Terraform, an API token is required, which can be created in the Security tab.

Hetzner Cloud Limitations

Floating IPs: Persistent (floating) IP addresses currently can only be assigned to cloud servers. This means that when you delete a load balancer, you will also lose the public IP you have been using for the services behind it. You will probably not delete load balancers in the production environment, but for staging and testing environments, load balancers can be scaled up and down via the Hetzner Cloud web UI or their API/Terraform if you want to save some money. There appear to be plans to add support for load balancers with floating IPs. Certificates: Certificates stored within the security service on Hetzner Cloud cannot be updated, only replaced. Before a certificate can be deleted, it must be dereferenced from services which were set up on load balancers. For this creason, Certbot needs to be wrapped by a script which takes care of certificate replacement (see infrastructure/modules/compute/certbot.sh). Unfortunately, Hetzner does not keep a public roadmap, but there seem to be plans to add support for Let's Encrypt directly to cloud load balancers as well.

config.json

The config.json and secrets.json files are read by Make, Packer and Terraform. This way all changing settings and secrets between environments can be stored in a central place and HCL files used by Packer and Terraform only need to be touched in case the infrastructure is intended to be "refactored". Due to some technical limitations in Terraform, it can be tricky to track state with backends in different environments. To avoid solutions involving templates or third party tools such as Terragrunt, a simple wrapper has been included in the Makefile which can set up backends automatically for different environments.

Secrets

secrets.json (with git-secrets)

To decrypt the secrets.json file, run the following command on your shell

git secret reveal

Gitlab

Secrets, such as the SSH key pair for the default system user are stored in the Gitlab CI/CD settings page of this Git project (for now), in the Variables section.

https://gitlab.com/infektweb/glv5/hetzner-cloud-environment/-/settings/repository/#js-deploy-tokens

id_rsa_operator_pub is baked into the image generated by Packer (see nixos/nix/system.nix)

NixOS

Building NixOS Images (Snapshots) with Packer

The nixos target in the Makefile wraps around the execution of Packer to build a NixOS image from the default Ubuntu 20.04 image provider by Hetzner Cloud. Two arguments may be supplied, VERSION= to specify the desired NixOS release (see NixOS Release Notes) and BUILD= with which you can track versions of the images that have been created.

Example:

$ make nixos VERSION=20.09 BUILD=1.0.0

After a build has been successful, Packer will display the ID of the created snapshot on the very last line of the output. When provisioning servers via Terraform, the used image ID will be read from the nixos_snapshot_id key in the config.json file. In case you missed the ID in the build output, you can query the Hetzner Cloud API like this to retrieve a list of created snapshots.

$ curl -H "Authorization: Bearer $HCLOUD_TOKEN" 'https://api.hetzner.cloud/v1/images' | jq '.images[] | select(.type == "snapshot")'

It makes sense to use the same NixOS image across all environments. (testing/staging/production/..)

Infrastructure

Working with Terraform

Have a look at their documentation. To learn more about its configuration language HCL, see

  • Resources
  • Variables and Outputs
  • Functions
  • State

Refer to the Provider documentation to see how to manage resources with Terraform on Hetzner Cloud.

Provisioning Infrastructure

Modules Overview

Rough overview of resources and outputs across the four modules.

environment
  - hcloud_network
  - hcloud_network_subnet
  - outputs
    - dc_default_id (identifier of the datacenter in nuremberg)
    - environment_name (name of the environment, read from config.json)
    - network_primary_id
    - network_subnet_a_id
ingress
  - hcloud_load_balancer
  - hcloud_load_balancer_network (attach to network/subnet configured in envionment module)
  - hcloud_load_balancer_service
  - hcloud_load_balancer_target (servers are implicitly assigned to load balancers via their labels)
storage
  - hcloud_volume
  - outputs
    - volume_data1_id
compute
  - hcloud_server
  - hcloud_server_network (attach servers to networks/subnets configured in envionment module)
  - hcloud_volume_attachment (directly attach volumes created in the storage module to servers)
Initializing State Backends for Each Module

You will need to (re-)initialize the state backend each time you change environments via config.json (see later sections).

$ make infra-init-backends MODULES="compute" # one module
$ make infra-init-backends MODULES="compute ingress" # multiple modules
$ make infra-init-all-backends # all modules
Applying Modules

You will need to manually confirm with 'yes' before the changes are applied.

$ make infra-apply MODULE=compute
Destroying Modules
$ make infra-destroy MODULE=compute

Operations Guide

Data

Ephemeral Data

/opt/ /etc/nixos

Persistent Data

/mnt/data

Setting Up a New Environment

The following sections assume the environment to be called 'production'.

Configure Environment in config.json and secrets.json

Set the environment name and desired NixOS image/snapshot ID in config.json. config.json:

{
tbd
}

Use your personal Gitlab deployment- and Hetzner Cloud tokens. secrets.json:

{
    "terraform_gitlab_backend_username": "",
    "terraform_gitlab_backend_password": "",
    "terraform_gitlab_backend_project": "",
    "gitlab_deploy_token_username": "",
    "gitlab_deploy_token_password": "",
    "aws_access_key_id": "",
    "aws_secret_access_key": "",
    "gitlab_deploy_token_password": "",
    "hcloud_token_testing": "",
    "hcloud_token_production": "",
    "vault_db_password_production": ""
}

Provisioning Infrastructure with Terraform

Just to be sure, re-initialize all the Terraform state backends for the desired environment.

$ make infra-init-all-backends

Roll out all the resources by applying each Terraform module. The environment module must be applied first, the compute module last.

$ make infra-apply MODULE=environment
$ make infra-apply MODULE=ingress
$ make infra-apply MODULE=storage
$ make infra-apply MODULE=compute

Take note of the public IP from the load balancer (used to access your services) and the server (used to manage the NixOS system) in the Hetzner Cloud web UI or via their API:

$ curl -H "Authorization: Bearer $API_TOKEN" 'https://api.hetzner.cloud/v1/servers?label_selector=environment==production' | jq '.servers[].public_net'
$ curl -H "Authorization: Bearer $API_TOKEN" 'https://api.hetzner.cloud/v1/load_balancers?label_selector=environment==production' | jq '.load_balancers[].public_net'

You can now connect to the newly created server, using the default key pair stored on Gitlab as user 'operator'.

$ ssh operator@168.119.230.44

Changing Passwords of System Users

As a first step you should change the passwords of the root and operator users.

$ sudo -i
$ passwd
$ passwd operator

Configuring Certbot

In case you have an existing configuration for Certbot, you can simply copy it to /mnt/data/letsencrypt, otherwise you can set up a new configuration either locally, or directly on the server itself.

$ export AWS_ACCESS_KEY_ID="..."
$ export AWS_SECRET_ACCESS_KEY="..."
$ export LETSENCRYPT_DIR=/mnt/data/letsencrypt
$ export domains="..." # list of domain_name_production and domain_alternative_names_production in config.json, each each one prefixed with the `-d` flag
$ certbot certonly --dry-run --non-interactive --agree-tos -m webmaster@"$DOMAIN_NAME" --work-dir "$LETSENCRYPT_DIR"/lib --logs-dir "$LETSENCRYPT_DIR"/log --config-dir "$LETSENCRYPT_DIR"/etc --dns-route53 --preferred-challenges dns $domains

At this point you should test whether the configuration is working, to prevent Certbot to create or renew the certificate, you can supply the --dry-run flag.

To know which IAM permission Certbot needs on Amazon Route53, refer to the Certbot documentation

Now that the configuration for Certbot is available, rebuild the NixOS system and deploy the certificates to the load balancers.

$ systemctl start nixos-rebuild
$ systemctl start hetzner-certbot
$ journalctl -u hetzner-certbot

Configuring Vault

Creating the Database

Log-in as the postgres user and execute the following SQL commands.

CREATE DATABASE vault;

CREATE USER vault WITH ENCRYPTED PASSWORD 'change to value of vault_db_password_$ENVIRONMENT';

GRANT ALL PRIVILEGES ON DATABASE vault TO vault;
\c vault
CREATE TABLE vault_kv_store (
  parent_path TEXT COLLATE "C" NOT NULL,
  path        TEXT COLLATE "C",
  key         TEXT COLLATE "C",
  value       BYTEA,
  CONSTRAINT pkey PRIMARY KEY (path, key)
);
CREATE INDEX parent_path_idx ON vault_kv_store (parent_path);
GRANT ALL PRIVILEGES ON TABLE vault_kv_store TO vault;

Be sure to replace the password with the value which is set for vault_db_password_production in secrets.json.

$ sudo -i
$ su -l postgres
$ psql
[.. SQL commands ..]
$ exit

Afterwards, restart Vault.

$ systemctl restart vault
$ systemctl status vault
Initializing Vault

You can now access Vault on port 9443 via any hostname behind the load balancer https://guidelines.ch:9443/. As a first step, you will need to create a master key (set) which is used to unseal Vault on each startup. To use just one master key, initialize Vault with "Key shares" and "Key threshold" both set to "1". The "initial root token" is used to authenticate as an administrator with the Vault API or web UI The "key" is used to unseal Vault upon startup. You can now set up the key-value based secret engine which is supported by the settings package. Be sure to use V2 of the KV engine. See the Vault documentation.

To unseal Vault manually, you can either use curl, the Vault CLI, or use the prompt on the web UI.

$ curl -XPUT http://127.0.0.1:8200/v1/sys/unseal -d '{"key": "master key"}'
$ vault operator unseal
Key (will be hidden):
Unseal Vault Automatically on Startup

You can manually write the created master key to /mnt/data/vault-root-token. If this file exists and contains a valid master key, it Vault will be unsealed automatically.

Configuring Elasticsearch

This is going to feel a bit hacky, but we need to provision the default set of built-in Elasticsearch users and the easiest way is to use x-pack. Since we use a non-standard path for the Elasticsearch "home", we need to copy some files to be able to use the elasticsarch-setup-passwords command.

$ export ES_HOME=/mnt/data/elasticsearch/ # currently missing x-pack commands
$ find / -type d -name "jre"
[..]
/nix/store/g67sykn6hfjmgxhvr6cqv5c7v19d6490-openjdk-headless-8u272-b10-jre/lib/openjdk/jre
$ export JAVA_HOME=/nix/store/g67sykn6hfjmgxhvr6cqv5c7v19d6490-openjdk-headless-8u272-b10-jre/lib/openjdk/jre
$ find / -type f -name "elasticsearch-setup-passwords"
[..]
/nix/store/j5s9sb7r2hbkq16afm87rjssic3czrqx-elasticsearch-7.5.1/bin/elasticsearch-setup-passwords
$ cp /nix/store/j5s9sb7r2hbkq16afm87rjssic3czrqx-elasticsearch-7.5.1/bin/x-pack-* /mnt/data/elasticsearch/bin/
$ cp /nix/store/j5s9sb7r2hbkq16afm87rjssic3czrqx-elasticsearch-7.5.1/bin/elasticsearch-setup-passwords /mnt/data/elasticsearch/bin/
$ /mnt/data/elasticsearch/bin/elasticsearch-setup-passwords interactive

Maybe there are better ways to do this using nix-shell. If you prefer the passwords to be generated for you, use the argument auto instead of interactive.

Credentials for Kibana

If you would like to use Kibana (recommended), add the password you set for the 'kibana' user to /mnt/data/kibana-elasticsearch-password (mode 600) and rebuild NixOS with systemctl start nixos-rebuild. Kibana can be accessed on port 8443 via any hostname behind the load balancer https://guidelines.ch:8443/. (sign in with the 'elastic' user)

Configuring Guidelines

CREATE DATABASE guidelines;
CREATE USER guidelines WITH ENCRYPTED PASSWORD 'changeme';
GRANT ALL PRIVILEGES ON DATABASE guidelines TO guidelines;