Homelab: Ansible

23 Jun 2019

Previously, I wrote about the process by which I bootstrapped Arch Linux onto the nodes of my Homelab. Just running Linux is hardly interesting - millions of computers do that every day. The real trick is going from bare Linux installs to deployed services, which requires managing the configuration and state of the servers. In this post, I’m gonna introduce some basic concepts from Ansible, and point out some stuff I wish I’d known starting off.

At work at Twitter, we leverage Puppet as our tool of choice for doing configuration management. Puppet is a declarative language for doing templating and configuration management, built atop Ruby which has seen some meaningful improvements in recent releases with the addition of types and other features. And it works great!

Puppet’s execution model - at least as we’ve deployed it - is that users author modules, which declare dependencies (with or without order) on other modules, and declare the intended state of files and packages. This enables Puppet to perform all the usual tree-analyses to construct execution plans, and do only required work. It also means that since Puppet is declarative, you want to use it to only talk about the intended end-state of your machines not mechanically what you want to happen. The entire point of Puppet is that it takes care of that.

Puppet features an agent, which allows hosts in a deployment to self-configure by periodically refreshing their configuration. At scale (as at Twitter) this is really fantastic tool because it lets you rely on eventually applied updates across your fleet.

All this however is work tooling for me, and what’s the fun of doing the same thing at home?

Enter Ansible. Ansible, like Puppet, is a configuration managing and templating solution. However unlike Puppet, Ansible provides a much more traditional imperative model. Ansible is based on scripts - called playbooks - comprised of a list of tasks which execute sequentially.

the only. good thing ansible has going for it is that it's replete with side effects and a glorified fucking distributed bash script. wait. these aren't good things. what's the complement of good
— Reid McKenzie (@arrdem) June 13, 2019

Ansible’s real strength is that it gives you a way to execute host management actions over SSH, without having an installation of your management system remotely. This makes it incredibly easy to automate bootstrapping a system with Ansible, because it imposes almost no dependencies and management directives are pushed to remote servers, not pulled.

ansible's primary use case is installing a puppet agent don't @ me
— mx.qualia 🌻 (@profanegeometry) June 13, 2019

Okay enough kidding around, let’s get to it.

An example

A really simple example of a playbook could be something like this -

---
- hosts:
    - localhost

  tasks:
    - name: Get host uptime
      command: uptime
      register: uptime

    - name: Print host uptime locally
      debug:
        var: uptime

This is a playbook which runs two tasks against one host - localhost. Ansible executes tasks by SSHing to the specified host(s), and using a shell to execute small Python programs. For the most part, there’s one module which implements any task. Each module in a playbook occurs sequentially as its own subprocess of the Ansible connection.

This playbook instructs Ansible to connect to your current machine, run a shell command, capture the result of that action as a variable in the Ansible execution, and then print that variable to the console you’re running Ansible from.

We can run this playbook, using the ansible-playbook command.

$ ansible-playbook uptime.yml

PLAY [localhost] ***************************************************************

TASK [Gathering Facts] *********************************************************
ok: [localhost]

TASK [Get host uptime] *********************************************************
changed: [localhost]

TASK [Print host uptime locally] ***********************************************
ok: [localhost] => {
    "uptime": {
        "changed": true,
        "cmd": [
            "uptime"
        ],
        "delta": "0:00:00.002846",
        "end": "2019-06-23 10:33:11.562630",
        "failed": false,
        "rc": 0,
        "start": "2019-06-23 10:33:11.559784",
        "stderr": "",
        "stderr_lines": [],
        "stdout": " 10:33:11 up  1:11,  2 users,  load average: 0.82, 0.89, 0.91",
        "stdout_lines": [
            " 10:33:11 up  1:11,  2 users,  load average: 0.82, 0.89, 0.91"
        ]
    }
}

PLAY RECAP *********************************************************************
localhost: ok=3    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0

Because Ansible doesn’t depend on a pull-based remote agent, it’s incredibly easy to to leverage Ansible for configuration, no matter how much of the system you have stood up yet.

My Ansible experimentation began with really not much more than this, and a list of the IP addresses currently assigned to ethos, logos and pathos.

In order to go work with my three hosts, there were two last things I had to set up: Inventory and host variables.

Inventory

In the example playbook above, I used only one host out of Ansible’s inventory - localhost. Ansible has localhost built-in, and doesn’t need any help to run playbooks locally. In order to work with my three other boxes however, I had to tell Ansible what they were and how to get there.

That’s what Ansible’s inventory system is for. Inventory is how Ansible gets lists of hosts to work on, and maps hosts to groups. For instance, I call my three boxes the apartment_modes (being named after three of the four classical modes of persuasion), and a simple hosts inventory file could be something like this:

hosts

---
apartment_modes:
  vars:
    ansible_user: root
  hosts:
    pathos.apartment.arrdem.com:
      ansible_host: 10.0.0.64
    ethos.apartment.arrdem.com:
      ansible_host: 10.0.0.65
    logos.apartment.arrdem.com:
      ansible_host: 10.0.0.66

If we put this text in the hosts file and use the ansible-inventory command, we can see what Ansible really makes of it.

$ ansible-inventory -i hosts --list
{
  "_meta": {
    "hostvars": {
      "ethos.apartment.arrdem.com": {
        "ansible_host": "10.0.0.65",
        "ansible_user": "root"
      },
      "logos.apartment.arrdem.com": {
        "ansible_host": "10.0.0.66",
        "ansible_user": "root"
      },
      "pathos.apartment.arrdem.com": {
        "ansible_host": "10.0.0.64",
        "ansible_user": "root"
      }
    }
  },
  "all": {
    "children": [
      "apartment_modes",
      "ungrouped"
    ]
  },
  "apartment_modes": {
    "hosts": [
      "ethos.apartment.arrdem.com",
      "logos.apartment.arrdem.com",
      "pathos.apartment.arrdem.com"
    ]
  }
}

This file has done two things. We can see that the group apartment_modes exists, and lists my current three hosts as children. We can also see that in the _meta record of the inventory, each host has an ansible_hostname - the IP address where the host can be found. Without this data (or DNS records pointing to the hosts you want to action), Ansible won’t do anything. And we haven’t gotten to building out DNS yet.

Also note that the ansible_user var was provided on the apartment_modes group, but all the members of the group inherited it. This particular variable tells Ansible what user to SSH as, but the group inheritance pattern is usable for a lot more.

Some other interesting things you can do with inventories - groups can contain other groups as children, see the generated all group. Groups inherit their children’s hosts recursively and you can do some fun stuff with that. It also makes working with host groups easy, because you don’t have to keep writing out all the child hosts all the time.

Unfortunately the upstream docs don’t really make any of this clear - which was an initial source of great frustration to me.

host_vars, group_vars

You don’t have to put all your host-specific configuration in inventory, although you certainly could. There’s a good case to be made that inventory should be declarative, and that configuration should live only on groups unless absolutely unavoidable.

Ansible gives you the option of factoring out your host and group specific vars into eponymous directories - host_vars and group_vars (see the upstream docs). Each of these directories may contain YAML files (with no file extension) having the same name as the host or group for which the file provides vars. My actual inventory relies on host_vars to provide my source of truth for IP addresses and some other data.

Roles

On which note, Ansible also lets you define roles. Roles are reusable fragments of a playbook, which can be parameterized and applied to a host using the role module.

One of the first roles I wrote was static-ip, which configures dhcpcd to use a static IP assignment. This let me ensure that all my hosts would be running the IP address I had configured for them in my inventory.

The role itself is pretty trivial -

roles/static-ip/tasks/main.yml

# All my nodes run on static address assignments,
# combined with address reservations on the router side.
# This task lays down the config(s) required to achieve this.
---
- name: Deploy dhcpcd config
  template:
    src: dhcpcd.conf.j2
    dest: /etc/dhcpcd.conf
  notify:
    - restart dhcpcd

and a corresponding template -

roles/static-ip/templates/dhcpcd.conf.j2

# A sample configuration for dhcpcd.
# See dhcpcd.conf(5) for details.

...

# Use static address assignments
interface 
static routers=10.0.0.1
static ip_address=/32

And to use it in a playbook, all you’d have to do is say (in your playbook of choice)

---
- hosts:
    - apartment_modes
  roles:
    - role: static-ip

You can customize role application by passing extra vars to the role and soforth, but in this case I want the role to pull from host-specific configuration so that it’s enforced as a source of truth.

Modules

Because playbooks, included roles and other fragments are executed sequentially, it’s easy to cobble them together just by duplicating your hand-configuration. Building up playbooks the way you’d build up a BASH script works, but it leaves the declarative (or at least idempotent) properties Ansible’s modules can provide on the table.

The file module for instance lets you declare what the state of a file should be. Whether it should be present, who the owning users should be and soforth. The copy and template modules are also fantastic - allowing you to copy files or apply templates to remote machines. There are also modules for packages - package which lets you abstract over the platform package manager.

In general I’ve found that making more use of modules, rather than having complicated roles that emulate bash scripts has made for more reliable and portable playbooks.

And that’s about everything I needed to figure out to get up and running with Ansible! You can actually browse the repo where my current Ansible configuration lives. It’s come a long way to be sure, and this is far from the last post about what has and hasn’t worked out for me.