Homelab: Internal DNS
06 Jul 2019Previously, I looked at using Ansible and Ansible’s inventory capabilities to begin managing services and configuration on my homelab.
A core defect in the setup I presented was that I hand-coded the mapping of hostnames to IP addresses in my Ansible inventory because well I didn’t have DNS set up yet.
But hang on a second. What is DNS and why do I care?
When you type in http://foo.com/bar
to your browser, that’s a URL (Uniform Resource Locator) which is comprised of a couple segments.
It has a scheme - in this case http
which describes the protocol by which we’ll go and fetch the resource.
It also has an authority part - in this case foo.com
- a hostname to go and fetch the resource from.
The authority part can have other details like a username and port as well.
For instance arrdem@foo.com:443
would provide a username, hostname and port.
A simple IPv4 or IPv6 address is also legal as a hostname.
A URL may also have a path - in this case /bar
- which says what to request from foo.com
when you get there.
Making a request from an IP address and port is pretty easy - if you know how to speak the protocol.
You just make a TCP connection to that (host, port)
pair and away you go.
But how do you find IP addresses?
I don’t want to commit ethos (10.0.0.64
), logos (10.0.0.65
) and pathos (10.0.0.66
) to memory or build out anything which really depends on those address assignments if I can avoid it.
Enter DNS - the traditional solution to this problem. DNS ([the] Domain Name System) was created to provide a protocol for mapping names memorable to humans (like ethos, logos and pathos!) to IP addresses which machines actually use. DNS is a host discovery system - its core purpose is to map a domain name to one or more IP addresses presumed to identify machines somewhere. It does not implement service discovery. Services (programs listening to ports on a machine) are identified by convention. For instance “the” program which speaks HTTP if any is listening on port 80, “the” program if any which speaks SSH is listening on port 22 and soforth. These conventions worked fine before the advent of modern shared infrastructure or “cloud” hosting and now pose some challenges I’ll talk about later.
So how does DNS work?
DNS consists of a hierarchy of servers - known as resolvers - which speak the DNS query language.
Each DNS client connects to a few (typically 3 or fewer) resolvers provided as IP addresses.
For instance 1.1.1.1
is a DNS resolver made public by CloudFlare, and 8.8.8.8
is a DNS resolver made public by Google.
When you make a request of the resolver, you do by requesting an address (called a domain) of the resolver.
If the resolver has data it will serve a response, otherwise it may have to (potentially recursively!)
inquire of other resolvers for the data you wanted.
What kind of record(s) live in DNS?
The most basic record is a A
record - just an IP address.
We can search DNS for records using the dig tool, as such -
$ dig www.arrdem.com
; <<>> DiG 9.14.3 <<>> www.arrdem.com A
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 17422
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;www.arrdem.com. IN A
;; ANSWER SECTION:
www.arrdem.com. 300 IN A 67.166.32.93
;; Query time: 66 msec
;; SERVER: 75.75.75.75#53(75.75.75.75)
;; WHEN: Fri Jul 05 17:32:05 PDT 2019
;; MSG SIZE rcvd: 59
In this response we can see the ANSWER
section, which says cryptically
www.arrdem.com. 300 IN A 67.166.32.93
The first element here - www.arrdem.com.
is the full canonical name of the requested record.
The second element - 300
- is the TTL of this record in seconds.
This tells resolvers which have to recursively query to get this data how long they may cache it for.
The third element - IN A
denotes the record type.
Finally we actually have the value - 67.166.32.93
being the current IP address for my homelab.
An interesting property of DNS is that most records need not be singular. That is, you could dig and get a couple IP addresses back.
Twitter for instance presents two public IPs.
$ dig twitter.com A
; <<>> DiG 9.14.3 <<>> twitter.com A
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 11939
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;twitter.com. IN A
;; ANSWER SECTION:
twitter.com. 563 IN A 104.244.42.65
twitter.com. 563 IN A 104.244.42.1
;; Query time: 32 msec
;; SERVER: 75.75.75.75#53(75.75.75.75)
;; WHEN: Fri Jul 05 17:34:08 PDT 2019
;; MSG SIZE rcvd: 72
That is there is not one but two public addresses either of which could be used to access the service known by the domain name twitter.com
if the other fails or is overloaded.
So if you go and connect to http://twitter.com
, you’ll be connecting to one of those two IP addresses.
This can be used to build client-side load balancing to distribute requests randomly over many hosts as clients are expected to choose which host to connect to in “round robin” order.
For instance a fleet of tens or more puppet servers all of which provide the same data could live behind a single A
record “round robin”.
There’s a lot of really interesting stuff you can do with DNS, but for now lets get it up and running in the lab.
The obvious first step would be to reconfigure my router to push the IP addresses of my three nodes as DNS resolvers. Doing so before the resolver(s) are set up however would nuke my ability to talk to the outside world! (looking at you stackoverflow) so lets hold off on that.
Instead we’ll take advantage of the dig
tool’s ability to target a specific resolver eg.
dig <address> @<resolver>
to test the resolvers I’m building out before we cut over to them.
Okay. Let’s do this.
BIND setup
There’s a number of DNS servers - but I’m gonna go with good old bind.
Bind (aka named
) uses a three part configuration.
/etc/named.conf
tells named what to do - for which the general pattern is include configurations for domains (called zones) out of /etc/named/data/.conf
.
While bind can do a lot of stuff, all I’m gonna use it for initially is to serve handwritten domain files (AKA zonefiles) out of /etc/named/master/
.
Writing this Ansible role is pretty easy -
roles/dns-resolver/tasks/main.yml
---
- name: Install bind
package:
name: bind
state: present
notify:
- named enable
- name: Create directories
when: installed.changed
file:
path: "{{ item }}"
state: directory
owner: root
group: root
with_items:
- /etc/named/data
- /etc/named/master
- name: Deploy named.conf
when: installed.changed
template:
src: named.conf.j2
dest: /etc/named.conf
Of slightly more interest is the actual named config template I’m deploying -
roles/dns-resolver/templates/named.conf.j2
acl "subnet" {
10.0.0.0/24;
localhost;
localnets;
};
options {
directory "/var/named";
pid-file "/run/named/named.pid";
listen-on { any; };
allow-recursion { subnet; localhost; };
allow-query { subnet; localhost; };
allow-query-cache { subnet; localhost; };
forwarders {
{% for node in upstream_dns_resolvers %}
{{ node }};
{% endfor %}
};
};
zone "localhost" IN {
type master;
file "localhost.zone";
};
This configuration defines an Access Control List (ACL) for my local subnet. It then allows only hosts in the subnet - or the local host - to make queries of this server. We also set up forwarders - hosts which each bind instance will query if the bind instance doesn’t have master data. Elsewhere in Ansible variables, I’m defining
# Everywhere we use the same upstream DNS resolvers.
# Local DNS resolvers are configured per-geo as
dns_resolvers_upstream:
- 1.1.1.1
- 8.8.8.8
- 8.8.4.4
Let’s create a new Ansible inventory group for the sake of hygiene which will contain our resolvers.
hosts
---
apartment_modes:
hosts:
ethos.apartment.arrdem.com:
vars:
ansible_host: 10.0.0.64
logos.apartment.arrdem.com:
vars:
ansible_host: 10.0.0.65
pathos.apartment.arrdem.com:
vars:
ansible_host: 10.0.0.66
apartment_resolvers:
children:
apartment_modes:
With just this configuration, we can run it against my modes using a really simple playbook
play.yml
---
- hosts:
- apartment_resolvers
roles:
- role: dns-resolver
And run that -
$ ansible-playbook -i hosts play.yml
PLAY [apartment_resolvers] **************************************************************************************
TASK [Gathering Facts] ******************************************************************************************
ok: [ethos.apartment.arrdem.com]
ok: [pathos.apartment.arrdem.com]
ok: [logos.apartment.arrdem.com]
TASK [dns-resolver : Install bind] ******************************************************************************
ok: [ethos.apartment.arrdem.com]
ok: [pathos.apartment.arrdem.com]
ok: [logos.apartment.arrdem.com]
TASK [dns-resolver : Create directories] ************************************************************************
ok: [ethos.apartment.arrdem.com] => (item=/etc/named/data)
ok: [pathos.apartment.arrdem.com] => (item=/etc/named/data)
ok: [logos.apartment.arrdem.com] => (item=/etc/named/data)
ok: [pathos.apartment.arrdem.com] => (item=/etc/named/master)
ok: [logos.apartment.arrdem.com] => (item=/etc/named/master)
ok: [ethos.apartment.arrdem.com] => (item=/etc/named/master)
TASK [dns-resolver : Deploy named.service] **********************************************************************
ok: [ethos.apartment.arrdem.com]
ok: [logos.apartment.arrdem.com]
ok: [pathos.apartment.arrdem.com]
TASK [dns-resolver : Deploy named.conf] *************************************************************************
ok: [logos.apartment.arrdem.com]
ok: [ethos.apartment.arrdem.com]
ok: [pathos.apartment.arrdem.com]
PLAY RECAP ******************************************************************************************************
ethos.apartment.arrdem.com : ok=5 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
logos.apartment.arrdem.com : ok=5 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
pathos.apartment.arrdem.com : ok=5 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
Cool!
So now we should be able to run some test DNS queries against these servers.
Most important is my ability to do recursive queries, so lets check twitter.com
first.
$ dig twitter.com @10.0.0.64
; <<>> DiG 9.14.3 <<>> twitter.com @10.0.0.64
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 27850
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
; COOKIE: 0f82e839818e3b9cecebcec75d200ee12a4d47ef5312925b (good)
;; QUESTION SECTION:
;twitter.com. IN A
;; ANSWER SECTION:
twitter.com. 1022 IN A 104.244.42.193
twitter.com. 1022 IN A 104.244.42.129
;; Query time: 5 msec
;; SERVER: 10.0.0.64#53(10.0.0.64)
;; WHEN: Fri Jul 05 20:00:45 PDT 2019
;; MSG SIZE rcvd: 100
Heck yeah. Recursive lookups are working.
A first zone
Now let’s do what we’re here to do - creating the apartment.arrdem.com
zone.
To keep things simple, I’m gonna handwrite my first zonefile.
roles/dns-zone/templates/apartment.arrdem.com.j2
$ORIGIN apartment.arrdem.com.
$TTL 7200
apartment.arrdem.com. IN SOA ns.apartment.arrdem.com. mail.apartment.arrdem.com. (
2019070442
43200
180
1209600
10800
)
;;; NS section
@ NS ns.apartment.arrdem.com.
ns IN A 10.0.0.65
ns IN A 10.0.0.66
ns IN A 10.0.0.64
;;; HOSTS
ethos IN A 10.0.0.65
logos IN A 10.0.0.66
pathos IN A 10.0.0.64
The ns
record is a convention for all the nameservers (resolves) in the domain.
And I’ve got an A
record for each of my currently three machines.
We’ll also need a small template to configure named for each zone -
roles/dns-zone/templates/zone-data.j2
zone "{{ item }}" {
type master;
file "/etc/named/master/{{ item }}";
allow-transfer {none;};
allow-update {none;};
};
This config just tells named to prohibit dynamic updates or transfers of the domain. We’ve already set global ACLs for querying. As a template, it presumes we’re rendering it from inside a loop over zone names.
All it takes to get this deployed is a pretty simple role -
roles/dns-zone/tasks/main.yml
---
- name: Deploy zonefiles
with_items: "{{ zones }}"
template:
src: "{{ item }}.zone"
dest: "/etc/named/master/{{ item }}"
notify:
- named reload
- name: Deploy zone data
with_items: "{{ zones }}"
template:
src: zone-data.j2
dest: "/etc/named/data/{{ item }}.conf"
- name: Add zone config
with_items: "{{ zones }}"
lineinfile:
path: /etc/named.conf
state: present
line: "include \"/etc/named/data/{{ item }}.conf\";"
That is, we’ll apply this role with a list of zones as the variable zones
, for each one rendering a template to produce the zonefile, rendering our config template for each zone and using the lineinfile
module to monkeypatch our main /etc/named.conf
to make named include the new zone’s config.
Patching our playbook a tiny bit -
play.yml
---
- hosts:
- apartment_resolvers
vars_files:
- "vars/.yml"
roles:
- role: dns-resolver
- role: dns-zone
zones:
- apartment.arrdem.com
And running it -
$ ansible-playbook -i hosts play.yml
PLAY [apartment_resolvers] **************************************************************************************
TASK [Gathering Facts] ******************************************************************************************
ok: [logos.apartment.arrdem.com]
ok: [pathos.apartment.arrdem.com]
ok: [ethos.apartment.arrdem.com]
TASK [dns-resolver : Install bind] ******************************************************************************
ok: [pathos.apartment.arrdem.com]
ok: [logos.apartment.arrdem.com]
ok: [ethos.apartment.arrdem.com]
TASK [dns-resolver : Create directories] ************************************************************************
ok: [pathos.apartment.arrdem.com] => (item=/etc/named/data)
ok: [ethos.apartment.arrdem.com] => (item=/etc/named/data)
ok: [logos.apartment.arrdem.com] => (item=/etc/named/data)
ok: [pathos.apartment.arrdem.com] => (item=/etc/named/master)
ok: [ethos.apartment.arrdem.com] => (item=/etc/named/master)
ok: [logos.apartment.arrdem.com] => (item=/etc/named/master)
TASK [dns-resolver : Deploy named.service] **********************************************************************
ok: [ethos.apartment.arrdem.com]
ok: [logos.apartment.arrdem.com]
ok: [pathos.apartment.arrdem.com]
TASK [dns-resolver : Deploy named.conf] *************************************************************************
ok: [logos.apartment.arrdem.com]
ok: [pathos.apartment.arrdem.com]
ok: [ethos.apartment.arrdem.com]
TASK [dns-zone : Deploy zonefiles] ******************************************************************************
changed: [pathos.apartment.arrdem.com] => (item=apartment.arrdem.com)
changed: [logos.apartment.arrdem.com] => (item=apartment.arrdem.com)
changed: [ethos.apartment.arrdem.com] => (item=apartment.arrdem.com)
TASK [dns-zone : Deploy zone data] ******************************************************************************
ok: [ethos.apartment.arrdem.com] => (item=apartment.arrdem.com)
ok: [logos.apartment.arrdem.com] => (item=apartment.arrdem.com)
ok: [pathos.apartment.arrdem.com] => (item=apartment.arrdem.com)
TASK [dns-zone : Add zone config] *******************************************************************************
changed: [ethos.apartment.arrdem.com] => (item=apartment.arrdem.com)
changed: [pathos.apartment.arrdem.com] => (item=apartment.arrdem.com)
changed: [logos.apartment.arrdem.com] => (item=apartment.arrdem.com)
RUNNING HANDLER [dns-zone : named reload] ***********************************************************************
changed: [pathos.apartment.arrdem.com]
changed: [ethos.apartment.arrdem.com]
changed: [logos.apartment.arrdem.com]
PLAY RECAP ******************************************************************************************************
ethos.apartment.arrdem.com : ok=9 changed=3 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
logos.apartment.arrdem.com : ok=9 changed=3 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
pathos.apartment.arrdem.com : ok=9 changed=3 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
we should be able to dig ethos, logos and pathos out of DNS!
$ for h in ethos logos pathos; do dig +short ${h}.apartment.arrdem.com @10.0.0.64; done
10.0.0.65
10.0.0.66
10.0.0.64
Heck yeah.
Now if I go into my router, tell it to use my three nodes as DNS resolvers and reconnect my device so that it get a fresh resolver config, I’ll see my resolvers configured in /etc/resolv.conf
# Generated by resolvconf
domain apartment.arrdem.com
search apartment.arrdem.com arrdem.com
nameserver 10.0.0.64
nameserver 10.0.0.65
nameserver 10.0.0.66
Now, I can ssh
using DNS names not IP addresses!
$ ssh arrdem@pathos echo '$(hostname -f)] Hello, world!'
pathos.apartment.arrdem.com] Hello, world!
Metaprogramming zones
While the above zonefile for apartment.arrdem.com
strictly works - it’s also one more thing to update by hand whenever I bring up a new node or service.
I’m gonna be spending a lot of quality time working on the service discovery problem - but let’s start with this.
Ansible still has (as ansible_host
) the IP address for every device I configure.
So at the very least, one could write this zone -
roles/dns-zone/templates/apartment.arrdem.com.j2
$ORIGIN apartment.arrdem.com.
$TTL 7200
apartment.arrdem.com. IN SOA ns.apartment.arrdem.com. mail.apartment.arrdem.com. (
{{ansible_date_time.year}}{{ansible_date_time.month}}{{ansible_date_time.day}}42
43200
180
1209600
10800
)
;;; NS section
@ NS ns.apartment.arrdem.com.
{% for node in groups[geo + '_resolvers'] %}
ns IN A {{ hostvars[node]['ansible_host'] }}
{% endfor %}
;;; HOSTS
{% for node in groups['geo_apartment'] %}
{{ node | shortname | format("{0: <16}") }} IN A {{ hostvars[node]['ansible_host'] }}
{% endfor %}
This template will generate a SOA version by concatenating the date to day precision, along with a counter I bump by hand.
Leveraging the fact that there’s an apartment_resolvers
group in Ansible’s inventory, we can introspect it if there’s a geo
variable set.
We can also play the same game to get all the hosts in the geo_apartment
group!
So if I tweak my inventory a tiny bit -
hosts
---
apartment_modes:
hosts:
ethos.apartment.arrdem.com:
vars:
ansible_host: 10.0.0.64
logos.apartment.arrdem.com:
vars:
ansible_host: 10.0.0.65
pathos.apartment.arrdem.com:
vars:
ansible_host: 10.0.0.66
apartment_resolvers:
children:
apartment_modes:
geo_apartment:
vars:
geo: apartment
children:
apartment_modes:
Now if I want to add a half-dozen raspberry pis all of a sudden, all I have to do is add them to my Ansible inventory and they’ll automatically be added to DNS!
To really see that this works, check out ansible-inventory -i hosts --list
with this hosts
file.
^d