import site.body

Anywhere services with Anycast

I have been looking at Anycast as a way to reduce the management on my network and allow me to use IPs to refer to services on the network instead of locations. As I rolled these services out on my network machines would automatically make use of the best available resource and I could worry less about site specific setups and just use the same setup at all sites.

Anycast is a powerful technique used by many organisations on the internet to help route traffic to the 'best' location to serve up that service. Anycast can also be used to sinkhole traffic when under a DOS where you want to reduce the impact on innocent bystanders and can afford the downtime.

To support my network as it grows more complex I currently use an Interior routing protocol (OSPF) to distribute routes at any one site and a Exterior routing protocol (BGP) to export resources between sites. When mixed with Anycast this causes sites that do not have the local Anycasted service to forward all traffic to a different Anycast site. By Influencing the announcement in BGP this allows me to influence the traffic between sites.

What do I hope to get out of this

While initially an experiment, As I worked through it some obvious advantages came out of the design which very quickly became desired features chief among these are:

  • Ability to reroute traffic to a backup service without reconfiguration of individual machines or dns records.
  • Identical setup for all networks putting less reliance on DHCP for network information.
  • Easy to take services out of rotation for updates and still have things work.

The Actual Implementation

How this is implemented is actually fairly simple and can be summed up in the following rules:

  • 10.254.0.0/16 is reserved for Anycast advertisements.
  • All advertisements in this range are /32s.
  • The bottom 2 octets map to the port number (port 2048 = 10.254.8.0).
  • All Anycast IPs go on a dummy interface.

For deployment on the server I typically spin up a dummy interface and assign the IPs to this. I have seen many people use lo for this however I much prefer an interface called anycast that I can up and down to enable/disable access to the services when there is an issue, This gets instantly propagated over the gateway protocols. Another advantage is this can be done with ip li anycast down rather than removing the IP by hand from zebra (think of it like a daemon that does ifup/ifdown) and having to re-add it later.

It is common for people without network experience to add an IP address to the main interface of a machine or use one of the IP addresses attached to a link. Using either method comes with the additional complication of the services being unavailable if that link goes away. By establishing a dummy interface that is always up, We isolate link failures from service failures and as long as there is a route to our service IPs, Connectivity will still continue via a different path.

Beyond this there is not much more that has to be executed, Most of the complexity and 'magic' surrounding using this on your own networks comes from thinking through the deployment and how it should work. Attention paid up front to the design and deployment pays dividends in maintenance and when there is an issue with the network.

Services in Use

Currently I am only running a handful of services this way, These line up with services DHCP can push out to clients asking for information:

  • DNS (10.254.0.53)
  • NTP (ntp.infra.pocketnix.org)
  • SMTP (smtp.infra.pocketnix.org)
  • Syslog (syslog.infra.pocketnix.org)

You may have noted in the above that everything has a domain name associated with it apart from DNS, For DNS we need to push out an IP as the DNS server would be used to resolve DNS. We solve this chicken and the egg problem by using the IP rather than the friendly name. As my DNS records and configs are generated from the same data I don't have to worry about any sort of drift of records vs configs here.

If you have read up on Anycast in the past you may also notice that I have TCP based services in there and that the general advice is not to do this. As my routing is relatively stable this is fairly safe. Some of the services on TCP are also short lived in time and not persistent connections further helping this along and also have good retry logic.

At a more global scale, Cloudflare has done this for their CDN and noted that in practice Anycasting TCP is not as much of an issue as it was thought to be as seen here and here.

Going Forward

I would like to expand on the Anycast network and add another range, Likely 10.253.0.0/16 to help with migration of Anycast services and provide more resilience when a service dies but the networking for it is still active.

There are also plans to wire ExaBGP into services via systemd so that announcement for these services are only active while the service is up and if a regular health check fails and decides to shut down the service then all traffic will go to the next Anycast site. As an added bonus this will mean i have the option to not run the service on the router but can host it elsewhere and relying on the routing to ensure traffic is forwarded.

This is just one part of updating and modernising my home network that I am currently under taking, The plans on paper are to have my home network look more like an enterprise virtualisation cluster and managed in a similar way. This is both a training exercise on my part as well as the increasingly larger requirements being placed on my setup. Due to being structured to a virtualisation farm this has made a great test bed for some of my virtualisation and container projects and is the perfect test environment.