Intercepting UDP DNS packets using NGINX and JavaScript
NGINX (pronounced "engine-x") is more than a proxy or a web server, it is a Swiss army knife. With the addition of NJS, a subset of JavaScript language to extend NGINX reach, things got even more interesting. Really interesting.
I recently had to deal with a very specific use case related to changing DNS records on the fly without having to update thousands of DNS zones. A kind of reverse proxy for DNS requests, it was a first for me.
DNS apex limitation
I won't be digging into how DNS works cause it's a bit out of the scope of this article, but there are four types of DNS servers, you can read more about in this CloudFlare article.
The one that controls the DNS zone, where all the DNS records of a domain lives, is called Authoritative Nameserver. If you ever had to purchase a domain, it's the DNS you insert there, on your registrar, and you use it to configure your mail settings, validate SSL certificates etc etc.
Every time you update your DNS zone, the DNS zone must be synced with all DNS servers inside the Authoritative Nameserver cluster. I mean, I hope your hosting provider is not using a single server to deal with HTTP, DNS and mail like it's 2001.
Imagine having to update, with the least amount of delay possible, the main A record (zone apex record, or ‘naked domain’) of thousands of domains. Effectively changing where the traffic for that domain is directed to.
Easy, you may say. Just use a CNAME record. The thing is, you can't use CNAME on the Apex record:
ntorga.com. 90 IN CNAME my-load-balancer.aws.com
AWS recommends using the ALIAS DNS record in their article about how to use an AWS load balancer on the apex record. Wait, ALIAS? What? There is no ALIAS DNS record on the list. You're correct, there isn't.
ALIAS, ANAME or CNAME flattening is a "virtual" type of DNS records, created by providers such as CloudFlare, DNS Made Easy, AWS Route 53 etc, but they don't "officially" exist. You are not going to find it on DNS software such as BIND9, although PowerDNS has recently added support for it.
Unfortunately for me, migrate thousands of domains and rewrite the entire DNS API (which is based on BIND9 servers) to use PowerDNS isn't an option, at least for now.
The use case
Luckily, the only use case we have right now for ALIAS-like records is due to our professional WAF provider (included for free on our plans) using CNAME to direct the traffic.
The WAF provider does offer an anycast network which does not rely on CNAME to work, but recently we had some BGP issues with the Brazilian point of presence of the anycast network and decided that the CNAME-based network works best.
The problem was that to use the CNAME approach, our DNS servers would need to be able to resolve ANAME/ALIAS to support this use case and they do not, at least for now.
The solutions
The WAF provider offered their DNS product, but for business reasons, the customers were a bit afraid to go with that solution cause it would require:
- a vanity name server feature which wasn't available right away;
- a fail-over strategy in case the WAF provider isn't on the market anymore;
- the migration of all DNS zones;
- change the name servers for all domains.
Not the ideal solution we agreed. So, how about a proxy? That was interesting.
First, I looked into DNSCrypt-proxy, which isn't ideal for authoritative name servers, but could work. After testing, it didn't work, mostly due to the lack of encrypted protocols in our BIND9 cluster.
I couldn't spend too much time debugging, so I went to the next solution: NGINX!
We eat NGINX rules for breakfast on the DevOps team since it's one of our main tools. Wait, NGINX can do DNS proxy? Well, sort of...
The MacGyver way
As always, NGINX had an awesome article about using it as a DoT or DoH Gateway. It is not focused on intercepting DNS requests and modifying its content, but it did provide a DNS JavaScript library written by TuxInvader that was quite interesting.
It required parsing the UDP packet into JavaScript objects, mainly to get the QNAME (the domain being queried) to route or block the request on the NGINX side, but it also had methods to re-encode the payload of the UDP packet. Hmmmm.
After digging into the library and understanding how to work with it, I wrote a simple proof of concept:
The goal was to replace the WAF anycast IP address with another IP, just to make sure the proxy would be able to intercept only the necessary DNS queries and replace it with a different IP address. It worked!
Since we're using NGINX, we can combine this workaround with modules like "ngx_stream_geoip", for example, to further extend the response delivery logic. The possibilities are limitless.
The dnsdist way
During this research, a friend recommended a great toot called dnsdist, developed by the folks at PowerDNS. You can cache responses, load balance granularly between servers, manipulate the DNS objects (question and response) easily and enable DoT and DoH with a single config line.
I played around with dnsdist a bit and started working with LUA code again after years. LUA is awesome. However, at least for now, I haven't been able to find a way to edit the contents of the DNSResponse object, so I can't do what I did with NGINX.
Looking at the current experimental branch (1.8.0), the dnsdist developers are working on a DNSParser feature that might allow editing the response code.
Still on the PowerDNS subject and its projects, I also found that they support something called LUA records. That feature allows records like:
www IN LUA A "ifportup(443, {'192.0.2.1', '192.0.2.2'})"
It intrigued me. A lot. I might reconsider rebuilding the entire DNS structure.
That's it for today. A big thanks to TuxInvader for writing the DNS NJS library. As a side note (if you didn't know yet), it's also worth checking the NGINX "match{}" directive for active health check using proper responses.
Comments