All of the code for this project is available at archit120/DNSWhy
main.py
- logic for resolving domain namepacking.py
- convert python structures to bytes for networkunpacking.py
- convert bytes received from network back to pythonnetwork.py
- simple UDP code to send and receive messagestructures.py
- defines a subset of structures in DNSenums.py
- contains subset of enums defined in DNSI will cover each of these files one-by-one in reverse order.
This file contains the enum types defined in RFC1035. Not all have been implemented for sake of brevity, just the ones I needed to complete this. These were Opcode
, RCode
, RType
, QType
, RClass
and QClass
.
The R*
enums are subsets of their Q*
counter-parts. However, python doesn’t support extending enums so I had to duplicate that information. Because not all possible enum values had been implemented I added an unknown code in all these enums. These are always coded to be an integer that’s outside the maximum possible value for these enums.
Example of such a functionality -
class Opcode(Enum):
QUERY = 1
Unknown = 16
@classmethod
def _missing_(cls, value: object):
return Opcode.Unknown
Similar to enums.py
. Implements structures required for communication. A lot of effort could have been spared using dataclasses
but I wanted to target python 3.6 so this wasn’t an option for me.
Very barebones code to send and receive messages using UDP sockets. The functionality depends on packing
and unpacking
to convert messages to and from network formats.
unpacking.py
is one of the more important files. It is responsible for creating Message
python types from the datagram received. The code is a natural extension of whatever we discussed in Part - 1. Most of it is pretty uninteresting except for read_string
that I want to talk about more.
def read_string(data: bytes, start_pos: int) -> Tuple[str, int]:
retstr = ''
while True:
length = data[start_pos]
start_pos+=1
if length == 0:
return retstr[1:], start_pos
elif length>63:
length, = struct.unpack(">H", data[start_pos-1:start_pos+1])
length -= 0xC000
return (retstr+'.'+read_string(data, length)[0])[1:], start_pos+1
retstr = retstr + '.' + data[start_pos:start_pos+length].decode('ascii')
start_pos += length
read_string
supports reading the domain names as implemented in RFC 1035 with compression. The basic idea is pretty simple, it takes as input the entire datagram and a start_pos
. The function assumes that the start_pos
is a valid byte number in the datagram and starts reading from there. Because labels are restricted to 63 characters, any length (which is the first byte) > 63 implies a pointer as described in compression. In which case we read a 2byte integer and remove the two higher order bits. Once that’s done we can recursively call read_string
for this new position. Caching could have been implemented here but wasn’t because performance is not a concern.
packing.py
is simply the reverse of unpacking
. The functionality implemented here is much less because I was only interested in A
type queries for a single domain. There is one interesting thing to note that all integers and enum values in python are 32bit so I had to truncate that information when packing to short integer types.
The final dns logic is implemented in main
. The code is pretty self explanatory but can also be visualized through this flowchart.
In the resolving part also caching could have been used to reduce the number of queries made but I didn’t implement that. This logic might not be the most rigorous but it works well enough for all cases that I tried.
With all that out of the way finally lets look at the output for some websites. Let’s start with the goal www.citadel.com
Asking 198.41.0.4 for www.citadel.com
No answer records found. Looking at authoritative records
Found 13 valid authoritative servers. Re-querying the first one with IP present
Name: a.gtld-servers.net, IP: 192.5.6.30
Asking 192.5.6.30 for www.citadel.com
No answer records found. Looking at authoritative records
Found 4 valid authoritative servers. Re-querying the first one with IP present
Name: ns-164.awsdns-20.com, IP: 205.251.192.164
Asking 205.251.192.164 for www.citadel.com
Found answers for target domain. Total 1 answers found
Found a CNAME record. Reasking root server for the new alias
Asking 198.41.0.4 for www.citadel.com.cdn.cloudflare.net
No answer records found. Looking at authoritative records
Found 13 valid authoritative servers. Re-querying the first one with IP present
Asking 192.5.6.30 for www.citadel.com.cdn.cloudflare.net
No answer records found. Looking at authoritative records
Found 5 valid authoritative servers. Re-querying the first one with IP present
Name: ns1.cloudflare.net, IP: 173.245.59.31
Asking 173.245.59.31 for www.citadel.com.cdn.cloudflare.net
Found answers for target domain. Total 2 answers found
Answer found!
104.18.24.189
104.18.25.189
Next, let’s try this blog’s domain.
Asking 198.41.0.4 for archit.me
No answer records found. Looking at authoritative records
Found 5 valid authoritative servers. Re-querying the first one with IP present
Name: a0.nic.me, IP: 199.253.59.1
Asking 199.253.59.1 for archit.me
No answer records found. Looking at authoritative records
Found 4 valid authoritative servers. Re-querying the first one with IP present
IP for none of the authoritative servers included in additional records. Querying for its IP first
Asking 198.41.0.4 for fortaleza.porkbun.com
No answer records found. Looking at authoritative records
Found 13 valid authoritative servers. Re-querying the first one with IP present
Name: a.gtld-servers.net, IP: 192.5.6.30
Asking 192.5.6.30 for fortaleza.porkbun.com
No answer records found. Looking at authoritative records
Found 4 valid authoritative servers. Re-querying the first one with IP present
Name: ns-199.awsdns-24.com, IP: 205.251.192.199
Asking 205.251.192.199 for fortaleza.porkbun.com
Found answers for target domain. Total 2 answers found
Answer found!
52.73.191.223
3.224.31.177
Asking 52.73.191.223 for archit.me
Found answers for target domain. Total 3 answers found
Answer found!
185.199.109.153
185.199.108.153
185.199.110.153
It was pretty fun to read an RFC and implement everything from scratch. I had to use wireshark a couple of times to figure out the string compression but it turned out to be not too hard. Writing a blog post along with this was a nice way to make notes on the RFC and was definitely useful in bringing this project to completion. I might make a part 3 where I create an interactive widget to run queries and power it using Heroku or AWS ServerLess. Hopefully more to come soon!