Converting DIG output to JSON

DIG is one powerful tool, mostly used to troubleshoot DNS queries.

However, sometime we want to achieve a task in another field of expertise and collect dns data. For example when one needs to limit access to content which is hosted on different servers from time to time but we can’t utilize FQDN in our firewall rulebase because the reverse dns isn’t acurate.

Now with DIG we can collect the ip addresses that is returned from a DNS request. For example for google.com.

martijn@monitoring:~$ dig google.com

; <<>> DiG 9.9.5-9+deb8u17-Debian <<>> google.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 16736
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;google.com.                    IN      A

;; ANSWER SECTION:
google.com.             36      IN      A       172.217.17.78

;; Query time: 1 msec
;; SERVER: 195.8.195.8#53(195.8.195.8)
;; WHEN: Tue Mar 10 10:38:02 CET 2020
;; MSG SIZE  rcvd: 55

martijn@monitoring:~$

But manually scraping the default output and maintaining a list is time consuming. We can make the output cleaner by adding some additional parameters. For example the following command:

martijn@monitoring:~$ dig google.com +nocomments +noquestion +noauthority +noadditional +nostats

; <<>> DiG 9.9.5-9+deb8u17-Debian <<>> google.com +nocomments +noquestion +noauthority +noadditional +nostats
;; global options: +cmd
google.com.             288     IN      A       172.217.168.238
martijn@monitoring:~$

While this is already much cleaner, we still would have to manually process this output, or perform some screen scraping to continue with the output. We can however pipe the output of dig through the powerful awk command and skip the first three lines.

martijn@monitoring:~$ dig aaaa google.com +nocomments +noquestion +noauthority +noadditional +nostats  | awk '{if (NR>3){print}}'
google.com.             53      IN      AAAA    2a00:1450:400e:80d::200e
martijn@monitoring:~$

And to be honest, yes, we could skip the first three lines with any other tool that provides these capabilities, but awk seems to be generally available. Now we only have the actual results of the query it is safe to continue with the data.

DNS data always consists of a fixed structure.

Query                  TTL      CLASS   TYPE    Content 
google.com.             53      IN      AAAA    2a00:1450:400e:80d::200e 

In my case i have a need to process this data in a structured way, and i am able to process either JSON or XML. for this example i will convert the structured data to JSON. Because the content is already by default separated by tabs we can pull the data through jq. However, we need to keep in mind that sometimes there are multiple tabs. So we need to squeeze them in to one.

martijn@monitoring:~$ dig aaaa google.com +nocomments +noquestion +noauthority +noadditional +nostats  | awk '{if (NR>3){print}}' | tr -s '\t' | jq -R 'split("\t") |{Name:.[0],TTL:.[1],Class:.[2],Type:.[3],IpAddress:.[4]}'
{
  "Name": "google.com.",
  "TTL": "76",
  "Class": "IN",
  "Type": "AAAA",
  "IpAddress": "2a00:1450:400e:80e::200e"
}
martijn@monitoring:~$

The output we have now seems to be valid JSON, however testing this further with dns queries returning multiple addresses will return slightly invalid JSON. An good example would be when we query the microsoft.com domain.

martijn@monitoring:~$ dig a microsoft.com +nocomments +noquestion +noauthority +noadditional +nostats  | awk '{if (NR>3){print}}'| tr -s '\t' |  jq -R 'split("\t")
 |{Name:.[0],TTL:.[1],Class:.[2],Type:.[3],IpAddress:.[4]}'
 {
   "Name": "microsoft.com.",
   "TTL": "3600",
   "Class": "IN",
   "Type": "A",
   "IpAddress": "104.215.148.63"
 }
 {
   "Name": "microsoft.com.",
   "TTL": "3600",
   "Class": "IN",
   "Type": "A",
   "IpAddress": "13.77.161.179"
 }
 {
   "Name": "microsoft.com.",
   "TTL": "3600",
   "Class": "IN",
   "Type": "A",
   "IpAddress": "40.76.4.15"
 }
 {
   "Name": "microsoft.com.",
   "TTL": "3600",
   "Class": "IN",
   "Type": "A",
   "IpAddress": "40.112.72.205"
 }
 {
   "Name": "microsoft.com.",
   "TTL": "3600",
   "Class": "IN",
   "Type": "A",
   "IpAddress": "40.113.200.201"
 }
 martijn@monitoring:~$

As already stated, the output isn’t yet valid JSON, we need to slurp it once more through the jq tooling.

 martijn@monitoring:~$ dig a microsoft.com +nocomments +noquestion +noauthority +noadditional +nostats  | awk '{if (NR>3){print}}'| tr -s '\t' | jq -R 'split("\t") |{Name:.[0],TTL:.[1],Class:.[2],Type:.[3],IpAddress:.[4]}' | jq --slurp .
 [
   {
     "Name": "microsoft.com.",
     "TTL": "3256",
     "Class": "IN",
     "Type": "A",
     "IpAddress": "104.215.148.63"
   },
   {
     "Name": "microsoft.com.",
     "TTL": "3256",
     "Class": "IN",
     "Type": "A",
     "IpAddress": "13.77.161.179"
   },
   {
     "Name": "microsoft.com.",
     "TTL": "3256",
     "Class": "IN",
     "Type": "A",
     "IpAddress": "40.76.4.15"
   },
   {
     "Name": "microsoft.com.",
     "TTL": "3256",
     "Class": "IN",
     "Type": "A",
     "IpAddress": "40.112.72.205"
   },
   {
     "Name": "microsoft.com.",
     "TTL": "3256",
     "Class": "IN",
     "Type": "A",
     "IpAddress": "40.113.200.201"
   }
 ]
 martijn@monitoring:~$

So, basically, to get the result of dig in an json valid output you could create one call in your bash script to

#!/bin/bash
recordtype="A"
fqdn="microsoft.com"
digjson=$( dig $recordtype $fqdn +nocomments +noquestion +noauthority +noadditional +nostats  | awk '{if (NR>3){print}}'| tr -s '\t' | jq -R 'split("\t") |{Name:.[0],TTL:.[1],Class:.[2],Type:.[3],IpAddress:.[4]}' | jq --slurp . )

Feel free to query your own domainname or specific record and adjust the recordtype, preferably by setting the variables.