Decoding hex-obscured IPv4 addresses

3 replies [Last post]
amenex
Offline
Joined: 01/03/2015

After resolving about two-thirds of a few hundred "unresolvable" hostnames through
a combination of Internet searches and experience, I was left with about a hundred,
two-thirds of which are listed in the attached file.

The sixty-two strings turn out to be hex-coded IPv4 addresses, as shown by these two examples:
bfbefba5 restated as bx16+f.bx16+e.fx16+b.ax16+5, then trying dig -x 191.190.251.165 yields bfbefba5.generic.com
5e049c2c restated as 5x16+e.0x16+4.9x16+c.2X16+c, then trying dig -x 94.4.156.44 yields 5e049c2c.generic.com

They're not proper IPv6 addresses, so I've found but just one site that makes
two-character-hex-number-at-a-time transforms. One of my ponderous awk scripts will separate
them into octet-wise groupings:
awk '{print $0}' trisquel-generic-hex.txt | awk '{gsub(/.{2}/,"&.")}1' '-' | sed 's/\./\t/g' '-' | awk '{print $1"\t"$2"\t"$3"\t"$4}' '-'
But the rest of the printf-based script for the restatement steps in
https://superuser.com/questions/1104234/how-to-convert-a-hexadecimal-value-to-a-standard-ip-address
that I tried to use in the examples eludes me.

George Langford

AttachmentSize
trisquel-generic-hex.txt558 bytes
Magic Banana

I am a member!

I am a translator!

Offline
Joined: 07/24/2010

First of all, your command line is not sensible. Writing 'awk '{print $0}' trisquel-generic-hex.txt | ...' is pointless (just give trisquel-generic-hex.txt to the command after the pipe), writing '-' as the only file argument too (it is the default), you add dots that you then convert into tabulations (write tabulations in the first place!), and the last awk does nothing but removing the trailing tabulations. In the end, your command line could be:
$ sed 's/../&\t/g' trisquel-generic-hex.txt | cut -f -4

But the rest of the printf-based script for the restatement steps in
https://superuser.com/questions/1104234/how-to-convert-a-hexadecimal-value-to-a-standard-ip-address that I tried to use in the examples eludes me.

printf only takes a string in argument, not a file. A 'while read' loop is slow but a solution if you do not have many lines. The other solution in the same post calls gawk --non-decimal-data. It works (after installing GNU AWK) but is not portable and it may cease to work in the future, given the warning at the end of https://www.gnu.org/software/gawk/manual/html_node/Nondecimal-Data.html

Here is a portable solution:
$ awk -F '' 'function d(i) { h = "123456789abcdef"; return 16 * index(h, $i) + index(h, $++i) } { print d(1) "." d(3) "." d(5) "." d(7) }' trisquel-generic-hex.txt
2.122.13.7
2.123.200.237
2.126.136.18
2.127.222.59
2.220.6.66
2.220.199.175
90.194.119.221
90.196.164.162
90.200.2.34
90.201.203.250
90.204.102.2
90.211.54.124
90.213.62.168
90.213.86.21
90.217.212.210
90.218.23.100
90.219.172.21
90.221.50.57
90.222.151.188
94.1.207.245
94.4.156.44
94.6.184.137
94.8.164.237
94.9.28.167
94.10.251.245
177.32.166.215
177.32.226.47
177.80.194.83
177.80.248.29
177.82.179.92
177.83.98.117
177.143.149.177
177.193.138.75
179.153.231.11
179.157.228.44
179.212.4.250
179.213.115.197
179.216.185.242
179.222.87.213
186.206.23.215
186.206.255.45
187.20.49.62
187.23.29.86
187.105.54.116
187.107.63.236
187.183.148.199
189.4.79.179
189.4.94.190
189.7.178.42
189.33.161.86
189.35.223.134
189.62.7.86
189.101.226.52
189.122.59.152
189.122.164.185
189.122.212.166
189.123.7.112
191.178.170.131
191.189.15.193
191.190.251.165
191.191.43.250
201.53.153.196

If you have upper-case letters either convert them with tr A-F a-f or substitute in the above program $i and $++i with tolower($i) and tolower($++i).

amenex
Offline
Joined: 01/03/2015

Magic Banana rightly explained that my admittedly bloated awk script could be improved:
awk '{gsub(/.{2}/,"&.")}1' trisquel-generic-hex.txt | sed 's/../&\t/g' trisquel-generic-hex.txt | cut -f -4
which separates the hex-coded octets, thereby facilitating our understanding of Magic
Banana's extraordinarily logically reading portable solution:
awk -F '' 'function d(i) { h = "123456789abcdef"; return 16 * index(h, $i) + index(h, $++i) } { print d(1) "." d(3) "." d(5) "." d(7) }' trisquel-generic-hex.txt
which is the awk-scripted version of my plain-English script:
5e049c2c separated into four two-character hex strings and calculated by 5x16+e.0x16+4.9x16+c.2X16+c
which is shorter but far, far, slower.

Magic Banana

I am a member!

I am a translator!

Offline
Joined: 07/24/2010

Magic Banana rightly explained that my admittedly bloated awk script could be improved:

What you then write executes awk '{gsub(/.{2}/,"&.")}1' trisquel-generic-hex.txt for absolutely no reason: its output is never read. What I actually wrote, all by itself, sed 's/../&\t/g' trisquel-generic-hex.txt | cut -f -4, is equivalent to the command line you showed.

Also, I forgot to write that I actually timed the solution based on while read and that I deemed slow:
$ while read l; do printf '%d.%d.%d.%d\n' $(echo $l | sed 's/../0x& /g'); done < trisquel-generic-hex.txt
It is about 50 times slower than the AWK program I gave.