Syntax problems with nested loops

13 replies [Last post]
amenex
Offline
Joined: 01/03/2015

Here is a script that is intended to generate a series of IPv4 addresses in CIDR/16 address space:

counter=1
while [ $counter -le 4096 ]
do prefix=185.180; dd if=/dev/urandom bs=2 count=1 2>/dev/null | od -An -tu1 | tr -s ' ' . | sed s/^/$prefix/ >> Temp42120Works.txt
((counter++))
done

The script produces a list of 4,096 IPv4 addresses, each of which starts with 185.180, in about
ten seconds. It works OK in the Mate terminal, but not when I invoke bash (/bin/sh), which doesn't
like counter++ and fails to stop generating IPv4 addresses as a result.

The following code is an effort to expand the scope of the task to encompass an entire list of
differently prefixed addresses:

retnuoc=0
while [ "retnuoc" -le 26 ]
do NUM=$(`expr $retnuoc + 1`)
counter=1
while [ "$counter" -le 4 ]
prefix=`expr tail -n+$NUM SS.IPv4-NLU-January2020-26Prefixes.txt | head -n1`;
dd if=/dev/urandom bs=2 count=1 2>/dev/null | od -An -tu1 | tr -s ' ' . | sed s/^/$prefix/ >> Temp41921F13.txt
counter=$(`expr $counter + 1`)
done
retnuoc=$(`expr $retnuocm+ 1`)
done

The first two octets are contained in the file SS.IPv4-NLU-January2020-26Prefixes.txt (attached)
and extracted with the expression
tail -n+NUM SS.IPv4-NLU-January2020-26Prefixes.txt | head -n1
which works OK when NUM is replaced by a number between 1 and 26 (in the present example).

The second pair of octets is randomly generated with this expression wherein $NUM is replaced by an
exemplar octet pair:
prefix=185.39; dd if=/dev/urandom bs=2 count=1 2>/dev/null | od -An -tu1 | tr -s ' ' . | sed s/^/$prefix/

Alas, progress has stalled now that the script no longer produces complaints of syntax errors.
If it were to work, 104 IPv4 addresses would be forthcoming, four for each of the twenty six
octet pairs.

George Langford

AttachmentSize
SS.IPv4-NLU-January2020-26Prefixes.txt158 bytes
amenex
Offline
Joined: 01/03/2015

Following Jaret's contribution, and also restructuring my script according to this link:
https://www.linuxquestions.org/questions/programming-9/nested-while-loops-for-bash-novice-4175439318/

!/bin/bash
retnuoc=1
counter=1
for (( retnuoc ; retnuoc < 3 ; retnuoc++ )); do
for (( counter ; counter < 41 ; counter++ )); do # xiferp=(`expr tail -n+$retnuoc SS.IPv4-NLU-January2020-26Prefixes.txt | head -n1`)
prefix=185.45; dd if=/dev/urandom bs=2 count=1 2>/dev/null | od -An -tu1 | tr -s ' ' . | sed s/^/$prefix/ >> Temp41922A.txt
done
done

This script generates one list of 40 IPv4 addresses while the expression "xiferp=(`expr tail..." is commented out.

The first step in any solution must be to find out why bash doesn't like the syntax of:
xiferp=(`expr tail -n+$retnuoc SS.IPv4-NLU-January2020-26Prefixes.txt | head -n1`)
which also works OK if $retnuoc is replaced by an integer like 1 or 2 (as it's limited in the actual script).

The second step will be to discover how to pass the prefix (i.e., xiferp) extracted from the file's list of 26 prefixes to:
prefix=xiferp; dd if=/dev/urandom bs=2 count=1 2>/dev/null | od -An -tu1 | tr -s ' ' . | sed s/^/$prefix/
which computes OK if xiferp is replaced by a two-octet stand-in value such as 185.45.

George Langford

Magic Banana

I am a member!

I am a translator!

Offline
Joined: 07/24/2010

As Jaret explained, /bin/sh used to be the original UNIX shell (the Bourne Shell). On modern systems, it is a symbolic link to another shell, which is only guaranteed to implement the features of the Bourne shell. On Trisquel it is dash by default:
$ ls -l /bin/sh
lrwxrwxrwx 1 root root 4 Apr 17 2018 /bin/sh -> dash

Bash extends the Bourne shell a lot. Using /bin/bash solves your problem. Unless you want the script to run on any Bourne-compatible shell. In that case, you can substitute "'((counter++))" with "counter=$(expr $counter + 1)"... what you actually wrote later! Although with useless additional backquotes. `...` actually equates to $(...). Since the counter is for a loop, another option is to write "for counter in $(seq 4096)". That shorter (two lines saved) and, in my humble opinion, clearer syntax works with /bin/sh.

But you need no counter. The simplest "shell way" to read a file line by line is:
while read line
do
...
done < file

Given several variables in argument (rather than one single, line, above), 'read' tokenizes the line using $IFS as delimiting characters (space and tabulation by default, but IFS can be redefined): the first variable is defined as the first field, the second variable as the second field, ..., and the last variable as all the remaining fields.

All that said, shells are slow. If I have properly understood the problem, you can:

  1. repeat each line in "file" four times (or whatever) with awk '{ for (i = 0; ++i <= 4; ) print }' file
  2. generate all the random suffixes with od -A n -t u1 -w2 -N $(expr 8 \* $(wc -l < file)) /dev/urandom | tr -s ' ' .
  3. paste the two outputs with paste -d ''

Three simple instructions. No 'dd' (thanks to the option -w of 'od'). No 'sed'. For greater performances, use pipes to have all commands run in parallel. One pipe must be named, because 'paste' takes two inputs. Creating the named pipe in /tmp and removing it, that gives that script (taking as argument the file listing the prefixes):
#!/bin/sh
TMP=$(mktemp -u)
trap "rm $TMP 2>/dev/null" 0
mkfifo $TMP
awk '{ for (i = 0; ++i <= 4; ) print }' "$1" > $TMP &
od -A n -t u1 -w2 -N $(expr 8 \* $(wc -l < "$1")) /dev/urandom | tr -s ' ' . | paste -d '' $TMP -

amenex
Offline
Joined: 01/03/2015

Magic Banana's magnificent script to fill out the listing of prefixes with random octets looks like
this for a specific set of two-octet prefixes:
TMP=$(mktemp -u)
trap "rm $TMP 2>/dev/null" 0
mkfifo $TMP
awk '{ for (i = 0; ++i <= 512; ) print }' "SS.IPv4-NLU-January2020-Prefixes.txt" > $TMP &
od -A n -t u1 -w2 -N $(expr 1024 \* $(wc -l < "SS.IPv4-NLU-January2020-Prefixes.txt")) /dev/urandom | tr -s ' ' . | paste -d '' $TMP - > TempMB42920.5160x512.txt

Here's the associated math that makes it operate cleanly:

512 is the number of copies required for each two-octet pair.
$(expr 1024 ... makes sure that both bytes of each octet are provided. 512 x 5160 = 2,641,920 addresses
5,160 is the number of two-octet pairs in the source file.

Takes about three seconds real time.

The hard part comes after the nmap asn-query script finishes (in a couple of days) and it's
time to remove large numbers of newline (\ n) chaacters. Leafpad can do this reliably, but
takes about fifteen minutes to remove a typical batch, which looks like this:
\n\t\n\nHost script results:\n|_asn-query: See the result for
and then replace it with a simple tab (\t).

Several other such batches of newline characters are similar, but removing just
one newline character at a time without tying it to a specific word removes all
the tabs, which makes the entire file unreadable. Pairing the \n with a preceding
right parenthesis doesn't help, because tr does the translation one character at
a time, so
tr -d 'the quick brown fox jumps over the lazy dog'
erases all the characters of the alphabet regardless of the order in which they appear.

George Langford

Magic Banana

I am a member!

I am a translator!

Offline
Joined: 07/24/2010

Indeed, ‘tr’ translates, squeezes, and/or delete *characters*. sed's s command substitutes strings. To substitute a sequence of literal "\n" and/or "\t" with one single tabulation:
$ sed 's/\\[nt]\(\\[nt]\)*/\t/g'

Is this what you want? If you want to substitute every "\n" or "\t" (even when in sequence):
$ sed 's/\\[nt]/\t/g'

amenex
Offline
Joined: 01/03/2015

Alas, sed won't let me play with the end-parentheses character, even when I'm using it
as a handle to select the immediately following newline character and to select where
to position the substitute tab: [)n] to be replaced by [)t]. I haven't worked out a
suitable adaptation of Magic Banana's suggested syntax:
$ sed 's/\\[nt]\(\\[nt]\)*/\t/g'

Another tack would be to use tr -d '()' to eliminate the pesky parentheses and then
attach the \n's to the preceding digit (0 through 9) of the IPv4 address.

I was hoping to remove the newline characters that could be associated with adjacent
strings on the same line and replace them with tabs next to those strings, and at last
remove the \n's that appear as the sole characters on the intervening lines.

The goal is to remove all the newline characters except those that are after the
following two types of string sequences:
"Origin AS: 701"
where the last digit is 0 through 9 and
"See the result for 100.19.104.32"
where the last digit is 0 through 9

Success in that endeavour would give one line for each IPv4 address in the nmap results.
The excess tabs are easier to control.

George Langford

Magic Banana

I am a member!

I am a translator!

Offline
Joined: 07/24/2010

Alas, sed won't let me play with the end-parentheses character
(...)
I was hoping to remove the newline characters

As far as I understand your last post, it was about substituting literal "\t" and "\n". Not parentheses. Not newlines. To be understandable, you could show an input that is as short and simple as possible (but not simpler/shorter: all cases must be present) and the expected output.

That said, and as I have already written you, I suspect you are not using the proper Nmap's output format. https://nmap.org/book/output-formats-grepable-output.html describes a "grepable output" but starts by saying "the XML output format is far more powerful, and is nearly as convenient for experienced users".

amenex
Offline
Joined: 01/03/2015

About my playing with the end-of-line character (\n) ...
Applying nMap's grepable-output function, as in this script:
sort TestGrepOutnMapNine.txt |
sudo nmap -Pn -sn -T4 --max-retries 16 --script asn-query -iL '-' -oG - |
grep "Host:" '-' > TestGrepOutnMapNineGrep.txt

eliminates the blank lines without further ado.

It skips the --script asn-query data, but for reconnaissance purposes, that's OK.

The preceding sort operation may seem unnecessary, but it makes the script update
its output to the hard drive every few minutes, which is assuring to watch, even
while it's accumulating data at [only] 100kbytes per hour.

Getting back to the end-of-line characters: the output of a similar nMap scan
without the -oG argument is attached.

Incidentally, with all the worldwide home schooling and online university classes,
these scans are currently (April-May 2020) intolerably slow, 100bytes/sec instead
of the former 70kbytes/sec. Nevertheless, my T420's CPU is working very hard to
process what little data it's getting.

George Langford

AttachmentSize
nMap-asn-query.output.txt 1.3 KB
TestGrepOutnMapNine.txt 123 bytes
amenex
Offline
Joined: 01/03/2015

Alas, I could not make Magic Banana's script for generating reandomized IPv4 addresses work;
most certainly by my failure to comprehend what's going on sufficiently. I tried constructing
an od script, but the best I could do was to generate 3rd & 4th octets which violate the 255
maxiumum of the two-byte octets with no clear way of limiting the decimal equivalent octets.

So I involved that famous, but maligned search engine, Google, where I found this gem:
https://stackoverflow.com/questions/24059607/nmap-taking-a-random-sample-from-a-range-of-ip-ranges-is-combining-ir-and
where it's said:
nmap -n -sL -iL ip_ranges -oG - | awk '/^Host/{print $2}' | shuf -n 100000
I tried that script out, but it looks up all the IP addresses in the entire CIDR block first and then shuffles
the results before selecting a modest number of them ... after we have had to wait for the resolution of them all.

But there's a better one leading the way:
https://unix.stackexchange.com/questions/455081/how-does-this-script-generate-an-ip-address-randomly
where it's said:
nmap -n -iR 10 -sL | awk '/report for/ { print $NF }'
but that's apparently all that any of the many nmap scripts are written to do.

Then it dawned on me that I could pinch the 3rd and 4th octets and graft them onto the first two octets of
my recent-visitor data's collected and not-looked-up IPv4 addresses:
sudo nmap -n -iR 2286000 -sL | awk '/report for/ { print $NF }' - | sed 's/\./\t/g' '-' | awk '{print $3"."$4}' '-' > Octets.3and4.File01.txt
awk '{ for (i = 0; ++i <= 2540; ) print }' "Prefixes.IPv4.txt" > Octets.1and2.File02.txt ;
paste Octets.1and2.File02.txt Octets.3and4.File01.txt | tr -d '\t' > List-IPv4.addresses.txt

I had to run the three script in two batches because the output of the second script interferes with entry
of my sudo password; the send and third sections of the script don't have this problem. That's why File01
and File02 are in the opposite of the normal order in the paste script.

The recent-visitor list produced over fifty thousand CIDR/16 prefixes, so I used the split command to
subdivide the main prefix file into batches of 900 addresses each, producing 2.29 million IPv4 addresses
with random 3rd & 4th octets.

I checked for repetitions of those addresses in one of the randomized sets of 3rd & 4th octets:
uniq -c Prefixes.IPv4.May2020.49.File01.txt | sort -nrk 1 > Repeats4901.txt
which found 29 duplicates out of 2.29 million addresses, a minuscule fraction.

George Langford

AttachmentSize
Prefixes.IPv4_.May2020.49.txt 6.42 KB
Magic Banana

I am a member!

I am a translator!

Offline
Joined: 07/24/2010

Alas, I could not make Magic Banana's script for generating reandomized IPv4 addresses work

What does not work? What do you want to do? Generate whole IPv4 addresses instead of only the last two bytes? Then turn -w2 into -w4 (and double the value in argument of -N):
$ nb=10; od -A n -t u1 -w4 -N $(expr 4 \* $nb) /dev/urandom | tr -s ' ' . | cut -c 2-

amenex
Offline
Joined: 01/03/2015

Here's what I was trying to accomplish:

Starting with Magic Banana's script:

TMP=$(mktemp -u)
trap "rm $TMP 2>/dev/null" 0
mkfifo $TMP
awk '{ for (i = 0; ++i <= 512; ) print }' "SS.IPv4-NLU-January2020-Prefixes.txt" > $TMP &
od -A n -t u1 -w2 -N $(expr 1024 \* $(wc -l < "SS.IPv4-NLU-January2020-Prefixes.txt")) /dev/urandom | tr -s ' ' . | paste -d '' $TMP - > TempMB42920.5160x512.txt

Which I adjusted to apply to May2020's recent-visitor data with these alterations:
TMP=$(mktemp -u)
trap "rm $TMP 2>/dev/null" 0
mkfifo $TMP
awk '{ for (i = 0; ++i <= 512; ) print }' "Prefixes.May2020.Slash16.txt" > $TMP &
od -A n -t u1 -w2 -N $(expr 1024 \* $(wc -l < "Prefixes.May2020.Slash16.txt")) /dev/urandom | tr -s ' ' . | paste -d '' $TMP - > TempMB42920.5160x512.txt

Trial run, using the single prefix in Prefix.May2020.Slash16.txt ==>

TMP=$(mktemp -u)
trap "rm $TMP 2>/dev/null" 0
mkfifo $TMP
awk '{ for (i = 0; ++i <= 512; ) print }' "Prefix.May2020.Slash16.txt" > $TMP &
od -A n -t u1 -w2 -N $(expr 1024 \* $(wc -l < "Prefix.May2020.Slash16.txt")) /dev/urandom | tr -s ' ' . | paste -d '' $TMP - > Temp0710G.txt

Here's what the terminal looks like after I run this set of scripts:
TMP=$(mktemp -u)
trap "rm $TMP 2>/dev/null" 0
mkfifo $TMP
awk '{ for (i = 0; ++i <= 512; ) print }' "Prefix.May2020.Slash16.txt" > $TMP &
[1] 31825
od -A n -t u1 -w2 -N $(expr 1024 \* $(wc -l < "Prefix.May2020.Slash16.txt")) /dev/urandom | tr -s ' ' . | paste -d '' $TMP - > Temp0710G.txt
[1]+ Done awk '{ for (i = 0; ++i <= 512; ) print }' "Prefix.May2020.Slash16.txt" > $TMP

Temp0710G.txt is just a list of 512 copies of the two-octet prefix (185.5.) without the 3rd & 4th octets that
paste was intended to append.

George Langford

Magic Banana

I am a member!

I am a translator!

Offline
Joined: 07/24/2010

Which I adjusted to apply to May2020's recent-visitor data with these alterations

You want to write reusable scripts. The input and the output should not be hard-coded. Neither should how many times each prefix is replicated. There can be a default though: 1 looks reasonable. And there should be a help message. For instance, copy-paste that in a file:
#!/bin/sh
if [ -z "$1" -o "$1" = "-h" -o "$1" = "--help" ]
then
printf "Usage: $0 prefixes [nb_of_replications]
"
exit
fi
if [ -z "$2" ]
then
od -A n -t u1 -w2 -N $(expr 2 \* $(wc -l < "$1")) /dev/urandom | tr -s ' ' . | paste -d '' "$1" -
exit
fi
TMP=$(mktemp -u)
trap "rm $TMP 2>/dev/null" 0
mkfifo $TMP
awk -v rep=$2 '{ for (i = 0; i != rep; ++i) print }' "$1" > $TMP &
od -A n -t u1 -w2 -N $(expr 2 \* $2 \* $(wc -l < "$1")) /dev/urandom | tr -s ' ' . | paste -d '' $TMP -

Turn that file executable. You now need to call it with at least one argument: the file to process. The second argument, if it exists, must be how many times each prefix is to be replicated. You can redirect the output of that execution. For instance, if the file is ./complement-ipv4, you can execute:
$ ./complement-ipv4 Prefixes.May2020.Slash16.txt 512 > TempMB42920.5160x512.txt

Assuming the prefix file Prefix.May2020.Slash16.txt in your script is the one you attached in your previous post (Prefixes.IPv4_.May2020.49.txt), the only issue I see is that every line ends with a supernumerary dot. Delete it:
$ sed -i 's/\.$//' Prefixes.May2020.Slash16.txt

Of course, you could put the substitution in the script. However, it would take useless computing resources when the prefix file does not have useless dots (that also take disk space).

amenex
Offline
Joined: 01/03/2015

After sitting on Magic Banana's suggested executable script for the near-forever four days, I found the
time and concentration to try it.

First, I stripped those trailing dots (which facilitated pasting nmap's random 3rd & 4th octets) with the
sed script suggested by Magic Banana:
sed -i 's/\.$//' Prefixes.IPv4.May2020.Corrected00.txt
which doesn't need or use a renamed output.

Then I copied the suggested script to a file I call MB.IPv4.urandom.txt, saved it as MB.IPv4.urandom.bin,
and made it executable with:
sudo chmod +x MB.IPv4.urandom.bin
I then applied it to the first of the 38 divisions of the CIDR/16 prefixes from May's Current Visitor data:
./MB.IPv4.urandom.bin Prefixes.IPv4.May2020.Corrected00.txt 2560 > Addresses.IPv4.00.678x2560.txt
This script took about three seconds & produced 25.5MB + 1,735,680 addresses at 2,560 for each of the 678 CIDR/16 prefixes.

My nmap method of generating those 3rd & 4th octets takes considerably more scripting and about 45 seconds
to produce about the same number of addresses. Many fewer places to enter new arguments, too.

Nice work, Magic Banana !

George Langford

Magic Banana

I am a member!

I am a translator!

Offline
Joined: 07/24/2010

After sitting on Magic Banana's suggested executable script for the near-forever four days, I found the time and concentration to try it.

It is the solution I first gave in this thread: https://trisquel.info/forum/syntax-problems-nested-loops#comment-148319

I only:

  • generalized it a bit: $2 replaces 4;
  • added a help message;
  • treated the default value for $2 (1) separately (to avoid running awk for nothing), but I could have simply set $2 to 1.

which doesn't need or use a renamed output.

'info sed' explains that -i "specifies that files are to be edited in-place".

Nice work, Magic Banana !

Thank you.