One more sed example (should be awk?)

3 Antworten [Letzter Beitrag]

I am a translator!

Beigetreten: 08/18/2020

I have a text file with things like

Some text.
nice ASN.1

more nice ASN.1

Some text.
More text.
other nice ASN.1
yet other nice ASN.1

I'd like to get only the lines between -- ASN1START and -- ASN1STOP.

After 1,5h searching through the manual of sed, I found section 6.3 on multiline techniques and was happy to see that the following seems to works: sed '/^-- ASN1STOP/!{H;d} ; x ; s/^.*-- ASN1START[^\n]*\(.*\)/\1/'

In the example in 6.3, there is /./{H;$!d}. The condition /./ is met on every line so here it makes no difference, but, if there would be another address than /./ not met on every line, would the d command be executed when both that address and $! are met, or only $!?

Then, perhaps there was simpler than that with sed or with something else, awk maybe?

Magic Banana

I am a member!

I am a translator!

Beigetreten: 07/24/2010

Using -n (to suppress automatic printing), a selection based on an interval of regular expressions and p to print is simpler:
$ sed -n '/^-- ASN1START$/,/^-- ASN1STOP$/p'
Selection based on an interval of regular expressions works as well with AWK:
$ awk '/^-- ASN1START$/,/^-- ASN1STOP$/'

If you do not want the lines "-- ASN1START" and "-- ASN1STOP", you can remove them, piping the output to grep:
$ ... | grep -Evxe '-- ASN1ST(ART|OP)'
I assumed "-- ASN1START" and "-- ASN1STOP" must be whole lines.

Another simple solution in AWK is to redefine the record separator (RS variable) and print even-numbered records:
$ awk -v RS='\n-- ASN1ST(ART|OP)\n' 'NR % 2 - 1'
That solution only works if "-- ASN1START" and "-- ASN1STOP" are perfectly intertwined (no two consecutive "-- ASN1START" or two consecutive "-- ASN1STOP"). Also, the first line cannot be "-- ASN1START", because the record separator starts with a newline character.


I am a translator!

Beigetreten: 08/18/2020

Thanks, your examples with sed are definitely simpler.

I missed that an address range could be made of two regular expressions, I could not find any example of that in the manual but the possibility is clearly mentioned in the text (in 4.4).

About awk, I had forgotten that the record separator could be defined as a non-fixed string. On the command, I guess you meant $ awk -v RS='-- ASN1ST(ART|OP)\n' 'FNR %2 == 0'
(in my file, it actually matches with even record numbers).

EDIT: You were probably editing your message when I tried, your example as of now is fine. I am not sure what the difference between NR and FNR is.
EDIT 2: I found, FNR is reset to 0 when changing input file while NR is not, so with a single input file it makes no difference.

Magic Banana

I am a member!

I am a translator!

Beigetreten: 07/24/2010

Exactly. As a consequence, whether it is better to use FNR or NR depends on whether the files are independent or continuations of each other (here with "-- ASN1START" towards the end of a file and "-- ASN1STOP" towards the beginning of the next file). I further edited my previous post: I added that selection based on an interval of regular expressions works as well with AWK and stressed the additional assumptions for the last solution to work.