Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
398 views
in Technique[技术] by (71.8m points)

awk: Join lines based on beginning or ending pattern

I have a multiline file where the records are separated by a new line ( ).

Each record split is identified by some text break.

How can I use awk so that if the record begins or ends with break it joins it onto the previous or next line, whilst retaining the record separator.

Input ($ represents EOL). Multiple breaks; can be ignored or treated as one:

A| break;$
B| break;$
C| break;$
D$
E|$
break; FGH|$
break; IJ| break;$
KLM| break;
NOP$

Desired output:

A|B|C|D$
E|FGH|IJ|KLM|NOP$

Current code (works on end break; but no join the lines beggining with break onto the previous one:

awk '{if (sub(/break;$/,"")) printf "%s", $0; else if (sub(/^break;/,"")) printf $0,"%s"; else print $0}' myfile

I suspect the problem is in the else if part but I cant figure out the correct syntax to join onto the previous line if the line begins with break;.

Any help would be appreciated but please consider awk solutions only.

UPDATE:

Thank you everyone who contributed! The suggestions below work, but it seems some records with consecutive break; in them are still causing problems:

A| break;
B| break;
C| break; break;
break; D$

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You may try this awk:

awk '!/break;||$/{print s $0; s=""; next} {gsub(/ *break; */, ""); s = s $0}' file

A|B|C|D
E|FGH|IJ|KLM|NOP

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...