Unix Shell Script: Take text file and insert a new line every 94 characters

dirknibleck
dirknibleck used Ask the Experts™
on
I have a text file where all of the text appears on the first line. I need to insert line after every 94 characters. How can I go about doing this in shell?
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Most Valuable Expert 2013
Top Expert 2013

Commented:
If it's really just one line in the input file:

awk 'BEGIN{n=1}{while(substr($0,n,94)){print substr($0,n,94);n+=94}}' inputfile
Hi dirknibleck,

This seems to work:
    perl -0pe 's/.{94}/$&\n/sg' inputfile

Notes:
- If the file is too big to fit in memory, then my method would not be advisable (but may still work, since virtual memory could be used).
- If there is a chance the input file could contain a null (ASCII 0) char, change the "-0pe" to "-0777 -pe" (less concise, but more flexible).
- If you want to replace the input file, you could do this:
    perl -i -0pe 's/.{94}/$&\n/sg' inputfile
- If you want to backup the old version of the input file as part of this, you could change us the the "-i" to something like "-i.bak" (which would backup to inputfile.bak).
Top Expert 2011

Commented:
@wmp
Your awk won't work.

With test data
--- inputfile ---
1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890
09876543210987654321098765432109876543210987654321098765432109876543210987654321098765432109876543210987654321
-------------

your script output is
-------
1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234
567890
Why Diversity in Tech Matters

Kesha Williams, certified professional and software developer, explores the imbalance of diversity in the world of technology -- especially when it comes to hiring women. She showcases ways she's making a difference through the Colors of STEM program.

Author

Commented:
woolmilkporc - I'm getting an error because awk won't accept an input line longer than 3000 bytes.
Most Valuable Expert 2013
Top Expert 2013

Commented:
Which is your OS? My AIX awk is working fine (Ok, I tested just up to 70K line length).

Do you have nawk ? Maybe this one would do.

wmp

Author

Commented:
tel2 - do you know how I can use your script with variables? The following assumes that the input is "ach" instead of ach...


ach=some_changing_filename.txt ; export ach
(perl -0pe 's/.{94}/$&\n/sg' ach) > ach

Open in new window

Author

Commented:
woolmilkporc - HPUX. nawk is not found...

Author

Commented:
tel2 - nevermind, I'm an idiot. $ach.
Most Valuable Expert 2013
Top Expert 2013

Commented:
So will will have to download and install gawk (GNU awK) for HPUX:

http://hpux.connect.org.uk/hppd/hpux/Gnu/gawk-4.0.0/

or maybe mawk for HPUX:

http://hpux.connect.org.uk/hppd/hpux/Shells/mawk-1.3.4.0625/

wmp
Hi dirknibleck,
Thanks for the points.
As indicated in the notes in my post:
- If you want to replace the input file, you could do this:
    perl -i -0pe 's/.{94}/$&\n/sg' inputfile

So if your filename is in $ach, you could just go:
    perl -0pe 's/.{94}/$&\n/sg' $ach
Works for me.
Either way, I don't think exporting is necessary, and when I run this:
    ach=some_changing_filename.txt ; export ach
    (perl -0pe 's/.{94}/$&\n/sg' $ach) > $ach
I end up with an empty file, as tends to happen with that kinda thing.

Hi wmp,
I get the same output wesly_chen got, when I ran your solution on GNU Linux with gawk 3.1.3.
As another example, if I run this:
    cal | gawk 'BEGIN{n=1}{while(substr($0,n,94)){print substr($0,n,94);n+=94}}'
I just get this single line of output:
     August 2011
What do you get on AIX, and what version of gawk are you running?
Top Expert 2011

Commented:
My guess the gawk in Linux, the "newline" character break the $0,
Most Valuable Expert 2013
Top Expert 2013

Commented:
Guys,

to test my solution you should use an input line which is more than 94 characters long, don't you think so?
Or replace the "94" in my code with e.g. "5" and retry.

And I stated explicitly that this solution as posted will only work for one single line of input!

For multi-line input we will have to experiment with setting RS and FS:

awk 'BEGIN{n=1; RS=""; FS="\n"}{while(substr($0,n,94)){print substr($0,n,94);n+=94}}' inputfile

Note: the above will leave the linefeeds intact, so lines will not be joined.

wmp

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial