9thTee
asked on
Splitting a long field into multiple shorter fields.
On a Linux machine, I need to split product description fields that are up to 256 characters long, into multiple fields no longer than 76 characters. The catch is, I need to split it up at a space. So the multiple fields need to be as long as possible but split at a space and no longer than 76 characters. I am assuming awk or sed can do this but not sure where to start.
Any help would be appreciated.
Any help would be appreciated.
Hi Tee,
What do you need to do with the items once they're split?
awk is probably the right tool. It uses a space for the default separator. The only question is once split, then what?
What do you need to do with the items once they're split?
awk is probably the right tool. It uses a space for the default separator. The only question is once split, then what?
Perl is the way to go, identify all space positions.
Then cut the Sata accordingly.
Then cut the Sata accordingly.
Could probably be optimized slightly, but here's a basic AWK script that seems to get the job done.
»bp
BEGIN {
# set max length of output lines
maxLen = 76
# initialize work variables for output line
outLine = ""
outLen = 0
}
{
# loop through all space delimited fields in this input line
for (i=1; i<=NF; i++) {
# get lenth of this chunk
l = length($i)
# will this exceen max output line length?
if (outLen + l + 1 > maxLen) {
# print accumulated output line, and clear it
print outLine
outLine = ""
}
# if first chunck added to output line, no space seperator added
if (outLine == "") {
outLine = $i
outLen = l
} else {
outLine = outLine " " $i
outLen = outLen + l + 1
}
}
# print any pending output line that was built
if (outLine != "") {
print outLine
}
# initialize work variables for output line
outLine = ""
outLen = 0
}
EDITED: Added comments, and fixed output length calculation.»bp
ASKER
Hi Bill,
This does exactly what I asked. But I need a small change, for the output, I would like the newly created 76 maximum character fields to all be on one line and pipe "|" delimited. Is that possible?
abc….76 characters max...xyz|abc….76 characters max...xyz|abc….76 characters max...xyz
Thanks,
Mark
This does exactly what I asked. But I need a small change, for the output, I would like the newly created 76 maximum character fields to all be on one line and pipe "|" delimited. Is that possible?
abc….76 characters max...xyz|abc….76 characters max...xyz|abc….76 characters max...xyz
Thanks,
Mark
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Perfect, thanks for your help.
Welcome.
»bp
»bp
Do you want each "chunk" less than 76 characters output on a separate line?
»bp