Solved

Selective Replace of characters

Posted on 2009-07-13
1
356 Views
Last Modified: 2012-06-27
I have a file where each line represents a database record. Each field is fixed width.
I need to add a "prefix" to one of the numeric fields. So, in one file, all the records will be added by 1000, on a second file by 2000, etc (I know for a fact that the numbers have less digits than the amount I'm adding).

I've made many manipulations with the file already using sed. But can't seem to find the way to do this one. For instance, suppose the lines are of the form shown below. I'm trying to change the 10th char to my prefix and then need to replace all consecutive spaces after that to 0. It's this replacement I'm having trouble with.

I know how to identify the fields I want:
sed 's/^\(.\{9\}\) \([ ]*\)\(.*\)$/\11\2\3/g'  #doesn't work because it doesn't replace spaces, just the prefix
I know how to replace spaces with 0:
sed 's/ /0/g'   #which doesn't work either as it replaces all spaces and not just the ones I want.

I've done quite a few manipulations with these files using sed but can't seem to do this one. I'm probably just being dense, and can't think out of the box right now. Anybody's got any suggestions? (It doesn't need to use sed at all, if there's a better alternative)

Sample text:
 
Field01     2     3Field04
Field01    23     5Field04
Field01     5     3Field04
Field01     2     4Field04
 
should become
 
Field01  1002     3Field04
Field01  1023     5Field04
Field01  1005     3Field04
Field01  1002     4Field04

Open in new window

0
Comment
Question by:pauloaguia
1 Comment
 
LVL 9

Accepted Solution

by:
pauloaguia earned 0 total points
ID: 24839370
I managed to get an answer on my own. For this sample, the command would be something like I show below.

This is actually a loop. It starts by creating a label a (the :a part). This will be used to create a loop later on.

On the next line, it matches any 9 characters, another character (the one I want to replace with the prefix) an undetermined number of nonspace characters (the resulting numbers will have 3 more digits after the first one) followed by a space and another undetermined number of characters (at least 13, to make sure it won't start matching with the spaces after the number).
The substitution is rather simple - everything is maintained except for the prefix character and the first space after it that gets turned into a 0.

Finally, the 'ta' part, means that if a replacement as occurred, then the program should jump to the label :a. Thus the loop is created.

To understand this better, I also added what one of the lines would look like after each iteration...
sed -e :a -e 's/\(.\{9\}.\([^ ]\{0,3\}\) \(.\{13,\}\)/\11\20\3/g;ta'
 
 
iterations for the first line:
it1: Field01  10 2     3Field04
it2: Field01  1002     3Field04
it3: No match. Don't jump back to :a

Open in new window

0

Featured Post

Live: Real-Time Solutions, Start Here

Receive instant 1:1 support from technology experts, using our real-time conversation and whiteboard interface. Your first 5 minutes are always free.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Background Still having to process all these year-end "csv" files received from all these sources (including Government entities), sometimes we have the need to examine the contents due to data error, etc... As a "Unix" shop, our only readily …
Active Directory replication delay is the cause to many problems.  Here is a super easy script to force Active Directory replication to all sites with by using an elevated PowerShell command prompt, and a tool to verify your changes.
Learn several ways to interact with files and get file information from the bash shell. ls lists the contents of a directory: Using the -a flag displays hidden files: Using the -l flag formats the output in a long list: The file command gives us mor…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…

813 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

15 Experts available now in Live!

Get 1:1 Help Now