Solved

modify/sort data in text file

Posted on 2011-02-19
19
989 Views
Last Modified: 2012-05-11
Hi,

I have a 30,000 line text file and need some help with data formatting.

Let me use an example; say I have lines as below:
C
 host_10_57_31_66
 host_10_57_31_67
 host_10_80_40_201
 host_10_80_40_202
A
 10_14_60_0_22       
 10_14_63_0_24       
 10_14_64_0_24       
B
 host_10_13_5_116
 host_10_13_5_117

 I want to modify as below:

set C address host_10_57_31_66
set C address host_10_57_31_67
set C address host_10_80_40_201
set C address host_10_80_40_202

set A address 10_14_60_0_22       
set A address 10_14_63_0_24       
set A address 10_14_64_0_24       

set B address host_10_13_5_116
set B address host_10_13_5_117

Thank you for all the help.
0
Comment
Question by:dpk_wal
  • 10
  • 8
19 Comments
 
LVL 16

Expert Comment

by:sjklein42
ID: 34932610
while ( <> )
{
	s/[\r\n]//g;

	if ( /^([A-Z])/ )
	{
		if ( $letter ne '' ) { print "\n"; }
		$letter = $1;
	}
	elsif ( /^ / )
	{
		$addr = $';
		print "set $letter address $addr\n";
	}
}

Open in new window


>perl foo.pl foo.txt
set C address host_10_57_31_66
set C address host_10_57_31_67
set C address host_10_80_40_201
set C address host_10_80_40_202

set A address 10_14_60_0_22
set A address 10_14_63_0_24
set A address 10_14_64_0_24

set B address host_10_13_5_116
set B address host_10_13_5_117

Open in new window

0
 
LVL 16

Expert Comment

by:sjklein42
ID: 34932614
Not sure where the "sorting" part of your project (see title) is supposed to come in.
0
 
LVL 3

Expert Comment

by:gopisera
ID: 34933195
The easy way use the sed command and replace the code with common lines
0
Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 32

Author Comment

by:dpk_wal
ID: 34935428
Thank you for the suggestions; I would try on Monday and get back to you.

sjklein42:
It kind of gets sorted for me as C is set which contains addresses! :)

gopisera:
Sorry but you would need to give full command than suggestion; am total zero with scripting.

Regards.
0
 
LVL 32

Author Comment

by:dpk_wal
ID: 34940237
Hi sjklein42,

This works for my sample file which has A,B,C as address names; in actual file the names are bigger and combination of alphabets, numbers and hash(-) or underscore(_).

The sample file format still remains same, I have name in the first line, followed by host_ or 10_ entries [all ending with new line character].

All entries below address set name always start with host_ or 10_ if that helps you.

Can you please post modification to your code.

Thank you.
0
 
LVL 16

Expert Comment

by:sjklein42
ID: 34940269
I am not sure if there is a problem or not.

Please post a new sample input file with the expanded format.  Can you include an example where my solution program does not give the right output?
0
 
LVL 32

Author Comment

by:dpk_wal
ID: 34940401
Here's the output:
-bash-2.05b$ perl foo.pl d
set  address src_57949
set  address   10_14_60_0_22
set  address   10_14_63_0_24
set  address   10_14_64_0_24
set  address src_CR56066
set  address   host_10_57_31_66
set  address   host_10_57_31_67
set  address   host_10_80_40_201
-bash-2.05b$ cat d
 src_57949
   10_14_60_0_22
   10_14_63_0_24
   10_14_64_0_24
 src_CR56066
   host_10_57_31_66
   host_10_57_31_67
   host_10_80_40_201

where the script works:
-bash-2.05b$ perl foo.pl foo.txt
set C address host_10_57_31_66
set C address host_10_57_31_67
set C address host_10_80_40_201
set C address host_10_80_40_202

set A address 10_14_60_0_22
set A address 10_14_63_0_24
set A address 10_14_64_0_24

set B address host_10_13_5_116
set B address host_10_13_5_117

-bash-2.05b$ cat foo.txt
C
 host_10_57_31_66
 host_10_57_31_67
 host_10_80_40_201
 host_10_80_40_202
A
 10_14_60_0_22
 10_14_63_0_24
 10_14_64_0_24
B
 host_10_13_5_116
 host_10_13_5_117

As I understand in the code we are doing:
5:       if ( /^([A-Z])/ )
per my understanding we are only matching for upper case A-Z; am not 100% sure though.

Thank you.
0
 
LVL 16

Accepted Solution

by:
sjklein42 earned 500 total points
ID: 34940458
This version should handle names with digits, dashes and underscores, lower and uppercase.

while ( <> )
{
	s/[\r\n]//g;

	if ( /^([A-Z0-9\-\_]+)/i )
	{
		if ( $name ne '' ) { print "\n"; }
		$name = $1;
	}
	elsif ( /^ / )
	{
		$addr = $';
		print "set $name address $addr\n";
	}
}

Open in new window

0
 
LVL 32

Author Comment

by:dpk_wal
ID: 34941450
Stii doesn't work :(

-bash-2.05b$ perl foo.pl d
set  address src_57949
set  address   10_14_60_0_22
set  address   10_14_63_0_24
set  address   10_14_64_0_24
set  address src_CR56066
set  address   host_10_57_31_66
set  address   host_10_57_31_67
set  address   host_10_80_40_201
-bash-2.05b$ cat foo.pl
#!/usr/local/bin/perl
while ( <> )
{
        s/[\r\n]//g;

        if ( /^([A-Z0-9\-\_]+)/i )
        {
                if ( $name ne '' ) { print "\n"; }
                $name = $1;
        }
        elsif ( /^ / )
        {
                $addr = $';
                print "set $name address $addr\n";
        }
}
0
 
LVL 16

Expert Comment

by:sjklein42
ID: 34941497
Please, I don't have the input data you are using so can't try it myself.  Can you post your input data that is not working right.
0
 
LVL 32

Author Comment

by:dpk_wal
ID: 34941503
I have posted my input data file named "d".

-bash-2.05b$ cat d
 src_57949
   10_14_60_0_22
   10_14_63_0_24
   10_14_64_0_24
 src_CR56066
   host_10_57_31_66
   host_10_57_31_67
   host_10_80_40_201
0
 
LVL 16

Expert Comment

by:sjklein42
ID: 34941538
The problem is that the new "d" file  has a blank character at the beginning of the "name" lines, and three blank characters at the beginning of each of the address lines.

Your original data had no blanks on the name lines and one blank on the address lines.

Is there a consistent rule to be followed here?  How are we to recognize the name lines?  The rule I was using was that there were no blanks at the beginning of the name lines, but that is not the case in your "d" file.

What are the rules for blanks at the beginning of the lines in the input file?
0
 
LVL 16

Expert Comment

by:sjklein42
ID: 34941581
This is what I expected to see for input.  I can adjust but we need to be able to describe the right "rule" for leading space characters.

src_57949
 10_14_60_0_22
 10_14_63_0_24
 10_14_64_0_24
src_CR56066
 host_10_57_31_66
 host_10_57_31_67
 host_10_80_40_201 

Open in new window

0
 
LVL 32

Author Comment

by:dpk_wal
ID: 34941613
I removed all leading spaces; now I do not get anything at all:
-bash-2.05b$ perl foo.pl d







-bash-2.05b$ cat d
src_57949
10_14_60_0_22
10_14_63_0_24
10_14_64_0_24
src_CR56066
host_10_57_31_66
host_10_57_31_67
host_10_80_40_201

-bash-2.05b$ cat foo.pl
#!/usr/local/bin/perl
while ( <> )
{
        s/[\r\n]//g;

        if ( /^([A-Z0-9\-\_]+)/i )
        {
                if ( $name ne '' ) { print "\n"; }
                $name = $1;
        }
        elsif ( /^ / )
        {
                $addr = $';
                print "set $name address $addr\n";
        }
}
0
 
LVL 16

Expert Comment

by:sjklein42
ID: 34941662
I provided the solution program and debugged the problem with your input.  Why no points?
0
 
LVL 16

Expert Comment

by:sjklein42
ID: 34941719
dpk_wal,

I am sorry you are getting frustrated, but  if you look at the data I posted, there is one leading space on the lines with addresses, and no leading spaces on the lines with names.  You must have changed that when you copied it into your test.

Why do you keep changing the format of the input data?
0
 
LVL 32

Author Comment

by:dpk_wal
ID: 34941796
Oops looks like error by me; I wanted to give you 500 points! I will check; thank you for objecting.
0
 
LVL 16

Expert Comment

by:sjklein42
ID: 34941816
dpk_wal,

Thank you, friend.  But I think you may have clicked the wrong button again.
0
 
LVL 32

Author Comment

by:dpk_wal
ID: 34942185
Sorry been a crazy day; I think I just clicked a post with code; and in both cases it was the output I was putting forward; sorry again!
0

Featured Post

Free Tool: Port Scanner

Check which ports are open to the outside world. Helps make sure that your firewall rules are working as intended.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
combine multiple lines 2 73
Perl tar error 8 69
AD Cleanup by EmployeeID 11 71
Need to combine two scripts 2 40
Over the years I've spent many an hour playing on hardened, DMZ'd servers, with only a sub-set of the usual GNU toy's to keep me company; frequently I've needed to save and send log or data extracts from these server back to my PC, or to others, and…
Utilizing an array to gracefully append to a list of EmailAddresses
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…

740 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question