• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 1016
  • Last Modified:

modify/sort data in text file

Hi,

I have a 30,000 line text file and need some help with data formatting.

Let me use an example; say I have lines as below:
C
 host_10_57_31_66
 host_10_57_31_67
 host_10_80_40_201
 host_10_80_40_202
A
 10_14_60_0_22       
 10_14_63_0_24       
 10_14_64_0_24       
B
 host_10_13_5_116
 host_10_13_5_117

 I want to modify as below:

set C address host_10_57_31_66
set C address host_10_57_31_67
set C address host_10_80_40_201
set C address host_10_80_40_202

set A address 10_14_60_0_22       
set A address 10_14_63_0_24       
set A address 10_14_64_0_24       

set B address host_10_13_5_116
set B address host_10_13_5_117

Thank you for all the help.
0
dpk_wal
Asked:
dpk_wal
  • 10
  • 8
1 Solution
 
sjklein42Commented:
while ( <> )
{
	s/[\r\n]//g;

	if ( /^([A-Z])/ )
	{
		if ( $letter ne '' ) { print "\n"; }
		$letter = $1;
	}
	elsif ( /^ / )
	{
		$addr = $';
		print "set $letter address $addr\n";
	}
}

Open in new window


>perl foo.pl foo.txt
set C address host_10_57_31_66
set C address host_10_57_31_67
set C address host_10_80_40_201
set C address host_10_80_40_202

set A address 10_14_60_0_22
set A address 10_14_63_0_24
set A address 10_14_64_0_24

set B address host_10_13_5_116
set B address host_10_13_5_117

Open in new window

0
 
sjklein42Commented:
Not sure where the "sorting" part of your project (see title) is supposed to come in.
0
 
gopiseraCommented:
The easy way use the sed command and replace the code with common lines
0
Receive 1:1 tech help

Solve your biggest tech problems alongside global tech experts with 1:1 help.

 
dpk_walAuthor Commented:
Thank you for the suggestions; I would try on Monday and get back to you.

sjklein42:
It kind of gets sorted for me as C is set which contains addresses! :)

gopisera:
Sorry but you would need to give full command than suggestion; am total zero with scripting.

Regards.
0
 
dpk_walAuthor Commented:
Hi sjklein42,

This works for my sample file which has A,B,C as address names; in actual file the names are bigger and combination of alphabets, numbers and hash(-) or underscore(_).

The sample file format still remains same, I have name in the first line, followed by host_ or 10_ entries [all ending with new line character].

All entries below address set name always start with host_ or 10_ if that helps you.

Can you please post modification to your code.

Thank you.
0
 
sjklein42Commented:
I am not sure if there is a problem or not.

Please post a new sample input file with the expanded format.  Can you include an example where my solution program does not give the right output?
0
 
dpk_walAuthor Commented:
Here's the output:
-bash-2.05b$ perl foo.pl d
set  address src_57949
set  address   10_14_60_0_22
set  address   10_14_63_0_24
set  address   10_14_64_0_24
set  address src_CR56066
set  address   host_10_57_31_66
set  address   host_10_57_31_67
set  address   host_10_80_40_201
-bash-2.05b$ cat d
 src_57949
   10_14_60_0_22
   10_14_63_0_24
   10_14_64_0_24
 src_CR56066
   host_10_57_31_66
   host_10_57_31_67
   host_10_80_40_201

where the script works:
-bash-2.05b$ perl foo.pl foo.txt
set C address host_10_57_31_66
set C address host_10_57_31_67
set C address host_10_80_40_201
set C address host_10_80_40_202

set A address 10_14_60_0_22
set A address 10_14_63_0_24
set A address 10_14_64_0_24

set B address host_10_13_5_116
set B address host_10_13_5_117

-bash-2.05b$ cat foo.txt
C
 host_10_57_31_66
 host_10_57_31_67
 host_10_80_40_201
 host_10_80_40_202
A
 10_14_60_0_22
 10_14_63_0_24
 10_14_64_0_24
B
 host_10_13_5_116
 host_10_13_5_117

As I understand in the code we are doing:
5:       if ( /^([A-Z])/ )
per my understanding we are only matching for upper case A-Z; am not 100% sure though.

Thank you.
0
 
sjklein42Commented:
This version should handle names with digits, dashes and underscores, lower and uppercase.

while ( <> )
{
	s/[\r\n]//g;

	if ( /^([A-Z0-9\-\_]+)/i )
	{
		if ( $name ne '' ) { print "\n"; }
		$name = $1;
	}
	elsif ( /^ / )
	{
		$addr = $';
		print "set $name address $addr\n";
	}
}

Open in new window

0
 
dpk_walAuthor Commented:
Stii doesn't work :(

-bash-2.05b$ perl foo.pl d
set  address src_57949
set  address   10_14_60_0_22
set  address   10_14_63_0_24
set  address   10_14_64_0_24
set  address src_CR56066
set  address   host_10_57_31_66
set  address   host_10_57_31_67
set  address   host_10_80_40_201
-bash-2.05b$ cat foo.pl
#!/usr/local/bin/perl
while ( <> )
{
        s/[\r\n]//g;

        if ( /^([A-Z0-9\-\_]+)/i )
        {
                if ( $name ne '' ) { print "\n"; }
                $name = $1;
        }
        elsif ( /^ / )
        {
                $addr = $';
                print "set $name address $addr\n";
        }
}
0
 
sjklein42Commented:
Please, I don't have the input data you are using so can't try it myself.  Can you post your input data that is not working right.
0
 
dpk_walAuthor Commented:
I have posted my input data file named "d".

-bash-2.05b$ cat d
 src_57949
   10_14_60_0_22
   10_14_63_0_24
   10_14_64_0_24
 src_CR56066
   host_10_57_31_66
   host_10_57_31_67
   host_10_80_40_201
0
 
sjklein42Commented:
The problem is that the new "d" file  has a blank character at the beginning of the "name" lines, and three blank characters at the beginning of each of the address lines.

Your original data had no blanks on the name lines and one blank on the address lines.

Is there a consistent rule to be followed here?  How are we to recognize the name lines?  The rule I was using was that there were no blanks at the beginning of the name lines, but that is not the case in your "d" file.

What are the rules for blanks at the beginning of the lines in the input file?
0
 
sjklein42Commented:
This is what I expected to see for input.  I can adjust but we need to be able to describe the right "rule" for leading space characters.

src_57949
 10_14_60_0_22
 10_14_63_0_24
 10_14_64_0_24
src_CR56066
 host_10_57_31_66
 host_10_57_31_67
 host_10_80_40_201 

Open in new window

0
 
dpk_walAuthor Commented:
I removed all leading spaces; now I do not get anything at all:
-bash-2.05b$ perl foo.pl d







-bash-2.05b$ cat d
src_57949
10_14_60_0_22
10_14_63_0_24
10_14_64_0_24
src_CR56066
host_10_57_31_66
host_10_57_31_67
host_10_80_40_201

-bash-2.05b$ cat foo.pl
#!/usr/local/bin/perl
while ( <> )
{
        s/[\r\n]//g;

        if ( /^([A-Z0-9\-\_]+)/i )
        {
                if ( $name ne '' ) { print "\n"; }
                $name = $1;
        }
        elsif ( /^ / )
        {
                $addr = $';
                print "set $name address $addr\n";
        }
}
0
 
sjklein42Commented:
I provided the solution program and debugged the problem with your input.  Why no points?
0
 
sjklein42Commented:
dpk_wal,

I am sorry you are getting frustrated, but  if you look at the data I posted, there is one leading space on the lines with addresses, and no leading spaces on the lines with names.  You must have changed that when you copied it into your test.

Why do you keep changing the format of the input data?
0
 
dpk_walAuthor Commented:
Oops looks like error by me; I wanted to give you 500 points! I will check; thank you for objecting.
0
 
sjklein42Commented:
dpk_wal,

Thank you, friend.  But I think you may have clicked the wrong button again.
0
 
dpk_walAuthor Commented:
Sorry been a crazy day; I think I just clicked a post with code; and in both cases it was the output I was putting forward; sorry again!
0

Featured Post

Get your problem seen by more experts

Be seen. Boost your question’s priority for more expert views and faster solutions

  • 10
  • 8
Tackle projects and never again get stuck behind a technical roadblock.
Join Now