Joining IP and port number

Hello Experts!

There is a file which includes multiple IP - port combos that look like "10.12.192.1 [newline char, lots of blanks, newline, newline, blanks] 21". I would rather join IP and port number directly (IP:port) instead of stripping the newlines and blanks using regexp, if this is possible.

thanks in advance
Tube
2b3Asked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

andreifCommented:
# open file
open(F,"test.txt") or die("$!\n");
# read text
$data = join("",<F>);
# re-format it
$data =~ s/(\d+\.\d+\.\d+\.\d+)[\n\r\s]+(\d+)[\n\r\s]+/$1:$2\n/sg;
#close file
close(F);

#print it out
print $data;
0
andreifCommented:
If that doesn't work, post a sample of the data file, please :)
0
2b3Author Commented:
seems to work in about half of all cases.
the combo is always IP [lots of newlines, blanks and rarely comments] PORT... port number is either 21, 22 or 80, however IPs aint on a single subnet.

if there is a way to get the IP:port combo WITHOUT using "hard coded" regexps, I'd appreciate it if it does not eat multiple combos because it's too greedy ;)
if it works, it's ok though.

source file:  

#some comments here

     

     

       

        81.42.183.56

       

       

       :21

       

       
   
       :80.33.64.201

       

       

       :80

       

       

        216.61.101.30

       

       

       :80

       

       
etc.

#some comments at the end
0
Cloud Class® Course: Certified Penetration Testing

This CPTE Certified Penetration Testing Engineer course covers everything you need to know about becoming a Certified Penetration Testing Engineer. Career Path: Professional roles include Ethical Hackers, Security Consultants, System Administrators, and Chief Security Officers.

andreifCommented:
here is another version:


# open file
open(F,"test.txt") or die("$!\n");
# read data
$data = join("", <F>);
# remove comments
$data =~ s/#.*$//gm;

# match pairs IP and PORT, it's pretty strict one, not greedy
@matches = ($data =~ /(\d+\.\d+\.\d+\.\d+)[\n\r\s\:]+(\d+)[\n\r\s\:]+/sg);

# print them out
while(@matches) {
        print shift(@matches) .  ':' . shift(@matches) . "\n";
}

close(F);
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
FishMongerCommented:
here's another method.

open IP, 'ip_port.txt' or die $!;
{
local undef $/;
$ip_port = <IP>;
while ($ip_port =~ /\G.*?(\d+\.\d+\.\d+\.\d+).*?(:\d+)/gs) {
   print "$1$2\n";
   push @ip_port, "$1$2";  #  if you need to put into an array
}
}
0
FishMongerCommented:
Here's a more optimized regex:


/\G[\D]+(\d+\.\d+\.\d+\.\d+)[^:]+(:\d+)/gs
0
FishMongerCommented:
Actually that  "optimized regex may not do what's expected, so use this instead:

/\G.*?(\d+\.\d+\.\d+\.\d+)[^:]+(:\d+)/gs)
0
FishMongerCommented:
If you want to shorten the code a little, you could rearange the lines a liitle and do this:

open IP, 'ip_port.txt' or die $!;
{ # create a bare block so you can localize the input_record_seperator
local undef $/;
$ip_port = <IP>;
print "$1$2\n" while ($ip_port =~ /\G.*?(\d+\.\d+\.\d+\.\d+)[^:]+(:\d+)/gs);
}

If you don't need to localize the seperator, shorten it to this:

open IP, 'ip_port.txt' or die $!;
undef $/;
$ip_port = <IP>;
print "$1$2\n" while ($ip_port =~ /\G.*?(\d+\.\d+\.\d+\.\d+)[^:]+(:\d+)/gs);


if you want/need the IP in an array change the one line and add a new print statment.

push @ip, "$1$2" while ($ip_port =~ /\G.*?(\d+\.\d+\.\d+\.\d+)[^:]+(:\d+)/gs);
print "$_\n" foreach @ip;
0
fantasy1001Commented:
open FILE, "filename" or die "Error: $!";
81.42.183.56
while(<FILE>){
   $ip = $_ if /\d+\.\d+\.\d+\.\d+/;
   $hash{$ip} = $_ if /:\d+/;
}
close FILE;

foreach (@hash){
   chomp; s/ +//g; s/#.*//;
   $hash{$_} =~ s/ +//g;
   print "$_" . $hash{$_} . "\n";
}
0
FishMongerCommented:
2n3,

The solution that you accepted from andreif is good, but just for the heck-of-it I thought I'd show you one more method that I believe is the most efficient.  Part of the reason that this is more efficient is because all (but one) of the other solutions puts the entire input file into a var; this one doesn't.  Instead, this one only assigns 1 IP address at a time to a var (not counting the $1 var from the regex) then outputs the IP and port number as needed which keeps the resource usage down to a minimum.  fantasy1001's solution is the next most efficient.

open IP, 'file.txt' or die $!;
while (<IP>) {
   if (/(\d+\.\d+\.\d+\.\d+)/) {
      $ip = $1;
   }
   elsif (/(:\d+)/) {
      print "$ip$1\n";
   }
}
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Perl

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.