• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 164
  • Last Modified:

Joining IP and port number

Hello Experts!

There is a file which includes multiple IP - port combos that look like "10.12.192.1 [newline char, lots of blanks, newline, newline, blanks] 21". I would rather join IP and port number directly (IP:port) instead of stripping the newlines and blanks using regexp, if this is possible.

thanks in advance
Tube
0
2b3
Asked:
2b3
1 Solution
 
andreifCommented:
# open file
open(F,"test.txt") or die("$!\n");
# read text
$data = join("",<F>);
# re-format it
$data =~ s/(\d+\.\d+\.\d+\.\d+)[\n\r\s]+(\d+)[\n\r\s]+/$1:$2\n/sg;
#close file
close(F);

#print it out
print $data;
0
 
andreifCommented:
If that doesn't work, post a sample of the data file, please :)
0
 
2b3Author Commented:
seems to work in about half of all cases.
the combo is always IP [lots of newlines, blanks and rarely comments] PORT... port number is either 21, 22 or 80, however IPs aint on a single subnet.

if there is a way to get the IP:port combo WITHOUT using "hard coded" regexps, I'd appreciate it if it does not eat multiple combos because it's too greedy ;)
if it works, it's ok though.

source file:  

#some comments here

     

     

       

        81.42.183.56

       

       

       :21

       

       
   
       :80.33.64.201

       

       

       :80

       

       

        216.61.101.30

       

       

       :80

       

       
etc.

#some comments at the end
0
Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.

 
andreifCommented:
here is another version:


# open file
open(F,"test.txt") or die("$!\n");
# read data
$data = join("", <F>);
# remove comments
$data =~ s/#.*$//gm;

# match pairs IP and PORT, it's pretty strict one, not greedy
@matches = ($data =~ /(\d+\.\d+\.\d+\.\d+)[\n\r\s\:]+(\d+)[\n\r\s\:]+/sg);

# print them out
while(@matches) {
        print shift(@matches) .  ':' . shift(@matches) . "\n";
}

close(F);
0
 
FishMongerCommented:
here's another method.

open IP, 'ip_port.txt' or die $!;
{
local undef $/;
$ip_port = <IP>;
while ($ip_port =~ /\G.*?(\d+\.\d+\.\d+\.\d+).*?(:\d+)/gs) {
   print "$1$2\n";
   push @ip_port, "$1$2";  #  if you need to put into an array
}
}
0
 
FishMongerCommented:
Here's a more optimized regex:


/\G[\D]+(\d+\.\d+\.\d+\.\d+)[^:]+(:\d+)/gs
0
 
FishMongerCommented:
Actually that  "optimized regex may not do what's expected, so use this instead:

/\G.*?(\d+\.\d+\.\d+\.\d+)[^:]+(:\d+)/gs)
0
 
FishMongerCommented:
If you want to shorten the code a little, you could rearange the lines a liitle and do this:

open IP, 'ip_port.txt' or die $!;
{ # create a bare block so you can localize the input_record_seperator
local undef $/;
$ip_port = <IP>;
print "$1$2\n" while ($ip_port =~ /\G.*?(\d+\.\d+\.\d+\.\d+)[^:]+(:\d+)/gs);
}

If you don't need to localize the seperator, shorten it to this:

open IP, 'ip_port.txt' or die $!;
undef $/;
$ip_port = <IP>;
print "$1$2\n" while ($ip_port =~ /\G.*?(\d+\.\d+\.\d+\.\d+)[^:]+(:\d+)/gs);


if you want/need the IP in an array change the one line and add a new print statment.

push @ip, "$1$2" while ($ip_port =~ /\G.*?(\d+\.\d+\.\d+\.\d+)[^:]+(:\d+)/gs);
print "$_\n" foreach @ip;
0
 
fantasy1001Commented:
open FILE, "filename" or die "Error: $!";
81.42.183.56
while(<FILE>){
   $ip = $_ if /\d+\.\d+\.\d+\.\d+/;
   $hash{$ip} = $_ if /:\d+/;
}
close FILE;

foreach (@hash){
   chomp; s/ +//g; s/#.*//;
   $hash{$_} =~ s/ +//g;
   print "$_" . $hash{$_} . "\n";
}
0
 
FishMongerCommented:
2n3,

The solution that you accepted from andreif is good, but just for the heck-of-it I thought I'd show you one more method that I believe is the most efficient.  Part of the reason that this is more efficient is because all (but one) of the other solutions puts the entire input file into a var; this one doesn't.  Instead, this one only assigns 1 IP address at a time to a var (not counting the $1 var from the regex) then outputs the IP and port number as needed which keeps the resource usage down to a minimum.  fantasy1001's solution is the next most efficient.

open IP, 'file.txt' or die $!;
while (<IP>) {
   if (/(\d+\.\d+\.\d+\.\d+)/) {
      $ip = $1;
   }
   elsif (/(:\d+)/) {
      print "$ip$1\n";
   }
}
0

Featured Post

Keep up with what's happening at Experts Exchange!

Sign up to receive Decoded, a new monthly digest with product updates, feature release info, continuing education opportunities, and more.

Tackle projects and never again get stuck behind a technical roadblock.
Join Now