Learn how to a build a cloud-first strategyRegister Now

x
?
Solved

A proxy regex ?

Posted on 2007-08-07
4
Medium Priority
?
7,630 Views
Last Modified: 2008-12-13
http://www.experts-exchange.com/Web/Web_Languages/PHP/Q_22050007.html

I have read this topic about how to use regex to strip out proxies from a page. Here is the final code by TeRReF :

<?php

$s = '80.249.72.180:80          elite proxy          Algeria (Algiers)
80.249.76.82:80          anonymous          Algeria
190.49.168.251:6588          elite proxy          Argentina (Buenos Aires)
83.160.170.10      8080      transparent      Netherlands      2006-10-25      Whois
203.106.52.102      3128      transparent      Malaysia      2006-10-25      Whois
213.52.140.53      80      anonymous      Great Britain (UK)      2006-10-25      Whois
128.134.137.23:8080 3 Kb/s
138.89.253.5:33322 26 Kb/s
192.38.109.143:3128 23 Kb/s
165.228.131.10      3128      transparent      Australia      2006-10-25      Whois
212.174.34.186      8080      anonymous      Turkey      2006-10-25      Whois
1      202.147.181.2      8080      transparent      Pakistan      2006-11-05      WHOIS
2      62.101.80.187      8080      high anonymity      Italy      2006-11-05      WHOIS
3      84.20.143.8      8080      transparent      Finland      2006-11-05      WHOIS
4      203.115.1.135      80      transparent      Sri Lanka      2006-11-05      WHOIS
yoho.uwaterloo.ca:8000   transparent      Pakistan      2006-11-05      WHOIS
kleinbonum.ethz.ch:8000   elite proxy          Algeria (Algiers)';

preg_match_all('/([\w\d]+\.[\w\d\.]+)[:\s]+(\d{1,5})/i', $s, $matches);
$count = count($matches[1]);
for ($i = 0; $i < $count; $i++)
        $lines[] = $matches[1][$i].':'.$matches[2][$i];

print_r($lines);

?>

It works well with the strings inside the $s but if the are something like abcdef255.255.255.255:80 then the code will go wrong. Can anyone provide an update for this please ? All helps will be appreciated. Thanks.
0
Comment
Question by:phpdotnet
  • 2
4 Comments
 
LVL 51

Expert Comment

by:ahoffmann
ID: 19647387
the question here seems to be: where did you abcdef255.255.255.255:80 get from, rather than finding a new regex

As this example abcdef255.255.255.255:80 is ambigious it will be hard to find a regex which also covers your other examples.
0
 
LVL 9

Accepted Solution

by:
keteracel earned 1000 total points
ID: 19647433
try this:

<?php

function isValidIpOrHost($str) {
  $parts = explode(".", $str);
  $allNumbers = true;

  for ($i = 0; $i < count($parts); $i++) {
    if ("".intVal($parts[$i]) == $parts[$i]) {
      if ($i == count($parts) -1) {
        return $allNumbers;
      }
    }
    else {
      $allNumbers = false;
    }
  }
  return true;
}

$s = '80.249.72.180:80          elite proxy          Algeria (Algiers)
abcdef80.249.76.82:80          anonymous          Algeria
190.49.168.251:6588          elite proxy          Argentina (Buenos Aires)
83.160.170.10      8080      transparent      Netherlands      2006-10-25      Whois
203.106.52.102      3128      transparent      Malaysia      2006-10-25      Whois
213.52.140.53      80      anonymous      Great Britain (UK)      2006-10-25      Whois
128.134.137.23:8080 3 Kb/s
138.89.253.5:33322 26 Kb/s
192.38.109.143:3128 23 Kb/s
165.228.131.10      3128      transparent      Australia      2006-10-25      Whois
212.174.34.186      8080      anonymous      Turkey      2006-10-25      Whois
1      202.147.181.2      8080      transparent      Pakistan      2006-11-05      WHOIS
2      62.101.80.187      8080      high anonymity      Italy      2006-11-05      WHOIS
3      84.20.143.8      8080      transparent      Finland      2006-11-05      WHOIS
yoho.uwaterloo.ca:8000   transparent      Pakistan      2006-11-05      WHOIS
kleinbonum.ethz.ch:8000   elite proxy          Algeria (Algiers)
4      203.115.1.135      80      transparent      Sri Lanka      2006-11-05      WHOIS
80.126.109.localhost:80     transparent       UK      2006-01-02      
abcdef80.126.109.localhost:80     transparent       UK      2006-01-02      WHOIS';

preg_match_all('/([\w\d]+\.[\w\d\.]+)[:\s]+(\d{1,5})/i', $s, $matches);
$count = count($matches[1]);
$lines = array();
for ($i = 0; $i < $count; $i++)
    if (isValidIpOrHost($matches[1][$i])) {
        $lines[] = $matches[1][$i].':'.$matches[2][$i];
    }

print_r($lines);

?>
0
 
LVL 51

Assisted Solution

by:ahoffmann
ahoffmann earned 1000 total points
ID: 19647436
# here is a lazy one, to be improved inmany ways
preg_match_all('/(([\w\d]+\.[\w\d\.]+){3-}|(\d+\.){3}\d+)[:\s]+(\d{1,5})/i', $s, $matches);
# you need to adapt your code according $matches now
0
 

Author Comment

by:phpdotnet
ID: 19650458
Thank you very much for your helps. That's all good.
0

Featured Post

Free Tool: Site Down Detector

Helpful to verify reports of your own downtime, or to double check a downed website you are trying to access.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Developers of all skill levels should learn to use current best practices when developing websites. However many developers, new and old, fall into the trap of using deprecated features because this is what so many tutorials and books tell them to u…
Introduction This article is intended for those who are new to PHP error handling (https://www.experts-exchange.com/articles/11769/And-by-the-way-I-am-New-to-PHP.html).  It addresses one of the most common problems that plague beginning PHP develop…
The viewer will learn how to create and use a small PHP class to apply a watermark to an image. This video shows the viewer the setup for the PHP watermark as well as important coding language. Continue to Part 2 to learn the core code used in creat…
The viewer will learn how to create a basic form using some HTML5 and PHP for later processing. Set up your basic HTML file. Open your form tag and set the method and action attributes.: (CODE) Set up your first few inputs one for the name and …
Suggested Courses
Course of the Month20 days, 23 hours left to enroll

810 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question