Link to home
Start Free TrialLog in
Avatar of phpdotnet
phpdotnet

asked on

A proxy regex ?

https://www.experts-exchange.com/questions/22050007/stipping-out-proxy-list-from-web-using-php.html

I have read this topic about how to use regex to strip out proxies from a page. Here is the final code by TeRReF :

<?php

$s = '80.249.72.180:80          elite proxy          Algeria (Algiers)
80.249.76.82:80          anonymous          Algeria
190.49.168.251:6588          elite proxy          Argentina (Buenos Aires)
83.160.170.10      8080      transparent      Netherlands      2006-10-25      Whois
203.106.52.102      3128      transparent      Malaysia      2006-10-25      Whois
213.52.140.53      80      anonymous      Great Britain (UK)      2006-10-25      Whois
128.134.137.23:8080 3 Kb/s
138.89.253.5:33322 26 Kb/s
192.38.109.143:3128 23 Kb/s
165.228.131.10      3128      transparent      Australia      2006-10-25      Whois
212.174.34.186      8080      anonymous      Turkey      2006-10-25      Whois
1      202.147.181.2      8080      transparent      Pakistan      2006-11-05      WHOIS
2      62.101.80.187      8080      high anonymity      Italy      2006-11-05      WHOIS
3      84.20.143.8      8080      transparent      Finland      2006-11-05      WHOIS
4      203.115.1.135      80      transparent      Sri Lanka      2006-11-05      WHOIS
yoho.uwaterloo.ca:8000   transparent      Pakistan      2006-11-05      WHOIS
kleinbonum.ethz.ch:8000   elite proxy          Algeria (Algiers)';

preg_match_all('/([\w\d]+\.[\w\d\.]+)[:\s]+(\d{1,5})/i', $s, $matches);
$count = count($matches[1]);
for ($i = 0; $i < $count; $i++)
        $lines[] = $matches[1][$i].':'.$matches[2][$i];

print_r($lines);

?>

It works well with the strings inside the $s but if the are something like abcdef255.255.255.255:80 then the code will go wrong. Can anyone provide an update for this please ? All helps will be appreciated. Thanks.
Avatar of ahoffmann
ahoffmann
Flag of Germany image

the question here seems to be: where did you abcdef255.255.255.255:80 get from, rather than finding a new regex

As this example abcdef255.255.255.255:80 is ambigious it will be hard to find a regex which also covers your other examples.
ASKER CERTIFIED SOLUTION
Avatar of keteracel
keteracel
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of phpdotnet
phpdotnet

ASKER

Thank you very much for your helps. That's all good.