Link to home
Start Free TrialLog in
Avatar of abangbatax
abangbatax

asked on

stipping out proxy list from web using php

Hi,

i would like to know how to code in php to strip proxy list only from a textarea

ex.

in textarea contain

80.249.72.180:80          elite proxy          Algeria (Algiers)
80.249.76.82:80          anonymous          Algeria
190.49.168.251:6588          elite proxy          Argentina (Buenos Aires)
83.160.170.10      8080      transparent      Netherlands      2006-10-25      Whois
203.106.52.102      3128      transparent      Malaysia      2006-10-25      Whois
213.52.140.53      80      anonymous      Great Britain (UK)      2006-10-25      Whois
128.134.137.23:8080 3 Kb/s
138.89.253.5:33322 26 Kb/s
192.38.109.143:3128 23 Kb/s
165.228.131.10      3128      transparent      Australia      2006-10-25      Whois
212.174.34.186      8080      anonymous      Turkey      2006-10-25      Whois
1       202.147.181.2       8080       transparent       Pakistan       2006-11-05       WHOIS
2       62.101.80.187       8080       high anonymity       Italy       2006-11-05       WHOIS
3       84.20.143.8       8080       transparent       Finland       2006-11-05       WHOIS
4       203.115.1.135       80       transparent       Sri Lanka       2006-11-05       WHOIS

here the question:

I want to grab(filter) only

proxy:port

nothing else except that

Thanks
ASKER CERTIFIED SOLUTION
Avatar of TeRReF
TeRReF
Flag of Netherlands image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of abangbatax
abangbatax

ASKER

Hi,

Your answer almost correct for port length 4 ( xxxx) but its not working for
140.134.194.148      49400       TAIWAN (TW)      11/06/2006      Whois Info
it only filter the port to 4940

can you fix it?
Try changing this line:
preg_match_all('/(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})[:\s]*(\d{1,4})/', $s, $matches);
into
preg_match_all('/(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})[:\s]*(\d{1,5})/', $s, $matches);
Ok, its working...
how about filtering
yoho.uwaterloo.ca:8000   transparent      Pakistan      2006-11-05      WHOIS
kleinbonum.ethz.ch:8000   elite proxy          Algeria (Algiers)
thanks... this will be the last question
That's a new question since it requires quite an adjustment to the original regular expression. Why didn't you include these in your original question?
ok never minnd...
thanks alot... anyway
This should work:

<?php

$s = '80.249.72.180:80          elite proxy          Algeria (Algiers)
80.249.76.82:80          anonymous          Algeria
190.49.168.251:6588          elite proxy          Argentina (Buenos Aires)
83.160.170.10      8080      transparent      Netherlands      2006-10-25      Whois
203.106.52.102      3128      transparent      Malaysia      2006-10-25      Whois
213.52.140.53      80      anonymous      Great Britain (UK)      2006-10-25      Whois
128.134.137.23:8080 3 Kb/s
138.89.253.5:33322 26 Kb/s
192.38.109.143:3128 23 Kb/s
165.228.131.10      3128      transparent      Australia      2006-10-25      Whois
212.174.34.186      8080      anonymous      Turkey      2006-10-25      Whois
1      202.147.181.2      8080      transparent      Pakistan      2006-11-05      WHOIS
2      62.101.80.187      8080      high anonymity      Italy      2006-11-05      WHOIS
3      84.20.143.8      8080      transparent      Finland      2006-11-05      WHOIS
4      203.115.1.135      80      transparent      Sri Lanka      2006-11-05      WHOIS
yoho.uwaterloo.ca:8000   transparent      Pakistan      2006-11-05      WHOIS
kleinbonum.ethz.ch:8000   elite proxy          Algeria (Algiers)';

preg_match_all('/([\w\d]+\.[\w\d\.]+)[:\s]+(\d{1,5})/i', $s, $matches);
$count = count($matches[1]);
for ($i = 0; $i < $count; $i++)
        $lines[] = $matches[1][$i].':'.$matches[2][$i];

print_r($lines);

?>
Thanks for all your help!
You're welcome :)