Link to home
Start Free TrialLog in
Avatar of Fernanditos
Fernanditos

asked on

Modifying multiple lines in text file using PHP

Hi,

I have a .txt file which has thousands of domain names listed in the same format you see in the attached lines.

I need to be able pick up the .net and .com only, or just delete the lines not containing ".com" or ".net"

I need solotion to get rid off non .com and .net domains, ALSO, to get rid off all characters after first ","... so, I need to get a final list that looks like this: (one domain name per line)

apartemen.com
luvme.com
sandwichbar.net
studio13.com
sunfar.net

Can some expert here please help me to find a solution, I think PHP can make it possible.

thank you.
apartemen.com,10/18/2010 12:00:00 AM,AUC
beehiveseller.asia,10/18/2010 12:00:00 AM,AUC
berlin.asia,10/18/2010 12:00:00 AM,AUC
besplatno.org,10/18/2010 12:00:00 AM,AUC
dekio.asia,10/18/2010 12:00:00 AM,AUC
edoctor.asia,10/18/2010 12:00:00 AM,AUC
enterbada.asia,10/18/2010 12:00:00 AM,AUC
global-gong.asia,10/18/2010 12:00:00 AM,AUC
globalgong.asia,10/18/2010 12:00:00 AM,AUC
gratuit.asia,10/18/2010 12:00:00 AM,AUC
karafarini.asia,10/18/2010 12:00:00 AM,AUC
lists.asia,10/18/2010 12:00:00 AM,AUC
luvme.com,10/18/2010 12:00:00 AM,AUC
numama.asia,10/18/2010 12:00:00 AM,AUC
sandwichbar.net,10/18/2010 12:00:00 AM,AUC
sandwichbars.asia,10/18/2010 12:00:00 AM,AUC
studio13.com,10/18/2010 12:00:00 AM,AUC
sunfar.net,10/18/2010 12:00:00 AM,AUC

Open in new window

Avatar of Justin Mathews
Justin Mathews

If your input file is domain.txt this perl command will do it:

perl -i.bak -ne 's/,.+//;print if /\.(net|com)$/' domains.txt
Suggest you get the free program Notepad++ and do this in the text editor.  
Avatar of Fernanditos

ASKER

Thank you jmatix, very interesting. I do not have Pearl actually on my server but if I will keep it in case I do not find a solution with php. Thank you!
Ray, I have Emeditor which is great but I still really do not know to do it with the editor. Do you know?

<?php // RAY_temp_fernanditos.php
error_reporting(E_ALL);
echo "<pre>";

// TEST DATA FROM THE POST AT EE
$str = <<<EOSTR
apartemen.com,10/18/2010 12:00:00 AM,AUC
beehiveseller.asia,10/18/2010 12:00:00 AM,AUC
berlin.asia,10/18/2010 12:00:00 AM,AUC
besplatno.org,10/18/2010 12:00:00 AM,AUC
dekio.asia,10/18/2010 12:00:00 AM,AUC
edoctor.asia,10/18/2010 12:00:00 AM,AUC
enterbada.asia,10/18/2010 12:00:00 AM,AUC
global-gong.asia,10/18/2010 12:00:00 AM,AUC
globalgong.asia,10/18/2010 12:00:00 AM,AUC
gratuit.asia,10/18/2010 12:00:00 AM,AUC
karafarini.asia,10/18/2010 12:00:00 AM,AUC
lists.asia,10/18/2010 12:00:00 AM,AUC
luvme.com,10/18/2010 12:00:00 AM,AUC
numama.asia,10/18/2010 12:00:00 AM,AUC
sandwichbar.net,10/18/2010 12:00:00 AM,AUC
sandwichbars.asia,10/18/2010 12:00:00 AM,AUC
studio13.com,10/18/2010 12:00:00 AM,AUC
sunfar.net,10/18/2010 12:00:00 AM,AUC
EOSTR;

// THE NEEDLES TO SEARCH FOR
$needles = array
( '.com,'
, '.net,'
)
;

// MAKE AN ARRAY FROM THE TEST DATA STRING
$arr = explode(PHP_EOL, $str);

// ITERATE OVER EACH LINE
foreach ($arr as $key => $val)
{
    // MAN PAGE http://us.php.net/manual/en/function.strpos.php
    if ( (strpos($val, $needles[0]) === FALSE) && (strpos($val, $needles[1]) === FALSE) ) unset($arr[$key]);
}
$new = implode(PHP_EOL, $arr);
echo $new;

Open in new window

Thank you Ray!

The script output:

apartemen.com,10/18/2010 12:00:00 AM,AUC
luvme.com,10/18/2010 12:00:00 AM,AUC
sandwichbar.net,10/18/2010 12:00:00 AM,AUC
studio13.com,10/18/2010 12:00:00 AM,AUC
sunfar.net,10/18/2010 12:00:00 AM,AUC

Any way to get only the domain names? :

apartemen.com
luvme.com
sandwichbar.net
studio13.com
sunfar.net

Any way to read it from external .txt file?

thank you!
Sure!  Please post a link to the external text file.
ASKER CERTIFIED SOLUTION
Avatar of Ray Paseur
Ray Paseur
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
This is a 15 MB text file domain list. I am not sure why do you need the real text file, however here it is zipped. http://musichat.net/domains.rar

thank you for your great help.
You can test with this smaller: http://musichat.net/domains.txt
Sorry, I did not see your last post. I tried your solution and it works like a charm! Thank you!