Modifying multiple lines in text file using PHP

Fernanditos
Fernanditos used Ask the Experts™
on
Hi,

I have a .txt file which has thousands of domain names listed in the same format you see in the attached lines.

I need to be able pick up the .net and .com only, or just delete the lines not containing ".com" or ".net"

I need solotion to get rid off non .com and .net domains, ALSO, to get rid off all characters after first ","... so, I need to get a final list that looks like this: (one domain name per line)

apartemen.com
luvme.com
sandwichbar.net
studio13.com
sunfar.net

Can some expert here please help me to find a solution, I think PHP can make it possible.

thank you.
apartemen.com,10/18/2010 12:00:00 AM,AUC
beehiveseller.asia,10/18/2010 12:00:00 AM,AUC
berlin.asia,10/18/2010 12:00:00 AM,AUC
besplatno.org,10/18/2010 12:00:00 AM,AUC
dekio.asia,10/18/2010 12:00:00 AM,AUC
edoctor.asia,10/18/2010 12:00:00 AM,AUC
enterbada.asia,10/18/2010 12:00:00 AM,AUC
global-gong.asia,10/18/2010 12:00:00 AM,AUC
globalgong.asia,10/18/2010 12:00:00 AM,AUC
gratuit.asia,10/18/2010 12:00:00 AM,AUC
karafarini.asia,10/18/2010 12:00:00 AM,AUC
lists.asia,10/18/2010 12:00:00 AM,AUC
luvme.com,10/18/2010 12:00:00 AM,AUC
numama.asia,10/18/2010 12:00:00 AM,AUC
sandwichbar.net,10/18/2010 12:00:00 AM,AUC
sandwichbars.asia,10/18/2010 12:00:00 AM,AUC
studio13.com,10/18/2010 12:00:00 AM,AUC
sunfar.net,10/18/2010 12:00:00 AM,AUC

Open in new window

Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
If your input file is domain.txt this perl command will do it:

perl -i.bak -ne 's/,.+//;print if /\.(net|com)$/' domains.txt
Most Valuable Expert 2011
Top Expert 2016

Commented:
Suggest you get the free program Notepad++ and do this in the text editor.  

Author

Commented:
Thank you jmatix, very interesting. I do not have Pearl actually on my server but if I will keep it in case I do not find a solution with php. Thank you!
JavaScript Best Practices

Save hours in development time and avoid common mistakes by learning the best practices to use for JavaScript.

Author

Commented:
Ray, I have Emeditor which is great but I still really do not know to do it with the editor. Do you know?
Most Valuable Expert 2011
Top Expert 2016

Commented:

<?php // RAY_temp_fernanditos.php
error_reporting(E_ALL);
echo "<pre>";

// TEST DATA FROM THE POST AT EE
$str = <<<EOSTR
apartemen.com,10/18/2010 12:00:00 AM,AUC
beehiveseller.asia,10/18/2010 12:00:00 AM,AUC
berlin.asia,10/18/2010 12:00:00 AM,AUC
besplatno.org,10/18/2010 12:00:00 AM,AUC
dekio.asia,10/18/2010 12:00:00 AM,AUC
edoctor.asia,10/18/2010 12:00:00 AM,AUC
enterbada.asia,10/18/2010 12:00:00 AM,AUC
global-gong.asia,10/18/2010 12:00:00 AM,AUC
globalgong.asia,10/18/2010 12:00:00 AM,AUC
gratuit.asia,10/18/2010 12:00:00 AM,AUC
karafarini.asia,10/18/2010 12:00:00 AM,AUC
lists.asia,10/18/2010 12:00:00 AM,AUC
luvme.com,10/18/2010 12:00:00 AM,AUC
numama.asia,10/18/2010 12:00:00 AM,AUC
sandwichbar.net,10/18/2010 12:00:00 AM,AUC
sandwichbars.asia,10/18/2010 12:00:00 AM,AUC
studio13.com,10/18/2010 12:00:00 AM,AUC
sunfar.net,10/18/2010 12:00:00 AM,AUC
EOSTR;

// THE NEEDLES TO SEARCH FOR
$needles = array
( '.com,'
, '.net,'
)
;

// MAKE AN ARRAY FROM THE TEST DATA STRING
$arr = explode(PHP_EOL, $str);

// ITERATE OVER EACH LINE
foreach ($arr as $key => $val)
{
    // MAN PAGE http://us.php.net/manual/en/function.strpos.php
    if ( (strpos($val, $needles[0]) === FALSE) && (strpos($val, $needles[1]) === FALSE) ) unset($arr[$key]);
}
$new = implode(PHP_EOL, $arr);
echo $new;

Open in new window

Author

Commented:
Thank you Ray!

The script output:

apartemen.com,10/18/2010 12:00:00 AM,AUC
luvme.com,10/18/2010 12:00:00 AM,AUC
sandwichbar.net,10/18/2010 12:00:00 AM,AUC
studio13.com,10/18/2010 12:00:00 AM,AUC
sunfar.net,10/18/2010 12:00:00 AM,AUC

Any way to get only the domain names? :

apartemen.com
luvme.com
sandwichbar.net
studio13.com
sunfar.net

Any way to read it from external .txt file?

thank you!
Most Valuable Expert 2011
Top Expert 2016

Commented:
Sure!  Please post a link to the external text file.
Most Valuable Expert 2011
Top Expert 2016
Commented:
Instead of using the test data string, you would read the external text file into the $str variable.

$str = file_get_contents('path/to/textfile.txt');
<?php // RAY_temp_fernanditos.php
error_reporting(E_ALL);
echo "<pre>";

// TEST DATA FROM THE POST AT EE
$str = <<<EOSTR
apartemen.com,10/18/2010 12:00:00 AM,AUC
beehiveseller.asia,10/18/2010 12:00:00 AM,AUC
berlin.asia,10/18/2010 12:00:00 AM,AUC
besplatno.org,10/18/2010 12:00:00 AM,AUC
dekio.asia,10/18/2010 12:00:00 AM,AUC
edoctor.asia,10/18/2010 12:00:00 AM,AUC
enterbada.asia,10/18/2010 12:00:00 AM,AUC
global-gong.asia,10/18/2010 12:00:00 AM,AUC
globalgong.asia,10/18/2010 12:00:00 AM,AUC
gratuit.asia,10/18/2010 12:00:00 AM,AUC
karafarini.asia,10/18/2010 12:00:00 AM,AUC
lists.asia,10/18/2010 12:00:00 AM,AUC
luvme.com,10/18/2010 12:00:00 AM,AUC
numama.asia,10/18/2010 12:00:00 AM,AUC
sandwichbar.net,10/18/2010 12:00:00 AM,AUC
sandwichbars.asia,10/18/2010 12:00:00 AM,AUC
studio13.com,10/18/2010 12:00:00 AM,AUC
sunfar.net,10/18/2010 12:00:00 AM,AUC
EOSTR;

// THE NEEDLES TO SEARCH FOR
$needles = array
( '.com,'
, '.net,'
)
;

// MAKE AN ARRAY FROM THE TEST DATA STRING
$arr = explode(PHP_EOL, $str);

// ITERATE OVER EACH LINE
foreach ($arr as $key => $val)
{
    // MAN PAGE http://us.php.net/manual/en/function.strpos.php
    if ( (strpos($val, $needles[0]) === FALSE) && (strpos($val, $needles[1]) === FALSE) )
    {
        unset($arr[$key]);
    }
    else
    {
        // FIND THE COMMA AT THE END OF THE TLD
        $poz = strpos($val, ',');
        $arr[$key] = substr($val, 0, $poz);
    }
}
$new = implode(PHP_EOL, $arr);
echo $new;

Open in new window

Author

Commented:
This is a 15 MB text file domain list. I am not sure why do you need the real text file, however here it is zipped. http://musichat.net/domains.rar

thank you for your great help.

Author

Commented:
You can test with this smaller: http://musichat.net/domains.txt

Author

Commented:
Sorry, I did not see your last post. I tried your solution and it works like a charm! Thank you!

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial