Solved

Looking for a wildcard to use with Notepad ++ to strip out some javascript from html

Posted on 2014-02-09
36
369 Views
Last Modified: 2014-03-03
Hello

I am using Notepad Plus.

I am looking for a wildcard that will let me search for

<script language  (wild card)  </script>

What I am looking to do is to strip a bunch of javascript out of some html.  

Any suggestions?

Thanks!

Rowby
0
Comment
Question by:Rowby Goren
  • 20
  • 16
36 Comments
 
LVL 108

Expert Comment

by:Ray Paseur
ID: 39846489
Is a PHP solution of any use?  The regular expression engines in different software packages often have some inconsistencies, especially where metacharacters come into play.

Also, the language attribute may often be omitted, since as a practical matter, there is no script language other than JavaScript.
0
 
LVL 9

Author Comment

by:Rowby Goren
ID: 39846496
Hi Ray,

Yes, that would work too.  

I have several pages    that I need to remove the javascript from.   Here is a more specific "wildcard"

<script language="JavaScript" type="text/javascript"> (then a bunch of javascript -- in some cases it will be slightly different -- which is why I need a wildcard)  
ending with </script>


As I think about this, however, it's all in the mysql database.

I guess I could to a export of the database and put the file in a folder, right?  And the php script would work on that file.  Afterwards I'd import the new mysql file back into the database -- of course protecting myself with a backup....

Thanks!

Rowby
0
 
LVL 108

Expert Comment

by:Ray Paseur
ID: 39846498
If a PHP solution...

<?php // RAY_temp_rowby.php
error_reporting(E_ALL);


// SEE http://www.experts-exchange.com/Web_Development/Web_Languages-Standards/PHP/Q_28360528.html


// TEST DATA FROM THE POST AT EE
$htm = <<<EOD
I am looking for a wildcard that will let me search for

<script language  (wild card)  </script>

What I am looking to do is to strip a bunch of javascript out of some html.
EOD;

// A REGULAR EXPRESSION TO ISOLATE JAVASCRIPT
$rgx
= '#'             // REGEX DELIMITER
. '\<script'      // START OF JAVASCRIPT
. '.*?'           // ANYTHING
. '\</script\>'   // END OF JAVASCRIPT
. '#'             // REGEX DELIMITER
. 'i'             // CASE INSENSITIVE
;

// WHAT HAPPENS
echo '<pre>';
echo PHP_EOL . htmlentities($htm);
$new = preg_replace($rgx, NULL, $htm);
echo PHP_EOL . htmlentities($new);

Open in new window

0
 
LVL 108

Expert Comment

by:Ray Paseur
ID: 39846499
Yes, I think export - process - import could probably be workable.
0
 
LVL 9

Author Comment

by:Rowby Goren
ID: 39846535
I guess this is where change tto the sql file would go?

echo PHP_EOL . htmlentities($htm);

I'll resume this in the a.m.???

Thanks!

Rowby
0
 
LVL 108

Expert Comment

by:Ray Paseur
ID: 39846544
Something like that.  You would put the data from the SQL into the $htm variable, then run the regular expression replacement on line 30, and rewrite the data into the SQL file from the $new variable.
0
 
LVL 9

Author Comment

by:Rowby Goren
ID: 39847251
Here is my thought after sleeping on it.

There are about 15 pages that have this issue.  I can copy the html out of Tinymce editor and create temporary static html pages and put those pages in a folder.  And then run the php only on those pages.  Then copy the results back into those pages.

I need to be cautious because it is on a live site.  I don't have a development site to "test" it on.

So if I put those 15 or so pages into a folder, can the php run on all of the html pages in that folder?  Or do they need to run on each page one at a time.

Thanks

Rowby
0
 
LVL 108

Expert Comment

by:Ray Paseur
ID: 39847385
I think what I might do is duplicate one of the pages and call it "testpage" then run the process on that page.  If it looks right, maybe duplicate another page and try it again.  Once you're fairly sure the answer is likely to work for all of them (and you have a backup), I would run the process one-at-a-time.  With only 15 things to fix, it's not an oppressive load.  If it were 1500 or even 150 I would aim for greater automation, but with only 15 I would take the easy way out ;-)
0
 
LVL 9

Author Comment

by:Rowby Goren
ID: 39856372
Hi Ray

I created a folder on my testing server called

email

And I put RAY_temp_rowby.php inside it.

I also put my first test HTML into it.  It's called musicology.html

Do I need to change its name?  

I assume I go to the browser, to the folder and run RAY_temp_rowby.php

BTW I tried that and didn't get any echo.  

Permissions for ray_temp is 644

Rowby

<?php // RAY_temp_rowby.php
error_reporting(E_ALL);


// SEE http://www.experts-exchange.com/Web_Development/Web_Languages-Standards/PHP/Q_28360528.html


// TEST DATA FROM THE POST AT EE
$htm = <<<EOD
I am looking for a wildcard that will let me search for

<script language  (wild card)  </script>

What I am looking to do is to strip a bunch of javascript out of some html.
EOD;

// A REGULAR EXPRESSION TO ISOLATE JAVASCRIPT
$rgx
= '#'             // REGEX DELIMITER
. '\<script'      // START OF JAVASCRIPT
. '.*?'           // ANYTHING
. '\</script\>'   // END OF JAVASCRIPT
. '#'             // REGEX DELIMITER
. 'i'             // CASE INSENSITIVE
;

// WHAT HAPPENS
echo '<pre>';
echo PHP_EOL . htmlentities($htm);
$new = preg_replace($rgx, NULL, $htm);
echo PHP_EOL . htmlentities($new);

Open in new window

0
 
LVL 9

Author Comment

by:Rowby Goren
ID: 39856391
I renamed the musicologyfile to htm.

Here is the error log:
[13-Feb-2014 09:47:09 America/Chicago] PHP Warning:  PHP Startup: magickwand: Unable to initialize module
Module compiled with module API=20060613
PHP    compiled with module API=20090626
These options need to match
 in Unknown on line 0
[13-Feb-2014 09:47:09 America/Chicago] PHP Warning:  PHP Startup: imagick: Unable to initialize module
Module compiled with module API=20060613
PHP    compiled with module API=20090626
These options need to match
 in Unknown on line 0
[13-Feb-2014 09:47:09 America/Chicago] PHP Warning:  PHP Startup: PDO: Unable to initialize module
Module compiled with module API=20060613
PHP    compiled with module API=20090626
These options need to match
 in Unknown on line 0
[13-Feb-2014 09:47:09 America/Chicago] PHP Warning:  PHP Startup: pdo_sqlite: Unable to initialize module
Module compiled with module API=20060613
PHP    compiled with module API=20090626
These options need to match
 in Unknown on line 0
[13-Feb-2014 09:47:09 America/Chicago] PHP Warning:  PHP Startup: Unable to load dynamic library '/usr/local/lib/php/extensions/no-debug-non-zts-20060613/sqlite.so' - /usr/local/lib/php/extensions/no-debug-non-zts-20060613/sqlite.so: undefined symbol: third_arg_force_ref in Unknown on line 0
[13-Feb-2014 09:47:09 America/Chicago] PHP Warning:  PHP Startup: pdo_mysql: Unable to initialize module
Module compiled with module API=20060613
PHP    compiled with module API=20090626
These options need to match
 in Unknown on line 0
[13-Feb-2014 09:47:09 America/Chicago] PHP Warning:  PHP Startup: SourceGuardian: Unable to initialize module
Module compiled with module API=20060613
PHP    compiled with module API=20090626
These options need to match
 in Unknown on line 0
[13-Feb-2014 09:53:31 America/Chicago] PHP Warning:  PHP Startup: magickwand: Unable to initialize module
Module compiled with module API=20060613
PHP    compiled with module API=20090626
These options need to match
 in Unknown on line 0
[13-Feb-2014 09:53:31 America/Chicago] PHP Warning:  PHP Startup: imagick: Unable to initialize module
Module compiled with module API=20060613
PHP    compiled with module API=20090626
These options need to match
 in Unknown on line 0
[13-Feb-2014 09:53:31 America/Chicago] PHP Warning:  PHP Startup: PDO: Unable to initialize module
Module compiled with module API=20060613
PHP    compiled with module API=20090626
These options need to match
 in Unknown on line 0
[13-Feb-2014 09:53:31 America/Chicago] PHP Warning:  PHP Startup: pdo_sqlite: Unable to initialize module
Module compiled with module API=20060613
PHP    compiled with module API=20090626
These options need to match
 in Unknown on line 0
[13-Feb-2014 09:53:31 America/Chicago] PHP Warning:  PHP Startup: Unable to load dynamic library '/usr/local/lib/php/extensions/no-debug-non-zts-20060613/sqlite.so' - /usr/local/lib/php/extensions/no-debug-non-zts-20060613/sqlite.so: undefined symbol: third_arg_force_ref in Unknown on line 0
[13-Feb-2014 09:53:31 America/Chicago] PHP Warning:  PHP Startup: pdo_mysql: Unable to initialize module
Module compiled with module API=20060613
PHP    compiled with module API=20090626
These options need to match
 in Unknown on line 0
[13-Feb-2014 09:53:31 America/Chicago] PHP Warning:  PHP Startup: SourceGuardian: Unable to initialize module
Module compiled with module API=20060613
PHP    compiled with module API=20090626
These options need to match
 in Unknown on line 0
[13-Feb-2014 09:53:35 America/Chicago] PHP Warning:  PHP Startup: magickwand: Unable to initialize module
Module compiled with module API=20060613
PHP    compiled with module API=20090626
These options need to match
 in Unknown on line 0
[13-Feb-2014 09:53:35 America/Chicago] PHP Warning:  PHP Startup: imagick: Unable to initialize module
Module compiled with module API=20060613
PHP    compiled with module API=20090626
These options need to match
 in Unknown on line 0
[13-Feb-2014 09:53:35 America/Chicago] PHP Warning:  PHP Startup: PDO: Unable to initialize module
Module compiled with module API=20060613
PHP    compiled with module API=20090626
These options need to match
 in Unknown on line 0
[13-Feb-2014 09:53:35 America/Chicago] PHP Warning:  PHP Startup: pdo_sqlite: Unable to initialize module
Module compiled with module API=20060613
PHP    compiled with module API=20090626
These options need to match
 in Unknown on line 0
[13-Feb-2014 09:53:35 America/Chicago] PHP Warning:  PHP Startup: Unable to load dynamic library '/usr/local/lib/php/extensions/no-debug-non-zts-20060613/sqlite.so' - /usr/local/lib/php/extensions/no-debug-non-zts-20060613/sqlite.so: undefined symbol: third_arg_force_ref in Unknown on line 0
[13-Feb-2014 09:53:35 America/Chicago] PHP Warning:  PHP Startup: pdo_mysql: Unable to initialize module
Module compiled with module API=20060613
PHP    compiled with module API=20090626
These options need to match
 in Unknown on line 0
[13-Feb-2014 09:53:35 America/Chicago] PHP Warning:  PHP Startup: SourceGuardian: Unable to initialize module
Module compiled with module API=20060613
PHP    compiled with module API=20090626
These options need to match
 in Unknown on line 0
[13-Feb-2014 09:53:38 America/Chicago] PHP Warning:  PHP Startup: magickwand: Unable to initialize module
Module compiled with module API=20060613
PHP    compiled with module API=20090626
These options need to match
 in Unknown on line 0
[13-Feb-2014 09:53:38 America/Chicago] PHP Warning:  PHP Startup: imagick: Unable to initialize module
Module compiled with module API=20060613
PHP    compiled with module API=20090626
These options need to match
 in Unknown on line 0
[13-Feb-2014 09:53:38 America/Chicago] PHP Warning:  PHP Startup: PDO: Unable to initialize module
Module compiled with module API=20060613
PHP    compiled with module API=20090626
These options need to match
 in Unknown on line 0
[13-Feb-2014 09:53:38 America/Chicago] PHP Warning:  PHP Startup: pdo_sqlite: Unable to initialize module
Module compiled with module API=20060613
PHP    compiled with module API=20090626
These options need to match
 in Unknown on line 0
[13-Feb-2014 09:53:38 America/Chicago] PHP Warning:  PHP Startup: Unable to load dynamic library '/usr/local/lib/php/extensions/no-debug-non-zts-20060613/sqlite.so' - /usr/local/lib/php/extensions/no-debug-non-zts-20060613/sqlite.so: undefined symbol: third_arg_force_ref in Unknown on line 0
[13-Feb-2014 09:53:38 America/Chicago] PHP Warning:  PHP Startup: pdo_mysql: Unable to initialize module
Module compiled with module API=20060613
PHP    compiled with module API=20090626
These options need to match
 in Unknown on line 0
[13-Feb-2014 09:53:38 America/Chicago] PHP Warning:  PHP Startup: SourceGuardian: Unable to initialize module
Module compiled with module API=20060613
PHP    compiled with module API=20090626
These options need to match
 in Unknown on line 0

Open in new window

0
 
LVL 9

Author Comment

by:Rowby Goren
ID: 39856435
I installed  xampp on my laptop yesterday.  Would it be better I run the program there.  Then I can set up the "server" however it works best for your script to run.
0
 
LVL 9

Author Comment

by:Rowby Goren
ID: 39856538
Tried running it on localhost xampp

http://localhost/ucla/emailfix/RAY_temp_rowby.php

Nothing seems to be "happening" when I run the script.  But at least I have it in a controlled environment.

That's a start!

Rowby
0
 
LVL 108

Expert Comment

by:Ray Paseur
ID: 39861239
Yeah, it's hard for me to understand what may be happening over there, especially with PHP startup errors.  Here is a good test script.  If this does not produce output, PHP is not installed or at least not installed correctly.

<?php phpinfo();

Open in new window

0
 
LVL 9

Author Comment

by:Rowby Goren
ID: 39861419
Hi

Here's the result of phpinfo.  Again I am running it on xampp on my local computer.

Rowby
pphinfo.pdf
0
 
LVL 9

Author Comment

by:Rowby Goren
ID: 39868881
Hi Ray,

I hope you are not snowbound!

Anyway when you have a chance please look at my phpinfo results.  They are for the xxamp that I set up on my local machine.

Thanks!

Rowby
0
 
LVL 108

Expert Comment

by:Ray Paseur
ID: 39870541
Hi, Rowby.  I don't see anything wrong in phpinfo() with the possible exception that the local timezone is set to Berlin.  But that should not matter.  I think I would opt for a complete re-install.  It looks like something is out of whack in the messages like this:

Module compiled with module API=20060613
PHP    compiled with module API=20090626
These options need to match
0
 
LVL 9

Author Comment

by:Rowby Goren
ID: 39870559
ok   I will do a new install tomorrow :)

Thanks!
0
 
LVL 108

Expert Comment

by:Ray Paseur
ID: 39871116
Hope that helps.  Meanwhile back here in the mid-Atlantic we are melting fast.  It's almost 60 outside today!  Quite a change from 2 weeks ago when we had a high of 14.
0
What Should I Do With This Threat Intelligence?

Are you wondering if you actually need threat intelligence? The answer is yes. We explain the basics for creating useful threat intelligence.

 
LVL 9

Author Comment

by:Rowby Goren
ID: 39871136
Wow!

I then won't bully you with the weather we've been having in West Los Angeles.

Rowby
0
 
LVL 9

Author Comment

by:Rowby Goren
ID: 39877710
Hi Ray,

I uninstalled xampp, rebooted windows and downloaded a new copy of xampp and installed it.

I created a subfolder and put the musicology.html file and the attached Ray-temp-rowby.php file.

Also it's not generating an error log.

Do i need to rename the musicology.html file. (I doubt it but thought I'd ask).

: )  :>  :}

Rowby
RAY-temp-rowby.php
musicology.html
browserview.jpg
browserview.jpg
phpinfo-xammp.pdf
0
 
LVL 108

Expert Comment

by:Ray Paseur
ID: 39878247
This seems to work OK for me.
http://iconoun.com/demo/temp_rowby.php

<?php // demo/temp_rowby.php
error_reporting(E_ALL);


// SEE http://www.experts-exchange.com/Web_Development/Web_Languages-Standards/PHP/Q_28360528.html


// TEST DATA FROM THE POST AT EE
$htm = file_get_contents('http://filedb.experts-exchange.com/incoming/2014/02_w08/836108/musicology.html');

// A REGULAR EXPRESSION TO ISOLATE JAVASCRIPT
$rgx
= '#'             // REGEX DELIMITER
. '\<script'      // START OF JAVASCRIPT
. '.*?'           // ANYTHING
. '\</script\>'   // END OF JAVASCRIPT
. '#'             // REGEX DELIMITER
. 'is'            // CASE INSENSITIVE, MULTI-LINE
;

// WHAT HAPPENS
echo '<pre>';
// echo PHP_EOL . htmlentities($htm);
$new = preg_replace($rgx, NULL, $htm, -1, $num);
echo PHP_EOL . "$num REPLACEMENTS";
echo PHP_EOL . htmlentities($new);

Open in new window

0
 
LVL 9

Author Comment

by:Rowby Goren
ID: 39878263
I'll try again, on my local thing and I'll try a different online domain I have laying around.

Just so I am clear -- once it is run what is supposed to happen?  

A new file, or a overwrite of the existing file?

Rowby
0
 
LVL 108

Expert Comment

by:Ray Paseur
ID: 39878321
Whatever you want to make happen!  All the script does is show how to remove the JavaScript.  You can use the $new variable in any way that makes sense for your app.
0
 
LVL 9

Author Comment

by:Rowby Goren
ID: 39892986
Hi Ray

I got it working fine -- for one file.  Works great.

Got rid of all the javascript.  Perfect!

Let's call that first html file HTML file #1

But when I put in a new file (HTML file #2) (and completely remove the old HTML File #1) and run the script again the results of RAY_temp_rowby.php is the fixed HIML File #1

Are the results of the script saved elsewhere (not on the server)? I am using xxamp.    Perhaps in a buffer in my xampp server?

Is there something I need to do to "flush out" the previous output of HTML file #1

I rebooted xampp and even rebooted my computer -- but when I run the script again the output is the fixed original file HTML #1.

I have a completely new html file with a different name.html  with a completely new html source code (with the javascript to be removed) but somehow the output still showing the fixed html #1 results).  The Html #1 file is no longer even in the folder.

????
Xamp control panel info, if it helps:

7:12:09 AM  [main]       Initializing Control Panel
7:12:09 AM  [main]       Windows Version: Windows 8  64-bit
7:12:09 AM  [main]       XAMPP Version: 1.8.3
7:12:09 AM  [main]       Control Panel Version: 3.2.1  [ Compiled: May 7th 2013 ]
7:12:09 AM  [main]       You are not running with administrator rights! This will work for
7:12:09 AM  [main]       most application stuff but whenever you do something with services
7:12:09 AM  [main]       there will be a security dialogue or things will break! So think
7:12:09 AM  [main]       about running this application with administrator rights!
7:12:09 AM  [main]       XAMPP Installation Directory: "c:\xampp\"
7:12:09 AM  [main]       Checking for prerequisites
7:12:19 AM  [main]       All prerequisites found
7:12:19 AM  [main]       Initializing Modules
7:12:19 AM  [main]       Starting Check-Timer
7:12:19 AM  [main]       Control Panel Ready
7:12:21 AM  [Apache]       Attempting to start Apache app...
7:12:21 AM  [mysql]       Attempting to start MySQL app...
7:12:23 AM  [Apache]       Status change detected: running
7:12:23 AM  [mysql]       Status change detected: running 

Open in new window


I'm assuming this has to do something with "flushing" or clearing "cache" / "buffer" in xampp.

Should I post such a question on an EE forum here to find out how to do that?

Or can it be handled in the script?
0
 
LVL 108

Expert Comment

by:Ray Paseur
ID: 39893112
Please show me the new script, thanks.
0
 
LVL 9

Author Comment

by:Rowby Goren
ID: 39893214
Hi

Here is the version that I'm running...

Rowby

<?php // demo/temp_rowby.php
error_reporting(E_ALL);


// SEE http://www.experts-exchange.com/Web_Development/Web_Languages-Standards/PHP/Q_28360528.html


// TEST DATA FROM THE POST AT EE
$htm = file_get_contents('http://filedb.experts-exchange.com/incoming/2014/02_w08/836108/musicology.html');

// A REGULAR EXPRESSION TO ISOLATE JAVASCRIPT
$rgx
= '#'             // REGEX DELIMITER
. '\<script'      // START OF JAVASCRIPT
. '.*?'           // ANYTHING
. '\</script\>'   // END OF JAVASCRIPT
. '#'             // REGEX DELIMITER
. 'is'            // CASE INSENSITIVE, MULTI-LINE
;

// WHAT HAPPENS
echo '<pre>';
// echo PHP_EOL . htmlentities($htm);
$new = preg_replace($rgx, NULL, $htm, -1, $num);
echo PHP_EOL . "$num REPLACEMENTS";
echo PHP_EOL . htmlentities($new);

Open in new window

0
 
LVL 108

Expert Comment

by:Ray Paseur
ID: 39893323
Looks to me like these scripts are reading from the exact same URL.

OLD: $htm = file_get_contents('http://filedb.experts-exchange.com/incoming/2014/02_w08/836108/musicology.html');
NEW: $htm = file_get_contents('http://filedb.experts-exchange.com/incoming/2014/02_w08/836108/musicology.html');

Open in new window

0
 
LVL 9

Author Comment

by:Rowby Goren
ID: 39893547
Screen capture of folderIt's weird...

I totally removed the musicology.htm file from the folder.

Here's a screen shot of the xampp folder:
Here's the first few lines of the cleaned file  -- it's from the musicology.html file --  even though musicology.html is no longer in that folder.

197 REPLACEMENTS
<p>&nbsp;
	</p>
<p>
	<strong>*click on bold names for bios of students</strong></p>
<table align="left" border="0" style="width: 360px; height: 552px;">
	<tbody>

Open in new window



Here's the first few lines of the hasom-contact-spam.html file which is now in the folder.  But the results that show up start with the above lines from the "ghost" musicology.html file.

<p><span style="font-family: 'Lucida Grande'; font-size: 13px; line-height: 18px;">&nbsp;</span></p><h1 style="font: normal normal bold 23px/29px Helvetica, Arial, sans-serif; color: #292929; margin-top: 0px; margin-right: 0px; margin-bottom: 15px; margin-left: 0px; padding: 0px;"><strong>School of Music</strong></h1><p style="margin-right: 0px; margin-bottom: 1em; margin-left: 0px; margin-top: 0px;">2539 Street Address<br />
Name of School <br />
Los Angeles, California 

Open in new window


Etc.    So it is weird that lines such as "*click on bold names for bios of students</strong></" are showing up even though the file that includes that text is not in the folder :)

???

That's why I think the php file is writing to some sort of a buffer in xampp that needs to be flushed out?????
0
 
LVL 9

Author Comment

by:Rowby Goren
ID: 39898813
It's a mystery to me where the corrected data is stored and then displayed.

As a test I completely uninstalled xampp and reinstalled it with the new totally different html file that needs cleaning.  The old file never was put into the folder.

Then I ran RAY_temp_rowby.php again and the resulted cleaned file was the same one.  The same "ghost" file that is  not even in the folder.  

Specifically the one that starts with ....

197 REPLACEMENTS
<p>&nbsp;
      </p>
<p>
      <strong>*click on bold names for bios of students</strong></p>

I thought by uninstalling xampp completely -- any "buffer" or "cache" in xampp would disappear.

So I guess my question to you, Ray, is where is this cleaned up data stored before it is rewritten to the file?  It's nowhere in xampp, apparently.  

Is there a way for your script to be more verbose in its output so I can debug this mystery?  Specifically, perhaps, saying where the data is being stored before it is written to the file.

The new file that is in the folder is called hasom-contact-spam.html

And nowhere in the file is the phrase ">*click on bold names for bios of students".  

?????

:) :) :)

Rowby

Just now, as another test, I removed all html files from the folder.  just running RAY_temp_rowby.php  Nothing else is in the folder.  However when I run RAY_temp_rowby.php the contents from the original html file (cleaned) show up,

I checked windows clipboard (even though I doubted it would be there) and it's not there.

???
0
 
LVL 108

Expert Comment

by:Ray Paseur
ID: 39900227
Please post the RAY_temp_rowby.php script as you have it now.  I'll be glad to add comments or whatever I can to make it clear how you can tailor it for your needs.
0
 
LVL 9

Author Comment

by:Rowby Goren
ID: 39900763
Hi

Here's the latest version. I think what might help is instead of there results being "written" to the RAY-temp_rowby.php file can it be written to a new file with the same name of the input file, plus the word "output".

Example:   musicology.html  would output to musicology_output.html   The input files would always be in this format:  name.html      So output file name would always be something like name_output.html  Always html

I must say I am baffled where the current data is being "saved" to because when I look at the Ray-temp_rowby.php file after it is run, it goes back to the "virgin" version of Ray-temp_rowby.php with no clue of where the html was saved.    Enlightenment would be fascinating, but I guess unnecessary as long as, with your revision, I get the output I am looking for :)

In any case I would say an _output suffix would solve the mystery?

Thanks for your expert help!

Rowby
RAY-temp-rowby.php
0
 
LVL 108

Accepted Solution

by:
Ray Paseur earned 500 total points
ID: 39901385
OK, I think I understand it better now.  I was just writing the data to the browser so you could see if it looked correct.  This version should write it back to the server file system.  It will likely work best if you use a relative URL for the input.  Your script will need write permissions for the directory.  It probably has the right permissions already.  Plug in the file name and give it a try.

<?php // demo/temp_rowby.php
error_reporting(E_ALL);


// SEE http://www.experts-exchange.com/Web_Development/Web_Languages-Standards/PHP/Q_28360528.html


// ORIGINAL TEST DATA FILE LINK FROM THE POST AT EE
$url = 'http://filedb.experts-exchange.com/incoming/2014/02_w08/836108/musicology.html';

// MAKE THE NEW FILE NAME
$arr = explode('.', $url);
$cnt = count($arr)-2;
$arr[$cnt] .= '_output';
$out = implode('.', $arr);

// READ THE INPUT FILE
$htm = file_get_contents($url);

// A REGULAR EXPRESSION TO ISOLATE JAVASCRIPT
$rgx
= '#'             // REGEX DELIMITER
. '\<script'      // START OF JAVASCRIPT
. '.*?'           // ANYTHING
. '\</script\>'   // END OF JAVASCRIPT
. '#'             // REGEX DELIMITER
. 'is'            // CASE INSENSITIVE, MULTI-LINE
;

// NULLIFY THE JAVASCRIPT
$new = preg_replace($rgx, NULL, $htm, -1, $num);

// WRITE THE NEW FILE
file_put_contents($out, $new);

// SHOW A LINK TO THE NEW FILE
echo PHP_EOL . '<a href="' . $out . '">' . $num . ' REPLACEMENTS HERE: ' . $out . '</a>';

Open in new window

0
 
LVL 9

Author Comment

by:Rowby Goren
ID: 39901417
Hi Ray

Works perfectly!

I'll award points once I test it on a few more pages.

But I'm sure it is fine!

Appreciations again!

Rowby
0
 
LVL 108

Expert Comment

by:Ray Paseur
ID: 39901491
Glad to help!  

All the best,
~Ray
0
 
LVL 9

Author Closing Comment

by:Rowby Goren
ID: 39901548
Hi Ray

Took care of all the rogue javascript problems.  

Thanks!
0
 
LVL 108

Expert Comment

by:Ray Paseur
ID: 39901763
Great!  Thanks for the points and thanks for using EE, ~Ray
0

Featured Post

Better Security Awareness With Threat Intelligence

See how one of the leading financial services organizations uses Recorded Future as part of a holistic threat intelligence program to promote security awareness and proactively and efficiently identify threats.

Join & Write a Comment

This is an explanation of a simple data model to help parse a JSON feed
Whether you’re a college noob or a soon-to-be pro, these tips are sure to help you in your journey to becoming a programming ninja and stand out from the crowd.
The viewer will learn how to use and create keystrokes in Netbeans IDE 8.0 for Windows.
Viewers will learn how to properly install Eclipse with the necessary JDK, and will take a look at an introductory Java program. Download Eclipse installation zip file: Extract files from zip file: Download and install JDK 8: Open Eclipse and …

757 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

22 Experts available now in Live!

Get 1:1 Help Now