Question

Free for all links page mods, just need to make it refuse certian words and from any page except certian ones. 500points

Asked by: Caiapfas

I'm trying to stop the spam!! I run a few for all links pages , but the "spammers" are making it so i dont want to.
Check it out , I get 1000 aday.
http://www.ccds.addr.com/wrc/links/addalink.htm

So what I was "trying" to do is, add a few protection measurse.

1. Have the abilty in the script to deny certian words (sex, make money, etc, etc)
2. DENY HTML/CSS
3. Deny anypost except from certian pages
example my page
my friedns page
Reason for this people have programs that add thousands of links everyDAY!!!! and never come to my page!!

I have tryed to make the mods, but i'm busy learning php. thanks

Script start
---------------------------------------------------------------------------------------------------------------
#!/usr/local/bin/perl

# Define Variables

$filename = "addalink.htm";
$linksurl = "http://www.ccds.addr.com/wrc/links/addalink.htm";
$linkscgi = "links.pl";
$linkstitle = "Free for all Links Page";
$database = "database.txt";

# Done
##############################################################################

# Get the input
read(STDIN, $buffer, $ENV{'CONTENT_LENGTH'});

# Split the name-value pairs
@pairs = split(/&/, $buffer);

foreach $pair (@pairs) {
   ($name, $value) = split(/=/, $pair);

   $value =~ tr/+/ /;
   $value =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex($1))/eg;
   $value =~ s/<([^>]|\n)*>//g;
   $value =~ s/<//g;
   $value =~ s/>//g;
   $FORM{$name} = $value;
}

if ($FORM{'url'} eq 'http://' || $FORM{'url'} !~ /^(f|ht)tp:\/\/\w+\.\w+/) {
   &no_url;
}
if (!($FORM{'title'})) {
   &no_title;
}

# Enter our tags and sections into an associative array

%sections = ("busi","Business","comp","Computers","educ","Education",
           "ente","Entertainment","gove","Government",
           "pers","Personal","misc","Miscellaneous");

# Suck previous link file into one big string
open(FILE,"$filename");
@lines = <FILE>;
close(FILE);

$i=1;
foreach $line (@lines) {    
    if ($line =~ /\<li\>\<a href\=\"([^\"]+)\">([^<]+)<\/a>/) {
        if ($FORM{'url'} eq $1) {
            &repeat_url;
        }
        $i++;
    }
}

# Open Link File to Output
open (FILE,">$filename");

foreach $line (@lines) { # For every line in our data

   if ($line =~ /<!--time-->/) {
      @months = ('January','February','March','April','May','June',
             'July','August','September','October','November','December');

      @days = ('Sunday','Monday','Tuesday','Wednesday','Thursday',
             'Friday','Saturday');

      ($sec,$min,$hour,$mday,$mon,$year,$wday) = (localtime(time))[0,1,2,3,4,5,6];
      if ($sec < 10) { $sec = "0$sec"; }
      if ($min < 10) { $min = "0$min"; }
      if ($hour < 10) { $hour = "0$hour"; }
      if ($mday < 10) { $mday = "0$mday"; }
      $year += 1900;
      $date = "on $days[$wday], $months[$mon] $mday, $year at $hour:$min:$sec";
      print FILE "<!--time--><b>Last link was added $date</b><hr>\n";
   }
   elsif ($line =~ /<!--number-->/) {
      print FILE "<!--number--><b>There are <i>$i</i> links on this ";
      print FILE "page.</b><br>\n";
   }
   else {
       print FILE $line;
   }

   foreach $tag ( keys %sections) { # For every tag
      if ( ($FORM{'section'} eq $sections{$tag}) &&
         ($line =~ /<!--$tag-->/) ) {

         print FILE "<li><a href=\"$FORM{'url'}\">$FORM{'title'}</a>\n";
      }
   }
}

close (FILE);

# Return Link File
print "Location: $linksurl\n\n";

if ($database ne '') {
    open (DATABASE,">>$database");
    print DATABASE "$FORM{'url'}\n";
    close(DATABASE);
}

sub no_url {
   print "Content-type: text/html\n\n";
   print "<html><head><title>ERROR: No URL</title></head>\n";
   print "<body bgcolor=#FFFFFF text=#000000><center>";
   print "<h1>No URL</h1></center>\n";
   print "You forgot to enter a url you wanted added to the Free for ";  
   print "all link page.  Another possible problem was that your link ";
   print "was invalid.<p>\n";
   print "<form method=POST action=\"$linkscgi\">\n";
   print "<input type=hidden name=\"title\" value=\"$FORM{'title'}\">\n";
   print "<input type=hidden name=\"section\"";
   print "value=\"$FORM{'section'}\">\n";
   print "URL: <input type=text name=\"url\" size=50><p>\n";
   print "<input type=submit> * <input type=reset>\n";
   print "<hr>\n";
   print "<a href=\"$linksurl\">$linkstitle</a>\n";
   print "</form></body></html>\n";

   exit;
}

sub no_title {
   print "Content-type: text/html\n\n";
   print "<html><head><title>ERROR: No Title</title></head>\n";
   print "<body bgcolor=#FFFFFF text=#000000><center>";
   print "<h1>No Title</h1></center>\n";
   print "You forgot to enter a title you wanted added to the Free for ";
   print "all link page.  Another possible problem is that you title ";
   print "contained illegal characters.<p>\n";
   print "<form method=POST action=\"$linkscgi\">\n";
   print "<input type=hidden name=\"url\" value=\"$FORM{'url'}\">\n";
   print "<input type=hidden name=\"section\"";
   print "value=\"$FORM{'section'}\">\n";
   print "TITLE: <input type=text name=\"title\" size=50><p>\n";
   print "<input type=submit> * <input type=reset>\n";
   print "<hr>\n";
   print "<a href=\"$linksurl\">$linkstitle</a>\n";
   print "</form></body></html>\n";

   exit;
}

sub repeat_url {
   print "Content-type: text/html\n\n";
   print "<html><head><title>ERROR: Repeat URL</title></head>\n";
   print "<body bgcolor=#FFFFFF text=#000000><center><h1>Repeat URL</h1></center>\n";
   print "Sorry, this URL is already in the Free For All Link Page.\n";
   print "You cannot add this URL to it again.  Sorry.<p>\n";
   print "<a href=\"$linksurl\">$linkstitle</a>";
   print "</body></html>\n";

   exit;
}


---------------------------------------------------------------------------------------------------------------

This Question has been solved and asker verified All Experts Exchange premium technology solutions are available to subscription members.

Subscribe now for full access to Experts Exchange and get

Instant Access to this Solution

  • Plus...
  • 30 Day FREE access, no risk, no obligation
  • Collaborate with the world's top tech experts
  • Unlimited access to our exclusive solution database
  • Never be left without tech help again

Subscribe Now

Asked On
2004-07-14 at 12:09:02ID21058885
Topic

Miscellaneous Programming

Participating Experts
10
Points
500
Comments
57

Trusted by hundreds of thousands everyday for fast, accurate and reliable tech support.

  • "The time we save is the biggest benefit of Experts Exchange to Warner Bros. What could take multiple guys 2 hours or more each to find is accessed in around 15 minutes on Experts Exchange." Mike Kapnisakis, Warner Bros.
  • "Our team likes having a resource that is more secure than just using Google and most experts using this service really know their stuff. It's nice to look here first versus using Google." Dayna Sellner, Lockheed Martin
  • "Anytime that I've been stumped with a problem, 9 out of 10 times Experts Exchange has either the accepted solution or an open discussion of the potential solution to the problem." Kenny Red, eBay Inc.

See what Experts Exchange can do for you.

Got a question?

We've got the answer.

Experts Exchange has been collecting answers to technology questions since 1996…3 million and counting! If you have a question, chances are we already have your answer.

Screenshot of Experts Exchange Knowledgebase

Need individual assistance?

Our experts are ready to help.

If you can't find the exact answer you're looking for, ask our exclusive community of 50,000 experts. You’ll get a personalized answer from a trusted professional.

Screenshot of Experts Exchange Knowledgebase

Want to learn from the best?

Read articles from industry experts.

Thousands of free tech tips, tricks, how-to’s and tutorials are available in our peer reviewed articles section. See for yourself how smart our experts are, no login required.

Screenshot of an Article

Working on a long term project?

Store your work and research.

Save solutions to your questions, answers you’ve discovered through searching plus helpful articles in your personal knowledgebase for easy future access.

Screenshot of Experts Exchange Knowledgebase

Access the answers to your technology questions today.

Subscribe Now

30-day free trial. Register in 60 seconds.

What Makes Experts Exchange Unique?

Members of the expert community talk about why the experience at Experts Exchange is different than what you will find anywhere else.

Trusted by the world's most respected brands.

image of each brand's logo

Faithfully serving IT professionals since 1996.

Experts Exchange Logo

Try it out and discover for yourself.

Subscribe Now

30-day free trial. Register in 60 seconds.

Related Solutions

  1. mod_perl?
    Hi, I have Apache 1.3.12 in our Sun unix box. I downloaded mod_perl for install, but from the documentation, I don't know how to change the install directory. The documentation says I need to 1. perl Makefile.pl 2. make 3. make install
  2. Mod Perl on Fedora Core 2
    I installed Fedora Core 2 from cd. With this install i have: mod_perl-1.99_12-2.1 I would assume this means I can develop my own mod perl applications without installing anything else? If i add this to my httpd.conf file Alias /perl/ /home/noc/ PerlModule Apache::Regist...
  3. Apache2 hangs on startup with mod_perl
    Dear Experts, I find myself completely unable to figure out why mod_perl won't run on one of my SuSE servers. The machine is running apache2.0.49, as supplied/packaged by SUSE, and the distribution is 9.1 I have a running Apache server on this machine, but if I enable mod_p...

Free Tech Articles

  1. WARNING: 5 Reasons why you should NEVER fix a computer for free.
    It is in our nature to love the puzzle. We are obsessed. The lot of us. We love puzzles. We love the challenge. We thrive on finding the answer. We hate disarray. It bothers us deep in our soul. W...
  2. SCCM OSD Basic troubleshooting
    SCCM 2007 OSD is a fantastic way to deploy operating systems, however, like most things SCCM issues can sometimes be difficult to resolve due to the sheer volume of logs to sift through and the dispe...
  3. Migrate Small Business Server 2003 to Exchange 2010 and Windows 2008 R2
    This guide is intended to provide step by step instructions on how to migrate from Small Business Server 2003 to Windows 2008 R2 with Exchange 2010. For this migration to work you will need the fo...
  4. Create a Win7 Gadget
    This article shows you how to create a simple "Gadget" -- a sort of mini-application supported by Windows 7 and Vista. Gadgets can be dropped anywhere on the desktop to provide instant information, ...
  5. Outlook continually prompting for username and password
    There have been a lot of questions recently regarding Outlook prompting for a username and password whilst using Exchange 2007. There are a few reasons why this would happen and I will try to cover t...
  6. Backup Exchange 2010 Information Store using Windows Backup
    There seems to be quite a lot of confusion around the ability to backup Exchange 2010 using the built in Windows Backup feature. This stems from the omission of this feature prior to Exchange 2007 s...

Cloud Class Webinars

  1. Avoiding Bugs in Microsoft Access
    Alison Balter takes and in-depth look at avoiding bugs in Access. In this webinar you will learn about using the immediate window to debug your applications, invoking the debugger, using breakpoints to troubleshoot, stepping through code, setting the next statement to execute, ...
  2. Top 10 Best New Features in Visio 2010
    Scott Helmers gives live demonstrations of the top 10 new features in Visio 2010. This webinar will teach you how to create compelling diagrams by adding shapes to the page with a single click, linking the shapes in a diagram to data in Excel (or SQL Server, or SharePoint), ...
  3. IT Consultant Business Secrets Revealed
    Michael Munger, Experts Exchange tech pro and IT consultant, pulls back the curtain on his very successful businesses and answers question on every IT consultant and business owner should know about. He shares secrets on what he did to solve the 5 most common problems in IT, ...
  4. Disaster Recovery and Business Continuity
    Quest CTO, Mike Billon, gives an overview of the steps involved in building a dunamic disaster recovery plan. Through case studies and an examination of software/hardware tooles for monitoring and testing, you'll gain a better understandin of where you are, where you want ...
  5. Organize Your Visio Diagrams with Containers and Lists
    Scott Helmers uses cross functional flowcharts, wireframe diagrams, data graphic legends and seating charts to teach you: how to ustilize all three new structured diagram components in Visio 2010, the best practices for organizeing shapes in previous version of Visio, how to organize ...
  6. How to Us Objects, Properties, Events and Methods in Microsoft Access
    Alison Dalter gives an in-depbth look at objects, properties, events and methods in Microsoft Access. In this webinar you will learn about using the object browser, referring to objects, working with properties and methods, working with object variables, understanding the ...

Join the Community

Give a Little. Get a Lot.

Join the community of experts here and help other tech pros by answering question in your area of expertise. You can earn FREE access to all Experts Exchange's premium features and resources.

Join the Community

Answers

 

by: CaiapfasPosted on 2004-07-14 at 12:14:28ID: 11552591

also it would be nice to have an email field so the user HAS to enter their email and it then saves their email and url in referrers.dat, for my records.

and maybe a way to only allow 2 post perday per ip.

http://www.blahblah.com (youremail@email.com)

 

by: elantraPosted on 2004-07-14 at 12:43:22ID: 11552933

You should build a queue so when links are submitted they must be approved by editors.  Editors can be registered users in which you provide the status because they have been around a while.  This way, each link can be inspected and insured it is valid and in the proper location.

 

by: CaiapfasPosted on 2004-07-14 at 14:23:10ID: 11553898

I want to make it self controled, so i will ad these featured to help control it

 

by: FishMongerPosted on 2004-07-14 at 16:01:50ID: 11554606

I'm surprised that you're only getting 1,000 per day!  Allowing anyone and everyone to post links without being reviewed is only asking for trouble.  You could limit the submissions to allow only 1 link per domain and/or IP address, but that still won't help  by very much.  So, until you decide to moderate the submissions, you won't be able to fix the problem.

On another note, your code could use some work.  You should look into using some of the Perl modules that will make the scripts easier to write, more robust, and more secure.  Here are a few things to read up on and use.

1)enable warnings
2)use the strict pragma in EVERY Perl script you write
3)run CGI scripts in taint mode
4)use the CGI module
5)use file locks  (i.e., flock) for reading/writing the flat file DB files (i.e., database.txt)
6)reading/writing MySQL (or other) database files
7)use one of the modules for calculating and formatting the date (POSIX, Date::Time, Time::Format, etc,)

Here's how the beginning of CGI scripts are generally written.

 #!/usr/bin/perl -Tw

use strict;
use CGI;
use CGI::Carp 'fatalsToBrowser';

 

by: rawiliamPosted on 2004-07-28 at 02:31:10ID: 11654383

I would recommend that you create a procedure so that when users register you send them a link to their e-mail address that needs to be verified - this should help a little with the spam issue. Also, check to see if the spammers are using web based e-mail accounts or not and maybe ban these or put in extra steps (e.g. must be verifued by admin) for those users.

I would also look to build an administrative back end with some kind of analysis as to how many postings are being done per user - blocking the IP address won't stop serious spammers as they will probably be on broadband and the ISPs tend to dynamically change their IP address even when thry're connected. If someone is porting like crazy, then you can assume they are a spammer and ban them (or take their e-mail address and spam them back!)

 

by: CaiapfasPosted on 2004-07-28 at 09:29:50ID: 11658402

I need help writting the perl...i dont want an admin section. to much work. i have way to many sites. juts want to do the aboved mentioned stuff.

 

by: swift99Posted on 2004-07-28 at 09:45:10ID: 11658670

You need to have the capacity to identify mis-spelled words as well, and have a "quarantine" status for those caught by the anti-spam measure that may actually be legitimate.  Maybe use soundex to identify possible issues.

Anti-spam is big business now, because it is so hard to do effectively.

Is it worth your while to invest in a commercial anti-spam engine?

 

by: MichaelChartPosted on 2004-07-28 at 14:43:48ID: 11661827

What you could do, you may not want to of course, is to have it count down from 10 seconds before it adds the link therefore any auto-spamming programs or people who add spam will be discouraged from adding.
Just a suggestion.

 

by: mtgradwellPosted on 2004-07-28 at 15:00:01ID: 11661935

The simplest technique I know is to display a short code. e.g. a random four digit number, and ask the visitor to enter that code. Only accept an entry if the visitor's code matches the displayed code.

Hackers can design a workaround which automatically reads the displayed code, but it wouldn't be worth their while unless your links page is almost vying with google for popularity. In that case you could add refinements like displaying the random code as an image at a randomly chosen location on the page, using random fonts.

 

by: swift99Posted on 2004-07-29 at 06:22:55ID: 11666041

Extending on mtgradwell's thought:

Or use a jpeg to display the image of the digits in a funky font on a patterned background.  That moves spoofing the digits into the realm of advanced image processing, which is beyond the means of the average spammer today (not guaranteeing tomorrow).

 

by: tomputsPosted on 2004-07-29 at 14:12:05ID: 11671411

Another option could be to call the submitted and scan the returned page for certain keywords ("screenscraping"). I'm not sure how this is done with perl, since I've only used asp. A quick search on google shows that you choose from a couple of modules to implement this.

 

by: matthew1471Posted on 2004-07-30 at 01:00:09ID: 11674508

The best thing I would do, is flag by IP, IP addresses can be spoofed, but setting a limit on how many per IP is a good idea, that and e-mail address (but to be honest e-mail addresses are easy to come by)

this'd only work on if it was the same person doing this

http://BlogX.co.uk try flooding that with comments

 

by: CaiapfasPosted on 2004-07-30 at 14:14:28ID: 11681420

any help on the actual programming of the perl to do this


1. Have the abilty in the script to deny certian words (sex, make money, etc, etc)
2. DENY HTML/CSS
3. Deny anypost except from certian pages
example my page
my friedns page
4. Deny more than 3 post per 24 hour period

and


it would be nice to have an email field so the user HAS to enter their email and it then saves their email and url in referrers.dat, for my records.

http://www.blahblah.com (youremail@email.com)

 

by: FishMongerPosted on 2004-07-31 at 09:45:33ID: 11684834

>> 1. Have the abilty in the script to deny certian words (sex, make money, etc, etc)

This is going to be very difficult to write from scratch.  First you need to put together the list of words (AND PHRASES?), then you'll need to take each one of them and list their “normal” misspelling, then build a list of their possible obfuscations.  Once you have that “database” assembled, then you can start working on how to handle the searching/matching.

For this part of the script, I think you'd be better off by following swift99's suggestion of investing in a commercial product.

>> 2. DENY HTML/CSS

Based on the links you currently have, you can filter most of this with 1 or 2 simple regex's.  For example, here's one that will strip out the style tags from the links:

   $FORM{'url'} =~ s/style=[^>]+//i;

If you want to strip out the form submissions, this regex should do the job:

   $FORM{'url'} =~ s!/\?[^"]+!!;

>> 3. Deny anypost except from certian pages

I'm not exactly sure what you mean.

 

by: FishMongerPosted on 2004-07-31 at 10:01:17ID: 11684935

Based on some of my suggestions, I've reworked portions of your script that you may want to review.  Each of the subs uses a different method for outputting the html; however, you should use use only 1 of those methods.  My purpose was to show you the various options.

#!/usr/local/bin/perl

use strict;
use POSIX qw(strftime);
use CGI qw(:all);
use CGI::Carp 'fatalsToBrowser';


# Define Variables

my $filename = 'addalink.htm';
my $linksurl = 'http://www.ccds.addr.com/wrc/links/addalink.htm';
my $linkscgi = 'links.pl';
my $linkstitle = 'Free for all Links Page';
my $database = 'database.txt';

my $q = new CGI;
my %FORM = $q->Vars;


# Done
##############################################################################


&no_url if ($FORM{'url'} eq 'http://' || $FORM{'url'} !~ /^(f|ht)tp:\/\/\w+\.\w+/);
&no_title if (!($FORM{'title'}));

# strip out the style tags
$FORM{'url'} =~ s/style=[^>]+//i;

# strip out the query string
$FORM{'url'} =~ s!/\?[^"]+!!;


# Enter our tags and sections into an associative array

my %sections = (
            busi => 'Business',
            comp => 'Computers',
            educ => 'Education',
            ente => 'Entertainment',
            gove => 'Government',
            pers => 'Personal',
            misc => 'Miscellaneous',
            );


# Suck previous link file into an array

open FILE,$filename or die $!;
my @lines = <FILE>;
close(FILE);

my $i=1;
foreach my $line (@lines) {    
    if ($line =~ /\<li\>\<a href\=\"([^\"]+)\">([^<]+)<\/a>/) {
        if ($FORM{'url'} eq $1) {
            &repeat_url;
        }
        $i++;
    }
}

# Open Link File to Output
open FILE,">$filename" or die $!;

foreach my $line (@lines) { # For every line in our data

   if ($line =~ /<!--time-->/) {
      my $date = strftime("%A, %B %d, %Y at %H:%M:%S %Z", localtime(time));
      print FILE "<!--time--><b>Last link was added $date</b><hr>\n";
   }
   elsif ($line =~ /<!--number-->/) {
      print FILE "<!--number--><b>There are <i>$i</i> links on this ";
      print FILE "page.</b><br>\n";
   }
   else {
       print FILE $line;
   }

   foreach my $tag ( keys %sections) { # For every tag
      if ( ($FORM{'section'} eq $sections{$tag}) &&
         ($line =~ /<!--$tag-->/) ) {

         print FILE "<li><a href=\"$FORM{'url'}\">$FORM{'title'}</a>\n";
      }
   }
}

close (FILE);

# Return Link File
#print "Location: $linksurl\n\n";

if ($database ne '') {
    open (DATABASE,">>$database" or die $!);
    print DATABASE "$FORM{'url'}\n";
    close(DATABASE);
}


sub no_url {  #  output HTML using a here document
   print <<HTML;
      Content-type: text/html\n\n
      <html><head><title>ERROR: No URL</title></head>
      <body bgcolor=#FFFFFF text=#000000><center>
      <h1>No URL</h1></center>
      You forgot to enter a url you wanted added to the Free for all link page.
      Another possible problem was that your link was invalid.<p>
      <form method=POST action="$linkscgi">
      <input type=hidden name="title" value="$FORM{'title'}">
      <input type=hidden name="section""
      value="$FORM{'section'}">
      URL: <input type=text name="url" size=50><p>
      <input type=submit> * <input type=reset>
      <hr>
      <a href="$linksurl">$linkstitle</a>
      </form></body></html>
HTML

   exit;
}

sub no_title {  #  output HTML using the CGI.pm object oriented syntax
   print (
      $q->header,
      $q->start_html(-title=>'ERROR: No Title', -bgcolor=>'#FFFFFF', -text=>'#000000'),
      $q->br."\n",
      $q->h1({-align=>center}, 'No Title') . "\n",
      $q->p('You forgot to enter a title you wanted added to the Free for all link page.',
            $q->br."\n",
            'Another possible problem is that you title contained illegal characters.')."\n",
      $q->start_form(-method=>'POST', -action=>$linkscgi),
      $q->hidden(-name=>'url', -value=>$FORM{'url'})."\n",
      $q->hidden(-name=>'title', -value=>$FORM{'section'})."\n".'TITLE: ',
      $q->textfield(-name=>'title', -size=>50)."\n",
      $q->submit, ' * ', $q->reset,
      $q->hr."\n",
      $q->a({-href=>$linksurl}, $linkstitle)."\n",
      $q->end_form,
      $q->end_html,
   );

   exit;
}

sub repeat_url {  # output using the CGI.pm standard syntax
   print (
      header,
      start_html(-title=>'ERROR: Repeat URL', -bgcolor=>'#FFFFFF', -text=>'#000000')."\n",
      h1({-align=>center},  'Repeat URL')."\n",
      p("Sorry, this URL is already in the Free For All Link Page\n",
        "You cannot add this URL to it again.  Sorry.\n"),
      a({-href=>$linksurl}, $linkstitle),
      end_html,
    );

   exit;
}

 

by: FishMongerPosted on 2004-07-31 at 14:14:45ID: 11685742

You may be interested in using the URI::URL module for parsing the URL.

Detailed Documentation:
http://search.cpan.org/~gaas/libwww-perl-5.10/lib/URI/URL.pm

Lastest Version:
http://search.cpan.org/~gaas/URI-1.31/URI/URL.pm

Here's a test script I ran that parsed the URL from 2 of your links.

use URI::URL;

$url1 = new URI::URL 'http://www.juiceboosted.com/index.php?requestId=2&Id=ix1pq';
$url2 = new URI::URL 'http://www.proofofprofit.com/mikl4545/?Jul28?Jul28?Jul28?Jul28';

foreach $url ($url1, $url2) {
   print 'scheme: ' . $url->scheme . "\n";
   print 'netloc: ' . $url->netloc . "\n";
   print 'path: ' . $url->path . "\n";
   print 'query: ' . $url->query . "\n";
   print $/ x 2;
}

-- output --
scheme: http
netloc: www.juiceboosted.com
path: /index.php
query: requestId=2&Id=ix1pq


scheme: http
netloc: www.proofofprofit.com
path: /mikl4545/
query: Jul28?Jul28?Jul28?Jul28

 

by: FishMongerPosted on 2004-07-31 at 15:50:44ID: 11686034

You might be wondering why I'm suggesting using these modules.  Well, these modules (and others that I haven't mentioned) will make it easier to parse, compare, and save or reject the links that are being submitted.  They will also make your scripts easier to write/read/maintain.

 

by: swift99Posted on 2004-08-02 at 06:57:43ID: 11693651

FishMonger's got the goods here ... follow up on his excellent work.

 

by: FishMongerPosted on 2004-08-02 at 08:41:13ID: 11694778

swift99,  thank you for the compliment. :)


Caiapfas ,

I have another suggestion.  Currently, you're reading-in and rewriting the entire 'addalink.htm' file.  It would be more effiecient to use a (relational) database such as MySQL to store the urls and other related info and have the 'links.pl' script dynamicly create/output the html links page.  If you don't want to use MySQL (or some other relational database), you can use various types of flat files as if they were a relational database.  I found this info in one of Tintin's posts in another question.

http://www.experts-exchange.com/Programming/Programming_Languages/Perl/Q_21076389.html

http://search.cpan.org/~jzucker/DBD-AnyData-0.08/AnyData.pm

This may not be as good as MySQL, but it still would be more effiecient that what you're currently using.

 

by: swift99Posted on 2004-08-02 at 09:00:18ID: 11694952

I recommend Firebird over MySql because Firebird has true two-phase commit in all implementations, but you have to pick and choose MySql implementations to get two-phase commit and the reliability that goes with it.

 

by: CaiapfasPosted on 2004-08-02 at 09:24:21ID: 11695264

FishMonger,


Almost there.


1. Have the abilty in the script to deny certian words (sex, make money, etc, etc)
For this one I can make the list of words, no SWEAT..but can you make the script check badwords.txt before accepting the post. I can handle the rest.

2. DENY HTML/CSS/Javascript
Will it deny javascript/html?

3. Deny anypost except from certian pages
example my page
my friedns page
They have programs out there that just has the url to my perl script and can add links to the page from that program.
or

all you need to steal is this code...

<form method=POST action="http://www.ccds.addr.com/cgi-bin/addalink/links.pl" >

Title: <input type=text name="title" size=30><br>
URL: <input type=text name="url" size=55><br>
Section to be placed in: <select name="section">

<option> Business
<option> Computers
<option> Education
<option> Entertainment
<option> Government
<option> Personal
<option selected> Miscellaneous

</select>

<input type=submit value="Add"> * <input type=reset>

</form>


and you can add to my page...So i want to deny post/accepting post only from certian pages laided out in the perl script

example :

# Referrers

$referrers = "http://www.blablah.com";
$referrers = "http://www.blablah22.com";


4. Deny more than 3 post per 24 hour period from same IP

does it do this?


it would be nice to have an email field so the user HAS to enter their email and it then saves their email and url in referrers.dat, for my records.

http://www.blahblah.com (youremail@email.com)

 

by: FishMongerPosted on 2004-08-02 at 18:29:00ID: 11699954

I'll need you to post some lines from your badwords.txt file, so I can work out how to parse it and do the pattern matching against it.  Do you need to run these tests against the url itself, or the link discription, or both?


>> <form method=POST action="http://www.ccds.addr.com/cgi-bin/addalink/links.pl" >

It's not a good idea to use the full address, it would be much better to use relative paths.  It would be even better to have it refer to itsef (using a variable) and have the links.pl script build the page dynamiclly.

 

by: CaiapfasPosted on 2004-08-03 at 12:14:08ID: 11708706

Fish,

If possible can we make the badword.txt non-case sentive...meaning if i have the word free and the user type freE it still gets denied..

link to file
http://www.ticketstogo.com/aadownload/badwords.txt


and how about a simple list of referrers? to deny post excetp from certian urls.
and does it deny javascript/html?


 

by: FishMongerPosted on 2004-08-04 at 12:03:40ID: 11719486

There are several methods for checking for the bad words.  Here's one method.

open BAD, "badword.txt" or die $!;
@badwords = <BAD>;
close BAD;

$badwords = join('|', @badwords);

&badurl if ($FORM{'title'} =~ /$badwords/i or $FORM{'url'} =~ /$badwords/i);

sub badurl {

   insert code for displayng the rejection page due to inapropriate words

}

I'm tied up on other projects at the moment, but as soon as I can, I'll respond to your other requests.

 

by: FishMongerPosted on 2004-08-04 at 12:06:56ID: 11719527

One small point I forgot; you need to remove the \n from each element of the array, prior to the join statement.

chomp @badwords;
$badwords = join('|', @badwords);

 

by: CaiapfasPosted on 2004-08-04 at 15:19:04ID: 11721357

FishMonger,

I will have to wait for you.. but i'm getting spammed so much on that page.
please helllllllllllpppppppppppppppppppppp

 

by: FishMongerPosted on 2004-08-04 at 18:15:37ID: 11722485

I'm still tied up, so I still can't do any scripting, but I need to know if you have the ability to install a couple modules.  If not, then we'll have to do things the hard way.  My recomendation is to throw away your current script and static html page and use a more robust cgi script, but I don't have the time to write the complete script for you.  My plan is to work within your current approach and show you a few module options that should help resolve some, but not all, of the issues.

 

by: CaiapfasPosted on 2004-08-05 at 11:12:50ID: 11729044

what do you mean install modules? how would i do this, but files in the same direcotry?

also, I think these few measures added will cut down on 90% of my spam issues...
its all i want/need.

 

by: FishMongerPosted on 2004-08-05 at 23:25:38ID: 11733196

Caiapfas,

Unfortunately, I still haven't had the time to work on you script needs.  My work projects and entertaining visiting relatives has kept me pretty busy.  I have a couple of ideas that will help resolve some, if not, most of your concerns, unfortunately I may not be able to work on the code until the weekend.

>> what do you mean install modules? how would i do this

Modules are additional (external) functions and subroutines that you import into your scripts, which will make yours scripts more robust, without having you to “reinvent the wheel” by writing complicated functions.  For example, the cg i module that I showed you is a standard module distributed with Perl that, among other things, simplifies the importing/processing of form submissions.  If you are hosting the site(s) on your own server, installing Perl modules can be as easy as this command:

ppm install module_name

or

perl -MCPAN -e install module_name

However, if the site is hosted by your ISP (or someone else), it can/will be more complicated, but is still do-able.

 

by: FishMongerPosted on 2004-08-08 at 20:20:53ID: 11749703

Caiapfas,

I haven't heard weather or not you'll be able to install any modules. If you will not be able to install any modules, then it will not be beneficial to either of us if I rewrite most or all of your script to use those modules. So instead, I'll make a few suggestions that will help clear up some of the issues.

Your 'url' field requires the person to specify the protocol, however, your page doesn't mention that requirement. So, you might want to add a selection field (just to the left of the url field) listing the accepted protocols with http:// being the default.

Currently, you have 3 separate "rejection" subroutines, of which, 2 are nearly identical. Following along with that approach, one of my prior posts suggested adding another sub for rejecting the "bad words". Instead, It would be better to have 1 rejection subroutine and pass a variable to it that holds the rejection message(s). As you perform each of the tests, i.e., 'no title', 'bad words', 'no url', etc., you can build up the rejection message and then call the rejection sub, if needed.

When testing the URL, you might want to check its length and reject it if it’s greater than X chars.  Currently, some of them are quite long.  You can also do a simple regex and reject if certain keywords like style and script are found.

>> Deny any post except from certain pages
There are a couple of CGI Environmental Variables that might help with this, namely: HTTP_REFERER and REMOTE_ADDR.  For example, as a starting point, you could do something similar to this:

if($ENV{HTTP_REFERER} !~ m!http://(www\.)?$mydomain/!) {
  $reject .= “your rejection message”;
}

If you add a required email field (and test for its validity), the combination of the $ENV{HTTP_ADDR} and the email address can be logged in a referrers.dat file.

Now, everything you need, can be done without the use of any additional modules, however, the modules will make it much easier to design, adds readability and maintainability as well as making it more bullet proof.  And, if you switch to using a relational database instead of the flat files, you'll add speed, and flexibility.

 

by: CaiapfasPosted on 2004-08-08 at 22:36:29ID: 11750084

ok, i perfer not to install any modules.
I only want meager security measures i mentioned for this will cut 95% of the spam.

For the referer denyer...how can i add more than one.

also..could you kindly add the script bits you outlayed on this page in the complete code for me. I tryed many errors =}

and the only thing we havent written code for it seems is the referrers.dat file. I would like the script to add the referrer and email address to this file. and deny adding anymore links ffor the same refferer

and then the very last

only 3 submits in 48 hours per ip, can be logged in the refferers.dat file

 

by: FishMongerPosted on 2004-08-08 at 23:16:25ID: 11750219

Without the benefit of the modules and the database, some of these tasks are going to be difficult and/or make your site run slow due to the extra reading/writing of the flat text files.

My schedule is really full right now, but as soon as I can, I'll work on the script.  However, I can't promise to provide you a complete and fully debugged script that does everything you want.  Afterall you need to remember that people on this site are here providing free tech support, and would prefer to assist you in writing the script, instead of writing it for you.

Here's another recomendation I can make; use an IDE program such as Komodo to help you write and debug your (Perl and PHP) scripts.
http://activestate.com/Products/Komodo/

 

by: FishMongerPosted on 2004-08-08 at 23:46:41ID: 11750316

If you feel that you need someone to write the complete scripts (form script and the processing script) and have them done in a hurry, I suggest closing this question, and post a new question in the Perl section.  Also, be sure to make it clear that you don't want to use any modules that are not in the base distribution.  That way other Perl experts will help and probably come up with solutions that I've missed.

 

by: CaiapfasPosted on 2004-08-09 at 10:23:10ID: 11754980



ok

Where do i put this in the script?
-----------------------

open BAD, "badword.txt" or die $!;
@badwords = <BAD>;
close BAD;

chomp @badwords;
$badwords = join('|', @badwords);

&badurl if ($FORM{'title'} =~ /$badwords/i or $FORM{'url'} =~ /$badwords/i);

sub badurl {

   insert code for displayng the rejection page due to inapropriate words <<<<what goes here text????

}
-----------------------------
where do i insert this

if($ENV{HTTP_REFERER} !~ m!http://(www\.)?$mydomain/!) {
  $reject .= “your rejection message”;
}

and how can i make it allow more than one?
--------------------------


just these 2 measures will save me 1000000000's of submits

 

by: CaiapfasPosted on 2004-08-09 at 10:26:29ID: 11755019

also where do i put this??


  $FORM{'url'} =~ s!/\?[^"]+!!;

 

by: adgPosted on 2004-08-10 at 13:32:56ID: 11767400

Why not use the "ban list" regexp in that other script I did for you.

# Load ban list
my @banlist;
if (-s $banlistfile) {
     open (TXTFILE,"$banlistfile");
     flock(TXTFILE,LOCK_SH);
     eval {
          @banlist = <TXTFILE>;
     };
     flock(TXTFILE,LOCK_UN);
     close(TXTFILE);
     die("Could not load ban file: $@") if ($@);
}

----
and
----

# Returns 1 for banned URL
sub IsBanned {
     my($cUrl) = @_;

     # Scan through the ban list and construct regex's
     my $cExp;
     foreach (@banlist) {
          s/^\s*(.*?)\s*$/$1/;
          $cExp = $_;
          # Escape all non-word characters
          $cExp =~ s/(\W)/\\$1/g;
          # Treat * as ungreedy wildcard
          $cExp =~ s/\\\*/'(.*?)'/ge;
          # Enforce whole string comparison, use * to relax
          $cExp = '(?i)^' . $cExp . '$';

          # Fail if the URL matches pattern
          return 1 if ($cUrl =~ $cExp);
     }
     return 0;
}

----

These implement a "ban list" where you can use wildcards, like this:

Ban EVERYTHING:
*
Ban any URL containing sex anywhere in the link:
*sex*

Ban any URL ENDING with bad. URL containing bad somewhere else is ok:
*bad

Ban testing.mydomain.com, both http: and https:
http*://testing.mydomain.com/*

Ban any URL beginning with www and ending with biz, with cgi-bin somewhere in the path:
http://www.*.biz/*cgi-bin*

Remember, the beginning of the pattern and the end of the pattern must match the exact beginning and ending of the URL being tested, except when the pattern begins or ends with "*".

All patterns are case insensitive.

 

by: CaiapfasPosted on 2004-08-10 at 16:03:48ID: 11768326

adg,

GREAT !! could you help me add these few measures to stop the abuse of the system?

 

by: FishMongerPosted on 2004-08-11 at 12:06:09ID: 11777162

This still needs addtional work, but it meets (most of) your needs.

#!/usr/local/bin/perl -w

use strict;
use POSIX qw(strftime);
use CGI qw(:standard);
use CGI::Carp 'fatalsToBrowser';

# Define Global Variables

my $filename = 'addalink.htm';
my $linksurl = 'http://www.ccds.addr.com/wrc/links/addalink.htm';
my $linkscgi = 'links.pl';
my $linkstitle = 'Free for all Links Page';
my $database = 'database.txt';
my (@bandwords, $bandwords, $msg);
my $url = param{'url'};
my $title = param{'title'};
my %sections = (
           busi => 'Business',
           comp => 'Computers',
           educ => 'Education',
           ente => 'Entertainment',
           gove => 'Government',
           pers => 'Personal',
           misc => 'Miscellaneous',
           );

# Done
########################################################################

if($ENV{HTTP_REFERER} !~ m!http://www.ccds.addr.com/!) {

#   The submission is NOT comming from your domain, so
#   the error handling on this could be either displaying another error page
#   or a redirect to your link submission form

}

open BAD, "bandword.txt" or die $!;
@bandwords = <BAD>;
close BAD;

chomp @bandwords;
$bandwords = join('|', @bandwords);

if (!$title || $title =~ /$bandwords/i) {
   $msg = 'You forgot to enter a title or '.
          'the title contains banned words';
   reject_submission($msg, 'title');
}

if ($url) {
   if ($url !~ /^(f|ht)tp:\/\/\w+\.\w+/) {
      $msg = 'You forgot to specify the protocol, e.g., http://';
      reject_submission($msg, 'url');
   }

   if ($url =~ /style=[^>]+/i || $url =~ /\?[^"]+/) {
      $msg = 'Your url has style tags or is a form submission, ' .
             'which are not allowed.';
      reject_submission($msg, 'url');
   }

   if ($url =~ /$bandwords/i) {
      $msg = 'Your title contains banned words';
      reject_submission($msg, 'url');
   }
}
else {
   $msg = 'You forgot to provide the url for the link';
      reject_submission($msg, 'url');
}

# Suck previous link file into an array

open FILE,$filename or die $!;
my @lines = <FILE>;
close(FILE);

my $i=1;
foreach my $line (@lines) {
   if ($line =~ /\<li\>\<a href\=\"([^\"]+)\">([^<]+)<\/a>/) {
       if ($url eq $1) {
           &repeat_url;
       }
       $i++;
   }
}

# Open Link File to Output
open FILE,">$filename" or die $!;

foreach my $line (@lines) { # For every line in our data

  if ($line =~ /<!--time-->/) {
     my $date = strftime("%A, %B %d, %Y at %H:%M:%S %Z", localtime(time));
     print FILE "<!--time--><b>Last link was added $date</b><hr>\n";
  }
  elsif ($line =~ /<!--number-->/) {
     print FILE "<!--number--><b>There are <i>$i</i> links on this ";
     print FILE "page.</b><br>\n";
  }
  else {
      print FILE $line;
  }

  foreach my $tag ( keys %sections) { # For every tag
      if ( (param{'section'} eq $sections{$tag}) &&
         ($line =~ /<!--$tag-->/) ) {

        print FILE "<li><a href=\"$url\">$title</a>\n";
      }
  }
}

close (FILE);

# Return Link File
#print "Location: $linksurl\n\n";

if ($database ne '') {
   open DATABASE,">>$database" or die $!;
   print DATABASE "$url\n";
   close(DATABASE);
}


sub reject_submission {
   my $msg = shift;
   my $section = shift;

   print <<HTML;
     Content-type: text/html\n\n
     <html><head><title>Link Rejection</title></head>
     <body bgcolor=#FFFFFF text=#000000><center>
     <h1>Link Submission Rejected</h1></center>
     $msg<p>
     <form method=POST action="$linkscgi">
     <input type=hidden name="$section" value="$section">
     <input type=hidden name="section" value="$section">
     $section: <input type=text name="$section" size=50><p>
     <input type=submit> * <input type=reset>
     <hr>
     <a href="$linksurl">$linkstitle</a>
     </form></body></html>
HTML

   exit;
}

sub repeat_url {  # output using the CGI.pm standard syntax
  print (
     header,
     start_html(-title=>'ERROR: Repeat URL', -bgcolor=>'#FFFFFF', -text=>'#000000')."\n",
     h1({-align=>center},  'Repeat URL')."\n",
     p("Sorry, this URL is already in the Free For All Link Page\n",
       "You cannot add this URL to it again.  Sorry.\n"),
     a({-href=>$linksurl}, $linkstitle),
     end_html,
   );

  exit;
}

 

by: CaiapfasPosted on 2004-08-11 at 13:12:17ID: 11777889

i get error


Software error:
Bareword "center" not allowed while "strict subs" in use at /usr163/home/c/c/ccds/public_html/cgi-bin/addalink/links.pl line 152.
Execution of /usr163/home/c/c/ccds/public_html/cgi-bin/addalink/links.pl aborted due to compilation errors.

you can test it at the above link..

p.s how can i log the urls / email addresses?

 

by: FishMongerPosted on 2004-08-11 at 15:44:44ID: 11779158

As you can see, I didn't fully debug it.  The error message is informing use that I forgot to put the quotes around that word.  I also noticed a couple other errors of mine that need to be fixed.

change these lines:
   my $url = param{'url'};
   my $title = param{'title'};

   h1({-align=>center},  'Repeat URL')."\n",

to this:
   my $url = param('url');
   my $title = param('title');

   h1({-align=>'center'},  'Repeat URL')."\n",

Also, take off the openning and closing (  ) on the print statement in the repeat_url subroutine.


>> how can i log the urls / email addresses?

The first step, of course, is to add the email field to your addalink.htm page.  Then in the links.pl, you'd access it with: param('email').  You'll then need to verify that they entered a properly formatted email address, often via a regex.  However IMO, unless you take the next step and verify that it actually is a valid address, the logging of it would be pointless.  There is a CPAN module that will easily test the format and do a DNS MX lookup to verify that it is a valid address,  but you'll need to change your mind about the use/installation of additional modules.

The host name and/or IP address of the visitor can be retreived from 2 of the CGI environment variables i.e., REMOTE_HOST and REMOTE_ADDR.

Once you have the seperate pieces of info, you have several choices on how to handle the logging.  The best method is to use a relational database, but if you want to use a file, I'd suggest using a tied hash.  A tied hash is "linked" or "tied" directly to a file, so as you make changes to the hash, you are actually making changes to the file.  This will allow you to store the info as a "record".  Here's a little more info on using a tied hash.

http://www.perl.com/pub/a/2001/09/04/tiedhash.html
http://www-106.ibm.com/developerworks/linux/library/l-cptied.html

I need to jump onto a couple of other personal projects, but as soon as I can I'll work up some code examples.

 

by: CaiapfasPosted on 2004-08-11 at 17:59:59ID: 11779808

ok, now all sudmissions are being denied due to contaioning no title or bad words which i know it does not have. any help?

 

by: adgPosted on 2004-08-11 at 21:16:41ID: 11780478

Could you upload exactly what you have (but change the extension from .pl to .txt) any post a link to it here?

 

by: FishMongerPosted on 2004-08-12 at 07:36:13ID: 11784170

Lets make a small change to the test.

if (!$title) {
   $msg = 'You forgot to enter a title.';
   reject_submission($msg, 'title');
}
elsif ($title =~ /$bandwords/i) {
   $msg = "The title contained '$&' which is a banned word";
   reject_submission($msg, 'title');
}

 

by: CaiapfasPosted on 2004-08-12 at 07:55:15ID: 11784351

I'm getting the error "The title contained '' which is a banned word'


Badwords.txt - http://www.ccds.addr.com/EExchange/badwords.txt
links.pl - http://www.ccds.addr.com/EExchange/links.txt


To test - http://www.ccds.addr.com/wrc/links/addalink.htm



P.s. adg, could you help me log urls and emails like in the other script?

 

by: ozoPosted on 2004-08-12 at 09:05:31ID: 11785276

$bandwords = join('|', map{quotemeta}@bandwords);
elsif ($title =~ /\b($bandwords)\b/i) {

 

by: CaiapfasPosted on 2004-08-12 at 10:28:21ID: 11786117

that brings this error


syntax error at /usr163/home/c/c/ccds/public_html/cgi-bin/addalink/links.pl line 45, near "elsif"
Execution of /usr163/home/c/c/ccds/public_html/cgi-bin/addalink/links.pl aborted due to compilation errors.

 

by: FishMongerPosted on 2004-08-12 at 18:54:59ID: 11790030

Line 45 should be a blank line, so did you make ozo's corrections to the appropriate lines?

With ozo’s corrections, lines 43 thru 53 should be:

chomp @bandwords;
$bandwords = join('|', map{quotemeta}@bandwords);

if (!$title) {
   $msg = 'You forgot to enter a title.';
   reject_submission($msg, 'title');
}
elsif ($title =~ /\b$bandwords\b/i) {
   $msg = "The title contained '$&' which is a banned word";
   reject_submission($msg, 'title');
}

 

by: CaiapfasPosted on 2004-08-13 at 07:38:26ID: 11793385

ok, now i'm get a 500 error


Internal Server Error
The server encountered an internal error or misconfiguration and was unable to complete your request.
Please contact the server administrator, webmaster@.com and inform them of the time the error occurred, and anything you might have done that may have caused the error.

More information about this error may be available in the server error log.


Check out the script here:
links.pl - http://www.ccds.addr.com/EExchange/links.txt

 

by: adgPosted on 2004-08-13 at 23:23:57ID: 11798832

It might be the first line, try:

#!/usr/bin/perl -w

Try with -w and if it doesn't work try again without the -w, to see if that's killing the script too.

 

by: CaiapfasPosted on 2004-08-14 at 10:28:06ID: 11800908

that wasn't the issue. before I added the lastlines of code it worked fine. but wouldnt add the link. how can i fix it?

 

by: FishMongerPosted on 2004-08-14 at 17:08:59ID: 11802304

#!/usr/local/bin/perl -w

use strict;
use POSIX qw(strftime);
use CGI qw(:standard);
use CGI::Carp qw(fatalsToBrowser);

# Define Global Variables

my $filename = '/usr163/home/c/c/ccds/public_html/wrc/links/addalink.htm';
my $linksurl = 'http://www.ccds.addr.com/wrc/links/addalink.htm';
my $linkscgi = 'http://www.ccds.addr.com/cgi-bin/addalink/links.pl';
my $linkstitle = 'Free for all Links Page';
my $database = '/usr163/home/c/c/ccds/public_html/wrc/links/database.txt';
my (@bandwords, $bandwords, $msg);
my $url = param('url');
my $title = param('title');
my $section = param('section');
my %sections = (
           busi => 'Business',
           comp => 'Computers',
           educ => 'Education',
           ente => 'Entertainment',
           gove => 'Government',
           pers => 'Personal',
           misc => 'Miscellaneous',
           );

# Done
########################################################################

#if($ENV{HTTP_REFERER} !~ m!http://www.ccds.addr.com/!) {

#   The submission is NOT comming from your domain, so
#   the error handling on this could be either displaying another error page
#   or a redirect to your link submission form

#}

if (!$title) {
   $msg = 'You forgot to enter a title.';
   reject_submission($msg, 'title');
}

if (!$url) {
   $msg = 'You forgot to provide the url for the link';
      reject_submission($msg, 'url');
}

open BAD, "badwords.txt" or die $!;
@bandwords = <BAD>;
close BAD;

chomp @bandwords;
$bandwords = join('|', map{quotemeta}@bandwords);

if ($title =~ /(\b$bandwords\b)/i) {
   $msg = "The title contained '$1' which is a banned word";
   reject_submission($msg, 'title');
}

if ($url !~ /^(f|ht)tp:\/\/\w+\.\w+/) {
   $msg = 'You forgot to specify the protocol, e.g., http://';
   reject_submission($msg, 'url');
}

if ($url =~ /style=[^>]+/i || $url =~ /\?[^"]+/) {
   $msg = 'Your url has style tags or is a form submission, ' .
          'which are not allowed.';
   reject_submission($msg, 'url');
}

if ($url =~ /($bandwords)/i) {
   $msg = "The url contained '$1' which is a banned word";
   reject_submission($msg, 'url');
}


# Suck previous link file into an array

open FILE,$filename or die $!;
my @lines = <FILE>;
close(FILE);

my $i=1;
foreach my $line (@lines) {
   if ($line =~ m!<li><a href="([^"]+)">([^<]+)</a>!) {
       if ($url eq $1) {
           &repeat_url;
       }
       $i++;
   }
}

# Open Link File to Output
open FILE,">$filename" or die $!;

foreach my $line (@lines) { # For every line in our data

  if ($line =~ /<!--time-->/) {
     my $date = strftime("%A, %B %d, %Y at %H:%M:%S %Z", localtime(time));
     print FILE "<!--time--><b>Last link was added $date</b><hr>\n";
  }
  elsif ($line =~ /<!--number-->/) {
     print FILE "<!--number--><b>There are <i>$i</i> links on this ";
     print FILE "page.</b><br>\n";
  }
  else {
     print FILE $line;
  }

  foreach my $tag ( keys %sections) { # For every tag
     if ( $section eq $sections{$tag} &&
        ($line =~ /<!--$tag-->/) ) {
        print FILE "<li><a href=\"$url\">$title</a>\n";
      }
  }
}

close (FILE);

# Return Link File
#print "Location: $linksurl\n\n";

if ($database ne '') {
   open DATABASE,">>$database" or die $!;
   print DATABASE "$url\n";
   close(DATABASE);
}


sub reject_submission {
   my $msg = shift;
   my $section = shift;

   print <<HTML;
     Content-type: text/html\n\n
     <html><head><title>Link Rejection</title></head>
     <body bgcolor=#FFFFFF text=#000000><center>
     <h1>Link Submission Rejected</h1></center>
     $msg<p>
     <form method=POST action="$linkscgi">
     <input type=hidden name="$section" value="$section">
     <input type=hidden name="section" value="$section">
     $section: <input type=text name="$section" size=50><p>
     <input type=submit> * <input type=reset>
     <hr>
     <a href="$linksurl">$linkstitle</a>
     </form></body></html>
HTML

   exit;
}

sub repeat_url {  # output using the CGI.pm standard syntax
  print
     header,
     start_html(-title=>'ERROR: Repeat URL', -bgcolor=>'#FFFFFF', -text=>'#000000')."\n",
     h1({-align=>'center'},  'Repeat URL')."\n",
     p("Sorry, this URL is already in the Free For All Link Page\n",
       "You cannot add this URL to it again.  Sorry.\n"),
     a({-href=>$linksurl}, $linkstitle),
     end_html,
   ;

  exit;
}

 

by: FishMongerPosted on 2004-08-14 at 17:13:18ID: 11802316

There still are a number of things that can and should be done to improve that script, but it works.

 

by: CaiapfasPosted on 2004-08-14 at 22:17:41ID: 11803091

it still doesnt work getting a 500 error

 

by: FishMongerPosted on 2004-08-15 at 07:26:26ID: 11804072

Sounds like it's a configuration problem, possibly due to permission settings on the script or the web server isn't configured to execute scripts from that directory, i.e., no script alias.

Try running the script from the command line.  If it works, it confirms that the script is "ok", so you'll need to look at the permissions/configuration.  For example, here are a some example outputs from my test runs.

C:\Testing>links.pl title='sex and the city' url=http://www.hbo.com/city/ section=Miscellaneous
     Content-type: text/html


     <html><head><title>Link Rejection</title></head>
     <body bgcolor=#FFFFFF text=#000000><center>
     <h1>Link Submission Rejected</h1></center>
     The title contained 'sex' which is a banned word<p>
     <form method=POST action="links.pl">
     <input type=hidden name="title" value="title">
     <input type=hidden name="section" value="title">
     title: <input type=text name="title" size=50><p>
     <input type=submit> * <input type=reset>
     <hr>
     <a href="http://www.ccds.addr.com/wrc/links/addalink.htm">Free for all Links Page</a>
     </form></body></html>

C:\Testing>links.pl url=http://www.hbo.com/city/ section=Miscellaneous
     Content-type: text/html


     <html><head><title>Link Rejection</title></head>
     <body bgcolor=#FFFFFF text=#000000><center>
     <h1>Link Submission Rejected</h1></center>
     You forgot to enter a title.<p>
     <form method=POST action="links.pl">
     <input type=hidden name="title" value="title">
     <input type=hidden name="section" value="title">
     title: <input type=text name="title" size=50><p>
     <input type=submit> * <input type=reset>
     <hr>
     <a href="http://www.ccds.addr.com/wrc/links/addalink.htm">Free for all Links Page</a>
     </form></body></html>

C:\Temp>links.pl title=www.hbo.com/city url=www.hbo.com/city/ section=Miscellaneous
     Content-type: text/html


     <html><head><title>Link Rejection</title></head>
     <body bgcolor=#FFFFFF text=#000000><center>
     <h1>Link Submission Rejected</h1></center>
     You forgot to specify the protocol, e.g., http://<p>
     <form method=POST action="links.pl">
     <input type=hidden name="url" value="url">
     <input type=hidden name="section" value="url">
     url: <input type=text name="url" size=50><p>
     <input type=submit> * <input type=reset>
     <hr>
     <a href="http://www.ccds.addr.com/wrc/links/addalink.htm">Free for all Links Page</a>
     </form></body></html>


You should also take note on the link being submitted.  I chose that link (to a popular TV program) to show that just because it has a "bad" word in the title, doesn't always mean that it needs to be rejected.  

 

by: CaiapfasPosted on 2004-08-25 at 10:23:13ID: 11894995

i still cant get this script to work i have it chomd to 755 and have tried 777

 

by: FishMongerPosted on 2004-08-25 at 23:07:44ID: 11899867

I forgot an important caveat; even though a cgi script executes correctly on the command line doesn't mean that it will work in the browser.  I ran a couple tests and found the problems.

1)  The error handling (die) on opening the filehandles works as expected on the command line but we'll need to change it (create another subroutine) so it will work in the browser.

2)  Your apache error log should also record a "Premature end of script headers" error due to the 'print "Location: $linksurl\n\n";' being commented out.

Here's the complete script that I tested, but you'll wantto make a few modifications.  1)  You'll need to make the appropriate changes to paths, but I'd suggest using relative paths like I did, rather than the full paths/urls.  2)  You'll want to modify the additional error subroutine to use a more appropriate error message.  3)  This script is still using a mixture of the methods for outputing the html which was only intended to deminstrate the verious options.  You should decide which method you prefer, and use only that one method.


#!c:/perl/bin/perl -w

use strict;
use POSIX qw(strftime);
use CGI qw(:all);
use CGI::Carp 'fatalsToBrowser';


# Define Variables

my $filename = '../htdocs/addalink.htm';
my $linksurl = 'http://localhost/addalink.htm';
my $linkscgi = './cgi-bin/links.cgi';
my $linkstitle = 'Free for all Links Page';
my (@bandwords, $msg);
my $bandwords = '../htdocs/badwords.txt';
my $url = param('url');
my $title = param('title');
my $section = param('section');
my %sections = (
          busi => 'Business',
          comp => 'Computers',
          educ => 'Education',
          ente => 'Entertainment',
          gove => 'Government',
          pers => 'Personal',
          misc => 'Miscellaneous',
          );


my $q = new CGI;
my %FORM = $q->Vars;


# Done
##############################################################################

#if($ENV{HTTP_REFERER} !~ m!http://www.ccds.addr.com/!) {

#   The submission is NOT comming from your domain, so
#   the error handling on this could be either displaying another error page
#   or a redirect to your link submission form

#}

if (!$title) {
  $msg = 'You forgot to enter a title.';
  reject_submission($msg, 'title');
}

if (!$url) {
  $msg = 'You forgot to provide the url for the link';
     reject_submission($msg, 'url');
}

open BAD, $bandwords or open_error($!, "badwords.txt");
@bandwords = <BAD>;
close BAD;

chomp @bandwords;
$bandwords = join('|', map{quotemeta}@bandwords);

if ($title =~ /(\b$bandwords\b)/i) {
  $msg = "The title contained '$1' which is a banned word";
  reject_submission($msg, 'title');
}

if ($url !~ /^(f|ht)tp:\/\/\w+\.\w+/) {
  $msg = 'You forgot to specify the protocol, e.g., http://';
  reject_submission($msg, 'url');
}

if ($url =~ /style=[^>]+/i || $url =~ /\?[^"]+/) {
  $msg = 'Your url has style tags or is a form submission, ' .
         'which are not allowed.';
  reject_submission($msg, 'url');
}

if ($url =~ /($bandwords)/i) {
  $msg = "The url contained '$1' which is a banned word";
  reject_submission($msg, 'url');
}


# Suck previous link file into an array

open FILE,$filename or open_error($!, $filename);
my @lines = <FILE>;
close(FILE);

my $i=1;
foreach my $line (@lines) {
  if ($line =~ m!<li><a href="([^"]+)">([^<]+)</a>!) {
      if ($url eq $1) {
          &repeat_url;
      }
      $i++;
  }
}

# Open Link File to Output
open FILE,">$filename" or open_error($!, $filename);

foreach my $line (@lines) { # For every line in our data

 if ($line =~ /<!--time-->/) {
    my $date = strftime("%A, %B %d, %Y at %H:%M:%S %Z", localtime(time));
    print FILE "<!--time--><b>Last link was added $date</b><hr>\n";
 }
 elsif ($line =~ /<!--number-->/) {
    print FILE "<!--number--><b>There are <i>$i</i> links on this ";
    print FILE "page.</b><br>\n";
 }
 else {
    print FILE $line;
 }

 foreach my $tag ( keys %sections) { # For every tag
    if ( $section eq $sections{$tag} &&
       ($line =~ /<!--$tag-->/) ) {
       print FILE "<li><a href=\"$url\">$title</a>\n";
     }
 }
}

close (FILE);

# Return Link File
print "Location: $linksurl\n\n";

#if ($database ne '') {
#  open DATABASE,">>$database" or die $!;
#  print DATABASE "$url\n";
#  close(DATABASE);
#}


sub reject_submission {
  my $msg = shift;
  my $section = shift;

  print <<HTML;
    Content-type: text/html\n\n
    <html><head><title>Link Rejection</title></head>
    <body bgcolor=#FFFFFF text=#000000><center>
    <h1>Link Submission Rejected</h1></center>
    $msg<p>
    <form method=POST action="$linkscgi">
    <input type=hidden name="$section" value="$section">
    <input type=hidden name="section" value="$section">
    $section: <input type=text name="$section" size=50><p>
    <input type=submit> * <input type=reset>
    <hr>
    <a href="$linksurl">$linkstitle</a>
    </form></body></html>
HTML

  exit;
}

sub repeat_url {  # output using the CGI.pm standard syntax
 print
    header,
    start_html(-title=>'ERROR: Repeat URL', -bgcolor=>'#FFFFFF', -text=>'#000000')."\n",
    h1({-align=>'center'},  'Repeat URL')."\n",
    p("Sorry, this URL is already in the Free For All Link Page\n",
      "You cannot add this URL to it again.  Sorry.\n"),
    a({-href=>$linksurl}, $linkstitle),
    end_html;

 exit;
}

sub open_error {
   my $error = shift;
   my $file = shift;
   print
      header,
      start_html(-title=>'ERROR: open Failed', -bgcolor=>'#FFFFFF', -text=>'#000000')."\n",
      h1({-align=>center},  'File Handle Error')."\n",
      p("Sorry, an error occured while opening $file<br>$error"),
      a({-href=>$linksurl}, $linkstitle),
      end_html;

   exit;
}

 

by: FishMongerPosted on 2004-08-25 at 23:48:53ID: 11900040

I forgot to mention, that the reject_submission sub needs a minor change so that it will send the proper form info back to the script; i.e., it should send ALL three required fields.

Also, instead of hard coding the $linksurl, declare it like this:

my $linkscgi = url();

http://search.cpan.org/~lds/CGI.pm-3.05/CGI.pm#OBTAINING_THE_SCRIPT'S_URL

If you read the cpan doc on the cgi module, you'll find verious options for obtaining the url as well as other valuable info.

20120131-EE-VQP-002

3 Ways to Join

30-Day Free Trial

The Experts

98% positive feedback on 31,087 answers since March 2000. angeliii is a Microsoft Most Valuable Professional for his work with MS SQL Server & Develoment.

He has also proven his knowledge of Visual Basic Programming, PHP Scripting and Oracle Databases.

The Experts

97% positive feedback on 10,752 answers since July 2000. lrmoore has more than 18 years experience in the networking industry.

The six-time Mircosoft MVPs specialties include firewalls, virtual private networking, and network management.

Testimonials

"...and excellent source for support... Kind of like having your very own IT dept." Electriciansnet

Testimonials

"I was apprehensive at signing up at first. However... it has already made my life as an IT administrator much easier." JaCrews

Testimonials

"WOW! You guys have great, active, and knowledgeable people on here." moore50

Business Clients

Business Clients

In the Press

"If you’ve got a question... Experts Exchange can supply an answer.”

In the Press

"...an invaluable aid for both IT professionals and those who require tech support."

In the Press

"where IT professionals provide quick answers on just about any topic"

Business Account Plans

Loading Advertisement...