Solved

Avoiding spammer "harvesting" of your site's email addresses

Posted on 2001-08-23
14
1,206 Views
Last Modified: 2008-02-01
Having set up a website some four years ago, I'm now on the list of almost every spammer on the net.  Looking at my server logs I see almost every day an unidentified bot come through the site (i.e. not one I recognise as being from a search engine) as they harvest the information they find in the "mailto:" tags.

Spam in my inbox is an annoyance, but it costs me nothing but time (I have a flat rate internet account).  Now, however, I want to include in our site a form which allows the visitor to send a message instantly to my pager, which can be done via an email address.

What I *don't* want is for that address to be harvested and to pay 20 cents a message for my mobile screen to be filled with ads urging me to buy CDs crammed with other harvested address, hot stock tips, and exhortations to increase the length of parts of my anatomy.

I remember reading once that it is *is* possible to sabotage address-harvesting bots (they still get the information but it's unusable and bounces back to the spammer unless they manually edit it?) but still have the form work.  Any idea how?
0
Comment
Question by:Polemic
  • 3
  • 2
  • 2
  • +5
14 Comments
 
LVL 2

Accepted Solution

by:
tewald earned 33 total points
ID: 6420802
There are a couple ways you can avoid spambots from harvesting you email address:

1) Best solution using JavaScript:

<SCRIPT LANGUAGE="javascript">
<!--
var Domain = "yourISP.com"
var Mailme = "mail" + "to:" + "you@" + Domain
document.write("<FORM>");
document.write("<INPUT TYPE=\"submit\" VALUE=\"Send me some email\" ");
document.write("onClick=\"parent.location=Mailme\"> ");
document.write("</FORM>");
// -->
</SCRIPT>

2) Good solution using html:

Instead of using the "@" symbol replace it with the html character reference "&#064; (eg. you&#064;yourisp.com instead of you@yourisp.com).
0
 
LVL 4

Expert Comment

by:heddesheimer
ID: 6420826
Hi tewald,

don't you think the developer of the bots are smart enough to know that trick and can find #064 as well as the "@" symbol?
0
 
LVL 4

Expert Comment

by:heddesheimer
ID: 6420830
Hi Polemic,

the best way to do that it to avoid to show the mail address in the HTML code at all. As a PHP expert I would suggest using PHP :-)

If you are not familiar with PHP: I have set up a little tool where you can compose your own Form Mailer online and have it sent to you by e-mail. All you need it a webspace that supports PHP and where you can send e-mails from the webserver (that is the default setting for most ISPs).

You can give it a try:
http://www.rent-a-tutor.com/tools/makeform.php

Marian
0
 
LVL 2

Expert Comment

by:tewald
ID: 6420875
heddesheimer, yes I do believe SOME spambots can identify, translate, and utilize #064 that's why I said it was a "good" html solution - not the best solution.  A server-side script (ASP, PHP...) would work too; however, using the JavaScript solution would equally prevent spambots from gleaning the email address since they are NOT intelligent enough to parse javascript, nor are they ever likely to.

tewald
0
 
LVL 2

Expert Comment

by:ramses
ID: 6421267
Another way to approach this is to use late binding.  I'll explain

Let's say you have a form like this:

<FORM ACTION="/cgi-bin/form.cgi">
<INPUT TYPE="HIDDEN" NAME="MAILTO" VALUE="joe@dot.com">
<INPUT TYPE="HIDDEN" NAME="REPLYOK" VALUE="ok.html">
...
</FORM>

This is considered early-binding because all the variables are already initialised when the page is send to the parser.

To use late-binding you can go like this:

<FORM NAME="pagerform" ACTION="cgi-bin/form.cgi">
<INPUT TYPE="HIDDEN" NAME="MAILTO" ID="xmail" VALUE="">
...
<BUTTON ONCLICK="doit()">Submit</BUTTON>
</FORM>

this comes in your <HEAD> section

<SCRIPT TYPE="text/javascript" LANGUAGE="JavaScript">
<!--
function doit()
{var z='joe';
 var y='dot.com';
 var x=z+'@'+y;
 document.all.xmail.value=x;
 document.all.pagerform.submit()
}
//-->
</SCRIPT>

I'll doubt that a robot will actually extract the email adress from this.  You can also put the script above in a separate file and link it to your document.  That way if you look at the source of the document with the form on it, you can't see any email adress.

To do this: place the script in a separate file with a .js extension, upload it to your server and place the following line in the head section of your form page:

<SCRIPT LANGUAGE="JavaScript" TYPE="text/javascript" SRC="xscript.js"></SCRIPT>

make sure you change xscript.js with the filename you've givven the script and include the relative path

BTW: I know it sounds silly to think those bots will abay it, but have you tried setting the robots name in your robots.txt file?  If their robots do not obay the robots standards, you can sue them.  Especially if they use the gathered email adresses for unsolicited comercial mail campaignes

Should you require more info on robots.txt just let me know


Ramses says Roooar
0
 
LVL 2

Expert Comment

by:ramses
ID: 6421272
Another work around is to have it mailed to an email with the subject set on the form page.  for example: 'PAGER REQUEST' , have your mail program filter those out and forward them to the REAL address

plain & simple



Ramses says RoOOOaar
0
 
LVL 17

Expert Comment

by:dorward
ID: 6421290
I prefer not to publish my email address on a webpage and just have a custom built PHP script form mailer with the address hard coded in to the program (where the user doesn't see it until I reply to their email).
0
Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

 
LVL 2

Expert Comment

by:ramses
ID: 6421300
OK, was just a solution for people who do not have php or other server-side scripts, except for form.cgi
0
 
LVL 22

Expert Comment

by:CJ_S
ID: 6421527
<a href="mailto:cdevos" onClick="this.href+='@h2d2.nl'">Click here to email me</a>

or

<a href="mailto:cdevos@nowhere.com" onClick="this.href=this.href.replace('nowhere.com', 'h2d2.nl')">Click here to email me</a>

or just tell your users to replace a certain character with the @ like:
Change the - with the @: <a href="mailto:cdevos-h2d2.nl">Click here to email me</a>

regards,
CJ
0
 

Author Comment

by:Polemic
ID: 6421699
I like the Javascript based solutions (tewald and ramses) because they appear as though they'd be too much trouble for a spambot to parse et I can still understand them, alter them to suit, and incorporate them easily in the design of the page.  I like Ramses's solution slightly better because it keeps all mention of the address off the source of the page.

I accept that php is probably bullet-proof in this regard, and I wish I knew how to configure it!  Thanks heddesheimer, I tried your cool little tool and got a script from it, but it will be a steep learning curve for me to make that script into a page that mimics the remainder of the ste and also carries the various other contact methods (form-to-mail, ICQ panel, and so on) I ant on the same page.  I'll try it though, and let you know if I can adapt it.

I would imagine the smater spammers have figured the &#064; trick now, and the thing is, I can't risk too weak a protection... the only real way to test the solution I choose will be to "go live" and if it fails and I end up in the spammers' databases it's too late to go back and try plan B.

The "forwarding email through a filter" theory won't work for two reasons - I don't have an "always on" internet account so emails come into the system spasmodically (albeit several times a day) whereas the essence of a pager is speed; and because filtering the mail sent from the form and forwarding would still forward spam... I'd have to filter on dubious words and manually check the dumped mail... too fiddly and time consuming.

CJ's solutions are intriguing in their simplicity... I'll play round with them and see if I can get them to work.  Im still suspicious of anything that has the compoents of my address anywhere near an "@" though... spammers are getting trickier all the time.

Now, just to add to my headaches my ISP has told me the server on which my site is hosted is closing down, so I hve to make urgent plans to shift my site.  So far from worrying about one form I'll be caught up with shifting files, logs, emails, and so on changing DNS data; and generally making sure things run smoothly that I won't be able to implement these suggestions for a few days... perhaps a week.

By all means keep thinking, and I welcome any new suggestions. But wrapping this up might take awhile.  Thanks to all who've contributed so far:-)
0
 
LVL 17

Expert Comment

by:dorward
ID: 6421962
PHP isn't the only solution, you can use ASP, Perl or whatever you like. If you have PHP available on your server please say and I'll write you a quick script.

I would really avoid client side JavaScript if possible as (last time I checked) 12% of users have JavaScript unavailable or disabled - and they wouldn't be able to email you at all. Add to that the problems of people who use webmail account - the mailto: link is useless to them.
0
 
LVL 53

Expert Comment

by:COBOLdinosaur
ID: 6629516
This question has been abandoned.  I will make a recommendation to the
moderators on its resolution in a week or two.  I appreciate any comments
that would help me to make a recommendation.

Cd&
 
0
 
LVL 53

Expert Comment

by:COBOLdinosaur
ID: 6694586
It is time to clean this abandoned question up.  

I am putting it on a clean up list for CS.

<recommendation>
split tewald, ramses,CJ_S

</recommendation>

If anyone participating in the Q disagrees with the recommendation,
please leave a comment for the mods.

Cd&
0
 

Expert Comment

by:ComTech
ID: 6696323
A three way split will be required here, and will reduce the points to 33.

33=tewald
33=ramses
33=CJ_S

I will accept tewald here and create two new question for the other Experts.

Best regards,
ComTech
0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Have you tried to learn about Unicode, UTF-8, and multibyte text encoding and all the articles are just too "academic" or too technical? This article aims to make the whole topic easy for just about anyone to understand.
This is a PowerShell web interface I use to manage some task as a network administrator. Clicking an action button on the left frame will display a form in the middle frame to input some data in textboxes, process this data in PowerShell and display…
In this Micro Tutorial viewers will learn how to create navigation buttons that change on rollover, using CSS (Continuation of the CSS Image Sprite tutorial) Create a parent ID for all the list items       - Specify position: absolute and display: block…
In this tutorial viewers will learn how to style transparent/translucent elements using alpha transparency in CSS Start with a normal styled element, such as a div.: Define its "background-color" property as "rgba (255, 255, 255, .5): The numbers in…

929 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

14 Experts available now in Live!

Get 1:1 Help Now