Solved

Avoiding spammer "harvesting" of your site's email addresses

Posted on 2001-08-23
14
1,204 Views
Last Modified: 2008-02-01
Having set up a website some four years ago, I'm now on the list of almost every spammer on the net.  Looking at my server logs I see almost every day an unidentified bot come through the site (i.e. not one I recognise as being from a search engine) as they harvest the information they find in the "mailto:" tags.

Spam in my inbox is an annoyance, but it costs me nothing but time (I have a flat rate internet account).  Now, however, I want to include in our site a form which allows the visitor to send a message instantly to my pager, which can be done via an email address.

What I *don't* want is for that address to be harvested and to pay 20 cents a message for my mobile screen to be filled with ads urging me to buy CDs crammed with other harvested address, hot stock tips, and exhortations to increase the length of parts of my anatomy.

I remember reading once that it is *is* possible to sabotage address-harvesting bots (they still get the information but it's unusable and bounces back to the spammer unless they manually edit it?) but still have the form work.  Any idea how?
0
Comment
Question by:Polemic
  • 3
  • 2
  • 2
  • +5
14 Comments
 
LVL 2

Accepted Solution

by:
tewald earned 33 total points
Comment Utility
There are a couple ways you can avoid spambots from harvesting you email address:

1) Best solution using JavaScript:

<SCRIPT LANGUAGE="javascript">
<!--
var Domain = "yourISP.com"
var Mailme = "mail" + "to:" + "you@" + Domain
document.write("<FORM>");
document.write("<INPUT TYPE=\"submit\" VALUE=\"Send me some email\" ");
document.write("onClick=\"parent.location=Mailme\"> ");
document.write("</FORM>");
// -->
</SCRIPT>

2) Good solution using html:

Instead of using the "@" symbol replace it with the html character reference "&#064; (eg. you&#064;yourisp.com instead of you@yourisp.com).
0
 
LVL 4

Expert Comment

by:heddesheimer
Comment Utility
Hi tewald,

don't you think the developer of the bots are smart enough to know that trick and can find #064 as well as the "@" symbol?
0
 
LVL 4

Expert Comment

by:heddesheimer
Comment Utility
Hi Polemic,

the best way to do that it to avoid to show the mail address in the HTML code at all. As a PHP expert I would suggest using PHP :-)

If you are not familiar with PHP: I have set up a little tool where you can compose your own Form Mailer online and have it sent to you by e-mail. All you need it a webspace that supports PHP and where you can send e-mails from the webserver (that is the default setting for most ISPs).

You can give it a try:
http://www.rent-a-tutor.com/tools/makeform.php

Marian
0
 
LVL 2

Expert Comment

by:tewald
Comment Utility
heddesheimer, yes I do believe SOME spambots can identify, translate, and utilize #064 that's why I said it was a "good" html solution - not the best solution.  A server-side script (ASP, PHP...) would work too; however, using the JavaScript solution would equally prevent spambots from gleaning the email address since they are NOT intelligent enough to parse javascript, nor are they ever likely to.

tewald
0
 
LVL 2

Expert Comment

by:ramses
Comment Utility
Another way to approach this is to use late binding.  I'll explain

Let's say you have a form like this:

<FORM ACTION="/cgi-bin/form.cgi">
<INPUT TYPE="HIDDEN" NAME="MAILTO" VALUE="joe@dot.com">
<INPUT TYPE="HIDDEN" NAME="REPLYOK" VALUE="ok.html">
...
</FORM>

This is considered early-binding because all the variables are already initialised when the page is send to the parser.

To use late-binding you can go like this:

<FORM NAME="pagerform" ACTION="cgi-bin/form.cgi">
<INPUT TYPE="HIDDEN" NAME="MAILTO" ID="xmail" VALUE="">
...
<BUTTON ONCLICK="doit()">Submit</BUTTON>
</FORM>

this comes in your <HEAD> section

<SCRIPT TYPE="text/javascript" LANGUAGE="JavaScript">
<!--
function doit()
{var z='joe';
 var y='dot.com';
 var x=z+'@'+y;
 document.all.xmail.value=x;
 document.all.pagerform.submit()
}
//-->
</SCRIPT>

I'll doubt that a robot will actually extract the email adress from this.  You can also put the script above in a separate file and link it to your document.  That way if you look at the source of the document with the form on it, you can't see any email adress.

To do this: place the script in a separate file with a .js extension, upload it to your server and place the following line in the head section of your form page:

<SCRIPT LANGUAGE="JavaScript" TYPE="text/javascript" SRC="xscript.js"></SCRIPT>

make sure you change xscript.js with the filename you've givven the script and include the relative path

BTW: I know it sounds silly to think those bots will abay it, but have you tried setting the robots name in your robots.txt file?  If their robots do not obay the robots standards, you can sue them.  Especially if they use the gathered email adresses for unsolicited comercial mail campaignes

Should you require more info on robots.txt just let me know


Ramses says Roooar
0
 
LVL 2

Expert Comment

by:ramses
Comment Utility
Another work around is to have it mailed to an email with the subject set on the form page.  for example: 'PAGER REQUEST' , have your mail program filter those out and forward them to the REAL address

plain & simple



Ramses says RoOOOaar
0
 
LVL 17

Expert Comment

by:dorward
Comment Utility
I prefer not to publish my email address on a webpage and just have a custom built PHP script form mailer with the address hard coded in to the program (where the user doesn't see it until I reply to their email).
0
How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

 
LVL 2

Expert Comment

by:ramses
Comment Utility
OK, was just a solution for people who do not have php or other server-side scripts, except for form.cgi
0
 
LVL 22

Expert Comment

by:CJ_S
Comment Utility
<a href="mailto:cdevos" onClick="this.href+='@h2d2.nl'">Click here to email me</a>

or

<a href="mailto:cdevos@nowhere.com" onClick="this.href=this.href.replace('nowhere.com', 'h2d2.nl')">Click here to email me</a>

or just tell your users to replace a certain character with the @ like:
Change the - with the @: <a href="mailto:cdevos-h2d2.nl">Click here to email me</a>

regards,
CJ
0
 

Author Comment

by:Polemic
Comment Utility
I like the Javascript based solutions (tewald and ramses) because they appear as though they'd be too much trouble for a spambot to parse et I can still understand them, alter them to suit, and incorporate them easily in the design of the page.  I like Ramses's solution slightly better because it keeps all mention of the address off the source of the page.

I accept that php is probably bullet-proof in this regard, and I wish I knew how to configure it!  Thanks heddesheimer, I tried your cool little tool and got a script from it, but it will be a steep learning curve for me to make that script into a page that mimics the remainder of the ste and also carries the various other contact methods (form-to-mail, ICQ panel, and so on) I ant on the same page.  I'll try it though, and let you know if I can adapt it.

I would imagine the smater spammers have figured the &#064; trick now, and the thing is, I can't risk too weak a protection... the only real way to test the solution I choose will be to "go live" and if it fails and I end up in the spammers' databases it's too late to go back and try plan B.

The "forwarding email through a filter" theory won't work for two reasons - I don't have an "always on" internet account so emails come into the system spasmodically (albeit several times a day) whereas the essence of a pager is speed; and because filtering the mail sent from the form and forwarding would still forward spam... I'd have to filter on dubious words and manually check the dumped mail... too fiddly and time consuming.

CJ's solutions are intriguing in their simplicity... I'll play round with them and see if I can get them to work.  Im still suspicious of anything that has the compoents of my address anywhere near an "@" though... spammers are getting trickier all the time.

Now, just to add to my headaches my ISP has told me the server on which my site is hosted is closing down, so I hve to make urgent plans to shift my site.  So far from worrying about one form I'll be caught up with shifting files, logs, emails, and so on changing DNS data; and generally making sure things run smoothly that I won't be able to implement these suggestions for a few days... perhaps a week.

By all means keep thinking, and I welcome any new suggestions. But wrapping this up might take awhile.  Thanks to all who've contributed so far:-)
0
 
LVL 17

Expert Comment

by:dorward
Comment Utility
PHP isn't the only solution, you can use ASP, Perl or whatever you like. If you have PHP available on your server please say and I'll write you a quick script.

I would really avoid client side JavaScript if possible as (last time I checked) 12% of users have JavaScript unavailable or disabled - and they wouldn't be able to email you at all. Add to that the problems of people who use webmail account - the mailto: link is useless to them.
0
 
LVL 53

Expert Comment

by:COBOLdinosaur
Comment Utility
This question has been abandoned.  I will make a recommendation to the
moderators on its resolution in a week or two.  I appreciate any comments
that would help me to make a recommendation.

Cd&
 
0
 
LVL 53

Expert Comment

by:COBOLdinosaur
Comment Utility
It is time to clean this abandoned question up.  

I am putting it on a clean up list for CS.

<recommendation>
split tewald, ramses,CJ_S

</recommendation>

If anyone participating in the Q disagrees with the recommendation,
please leave a comment for the mods.

Cd&
0
 

Expert Comment

by:ComTech
Comment Utility
A three way split will be required here, and will reduce the points to 33.

33=tewald
33=ramses
33=CJ_S

I will accept tewald here and create two new question for the other Experts.

Best regards,
ComTech
0

Featured Post

What Is Threat Intelligence?

Threat intelligence is often discussed, but rarely understood. Starting with a precise definition, along with clear business goals, is essential.

Join & Write a Comment

Someone recently asked me about how to display a progress indicator on a page while an iframe is loading. And I remember when I first came across this myself. It was a bit tricky to get my head around, but really, it's very simple. The most impor…
This is a PowerShell web interface I use to manage some task as a network administrator. Clicking an action button on the left frame will display a form in the middle frame to input some data in textboxes, process this data in PowerShell and display…
In this tutorial viewers will learn how add a scalable full-width header using CSS3. Create a new HTML document with an internal stylesheet. Set a tiled background.:  Create a new div and name it Header. Position it with position:absolute at the top…
The viewer will learn the basics of jQuery, including how to invoke it on a web page. Reference your jQuery libraries: (CODE) Include your new external js/jQuery file: (CODE) Write your first lines of code to setup your site for jQuery.: (CODE)

743 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

11 Experts available now in Live!

Get 1:1 Help Now