This is probably a long shot but I wanted to go ahead and see if it would work anyway.
We run a couple of websites which have a lot of unique content on them which we have written. However, from time to time we notice that our content shows up on other unauthorized websites. I know this is a pretty common problem and there are a bunch of guidelines on the web on how to go about tracking these people down and forcing them to remove the content.
But I wanted to see if there was anyway to prevent these websites from getting the content and at the same time not comprising usability and the ability of search engines to index. I have asked a similar question on ee before and nothing viable came out of it (
http://www.experts-exchange.com/Security/Misc/Q_22061974.html).
But this time, I am thinking of the issue from a little different perspective and wanted to see if there is any solution. How do most unauthorized websites get their hands on other people's content anyway? Would you know if its by sending a bot to the content rich website or by downloading all the web pages to a local drive or by having people manually copy the content or some other technique? I am not sure but I am guessing a fair amount of websites do it by sending a bot to crawl the content rich website?
So is there anyway to prevent that? I was thinking if there is some way to implement a solution which blocks all robots except the spiders of the major search engines (google, yahoo, msn, ask), then if any other spider is crawling the website, we would know it is unauthorized and we would automatically block its access to the website by trapping it or something.
Is this is a viable solution, can it really be implemented and will it work?
Thank you for your time and please let me know if you need me to clarify something.
Start Free Trial