Finding the source of links to unavailable pages

I have an ASP.NET / ASP site which has been running for more than 15 years. If there is an unhandled exception, I made the system send me an email with the error details, using the Global.asax Application_Error method. Mainly, I get emails with errors "The file _____.aspx does not exist". Now, the filenames and their folders once really existed, but don't any more. I've search through all my codebase for the link to this file. I searched google with the "link:"-prefix, but I can't find any traces of which pages link to these nonexisting files.

I looked at HttpContext.Current.Request.UrlReferrer in the Globals.asax Application_Error method, trying to see where the nonexisting files are reffered, but it is always blank.

I'm thinking maybe to make a file there, with the correct name, and read the Browser history, but this is not possible? With the History object, I can only go back and forward.

My last solution would be to display a web page, and just ask the person browsing, how he got to this page.

Any other methods I can use to track this down?
lefodnesAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Scott Fell, EE MVEDeveloper & EE ModeratorCommented:
If you think they are external, I would start with google webmaster tools as well as looking at my logs.  Searching google for "link" does not seem to work as it once did.  Check out the advanced search http://www.google.com/advanced_search.  What you are looking for is the search term "allinanchor:  mydomain.com/somepage.aspx"
CtrlAltDlCommented:
If a webcrawler has once been to that page they could keep going back to index those old URL/files.  Even though it doesn't exist and there are no links to it in your site they will still visit because that URL is still in their archives.

You could put those old file names in your robots.txt file, which tells the webcrawlers not to visit those files.

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
lefodnesAuthor Commented:
Thanks for your posts so far. I traced the IPs of the failed requests, and they are mostly IPs of dhcp pools for home ADSL, not belonging to any webcrawlers. Also, I tried the "allinanchor" search prefix, but had no hits on any of the missing links.
So thinking of my original question: Is it possible to retrieve the URLs in the history of the web browser, so I can send them to the server by Ajax? In general, this sounds to rude, and would break the privacy of the users, but it's the only method I can think of right now.
Big Business Goals? Which KPIs Will Help You

The most successful MSPs rely on metrics – known as key performance indicators (KPIs) – for making informed decisions that help their businesses thrive, rather than just survive. This eBook provides an overview of the most important KPIs used by top MSPs.

CtrlAltDlCommented:
The ADSL pool of IP's are probably infected computers and you are seeing the traffic of the bots or worms.  Probably harvesting email addresses or something.

As far as accessing their URL history, that isn't possible, you can only get the URL of the referring (previous) page. You can get that with Request.UrlReferrer, but only if their browser allows it.
CtrlAltDlCommented:
Since the referrer is a server variable it can also be retrieved with Request.Headers("Referer"), and yes it is misspelled in the HTTP specifications, but Microsoft spells it correctly.

http://en.wikipedia.org/wiki/HTTP_referer
lefodnesAuthor Commented:
It turns out that I have a custom error page, which returns http code 200. That way, the links in the search engines are still there, and even though the "allinanchor" did not find the files, I found the files far down the search results just by searching for the site name.
I must work on the http return codes.
The HTTP_referer is always blank in the real error messages, so to no help. However, it works, because if I make a file with a hyperlink to one of those nonexisting pages, it is correctly filled.
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
ASP.NET

From novice to tech pro — start learning today.