• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 342
  • Last Modified:

Google SEO: How to provocate a recrawl of a sitemap (update wrong informations)

Hi all

I have a special problem.
We have a public portal created with a software based on AJAX.
As AJAX don't return useful informations to SEO spiders (Google-bot, Bing-bot, etc.) we have implemented a special
"Spider detection" and - if a spider is detected - give back standard html-code with the correct informations to the link (queried from a sql-server).
We have about 38'000 different links.
We have submitted a (generated) site-map with all the links in the Google-WebMaster-Tool.
This has worked fine - our links where indexed fine.

Unfortunately there was a bug in our underlying development-software (with an update), so that the spiders where not detected correct for a longer time. As they don't were detected correct, the standard AJAX-Code was given back for a longer time (no useful information for the spiders).
The result was, that all entries were killed (Google, Bing, Yahoo)...
Nothing was found by do a search, no keywords where stored.
With site:xxxxxx only links (from the sitemap) were displayed (without any further information).
After a long investigation, I have find out the problem and was able to implement a workaround, so that the detection now works correct again.
I then have crawled some of the links manually I the Google-Webmaster-Tool
=> This has worked correct (I was able to find the information's in Google-Search)
I then have re-submitted the site-map (with the about 38'000 Links, same version of sitemap as before).

Problem description:
In the last 3 Weeks maybe 10-20 links (from the about 38'000) were crawled automatically from google.
If I search with site:xxx:
- I can see maybe 60-80 correct (new crawled) entry's - the most of them are from manually submitted links in the
- I can see a lot of (wrong) entry's with only the links (without any further information)

=> How can in solve this problem an force a recrawl of all (about 38'000) links in the sitemap?
==> As I wrote I have submitted the site-map once again without success:
   => But the same version as before (same file-date, same entry's)
 => Maybe I should update the file-date / some contend of the sitemap (does this matter)?

Thanks for any advice...
- The same is true in Bing and Yahoo
  • 3
  • 2
1 Solution
force a recrawl
You can't, you just just have to grin and bear it until Google et al eventually reindex the site.
HoneymoonAuthor Commented:
As I can see in the Web-Master-Tool, Google is crawling between 1-20 pages per day (in which I can't see a real progress in search-results day by day).
=> Maybe the thousand's already stored links without any information have something to do with that?

If Google crawls 10 Pages / Day (in average) and I have 38'000 Pages, I have to wait 3'800 days = 10 years?!?

Is there really no other way?
What you see in WMT is not real time.

You have the advantage that the links are still there so it will happen fairly quickly but it's not going to happen in the next week.
You can try increasing the crawl rate in WMT but it doesn't mean Google will honour it if it doesn't think there is any need.
HoneymoonAuthor Commented:
Hi Gary

Additional questions:
- Does Google have a look:
  -  at the file-date of the  sitemap?  
  -  at the filesize-date of the  sitemap?  
  -  at the lastmodifyed-attribute in sitemap (to every entry)?

 -  at the file-date of the  sitemap?  
   -  at the filesize-date of the  sitemap?  
I'm not 100% sure, from vague memory it just downloads it on a somewhat regular basis

-  at the lastmodifyed-attribute in sitemap (to every entry)?
The value makes no difference, Google basically uses the sitemap to learn your site structure/links and nothing more
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Upgrade your Question Security!

Your question, your audience. Choose who sees your identity—and your question—with question security.

  • 3
  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now