I have a forum with +2 million pages indexed in google. how ever I know that a lot of this pages are doubled content because the same forum post open with 3 different urls.
for example the same topic opens for the urls
1. viewtopic.php?p=xxxx (view a post)
2. viewtopic.php?t=xxxx (view the entire topic but of course the post above as well)
3. this_is_example_topic.html (cached version of the same as the topic before)
I prefer normally the SEO version of the url (nr3.) but unfortunately google has indexed the versions viewtopic as well more than 500.000 times.
I thought about to block now simple the google indexer with my robots.txt and block access to all viewtopic urls which would filter everything out except the seo url version BUT i fear this is going to hit me negative and do more harm than well because this means goolge is going to kick hundreds of thousand urls out of the index. The content might not be 100% indexed with the SEO url version and once i kick the viewtopic urls out I might be out of traffic.
I would be happy for some suggestions or ideas on how to prevent a big mess but clean up my urls in google.
thanks in advance