Experts Exchange versus AllFAQ.org

Experts Exchange has been hit by a screen-scraping site called allfaq.org and have stolen a good chunk of content. The interesting thing is they are out-ranking EE with Google in various ways and we're wondering why. Some factors we've speculated include:

- .org versus .com classification
- free versus paid
- supporting google ads
- lower page weight (skimpy markup, no images)

Any ideas on what type of analysis/comparative we should run? Much appreciated.
LVL 1
bobexpertAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

aikimarkCommented:
Shouldn't this question be in the ZA zone?
bobexpertAuthor Commented:
Nope, I'm asking the community as a whole.
Alan HardistyCo-OwnerCommented:
What about the legal angle? Anything to be done there?
They even reference EE in some of the solutions:
http://w.e-e.com/i9Lj9G
vs
http://www.experts-exchange.com/Programming/Handhelds_-_PDAs/Q_26165524.html?cid=1572#a32655498

URL Shortened to prevent SEO for the 'other' site.

Alan Hardisty
ZA

Open in new window

HTML5 and CSS3 Fundamentals

Build a website from the ground up by first learning the fundamentals of HTML5 and CSS3, the two popular programming languages used to present content online. HTML deals with fonts, colors, graphics, and hyperlinks, while CSS describes how HTML elements are to be displayed.

bobexpertAuthor Commented:
We're exploring it from a legal perspective, but this particular question is meant to focus on why/how they could outrank us at all?
aikimarkCommented:
@bobexpert

You are asking "the world" this question, including a.l.l.f.a.q. [dot] o.r.g.  
Alan HardistyCo-OwnerCommented:
Okay - leaving it to those that know SEO (which is not me) ; )
bobexpertAuthor Commented:
Correct. aikimark. I think we'll survive.
AndyBeardCommented:
Fundamentals

They have lots of links
They have slightly different titles
Slightly different mix of the content on the page
They probably have a different, possibly better internal linking structure. Internal linking structure is something I am pretty good on, but I haven't got much time for analysis.

My explanation of how search really works I normally introduce as my weird and freaky stuff. It is the type of stuff that I discuss with seasoned SEO veterans and blow their minds.

I explained some of it here
http://www.experts-exchange.com/Web_Development/Internet_Marketing/Search_Engine_Optimization_SEO/Q_26140505.html?cid=1131

What it boils down to is that they have authority based on links, so they get allocated some traffic.
EE has tons of content, but is only allocated traffic based on the amount of authority they have which is based upon links, both from external pages and from indexed documents... which is a bit of a chicken/egg situation.

It could for instance be looked on that Experts Exchange is highly relevant for the search term "Andy Beard" - there is now a fair chunk of my original content here, and from each piece of that content there is a link to a profile page.
However Google has a whole load of other content that is also relevant for the search query, and determines to allocate traffic for EE to other terms which are more suitable.

Ultimately EE is in some ways fortunate to get the search traffic it does, because I would have expected more links, possibly by improving your existing badges a little - they aren't bad, but they could be improved.
I would construct the 1st click free pages differently using ajax that can't be crawled to get rid of some of the clutter in the code.
There are also ways to improve the internal linking structure a little, but nothing will fix this better than a ton more links. Millions of them.
AndyBeardCommented:
From a legal perspective (I am not a lawyer) my understanding is that users of EE license EE to use their content, but EE doesn't hold copyright in the content.

As they are only grabbing small pieces of EE's collection of licensed copyright works owned by EE contributors, I doubt you are in a position to file a DMCA for anything other than content written my EE staff.

However I did notice the site is doing automated translation of content which is now against Google ToS for their translate API, and allowing automated translation of content is now against their webmaster guidelines.
The site in general is a poor user experience only having partial answers. Which makes it a MFA site that might be looked on as against Google's Adsense policy.
It wouldn't be unusual for copyright owners such as EE authors reporting such a site as a violation of Google's adsense policies by clicking on the logo of an advert on a page displaying their work, and reporting it as a violation of the Adsense policy.
Authors so inclined could also file a DMCA if they thought their copyright was being breached.
AndyBeardCommented:
Note: I believe they might have some problems with monetization soon ;)
duzCommented:
Sorry for the late entry but this question was only brought to my attention today by a friendly mod.

Firstly, the scraping site has a better internal linking structure.

Secondly, when you are outranked by a scraper then you can be sure that some of what Google calls "quality signals" are awry on your site. This is one of them:

Look at my comment from 2006 here http://www.experts-exchange.com/Community_Support/Suggestions/Q_21976494.html Quote: In my opinion the nofollow was applied by EE probably in the mistaken belief that it would somehow improve visibility and rankings in the search engines by hoarding PageRank. Engineering please comment if this is incorrect. The reason why abusing the nofollow in this way will be counterproductive in the long term is not something I want to detail here but suffice to say that if you want a great way to tell search engines that your pages are filled with worthless content then applying nofollow to millions of links on your site is a very good way to do it".

But, you may say, surely all the no-follow links on EE are going to be no-follow links on the scraper.  Yes that is true but the internal links on EE do not have a no-follow and on the scraper site will become valid external links. Think about it.....

These are not the only factors but should be enough to work on for now.

- duz

bobexpertAuthor Commented:
I really appreciate the insight, duz.

One quick follow-up question - can you provide specific examples of how our scraping friends have a better internal linking structure? Articulating would probably help our SEO/engineering staff the most.

Thanks again.
duzCommented:
>can you provide specific examples of how our scraping friends have a better internal linking structure?

Sure, rather than detailed technical analysis I think there is an easy way to visualize the discrepancy as follows.

Take the pages mentioned above by alanhardisty.

Anchor text for internal links on EE page:

About Us, Activesync, Advanced, alanhardisty, Apple, Articles, Ask, Blogs, Browse All, Cancel, Contact Us, Database, Digital Living, dloebig, DROID, EE Blog, Exchange, Experts Exchange, Forgot your password?, Handheld/PDA Prog, Hardware, Help, Help, Home, Internet Rank, Member Login, Microsoft, Networking, OS, Other, Privacy Policy, Programming, Programming, Security, Shorten URL, Sign Up Now!, Site Map, Software, Solutions, Start FREE Trial, Start your 30-day free trial, Start your 30-day free trial, Start your 30-day free trial, Storage, Suggestions, Terms of Use, Testimonials, view the solution free, View the Solution FREE for 30 Days, Visit Experts Exchange, Web Development, Why It Works.

Anchor text for internal links on scraper page:

Active, activesync, bes, BlackBerry, Blackberry Enterprise Server, Browse All Tags, C#, calendar, ce, copy command - without overwrite suggestion, Database choices, desktop, email, emulator, enterprise, error, exchange, file, function problem, Handheld and PDA, Hardware, how, how different is VBA from Office 2000 to Office 2003 and 2007, especially Excel and Access, How do I link one *.Obj rather than another, How to make a ListBox Display Dates from an Access Date/Time field in a desired DD/MM/YYYY Format, HTC DROID Incredible Active Sync Failing, http://www.experts-exchange.com/articles/Software/Server_Software/Email_Servers/Exchange/Exchange-2003-Activesync-Connection-Problems-FAQ.html, Microsoft, Microsoft, Mobile, Network, outlook, palm, pc, pda, pocket, POP3 AND SMTP, Progamming, Questions and answers to issues related to Microsoft: Windows, Applications, Development, Hardware, Server, Internet Protocols, Database, Exchange ., read/ write registry, RIM, Select command for .NET 3.5, send, server, Software, sync, Tags:, Translate to German, Translate to Spainish, Translation into Dutch, Translation into French, Translation into Italian, unable, windows, Windows 2003 Domain Controller fell off domain w/o DCPromo, windows mobile.

You can see at a glance that the scraper site's internal links anchor text is richer in "topical references" and this is the case throughout the site's half a million pages.

However the scraper site's internal linking structure is actually very poor (and on no account should be emulated) but as I said, it is better than EEs.

Let's hope the appearance of the scraper site will be the catalyst for EE developing the internal linking strategy that should have been in place many moons ago.

- duz
bobexpertAuthor Commented:
That was very helpful. Ok, so last question (swear) from our SEO folks:

Which sites (outside of our screen-scraping friends) would you consider to be using internal linking best practices?

Thanks again, duz.
AndyBeardCommented:
I haven't got time to respond in detail, I did respond to one question by an EE engineer by email.

I don't fully agree with Duz

My email reply concentrated on flattening site architecture whilst highlighting terms which bring highest conversions.
It is quite possible header terms from top level navigation don't bring in high conversions, thus optimizing anchor text on them won't necessarily be of much consequence to a page 5+ levels down.
duzCommented:
>Which sites (outside of our screen-scraping friends) would you consider to be using internal linking best practices?

Wikipedia is a very good example.

For sites driven by user generated content (like EE and Wikipedia) allowing users the use of the <a> tag provides the best form of internal linking because the links are all contextual, have in-line anchor text and are useful to the reader. Also they are natural rather than optimized which makes them kinda perfect :)

Google is very good at detecting non-natural internal linking (in fact non-natural linking per se) so you would have to be smart at rectifying the current situation for existing links. One possibility would be to programmatically change site wide all the internal links anchor text to the referred page <title> tag text.

For example the link mentioned above by alanhardisty has an accepted solution with an internal link that looks like this:

<a href="http://www.experts-exchange.com/articles/Software/Server_Software/Email_Servers/Exchange/Exchange-2003-Activesync-Connection-Problems-FAQ.html">http://www.experts<wbr />-exchange.<wbr />com/articl<wbr />es/Softwar<wbr />e/<wbr />Server_S<wbr />oftware/Em<wbr />ail_Server<wbr />s/Exchange<wbr />/Exchange-<wbr />2003-<wbr />Activ<wbr />esync-Conn<wbr />ection-Pro<wbr />blems-FAQ.<wbr />html</a>

A simple script might change this to:

<a href="http://www.experts-exchange.com/articles/Software/Server_Software/Email_Servers/Exchange/Exchange-2003-Activesync-Connection-Problems-FAQ.html">Exchange 2003 - Activesync Connection Problems FAQ - Exchange 2003, Activesync, iPhone, Windows Mobile</a>

It is not quite so easy because in practice the script would have to shorten the referred page <title> tag text before it was used as anchor text.  Also an element of randomization would need need to be introduced to avoid the non-natural situation of every internal link pointing to the same page having exactly the same anchor text.

If I were briefing a team to write and execute this script I would suggest they look at the anchor text used in external links to the referred page with a view to capturing it and mating it in various ways with the <title> tag text.

Anyway, food for thought.

- duz

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
bobexpertAuthor Commented:
Impressive as always, guys.
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Web Development

From novice to tech pro — start learning today.