Solved

How to aggregate data from third party sites?

Posted on 2012-12-29
3
1,058 Views
Last Modified: 2013-11-18
Hey guys, I am writing a document for a university project which asks me to cover RDFa, data aggregation and the semantic web. After a week of research I am not sure I understand all this. Is it normal for sites to aggregate data from third party sites? Is this legal? How exactly would this be done?

I have read all of the Wikipedia sites and plenty other documentation, but I am interested in you guys opinions, on how it should be done and why. How do you draw in information from others sites which is directly relative to the content you already have?

I would appreciate any information on this at all!

many thanks!
0
Comment
Question by:deucalion0
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
3 Comments
 
LVL 110

Accepted Solution

by:
Ray Paseur earned 500 total points
ID: 38730698
There is not likely to be any one answer to this post, but hopefully you will get some good comments.

RDFa is a syntax (W3C recommendation) that allows the use of namespaced-attributes in XML or HTML tags.  Perhaps the most popular use of RDFa is the Facebook "open graph" protocol.  It allows and encourages sharing information between web sites about clients' personal  interests.  So, "Yes."  It is legal, normal, and widely prevalent.

But now to the second part of the question, which I find to be a fascinating point of departure for discussion: "information from others sites which is directly relative to the content you already have."  The key term here is "content."  Let's consider how this really works and to what effect.  Go online, perhaps to Amazon.com, and make a search for something fairly specific, say "copper stock pots."  Next, go to a completely unrelated web site like TheOnion.com.  You will find advertisements for copper stock pots turning up in the sidebars!  This is the open graph at work.  There is nothing at all in TheOnion.com about copper stock pots -- it's a satirical news site, so this would seem to belie your definition of "relative to the content" but I see that whole issue a little differently.  If a web site user is getting something for free, the user is not the customer; the user is the product!  And the site visit is the content.  In other words, free web sites get to choose what content they show you, and (presumably) they strike a balance between showing you content that will keep you coming back, and content that will earn them money through advertising impressions and, optimistically, click-throughs to purchases.

That's my broad-brush take on it.  Hope you get some other comments, too.  Happy New Year 2013, ~Ray
0
 

Author Closing Comment

by:deucalion0
ID: 38744737
Thanks for your advice! I took what you said I looked further into this, it is interesting, but very vague in some points. I was hoping for more replies, but EE seems to be so quiet these days :(

Thank you!!!
0
 
LVL 110

Expert Comment

by:Ray Paseur
ID: 38745893
Thanks for the points.  Maybe if you post the question again when it is not over a holiday weekend some others will weigh in.  If you feel answers are vague, you can always ask for clarification, too. ~Ray
0

Featured Post

Transaction Monitoring Vs. Real User Monitoring

Synthetic Transaction Monitoring Vs. Real User Monitoring: When To Use Each Approach? In this article, we will discuss two major monitoring approaches: Synthetic Transaction and Real User Monitoring.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Because your company can’t afford for you to make SEO mistakes, you’ll want to ensure you’re taking the right steps each and every time you post a new piece of content. This list of optimization do’s and don’ts can help you become an SEO wizard.
Original post  on Monitis Blog. Web performance monitoring is broken into two camps: passive and active. Passive monitoring is defined as looking at real-world historical performance by monitoring actual log-ins, site hits, clicks, requests for …
This tutorial will teach you the core code needed to finalize the addition of a watermark to your image. The viewer will use a small PHP class to learn and create a watermark.
HTML5 has deprecated a few of the older ways of showing media as well as offering up a new way to create games and animations. Audio, video, and canvas are just a few of the adjustments made between XHTML and HTML5. As we learned in our last micr…

705 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question