• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 287
  • Last Modified:

is it possible to normalize a rss feed to remove duplicate links?

I have a rss feed from twitter list.

This often has duplicates - tweets that have the same link

How can I remove the tweets from the rss that have the dupes?
0
finnstone
Asked:
finnstone
  • 3
  • 3
1 Solution
 
Scott Fell, EE MVEDeveloper & EE ModeratorCommented:
I don't know perl but I think I would do this client side anyway.  Create an array of links.  For each post, first find the links and see if they are in the current array.   If not, add the link to the array.  If the link is in the array, then don't use it.  If there are not that many tweets you are checking, there may be no difference client/serverside.
0
 
finnstoneAuthor Commented:
does not have to be perl, i didnt know where to post this
0
 
Scott Fell, EE MVEDeveloper & EE ModeratorCommented:
What language do you work in?  I'm sure there will be multiple ways to do this in any language.  

You would want to look at a post, then find each string starting, "<a " and ending in "</a>".  Then from the complete string <a href="http://mypage.com/link">Link</a> take just the mypage.com/link and add that to your array.

Then for each post look for a link, if the, "mypage.com/link" matches what is in the array, then do not use the link or post.  

I think you will find this harder though because the same link may be shortened with different url shorteners. If that is not a problem, this should work.
0
Free Tool: Subnet Calculator

The subnet calculator helps you design networks by taking an IP address and network mask and returning information such as network, broadcast address, and host range.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

 
finnstoneAuthor Commented:
I do not. I am  going to hire someone and am budget insensitive.
0
 
finnstoneAuthor Commented:
Yes, but I think there is a system that could return true URL
0
 
Scott Fell, EE MVEDeveloper & EE ModeratorCommented:
>Yes, but I think there is a system that could return true URL
There are going to be multiple ways to do this in any given language.  No matter what it will include some sort of search for a string with given parameters.

How you end up using it and displaying and how many tweets you will end up checking will help determine the best solution.

If you want to see the least amount of code, you would use regex.  But even regex may be slightly different from language to language.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Introducing Cloud Class® training courses

Tech changes fast. You can learn faster. That’s why we’re bringing professional training courses to Experts Exchange. With a subscription, you can access all the Cloud Class® courses to expand your education, prep for certifications, and get top-notch instructions.

  • 3
  • 3
Tackle projects and never again get stuck behind a technical roadblock.
Join Now