is it possible to normalize a rss feed to remove duplicate links?

I have a rss feed from twitter list.

This often has duplicates - tweets that have the same link

How can I remove the tweets from the rss that have the dupes?
finnstoneAsked:
Who is Participating?
 
Scott Fell, EE MVEConnect With a Mentor Developer & EE ModeratorCommented:
>Yes, but I think there is a system that could return true URL
There are going to be multiple ways to do this in any given language.  No matter what it will include some sort of search for a string with given parameters.

How you end up using it and displaying and how many tweets you will end up checking will help determine the best solution.

If you want to see the least amount of code, you would use regex.  But even regex may be slightly different from language to language.
0
 
Scott Fell, EE MVEDeveloper & EE ModeratorCommented:
I don't know perl but I think I would do this client side anyway.  Create an array of links.  For each post, first find the links and see if they are in the current array.   If not, add the link to the array.  If the link is in the array, then don't use it.  If there are not that many tweets you are checking, there may be no difference client/serverside.
0
 
finnstoneAuthor Commented:
does not have to be perl, i didnt know where to post this
0
Free Tool: Path Explorer

An intuitive utility to help find the CSS path to UI elements on a webpage. These paths are used frequently in a variety of front-end development and QA automation tasks.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

 
Scott Fell, EE MVEDeveloper & EE ModeratorCommented:
What language do you work in?  I'm sure there will be multiple ways to do this in any language.  

You would want to look at a post, then find each string starting, "<a " and ending in "</a>".  Then from the complete string <a href="http://mypage.com/link">Link</a> take just the mypage.com/link and add that to your array.

Then for each post look for a link, if the, "mypage.com/link" matches what is in the array, then do not use the link or post.  

I think you will find this harder though because the same link may be shortened with different url shorteners. If that is not a problem, this should work.
0
 
finnstoneAuthor Commented:
I do not. I am  going to hire someone and am budget insensitive.
0
 
finnstoneAuthor Commented:
Yes, but I think there is a system that could return true URL
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.