finnstone
asked on
is it possible to normalize a rss feed to remove duplicate links?
I have a rss feed from twitter list.
This often has duplicates - tweets that have the same link
How can I remove the tweets from the rss that have the dupes?
This often has duplicates - tweets that have the same link
How can I remove the tweets from the rss that have the dupes?
I don't know perl but I think I would do this client side anyway. Create an array of links. For each post, first find the links and see if they are in the current array. If not, add the link to the array. If the link is in the array, then don't use it. If there are not that many tweets you are checking, there may be no difference client/serverside.
ASKER
does not have to be perl, i didnt know where to post this
What language do you work in? I'm sure there will be multiple ways to do this in any language.
You would want to look at a post, then find each string starting, "<a " and ending in "</a>". Then from the complete string <a href="http://mypage.com/link">Link</a> take just the mypage.com/link and add that to your array.
Then for each post look for a link, if the, "mypage.com/link" matches what is in the array, then do not use the link or post.
I think you will find this harder though because the same link may be shortened with different url shorteners. If that is not a problem, this should work.
You would want to look at a post, then find each string starting, "<a " and ending in "</a>". Then from the complete string <a href="http://mypage.com/link">Link</a> take just the mypage.com/link and add that to your array.
Then for each post look for a link, if the, "mypage.com/link" matches what is in the array, then do not use the link or post.
I think you will find this harder though because the same link may be shortened with different url shorteners. If that is not a problem, this should work.
ASKER
I do not. I am going to hire someone and am budget insensitive.
ASKER
Yes, but I think there is a system that could return true URL
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.