• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 277
  • Last Modified:

is it possible to normalize a rss feed to remove duplicate links?

I have a rss feed from twitter list.

This often has duplicates - tweets that have the same link

How can I remove the tweets from the rss that have the dupes?
0
finnstone
Asked:
finnstone
  • 3
  • 3
1 Solution
 
Scott Fell, EE MVEDeveloperCommented:
I don't know perl but I think I would do this client side anyway.  Create an array of links.  For each post, first find the links and see if they are in the current array.   If not, add the link to the array.  If the link is in the array, then don't use it.  If there are not that many tweets you are checking, there may be no difference client/serverside.
0
 
finnstoneAuthor Commented:
does not have to be perl, i didnt know where to post this
0
 
Scott Fell, EE MVEDeveloperCommented:
What language do you work in?  I'm sure there will be multiple ways to do this in any language.  

You would want to look at a post, then find each string starting, "<a " and ending in "</a>".  Then from the complete string <a href="http://mypage.com/link">Link</a> take just the mypage.com/link and add that to your array.

Then for each post look for a link, if the, "mypage.com/link" matches what is in the array, then do not use the link or post.  

I think you will find this harder though because the same link may be shortened with different url shorteners. If that is not a problem, this should work.
0
Never miss a deadline with monday.com

The revolutionary project management tool is here!   Plan visually with a single glance and make sure your projects get done.

 
finnstoneAuthor Commented:
I do not. I am  going to hire someone and am budget insensitive.
0
 
finnstoneAuthor Commented:
Yes, but I think there is a system that could return true URL
0
 
Scott Fell, EE MVEDeveloperCommented:
>Yes, but I think there is a system that could return true URL
There are going to be multiple ways to do this in any given language.  No matter what it will include some sort of search for a string with given parameters.

How you end up using it and displaying and how many tweets you will end up checking will help determine the best solution.

If you want to see the least amount of code, you would use regex.  But even regex may be slightly different from language to language.
0

Featured Post

The new generation of project management tools

With monday.com’s project management tool, you can see what everyone on your team is working in a single glance. Its intuitive dashboards are customizable, so you can create systems that work for you.

  • 3
  • 3
Tackle projects and never again get stuck behind a technical roadblock.
Join Now