I have been handed a comma-delimited file containing 50,000 Twitter tweets - and many of the tweets are Spam tweets.
So, each row is an individual tweet. I need to clean this file of all suspicious rows.
Can you please recommend a method to identify the Spam lines in this file? Any examples or source-code is greatly appreciated.