I have a listing of 100,000's of transactions. Most of those transactions correspond to same items, e.g. used iPhone 4G 8Gb GSM. However there is no product catalogue so basically you can find a registry with brand being Apple, Mac, or iPhone and the model could be 4G 8Gb, 4Gb 8Gb, 4 8Gb, etc. I think I made my point. Each field was written with what ever the user wanted.
I want to unify these items and get a starting item catalog so that now onwards all "identical" items be specified with a single SKU and the catalog be grown. Nonetheless, I have no idea on how to start. Of course I could deploy someone to start grouping similar items according to his/her understanding, but I would like to know if there is a programmatical way to classify most of these items. Maybe using Amazon API, or Data Mining. Any help will be very appreciated.