Link to home
Start Free TrialLog in
Avatar of degaray
degaray

asked on

Group same items on a single SKU

Hi Experts!

I have a listing of 100,000's of transactions. Most of those transactions correspond to same items, e.g. used iPhone 4G 8Gb GSM. However there is no product catalogue so basically you can find a registry with brand being Apple, Mac, or iPhone and the model could be 4G 8Gb, 4Gb 8Gb, 4 8Gb, etc. I think I made my point. Each field was written with what ever the user wanted.

I want to unify these items and get a starting item catalog so that now onwards all "identical" items be specified with a single SKU and the catalog be grown. Nonetheless, I have no idea on how to start. Of course I could deploy someone to start grouping similar items according to his/her understanding, but I would like to know if there is a programmatical way to classify most of these items. Maybe using Amazon API, or Data Mining. Any help will be very appreciated.

Cheers!
Avatar of varontron
varontron
Flag of Afghanistan image

What format, system, rdbms, etc is the data in now?

Assuming there is consistency in the data values, this kind of grouping is pretty basic with either sql for an rdbms, or perl for a txt file (csv), or xsl for xml.
Avatar of degaray
degaray

ASKER

Hi, I have it in sql server, and I can have it in csv or xls without a problem. The main point is that grouping is not so basic since the data values are everything but consistent
You could a simple vba function that was designed to accept the SKU free-hand description and then return the actual SKU ID.

It would search the description for specific words e.g. 8GB and it would then return the correct SKU ID e.g. Apple Ipod 8Gb.  

The problem you would have is that if two items both have 8gb in them it could return the wrong value.
I said VBA function because I found the question in the excel page, but given the choice I would create a database function to do this.  
Avatar of degaray

ASKER

Right, but I do not have sku's yet. That is the point. I would like to know if there is any way that I could send those parameters such as 8gb and iphone and get the sku and any additional info.
ASKER CERTIFIED SOLUTION
Avatar of softpro2k
softpro2k

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of degaray

ASKER

I was looking for something more relevant but anyway I think that can help a little, although that would not distinguish between lemon and lemons or lemon_.
Hello,

The distinct clause should distinguish between lemon and lemons or lemon_.

You can try a feature in excel to create unique list from an existing list. You can do this using 'Advanced Filter' under 'Data' Menu. You need to check/tick 'Unique Records Only' box before proceeding.

Read 'Filter Unique Records' of this article.

Regards,

A. Roy