?
Solved

Group same items on a single SKU

Posted on 2011-10-18
8
Medium Priority
?
226 Views
Last Modified: 2012-06-22
Hi Experts!

I have a listing of 100,000's of transactions. Most of those transactions correspond to same items, e.g. used iPhone 4G 8Gb GSM. However there is no product catalogue so basically you can find a registry with brand being Apple, Mac, or iPhone and the model could be 4G 8Gb, 4Gb 8Gb, 4 8Gb, etc. I think I made my point. Each field was written with what ever the user wanted.

I want to unify these items and get a starting item catalog so that now onwards all "identical" items be specified with a single SKU and the catalog be grown. Nonetheless, I have no idea on how to start. Of course I could deploy someone to start grouping similar items according to his/her understanding, but I would like to know if there is a programmatical way to classify most of these items. Maybe using Amazon API, or Data Mining. Any help will be very appreciated.

Cheers!
0
Comment
Question by:degaray
  • 3
  • 2
  • 2
  • +1
8 Comments
 
LVL 3

Expert Comment

by:varontron
ID: 36989316
What format, system, rdbms, etc is the data in now?

Assuming there is consistency in the data values, this kind of grouping is pretty basic with either sql for an rdbms, or perl for a txt file (csv), or xsl for xml.
0
 
LVL 4

Author Comment

by:degaray
ID: 36989447
Hi, I have it in sql server, and I can have it in csv or xls without a problem. The main point is that grouping is not so basic since the data values are everything but consistent
0
 
LVL 3

Expert Comment

by:Andibevan
ID: 36989457
You could a simple vba function that was designed to accept the SKU free-hand description and then return the actual SKU ID.

It would search the description for specific words e.g. 8GB and it would then return the correct SKU ID e.g. Apple Ipod 8Gb.  

The problem you would have is that if two items both have 8gb in them it could return the wrong value.
0
Efficient way to get backups off site to Azure

This user guide provides instructions on how to deploy and configure both a StoneFly Scale Out NAS Enterprise Cloud Drive virtual machine and Veeam Cloud Connect in the Microsoft Azure Cloud.

 
LVL 3

Expert Comment

by:Andibevan
ID: 36989464
I said VBA function because I found the question in the excel page, but given the choice I would create a database function to do this.  
0
 
LVL 4

Author Comment

by:degaray
ID: 36989631
Right, but I do not have sku's yet. That is the point. I would like to know if there is any way that I could send those parameters such as 8gb and iphone and get the sku and any additional info.
0
 
LVL 5

Accepted Solution

by:
softpro2k earned 1000 total points
ID: 36990877
A Database would be the best solution: MySql, Access, etc.

Now use "Distinct" clause in your SQL query. that would remove duplicate item/item-names and produce only unique items.
0
 
LVL 4

Author Closing Comment

by:degaray
ID: 37020354
I was looking for something more relevant but anyway I think that can help a little, although that would not distinguish between lemon and lemons or lemon_.
0
 
LVL 5

Expert Comment

by:softpro2k
ID: 37023543
Hello,

The distinct clause should distinguish between lemon and lemons or lemon_.

You can try a feature in excel to create unique list from an existing list. You can do this using 'Advanced Filter' under 'Data' Menu. You need to check/tick 'Unique Records Only' box before proceeding.

Read 'Filter Unique Records' of this article.

Regards,

A. Roy
0

Featured Post

Free Tool: Port Scanner

Check which ports are open to the outside world. Helps make sure that your firewall rules are working as intended.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

One of the most important things in an application is the query performance. This article intends to give you good tips to improve the performance of your queries.
Here in this article, you will get a step by step guidance on how to restore an Exchange database to a recovery database. Get a brief on Recovery Database and how it can be used to restore Exchange database in this section!
This Micro Tutorial demonstrates using Microsoft Excel pivot tables, how to reverse engineer competitors' marketing strategies through backlinks.
In this video, Percona Solutions Engineer Barrett Chambers discusses some of the basic syntax differences between MySQL and MongoDB. To learn more check out our webinar on MongoDB administration for MySQL DBA: https://www.percona.com/resources/we…

840 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question