Solved

I need some assistance to find and use a sql server algorithm for categorising my list of products

Posted on 2013-05-10
7
314 Views
Last Modified: 2013-05-27
Hi,

I have a table in my sql server database that contains the following columns...

ID
Name
CommonName
Description

Now, I need to add another column called, 'Categorisation' and this column will be used to filter out the different types of product.

Now my idea of doing this was to run some kind of algorithm over the table that will do some clever pattern matching for me over the 3 columns Name,Common Name and Description. I know it won't be 100% accurate but there are 1300 products and I was hoping to at least get some of these products categorised automatically categorised.

Using algorithms to analyse data is a complete new experience for me so a thorough explanation would be greatly appreciated. Thanks.
0
Comment
Question by:jazz__man
  • 2
  • 2
  • 2
  • +1
7 Comments
 
LVL 14

Expert Comment

by:Juan Ocasio
ID: 39155311
You can use LIKE '%prodcut%' in your where clause.

For example:

SELECT * FROM table where Name LIKE '%product%'
OR CommonNam  LIKE '%product%'
OR Description LIKE '%product%'

This may take a LONG time depending on how much data is in your db, so you can do them separately as well

SELECT * FROM table where Name LIKE '%product%'


SELECT * FROM table where CommonNam  LIKE '%product%'


SELECT * FROM table where  Description LIKE '%product%'

the product you're looking for would go between the '% and the  %'
0
 
LVL 69

Expert Comment

by:Qlemo
ID: 39155315
Something like:
update tbl
set Categorisation = case
  when Name like '%Something%' and CommonName like '%Something%' and description like '%Something%' then 'Something'
  when Name ...
end

Open in new window

You can also write individual update commands for each pattern combination, and put the condition into the where clause. It is very similar to above.
0
 
LVL 2

Author Comment

by:jazz__man
ID: 39155343
Qlemo,

Thanks very much, this is very helpful but I would have to write hundreds of case conditions to run this. This is why I was looking for something a bit more mathematical like an algorithm to do a bit of pattern matching for me. There are 900 distinct products in a database of 1300. Now if there were 100 distinct records in 1300 then life would be much easier but im not in this position. Hope this makes sense. Thanks
0
Optimizing Cloud Backup for Low Bandwidth

With cloud storage prices going down a growing number of SMBs start to use it for backup storage. Unfortunately, business data volume rarely fits the average Internet speed. This article provides an overview of main Internet speed challenges and reveals backup best practices.

 
LVL 69

Expert Comment

by:Qlemo
ID: 39155367
To develop something smart we would have to know the details. SQL cannot guess, so you have to tell it something concrete.
You could e.g. build it stepwise:
update tbl set Categorisation = 'Something' where name like '\[A-F\]%' escape '\' and Categorisation is null;
update tbl set Categorisation = 'Somethong' where CommonName like '_\[0-9]%' escape '\' and Categorisation is null;

Open in new window

and so on. That will only update rows not already categorized. You should go for the most common categories first, and then treat the more special cases.
0
 
LVL 48

Accepted Solution

by:
PortletPaul earned 500 total points
ID: 39155609
>>Now, I need to add another column called, 'Categorisation'

Don't you need another TABLE? e.g.
ProductCategory
ID, Product_ID (FK), Category

so for one row in the existing table you may have one or more rows in the new table

then your queries would be far easier and more efficient

select
*
from products p
inner join ProductCategory pc on p.id = pc.product_id
where pc.category ='fish'
or pc.category = 'chips'


+ You could take this a step further and have a table of Categories, so that only allowed categories can be added to a product.

sorry, but I just hate the idea of you evaluating strings (like '%something%') at each and every select statement - it will be a nightmare.

Normalization makes life easier in the long run.

If you go down this route, what you will need to generate would be a set of insert statements to the new table, and this could leverage the table of categories.

let's say you have a table of Categories like this

Fish
Chips
Salt
Vinegar

then you could do something along these lines (to generate the insert data, not as a daily activity)

select
p.*
c.category
from products p
inner join categories p.description like ('%' + c.category + '%')
0
 
LVL 2

Author Comment

by:jazz__man
ID: 39155668
PortletPaul,

You are making me hungry!
0
 
LVL 48

Expert Comment

by:PortletPaul
ID: 39155731
LOL, maybe that's what's on my mind... [note to self, eat]
0

Featured Post

NAS Cloud Backup Strategies

This article explains backup scenarios when using network storage. We review the so-called “3-2-1 strategy” and summarize the methods you can use to send NAS data to the cloud

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

If you have heard of RFC822 date formats, they can be quite a challenge in SQL Server. RFC822 is an Internet standard format for email message headers, including all dates within those headers. The RFC822 protocols are available in detail at:   ht…
Use this article to create a batch file to backup a Microsoft SQL Server database to a Windows folder.  The folder can be on the local hard drive or on a network share.  This batch file will query the SQL server to get the current date & time and wi…
This video shows how to use Hyena, from SystemTools Software, to bulk import 100 user accounts from an external text file. View in 1080p for best video quality.
The Email Laundry PDF encryption service allows companies to send confidential encrypted  emails to anybody. The PDF document can also contain attachments that are embedded in the encrypted PDF. The password is randomly generated by The Email Laundr…

770 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question