Solved

I need some assistance to find and use a sql server algorithm for categorising my list of products

Posted on 2013-05-10
7
308 Views
Last Modified: 2013-05-27
Hi,

I have a table in my sql server database that contains the following columns...

ID
Name
CommonName
Description

Now, I need to add another column called, 'Categorisation' and this column will be used to filter out the different types of product.

Now my idea of doing this was to run some kind of algorithm over the table that will do some clever pattern matching for me over the 3 columns Name,Common Name and Description. I know it won't be 100% accurate but there are 1300 products and I was hoping to at least get some of these products categorised automatically categorised.

Using algorithms to analyse data is a complete new experience for me so a thorough explanation would be greatly appreciated. Thanks.
0
Comment
Question by:jazz__man
  • 2
  • 2
  • 2
  • +1
7 Comments
 
LVL 14

Expert Comment

by:Juan Ocasio
ID: 39155311
You can use LIKE '%prodcut%' in your where clause.

For example:

SELECT * FROM table where Name LIKE '%product%'
OR CommonNam  LIKE '%product%'
OR Description LIKE '%product%'

This may take a LONG time depending on how much data is in your db, so you can do them separately as well

SELECT * FROM table where Name LIKE '%product%'


SELECT * FROM table where CommonNam  LIKE '%product%'


SELECT * FROM table where  Description LIKE '%product%'

the product you're looking for would go between the '% and the  %'
0
 
LVL 68

Expert Comment

by:Qlemo
ID: 39155315
Something like:
update tbl
set Categorisation = case
  when Name like '%Something%' and CommonName like '%Something%' and description like '%Something%' then 'Something'
  when Name ...
end

Open in new window

You can also write individual update commands for each pattern combination, and put the condition into the where clause. It is very similar to above.
0
 
LVL 2

Author Comment

by:jazz__man
ID: 39155343
Qlemo,

Thanks very much, this is very helpful but I would have to write hundreds of case conditions to run this. This is why I was looking for something a bit more mathematical like an algorithm to do a bit of pattern matching for me. There are 900 distinct products in a database of 1300. Now if there were 100 distinct records in 1300 then life would be much easier but im not in this position. Hope this makes sense. Thanks
0
How to improve team productivity

Quip adds documents, spreadsheets, and tasklists to your Slack experience
- Elevate ideas to Quip docs
- Share Quip docs in Slack
- Get notified of changes to your docs
- Available on iOS/Android/Desktop/Web
- Online/Offline

 
LVL 68

Expert Comment

by:Qlemo
ID: 39155367
To develop something smart we would have to know the details. SQL cannot guess, so you have to tell it something concrete.
You could e.g. build it stepwise:
update tbl set Categorisation = 'Something' where name like '\[A-F\]%' escape '\' and Categorisation is null;
update tbl set Categorisation = 'Somethong' where CommonName like '_\[0-9]%' escape '\' and Categorisation is null;

Open in new window

and so on. That will only update rows not already categorized. You should go for the most common categories first, and then treat the more special cases.
0
 
LVL 48

Accepted Solution

by:
PortletPaul earned 500 total points
ID: 39155609
>>Now, I need to add another column called, 'Categorisation'

Don't you need another TABLE? e.g.
ProductCategory
ID, Product_ID (FK), Category

so for one row in the existing table you may have one or more rows in the new table

then your queries would be far easier and more efficient

select
*
from products p
inner join ProductCategory pc on p.id = pc.product_id
where pc.category ='fish'
or pc.category = 'chips'


+ You could take this a step further and have a table of Categories, so that only allowed categories can be added to a product.

sorry, but I just hate the idea of you evaluating strings (like '%something%') at each and every select statement - it will be a nightmare.

Normalization makes life easier in the long run.

If you go down this route, what you will need to generate would be a set of insert statements to the new table, and this could leverage the table of categories.

let's say you have a table of Categories like this

Fish
Chips
Salt
Vinegar

then you could do something along these lines (to generate the insert data, not as a daily activity)

select
p.*
c.category
from products p
inner join categories p.description like ('%' + c.category + '%')
0
 
LVL 2

Author Comment

by:jazz__man
ID: 39155668
PortletPaul,

You are making me hungry!
0
 
LVL 48

Expert Comment

by:PortletPaul
ID: 39155731
LOL, maybe that's what's on my mind... [note to self, eat]
0

Featured Post

Enabling OSINT in Activity Based Intelligence

Activity based intelligence (ABI) requires access to all available sources of data. Recorded Future allows analysts to observe structured data on the open, deep, and dark web.

Join & Write a Comment

Suggested Solutions

Title # Comments Views Activity
Impove long SQL Stored Procedure Performance 14 64
Truncate vs Delete 63 90
optimize  c# code 7 49
SQL 2012 Syntax Error 5 25
This article explains how to reset the password of the sa account on a Microsoft SQL Server.  The steps in this article work in SQL 2005, 2008, 2008 R2, 2012, 2014 and 2016.
In this article we will get to know that how can we recover deleted data if it happens accidently. We really can recover deleted rows if we know the time when data is deleted by using the transaction log.
Access reports are powerful and flexible. Learn how to create a query and then a grouped report using the wizard. Modify the report design after the wizard is done to make it look better. There will be another video to explain how to put the final p…
This video shows how to remove a single email address from the Outlook 2010 Auto Suggestion memory. NOTE: For Outlook 2016 and 2013 perform the exact same steps. Open a new email: Click the New email button in Outlook. Start typing the address: …

743 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

10 Experts available now in Live!

Get 1:1 Help Now