Go Premium for a chance to win a PS4. Enter to Win

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 329
  • Last Modified:

I need some assistance to find and use a sql server algorithm for categorising my list of products

Hi,

I have a table in my sql server database that contains the following columns...

ID
Name
CommonName
Description

Now, I need to add another column called, 'Categorisation' and this column will be used to filter out the different types of product.

Now my idea of doing this was to run some kind of algorithm over the table that will do some clever pattern matching for me over the 3 columns Name,Common Name and Description. I know it won't be 100% accurate but there are 1300 products and I was hoping to at least get some of these products categorised automatically categorised.

Using algorithms to analyse data is a complete new experience for me so a thorough explanation would be greatly appreciated. Thanks.
0
jazz__man
Asked:
jazz__man
  • 2
  • 2
  • 2
  • +1
1 Solution
 
Juan OcasioCommented:
You can use LIKE '%prodcut%' in your where clause.

For example:

SELECT * FROM table where Name LIKE '%product%'
OR CommonNam  LIKE '%product%'
OR Description LIKE '%product%'

This may take a LONG time depending on how much data is in your db, so you can do them separately as well

SELECT * FROM table where Name LIKE '%product%'


SELECT * FROM table where CommonNam  LIKE '%product%'


SELECT * FROM table where  Description LIKE '%product%'

the product you're looking for would go between the '% and the  %'
0
 
QlemoC++ DeveloperCommented:
Something like:
update tbl
set Categorisation = case
  when Name like '%Something%' and CommonName like '%Something%' and description like '%Something%' then 'Something'
  when Name ...
end

Open in new window

You can also write individual update commands for each pattern combination, and put the condition into the where clause. It is very similar to above.
0
 
jazz__manAuthor Commented:
Qlemo,

Thanks very much, this is very helpful but I would have to write hundreds of case conditions to run this. This is why I was looking for something a bit more mathematical like an algorithm to do a bit of pattern matching for me. There are 900 distinct products in a database of 1300. Now if there were 100 distinct records in 1300 then life would be much easier but im not in this position. Hope this makes sense. Thanks
0
Windows Server 2016: All you need to know

Learn about Hyper-V features that increase functionality and usability of Microsoft Windows Server 2016. Also, throughout this eBook, you’ll find some basic PowerShell examples that will help you leverage the scripts in your environments!

 
QlemoC++ DeveloperCommented:
To develop something smart we would have to know the details. SQL cannot guess, so you have to tell it something concrete.
You could e.g. build it stepwise:
update tbl set Categorisation = 'Something' where name like '\[A-F\]%' escape '\' and Categorisation is null;
update tbl set Categorisation = 'Somethong' where CommonName like '_\[0-9]%' escape '\' and Categorisation is null;

Open in new window

and so on. That will only update rows not already categorized. You should go for the most common categories first, and then treat the more special cases.
0
 
PortletPaulCommented:
>>Now, I need to add another column called, 'Categorisation'

Don't you need another TABLE? e.g.
ProductCategory
ID, Product_ID (FK), Category

so for one row in the existing table you may have one or more rows in the new table

then your queries would be far easier and more efficient

select
*
from products p
inner join ProductCategory pc on p.id = pc.product_id
where pc.category ='fish'
or pc.category = 'chips'


+ You could take this a step further and have a table of Categories, so that only allowed categories can be added to a product.

sorry, but I just hate the idea of you evaluating strings (like '%something%') at each and every select statement - it will be a nightmare.

Normalization makes life easier in the long run.

If you go down this route, what you will need to generate would be a set of insert statements to the new table, and this could leverage the table of categories.

let's say you have a table of Categories like this

Fish
Chips
Salt
Vinegar

then you could do something along these lines (to generate the insert data, not as a daily activity)

select
p.*
c.category
from products p
inner join categories p.description like ('%' + c.category + '%')
0
 
jazz__manAuthor Commented:
PortletPaul,

You are making me hungry!
0
 
PortletPaulCommented:
LOL, maybe that's what's on my mind... [note to self, eat]
0

Featured Post

Free Tool: SSL Checker

Scans your site and returns information about your SSL implementation and certificate. Helpful for debugging and validating your SSL configuration.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

  • 2
  • 2
  • 2
  • +1
Tackle projects and never again get stuck behind a technical roadblock.
Join Now