Link to home
Start Free TrialLog in
Avatar of dvplayltd
dvplayltdFlag for Bulgaria

asked on

T-SQL index over computer column in a table

Dear experts,

I have a SQL 2008 R2 DB and in my table I have a field DocumentNumber (Num) bigint. I want to provide my C# app to search by any part of this number, therefore I should search in it in nvarchar (with LIKE %234). My question is – which is the classis way:

   1.      To create a separated fields and on Add record to cast the bigint to nvarchar and to index on it. Because this is a official document, it will be very rarely modified / deleted.
  2. To create computed field (      [NumS]  AS (CONVERT([nvarchar],[Num],0)), and to index on it. Should I mark it Is persisted or not?  I think this will be the better way, if the SQL use this column only to create index, while in the point 1 this separated column will take HDD space.

3.      I prefer to use nvarchar, not varchar because if in the feature the document number included letters as well. I know nvarchar get twice more space because of Unicode support, but what about search of index? I mean – a index over nvarchar is equal fast as index over varchar, right ? If there is big differences, may be I'll use varchar.

 
 The more likely usage of search will be the user to search with last 2 or 3 numbers of the document, so the index will be used in real. I expect not more than 50 000 records in that table for few years, but want to do it right, next time it could be 5 000 000 records.
Avatar of ste5an
ste5an
Flag of Germany image

When you want to search for any part, then you need to convert it to VARCHAR() and use a full-text index using the CONTAINS() predicate.

But: Searching only for part makes normally only sense, when there is a structure in the number. But in this case you should split the number in appropriate parts and store those.

And the problem with LIKE is: only searching for 'text%' is supported by indices.
Avatar of dvplayltd

ASKER

To ste5an

The data is actually a number of document, let say 10356. It will be up to user will seach for example for 356 in the end (Like %356), in the start or at any part .

I'm sure I do not need full text index search, it look to me like overwork for that purpose.

Please tell me I should do option 1 or option 2 is the same result in speed of searhing ?
Why you just not change the column data type from bigint to VARCHAR? Then create an index on that field so your queries run faster.
ASKER CERTIFIED SOLUTION
Avatar of PortletPaul
PortletPaul
Flag of Australia image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
To Vitor Montalvão

"Why you just not change the column data type from bigint to VARCHAR?"
Well, it is a option ..initially it is bigint because when user add a new document I read last number + 1. In that moment I can convert nvarchar to bigint and find the biggest one ... but this will read ALL 50 000 records, right?

With such number of documents (50 000) any option will be OK. My question is primary to do it in right way, also to learn more about this.
If you index the column you can easily find the last record ( MAX(DocumentNumber) ). Problem might be if you want to use letters but then I would go for a solution with 2 columns, one bigint for DocumentNumber and another one a VARCHAR column for DocumentLetter and like that you can even have sequencial document numbers by letters:
SELECT DocumentLetter, MAX(DocumentNumber)
FROM Document
GROUP BY DocumentLetter

Open in new window

That is my idea .. to use 2 columns. The question is- WHAT to be second column - separated or  computed and how index on it will be affected. After few post we are go again to my init question ... when anyone start answer please focus on that.
My idea for a 2nd column it's only if you need to join letters and numbers and to help you getting the next document number quickly. You idea for a 2nd column is to use a computed column but then how you going to deal with a document number that has letters?
Vitor, i doublt you want I to explain you the whole logic.I'm closing this question, it goes to non useful direction ... the question is how index work on computed columns, that is all.
I've requested that this question be closed as follows:

Accepted answer: 0 points for dvplayltd's comment #a40399159

for the following reason:

the answers goes too far from the init question ...
The only solution I can imagine is pretty ugly. REALLY. Don't read ahead...

Create a persited computed column for each digit (0-9) and index these. Then you can query for numbers containing the digits of your pattern. But I'm not sure without tests, if this will give any performance boost.
I'm just giving you other options but PortletPaul was the one that gave you an answer for your question. See ID: 40398996
I don't believe there is reason to close this question.
The question does not mention computed column (until I suggested it) but somehow it became a matter of how to index that computed column. That means you accept the suggestion to use a computed column.

Computed Columns


For the index on the computed column:
Simply created an index on the computed column. The problem is that a simple non-clustered index (or even a clustered index) isn't suited to an "any part" search.

see: Indexes on Computed Columns