Solved

T-SQL index over computer column in a table

Posted on 2014-10-23
13
186 Views
Last Modified: 2014-10-23
Dear experts,

I have a SQL 2008 R2 DB and in my table I have a field DocumentNumber (Num) bigint. I want to provide my C# app to search by any part of this number, therefore I should search in it in nvarchar (with LIKE %234). My question is – which is the classis way:

   1.      To create a separated fields and on Add record to cast the bigint to nvarchar and to index on it. Because this is a official document, it will be very rarely modified / deleted.
  2. To create computed field (      [NumS]  AS (CONVERT([nvarchar],[Num],0)), and to index on it. Should I mark it Is persisted or not?  I think this will be the better way, if the SQL use this column only to create index, while in the point 1 this separated column will take HDD space.

3.      I prefer to use nvarchar, not varchar because if in the feature the document number included letters as well. I know nvarchar get twice more space because of Unicode support, but what about search of index? I mean – a index over nvarchar is equal fast as index over varchar, right ? If there is big differences, may be I'll use varchar.

 
 The more likely usage of search will be the user to search with last 2 or 3 numbers of the document, so the index will be used in real. I expect not more than 50 000 records in that table for few years, but want to do it right, next time it could be 5 000 000 records.
0
Comment
Question by:dvplayltd
  • 5
  • 4
  • 2
  • +1
13 Comments
 
LVL 32

Expert Comment

by:Stefan Hoffmann
ID: 40398787
When you want to search for any part, then you need to convert it to VARCHAR() and use a full-text index using the CONTAINS() predicate.

But: Searching only for part makes normally only sense, when there is a structure in the number. But in this case you should split the number in appropriate parts and store those.

And the problem with LIKE is: only searching for 'text%' is supported by indices.
0
 

Author Comment

by:dvplayltd
ID: 40398798
To ste5an

The data is actually a number of document, let say 10356. It will be up to user will seach for example for 356 in the end (Like %356), in the start or at any part .

I'm sure I do not need full text index search, it look to me like overwork for that purpose.

Please tell me I should do option 1 or option 2 is the same result in speed of searhing ?
0
 
LVL 45

Expert Comment

by:Vitor Montalvão
ID: 40398985
Why you just not change the column data type from bigint to VARCHAR? Then create an index on that field so your queries run faster.
0
 
LVL 48

Accepted Solution

by:
PortletPaul earned 500 total points
ID: 40398996
>>"I prefer to use nvarchar, not varchar because if in the feature the document number included letters"
so make this column nvarchar

if you do wish to persist with the bigint column, then also use a computed column (persisted) that is the nvarchar conversion of the existing column.

BUT: as already mentioned searching for "any part" isn't solved by a standard index
0
 

Author Comment

by:dvplayltd
ID: 40399088
To Vitor Montalvão

"Why you just not change the column data type from bigint to VARCHAR?"
Well, it is a option ..initially it is bigint because when user add a new document I read last number + 1. In that moment I can convert nvarchar to bigint and find the biggest one ... but this will read ALL 50 000 records, right?

With such number of documents (50 000) any option will be OK. My question is primary to do it in right way, also to learn more about this.
0
 
LVL 45

Expert Comment

by:Vitor Montalvão
ID: 40399117
If you index the column you can easily find the last record ( MAX(DocumentNumber) ). Problem might be if you want to use letters but then I would go for a solution with 2 columns, one bigint for DocumentNumber and another one a VARCHAR column for DocumentLetter and like that you can even have sequencial document numbers by letters:
SELECT DocumentLetter, MAX(DocumentNumber)
FROM Document
GROUP BY DocumentLetter

Open in new window

0
Highfive Gives IT Their Time Back

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

 

Author Comment

by:dvplayltd
ID: 40399159
That is my idea .. to use 2 columns. The question is- WHAT to be second column - separated or  computed and how index on it will be affected. After few post we are go again to my init question ... when anyone start answer please focus on that.
0
 
LVL 45

Expert Comment

by:Vitor Montalvão
ID: 40399168
My idea for a 2nd column it's only if you need to join letters and numbers and to help you getting the next document number quickly. You idea for a 2nd column is to use a computed column but then how you going to deal with a document number that has letters?
0
 

Author Comment

by:dvplayltd
ID: 40399197
Vitor, i doublt you want I to explain you the whole logic.I'm closing this question, it goes to non useful direction ... the question is how index work on computed columns, that is all.
0
 

Author Comment

by:dvplayltd
ID: 40399230
I've requested that this question be closed as follows:

Accepted answer: 0 points for dvplayltd's comment #a40399159

for the following reason:

the answers goes too far from the init question ...
0
 
LVL 32

Expert Comment

by:Stefan Hoffmann
ID: 40399207
The only solution I can imagine is pretty ugly. REALLY. Don't read ahead...

Create a persited computed column for each digit (0-9) and index these. Then you can query for numbers containing the digits of your pattern. But I'm not sure without tests, if this will give any performance boost.
0
 
LVL 45

Expert Comment

by:Vitor Montalvão
ID: 40399208
I'm just giving you other options but PortletPaul was the one that gave you an answer for your question. See ID: 40398996
0
 
LVL 48

Expert Comment

by:PortletPaul
ID: 40399231
I don't believe there is reason to close this question.
The question does not mention computed column (until I suggested it) but somehow it became a matter of how to index that computed column. That means you accept the suggestion to use a computed column.

Computed Columns


For the index on the computed column:
Simply created an index on the computed column. The problem is that a simple non-clustered index (or even a clustered index) isn't suited to an "any part" search.

see: Indexes on Computed Columns
0

Featured Post

Why You Should Analyze Threat Actor TTPs

After years of analyzing threat actor behavior, it’s become clear that at any given time there are specific tactics, techniques, and procedures (TTPs) that are particularly prevalent. By analyzing and understanding these TTPs, you can dramatically enhance your security program.

Join & Write a Comment

In this article I will describe the Detach & Attach method as one possible migration process and I will add the extra tasks needed for an upgrade when and where is applied so it will cover all.
How to leverage one TLS certificate to encrypt Microsoft SQL traffic and Remote Desktop Services, versus creating multiple tickets for the same server.
Via a live example, show how to setup several different housekeeping processes for a SQL Server.
Using examples as well as descriptions, and references to Books Online, show the documentation available for datatypes, explain the available data types and show how data can be passed into and out of variables.

708 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

13 Experts available now in Live!

Get 1:1 Help Now