[Last Call] Learn how to a build a cloud-first strategyRegister Now

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 606
  • Last Modified:

How can I calculate the amount of disk space needed for a SQL 2005 full text index?

I have a table with 1626078 rows and a column of VARCHAR (MAX). I would like to take a guess and the amount of space I need to ask for when creating the full text index on the VARCHAR (MAX) column.

I did (8002 * 1626078) and it looks like my FTI will take approx 12.5GB. I got the 8002 value from 8000 characters + 2 bytes for the termination. Then multiplied that by the amount of rows in my table.

I am using SQL 2005 so if I am not mistaken the FTI goes in a separate file.

Did I miss anything? Am I heading into disaster?

Thanks experts!
0
sqlagent007
Asked:
sqlagent007
  • 2
2 Solutions
 
Rich WeisslerProfessional Troublemaker^h^h^h^h^hshooterCommented:
A few thoughts:
First, the maximum amount that could be stored in VARCHAR(MAX) is 2 GB, not a page worth (little less than 8k).
Second, the Full Text Index, in 2005 does go to a separate directory... but that isn't a bad thing.

You can grab a random sample of the table (select top (n) percent [...] order by <something randomish>), put the table on a test box, and perform to a full text index on that.  From that, you can establish a rough estimate.  With the information provided in your question, the estimate for the FTI isn't possible.
0
 
Scott PletcherSenior DBACommented:
I would use the actual length of your data, not assume 8000 bytes per row (which could be (way) too low or (way) too high, depending on your specific data).

SELECT SUM(CAST(LEN(varchar_max_column) AS bigint))
FROM dbo.tablename

That will typically be an *over*estimate, since noise words are eliminated, and duplicates words are not physically fully repeated in the index.
0
 
Rich WeisslerProfessional Troublemaker^h^h^h^h^hshooterCommented:
Also remember that the full text index will also be compressed.  ("Rather than construct a B-tree structure based on a value stored in a particular row, MSFTESQL builds an inverted, stacked, compressed index structure based on individual tokens from the text being indexed.")

You can proceed with an estimate of the space consumed by the one column, but use that as an extreme upper limit rather than a realistic estimate of the space which will be consumed by the full text catalog.
0
 
sqlagent007Author Commented:
Thanks experts! That did the trick, the index was very compressed and only took up 50mb.
0

Featured Post

VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now