[Last Call] Learn how to a build a cloud-first strategyRegister Now

x
?
Solved

Variable in TABLESAMPLE Clause

Posted on 2011-10-12
6
Medium Priority
?
407 Views
Last Modified: 2012-05-12
Hi

I am trying to base the size of my table sample based on the number of records in the table.  In this example, if my record count is between 2 and 8, then I want my sample size to be 2 records from the all of the records in the table.  If the record count is between 9 and 15, then the sample size is 3.  The error message I get is "Variables are not allowed in the TABLESAMPLE or REPEATABLE clauses"

CREATE PROCEDURE p_CashSample
      @RecordCount Int = NULL OUTPUT      
AS
BEGIN

SET @RecordCount = (SELECT COUNT(CashID) FROM PopulationBondsCash)

IF (@RecordCount BETWEEN 2 AND 8)
      SELECT TOP 2 * FROM PopulationBondsCash TABLESAMPLE (@RecordCount ROWS)
ELSE IF (@RecordCount BETWEEN 9 AND 15)
      SELECT TOP 3 * FROM PopulationBondsCash TABLESAMPLE (@RecordCount ROWS)
ELSE IF (@RecordCount BETWEEN 16 AND 25)
      SELECT TOP 5 * FROM PopulationBondsCash TABLESAMPLE (@RecordCount ROWS)
END
GO
0
Comment
Question by:evanburen
6 Comments
 
LVL 21

Assisted Solution

by:JestersGrind
JestersGrind earned 1000 total points
ID: 36958869
You will probably have to use dynamic SQL like this.

Greg


CREATE PROCEDURE p_CashSample
      @RecordCount Int = NULL OUTPUT      
AS
BEGIN

DECLARE @SQL VARCHAR(MAX),
		@Top VARCHAR(5)

SET @RecordCount = (SELECT COUNT(CashID) FROM PopulationBondsCash)

IF (@RecordCount BETWEEN 2 AND 8)
      SET @Top = '2'
ELSE IF (@RecordCount BETWEEN 9 AND 15)
      SET @Top = '3'
ELSE IF (@RecordCount BETWEEN 16 AND 25)
      SET @Top = '5'

SET @SQL = 'SELECT TOP ' + @Top + ' * FROM PopulationBondsCash TABLESAMPLE (' + CONVERT(VARCHAR(10), @RecordCount) + ' ROWS)'

EXECUTE sp_executesql @SQL

END
GO

Open in new window

0
 
LVL 75

Accepted Solution

by:
Anthony Perkins earned 1000 total points
ID: 36959792
Or you could scrap the whole TABLESAMPLE approach and use the equivalent:
CREATE PROCEDURE p_CashSample
    @RecordCount Int = NULL OUTPUT
AS 
SELECT  @RecordCount = COUNT(*)
FROM    PopulationBondsCash

SELECT TOP (CASE WHEN @RecordCount BETWEEN 2 AND 8 THEN 2
                 WHEN @RecordCount BETWEEN 9 AND 15 THEN 3
                 WHEN @RecordCount BETWEEN 16 AND 25 THEN 5
                 ELSE 6
            END)
        *
FROM    PopulationBondsCash
ORDER BY NEWID()

Open in new window

0
 
LVL 2

Expert Comment

by:kamalranjan
ID: 36960961
Hi evanburen,

I think you are using 2 techniques simultaneously to narrow down the resultset. Since you are already using TOP keyword with values like 2,3,5 etc, you do not need to use TABLESAMPLE keyword as it would also do the same as TOP keyword. So simply remove the TABLESAMPLE clause and your results will still be good as you want them to narrow down to.

Regards
Kamal
0
Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 

Author Closing Comment

by:evanburen
ID: 36961519
I think using the CASE statement with the NEWID() function is easier to read which is the one I'm going with although using dynamic SQL will also work.   Using TOP instead of TABLESAMPLE or NEWID() does not give you a random sample of records. It just gives you the first x records.
0
 
LVL 21

Expert Comment

by:JestersGrind
ID: 36968045
@acperkins

Ordering by NEWID() is brilliant!  Thank you for posting that.  I learned something new.  

Greg

0
 
LVL 75

Expert Comment

by:Anthony Perkins
ID: 36970313
Greg,

Thanks.  As it turns out I learned something too, as I had to read up on TABLESAMPLE clause.  I am not sure if it was just me, but I was less than impressed by the results produced by it and in fact SQL Server BOL seemed to agree with me:

TABLESAMPLE SYSTEM returns an approximate percentage of rows and generates a random value for each physical 8-KB page in the table. Based on the random value for a page and the percentage specified in the query, a page is either included in the sample or excluded. Each page that is included returns all rows in the sample result set. For example, if you specify TABLESAMPLE SYSTEM 10 PERCENT, SQL Server returns all the rows on approximately 10 percent of the specified data pages of the table. If the rows are evenly distributed on the pages of the table, and if there is a sufficient number of pages in the table, the number of rows returned should approximate the sample size that is requested. However, because the random value that is generated for each page is independent of the values that are generated for any other page, a larger, or smaller, percentage of pages than have been requested might be returned. The TOP(n) operator can be used to limit the number of rows to a specified maximum.
...
The TABLESAMPLE SYSTEM clause should be used with some caution, and with some understanding of some of the implications of using sampling. For example, a join of two tables is likely to return a match for each row in both tables; however, if TABLESAMPLE SYSTEM is specified for either of the two tables, some rows returned from the unsampled table are unlikely to have a matching row in the sampled table. This behavior might lead you to suspect that a data consistency problem exists in the underlying tables, when the data is actually valid. Similarly, if TABLESAMPLE SYSTEM is specified for both tables that are joined, the perceived problem could be even worse.


Anthony
0

Featured Post

Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Composite queries are used to retrieve the results from joining multiple queries after applying any filters. UNION, INTERSECT, MINUS, and UNION ALL are some of the operators used to get certain desired results.​
It is possible to export the data of a SQL Table in SSMS and generate INSERT statements. It's neatly tucked away in the generate scripts option of a database.
this video summaries big data hadoop online training demo (http://onlineitguru.com/big-data-hadoop-online-training-placement.html) , and covers basics in big data hadoop .
With just a little bit of  SQL and VBA, many doors open to cool things like synchronize a list box to display data relevant to other information on a form.  If you have never written code or looked at an SQL statement before, no problem! ...  give i…

834 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question