• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 1009
  • Last Modified:

Random sample of 100 records

How could I get a random sample of 100 records from a very large table?
0
hrolsons
Asked:
hrolsons
1 Solution
 
Jim HornMicrosoft SQL Server Developer, Architect, and AuthorCommented:
Not sure about the 'very large table' part, but otherwise...

SELECT TOP 100 * FROM your_table
ORDER BY NEWID()
0
 
Jim HornMicrosoft SQL Server Developer, Architect, and AuthorCommented:
Tell you what ... How about telling us the business problem that you're trying to tackle, and maybe we'll be able to come up with a better solution.
0
 
ProjectChampionCommented:
Since 2008 R2, SQL Server has a built in feature for this puprpose, i.e. TABLESAMPLE. For instance:

USE AdventureWorks2008R2 ;
GO
SELECT FirstName, LastName
FROM Person.Person
TABLESAMPLE (10 PERCENT) ;
0
Cloud Class® Course: Amazon Web Services - Basic

Are you thinking about creating an Amazon Web Services account for your business? Not sure where to start? In this course you’ll get an overview of the history of AWS and take a tour of their user interface.

 
hrolsonsAuthor Commented:
@Jim Horn - I've hired someone to edit photographs for me and I want them to edit a random sample of my whole collection to see how they do.  I didn't just want to send him 100 of the same track meet.

@ProjectChampion - How do you apply TABLESAMPLE to a fixed number, like 100.
0
 
Anthony PerkinsCommented:
TABLESAMPLE was introduced with SQL Server 2005 and the syntax is:
TABLESAMPLE [SYSTEM] (sample_number [ PERCENT | ROWS ] )

So in your case:
TABLESAMPLE (100 ROWS)
0
 
Anthony PerkinsCommented:
Having said that TABLESAMPLE is approximate, so if you want exactly 100 you would be better off with Jim's solution.
0
 
Anthony PerkinsCommented:
And on second thoughts and after doing some testing with TABLESAMPLE (perhaps I should have done that in the first place) the results are not very random at all (which I believe that is akin to saying that someone is not very pregnant :) )

In fact SQL Server's BOL states:
The sample does not have to be a truly random sample at the level of individual rows.
...
If you really want a random sample of individual rows, modify your query to filter out rows randomly, instead of using TABLESAMPLE. For example, the following query uses the NEWID function to return approximately one percent of the rows of the Sales.SalesOrderDetail table:
...
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Free Tool: IP Lookup

Get more info about an IP address or domain name, such as organization, abuse contacts and geolocation.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Tackle projects and never again get stuck behind a technical roadblock.
Join Now