?
Solved

Random sample of 100 records

Posted on 2014-04-21
7
Medium Priority
?
918 Views
Last Modified: 2014-04-25
How could I get a random sample of 100 records from a very large table?
0
Comment
Question by:hrolsons
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
7 Comments
 
LVL 66

Accepted Solution

by:
Jim Horn earned 2000 total points
ID: 40013616
Not sure about the 'very large table' part, but otherwise...

SELECT TOP 100 * FROM your_table
ORDER BY NEWID()
0
 
LVL 66

Expert Comment

by:Jim Horn
ID: 40014972
Tell you what ... How about telling us the business problem that you're trying to tackle, and maybe we'll be able to come up with a better solution.
0
 
LVL 8

Expert Comment

by:ProjectChampion
ID: 40015214
Since 2008 R2, SQL Server has a built in feature for this puprpose, i.e. TABLESAMPLE. For instance:

USE AdventureWorks2008R2 ;
GO
SELECT FirstName, LastName
FROM Person.Person
TABLESAMPLE (10 PERCENT) ;
0
Prepare for your VMware VCP6-DCV exam.

Josh Coen and Jason Langer have prepared the latest edition of VCP study guide. Both authors have been working in the IT field for more than a decade, and both hold VMware certifications. This 163-page guide covers all 10 of the exam blueprint sections.

 

Author Comment

by:hrolsons
ID: 40015520
@Jim Horn - I've hired someone to edit photographs for me and I want them to edit a random sample of my whole collection to see how they do.  I didn't just want to send him 100 of the same track meet.

@ProjectChampion - How do you apply TABLESAMPLE to a fixed number, like 100.
0
 
LVL 75

Expert Comment

by:Anthony Perkins
ID: 40015598
TABLESAMPLE was introduced with SQL Server 2005 and the syntax is:
TABLESAMPLE [SYSTEM] (sample_number [ PERCENT | ROWS ] )

So in your case:
TABLESAMPLE (100 ROWS)
0
 
LVL 75

Expert Comment

by:Anthony Perkins
ID: 40015612
Having said that TABLESAMPLE is approximate, so if you want exactly 100 you would be better off with Jim's solution.
0
 
LVL 75

Expert Comment

by:Anthony Perkins
ID: 40015632
And on second thoughts and after doing some testing with TABLESAMPLE (perhaps I should have done that in the first place) the results are not very random at all (which I believe that is akin to saying that someone is not very pregnant :) )

In fact SQL Server's BOL states:
The sample does not have to be a truly random sample at the level of individual rows.
...
If you really want a random sample of individual rows, modify your query to filter out rows randomly, instead of using TABLESAMPLE. For example, the following query uses the NEWID function to return approximately one percent of the rows of the Sales.SalesOrderDetail table:
...
0

Featured Post

Get 15 Days FREE Full-Featured Trial

Benefit from a mission critical IT monitoring with Monitis Premium or get it FREE for your entry level monitoring needs.
-Over 200,000 users
-More than 300,000 websites monitored
-Used in 197 countries
-Recommended by 98% of users

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Load balancing is the method of dividing the total amount of work performed by one computer between two or more computers. Its aim is to get more work done in the same amount of time, ensuring that all the users get served faster.
In part one, we reviewed the prerequisites required for installing SQL Server vNext. In this part we will explore how to install Microsoft's SQL Server on Ubuntu 16.04.
This videos aims to give the viewer a basic demonstration of how a user can query current session information by using the SYS_CONTEXT function
Via a live example, show how to shrink a transaction log file down to a reasonable size.
Suggested Courses

800 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question