ondrejko1
asked on
Subquery in select statement
i have a query that is "kind-of" working properly. Keep in mind, I am not a developer, but trying to learn on my own so forgive me if I do not understand where or why my query is not working.
Here is my code and results set in the attached screenshot.
I have a subquery in the select that I want to retrieve a random stock photography image as well as their own image, which you will see screenshot that contains nulls. The problem is the subquery to pull the stock image records pulls images of the correct type, but it pulls all the same record for all.
Now, before we get to the code and screenshot, I believe i have another subquery select statement that i tested which gives me random records, but my problem is that I do not know how to integrate it into the query below as i keep getting funky SQL errors.
Query 2 Screenshot goes with this query
SELECT top 5 'AAA', (SELECT top 1 [tb_stock_media].[str_file name] as [stock_str_filename]
order by newid()) [stock_str_filename]
from [tb_stock_media]
WHERE [tb_stock_media].[f_subtyp e1_id] = 315 ---which this should be something like = [tb_breeders].[f_subtype1_ id]
order by newid()
Query displaying all correct columns, but as you will see in the attached screenshot that [stock_str_filename] is the same for all.
Query 1 Screenshot goes with this query
query2.jpg
what-record-set-looks-like.jpg
Here is my code and results set in the attached screenshot.
I have a subquery in the select that I want to retrieve a random stock photography image as well as their own image, which you will see screenshot that contains nulls. The problem is the subquery to pull the stock image records pulls images of the correct type, but it pulls all the same record for all.
Now, before we get to the code and screenshot, I believe i have another subquery select statement that i tested which gives me random records, but my problem is that I do not know how to integrate it into the query below as i keep getting funky SQL errors.
Query 2 Screenshot goes with this query
SELECT top 5 'AAA', (SELECT top 1 [tb_stock_media].[str_file
order by newid()) [stock_str_filename]
from [tb_stock_media]
WHERE [tb_stock_media].[f_subtyp
order by newid()
Query displaying all correct columns, but as you will see in the attached screenshot that [stock_str_filename] is the same for all.
Query 1 Screenshot goes with this query
SELECT
TOP 20
NEWID() AS [RandomTemporaryMemberGUID],
[tb_members].[str_first_name],
[tb_members].[str_last_name],
[tb_members].[p_member_id],
[tb_breeders].[p_breeder_id],
[tb_breeders].[str_facility_name],
[tb_breeders].[int_listing_hits],
[tb_breeders].[f_type_id],
[tb_breeders].[f_subtype1_id],
[ref_listings_subtype1].[str_subtype1],
[ref_listings_city_zipcodes].[str_preferredcityname],
[ref_listings_state_provinces].[str_statecode],
--Begin retrieval of stock photo
(select top 1
[tb_stock_media].[str_filename ]
From [dbo].[tb_stock_media] AS [tb_stock_media] WITH (NOLOCK)
where [tb_breeders].[f_subtype1_id] = [tb_stock_media].[f_subtype1_id]
order by newid()
) as [stock_str_filename],
-----END retrieval of stock photo
(SELECT TOP 1
[tb_breeder_gallery].[str_filename]
FROM [dbo].[tb_breeder_gallery] AS [tb_breeder_gallery] WITH (NOLOCK)
WHERE [tb_breeders].[f_member_id] = [tb_breeder_gallery].[f_member_id]
ORDER BY NEWID()
) AS [str_filename]
FROM [dbo].[tb_members] AS [tb_members] WITH (NOLOCK)
INNER JOIN [dbo].[tb_breeders] AS [tb_breeders] WITH (NOLOCK)
ON [tb_members].[p_member_id] = [tb_breeders].[f_member_id]
INNER JOIN ref_listings_subtype1 WITH (NOLOCK)
ON tb_breeders.f_subtype1_id = ref_listings_subtype1.p_subtype1_id
INNER JOIN [dbo].[ref_listings_city_zipcodes] AS [ref_listings_city_zipcodes] WITH (NOLOCK)
ON [tb_breeders].[f_zipcode_id] = [ref_listings_city_zipcodes].[p_zipcode_id]
INNER JOIN [dbo].[ref_listings_state_provinces] AS [ref_listings_state_provinces] WITH (NOLOCK)
ON [ref_listings_city_zipcodes].[f_state_province_id] = [ref_listings_state_provinces].[p_state_province_id]
where 1=1
and [tb_breeders].[f_subtype1_id] = 517
order by 1
query1.jpgquery2.jpg
what-record-set-looks-like.jpg
ASKER
Ok, that makes sense. But as you see in my second individual query such as this one, it does randomize them. So, how do i integrate below into the other query?
SELECT top 5 'AAA', (SELECT top 1 [tb_stock_media].[str_filename] as [stock_str_filename]
order by newid()) [stock_str_filename]
from [tb_stock_media]
WHERE [tb_stock_media].[f_subtype1_id] = 315 ---which this should be something like = [tb_breeders].[f_subtype1_id]
order by newid()
ASKER
Let me clarify, i have hundreds of different random images for the column stock_str_filename, of course i want the same type of image, which is why i specify the same subtypeid. Not sure if that helps or makes more sense.
To troubleshoot the query, reduce the number of fields until you get it to work then re-include other fields. You may confine yourself within few fields and 2 tables. Doing so, you may find the solution yourself.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
vadimrapp1,
Thank you, that worked, but you were correct. It did make it much slower. It used to execute in 0.0, but now takes 2 - 4 seconds. Are there strategies to speed this up?
Thank you, that worked, but you were correct. It did make it much slower. It used to execute in 0.0, but now takes 2 - 4 seconds. Are there strategies to speed this up?
You can't speed it up by definition, as long as you choose to pick the random one by sorting them all - you must sort them all for every row, and it can't be fast.
I would simply write a stored procedure like this: assuming that the random images are numbered by column "id":
minnumber = min(id)
maxnumber = max(id)
randomnumber = rand() * (maxnumber-minnumber) + minnumber
select image where id = randomnumber
(the above is the algorithm, not in any particular language)
This does not even have to be in the sql. You probably have some client portion of your application, like asp.net or something, so if you need N random images, you calculate N random numbers in the client, retrieve those images, and attach one in every row.
I would simply write a stored procedure like this: assuming that the random images are numbered by column "id":
minnumber = min(id)
maxnumber = max(id)
randomnumber = rand() * (maxnumber-minnumber) + minnumber
select image where id = randomnumber
(the above is the algorithm, not in any particular language)
This does not even have to be in the sql. You probably have some client portion of your application, like asp.net or something, so if you need N random images, you calculate N random numbers in the client, retrieve those images, and attach one in every row.
Please post the reduced form of your query and I will see what I can do.
ASKER
This is the new revised query with the above solution implemented. As i said, it is working but takes between 4 - 6 seconds now versus subzero seconds.
SELECT
TOP 10
NEWID() AS [RandomTemporaryMemberGUID],
[tb_members].[str_first_name],
[tb_members].[str_last_name],
[tb_members].[p_member_id],
[tb_breeders].[p_breeder_id],
[tb_breeders].[str_facility_name],
[tb_breeders].[int_listing_hits],
[tb_breeders].[f_type_id],
[tb_breeders].[f_subtype1_id],
[ref_listings_subtype1].[str_subtype1],
[ref_listings_city_zipcodes].[str_preferredcityname],
[ref_listings_state_provinces].[str_statecode],
(select top 1
[tb_stock_media].[str_filename ]
From [dbo].[tb_stock_media] AS [tb_stock_media] WITH (NOLOCK)
where [tb_breeders].[f_subtype1_id] = [tb_stock_media].[f_subtype1_id]
and [tb_stock_media].[is_ad_photo] = 0
order by newid(), LEN([tb_members].[str_first_name])
) as [stock_str_filename],
(SELECT TOP 1
[tb_breeder_gallery].[str_filename]
FROM [dbo].[tb_breeder_gallery] AS [tb_breeder_gallery] WITH (NOLOCK)
WHERE [tb_breeders].[f_member_id] = [tb_breeder_gallery].[f_member_id]
ORDER BY NEWID()
) AS [str_filename]
FROM [dbo].[tb_members] AS [tb_members] WITH (NOLOCK)
INNER JOIN [dbo].[tb_breeders] AS [tb_breeders] WITH (NOLOCK)
ON [tb_members].[p_member_id] = [tb_breeders].[f_member_id]
INNER JOIN ref_listings_subtype1 WITH (NOLOCK)
ON tb_breeders.f_subtype1_id = ref_listings_subtype1.p_subtype1_id
INNER JOIN [dbo].[ref_listings_city_zipcodes] AS [ref_listings_city_zipcodes] WITH (NOLOCK)
ON [tb_breeders].[f_zipcode_id] = [ref_listings_city_zipcodes].[p_zipcode_id]
INNER JOIN [dbo].[ref_listings_state_provinces] AS [ref_listings_state_provinces] WITH (NOLOCK)
ON [ref_listings_city_zipcodes].[f_state_province_id] = [ref_listings_state_provinces].[p_state_province_id]
where 1=1
and [tb_breeders].[f_subtype1_id] = 318
order by 1
Actually more than revised I asked for a reduced query. In other words one that just included the problem sub-query, so that if you comment it out, all is fine.
Having said that I am puzzled by your WHERE clause:
[tb_breeders].[f_subtype1_ id] = 318
If that is for real, why bother with the following condition in your subquery:
where [tb_breeders].[f_subtype1_ id] = [tb_stock_media].[f_subtyp e1_id]
Why not just write:
where [tb_stock_media].[f_subtyp e1_id] = 318
Having said that I am puzzled by your WHERE clause:
[tb_breeders].[f_subtype1_
If that is for real, why bother with the following condition in your subquery:
where [tb_breeders].[f_subtype1_
Why not just write:
where [tb_stock_media].[f_subtyp
ASKER
We could, but because that is a variable that is being passed from the URL, i just joined on that. the subquery that is the problem is this. I need it to pull a random image for each of the records. Does that make sense if you look at the screenshots?
(select top 1
[tb_stock_media].[str_filename ]
From [dbo].[tb_stock_media] AS [tb_stock_media] WITH (NOLOCK)
where [tb_breeders].[f_subtype1_id] = [tb_stock_media].[f_subtype1_id]
and [tb_stock_media].[is_ad_photo] = 0
order by newid(), LEN([tb_members].[str_first_name])
) as [stock_str_filename],
We could, but because that is a variable that is being passed from the URL, i just joined on that
And that is fine, but surely it would be more efficient to do this:
From [dbo].[tb_stock_media] AS [tb_stock_media] WITH (NOLOCK)
where [tb_stock_media].[f_subtyp e1_id] = 318 -- Or use a parameter as in @f_subtype1_id
and [tb_stock_media].[is_ad_ph oto] = 0
Or what am I missing?
And that is fine, but surely it would be more efficient to do this:
From [dbo].[tb_stock_media] AS [tb_stock_media] WITH (NOLOCK)
where [tb_stock_media].[f_subtyp
and [tb_stock_media].[is_ad_ph
Or what am I missing?
ASKER
that absolutely works, i can do
where @Variable = [tb_stock_media].[f_subtyp e1_id]
i just had it = the other column.. so, i guess the question is, how do i make the following more efficient
(select top 1
[tb_stock_media].[str_file name ]
From [dbo].[tb_stock_media] AS [tb_stock_media] WITH (NOLOCK)
where @variable = [tb_stock_media].[f_subtyp e1_id]
and [tb_stock_media].[is_ad_ph oto] = 0
order by newid(), LEN([tb_members].[str_firs t_name])
) as [stock_str_filename],
where @Variable = [tb_stock_media].[f_subtyp
i just had it = the other column.. so, i guess the question is, how do i make the following more efficient
(select top 1
[tb_stock_media].[str_file
From [dbo].[tb_stock_media] AS [tb_stock_media] WITH (NOLOCK)
where @variable = [tb_stock_media].[f_subtyp
and [tb_stock_media].[is_ad_ph
order by newid(), LEN([tb_members].[str_firs
) as [stock_str_filename],
You can't make it more efficient. If you choose to calculate 1,000 random numbers by sorting out 1,000 rows 1,000 times only to pick one at the top, you can't expect anything but million sorts. As I said before, there are many much more sane ways to do it.
ASKER
ok. Thank you for you help.
so, i guess the question is, how do i make the following more efficient
Now that we have clarified that, you have some options:
1. You can place the results of that sub query into a well indexed temp table together with the NEWID() and then JOIN the main query to the temp table.
2. You can change the sub query to a derived table.
3. Use a CTE.
But it looks like you are satisfied with the results after making the change I suggested. Incidentally, make sure all your tables are appropriately indexed. This may explain the sub-par performance.
Now that we have clarified that, you have some options:
1. You can place the results of that sub query into a well indexed temp table together with the NEWID() and then JOIN the main query to the temp table.
2. You can change the sub query to a derived table.
3. Use a CTE.
But it looks like you are satisfied with the results after making the change I suggested. Incidentally, make sure all your tables are appropriately indexed. This may explain the sub-par performance.
That is to be expected: You are JOINing on [tb_breeders].[f_subtype1_