Could you point a way to split a very long MSSQLServer query to make it retrieves just parts of the lines for each time?

Eduardo Fuerte
Eduardo Fuerte used Ask the Experts™
on
Hi Experts

Could you point a way to split a very long MSSQLServer query to make it retrieves just parts of the lines for each time?

I'm running a query that must retrieve 3.200.0000 lines, resulted from some group operations:

     SELECT nome, Nome_Valido, cpf, CPF_Valido, data_nasc, telefone, endereco, cidade, uf, cep, 
       max(Cem) AS  cem , max(Infosaude) AS  infosaude, max(Procare) as procare
FROM v_UnificacaoPacientes_v5
GROUP BY nome, Nome_Valido, cpf, CPF_Valido, data_nasc, telefone, endereco, cidade, uf, cep
ORDER BY nome;

Open in new window


Is that a way to partitionate the results in subsets of lines and then assemble it ?

Thanks in advance.
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Did you try OFFSET and FETCH NEXT clauses?
     SELECT nome, Nome_Valido, cpf, CPF_Valido, data_nasc, telefone, endereco, cidade, uf, cep, 
       max(Cem) AS  cem , max(Infosaude) AS  infosaude, max(Procare) as procare
FROM v_UnificacaoPacientes_v5
GROUP BY nome, Nome_Valido, cpf, CPF_Valido, data_nasc, telefone, endereco, cidade, uf, cep
ORDER BY nome
OFFSET 100000 ROWS
FETCH NEXT 100000 ROWS ONLY;

Open in new window

You may use variables or even expresions for OFFSET and FETCH NEXT.
More info https://docs.microsoft.com/en-us/sql/t-sql/queries/select-order-by-clause-transact-sql
OTOH, you should answer the question why do you need to retrieve 3 mio rows? Who will read it? Shouldn't you consolidate them on the SQL Server?

One more thing: It seems your query is from some view. OFFSET and FETCH are not supported in indexed views or in a view that is defined by using the CHECK OPTION clause. You should also read all other notes on OFFSET and FETCH in the above link.
Eduardo FuerteDeveloper and Analyst

Author

Commented:
Hi Pcelba

We need to migrate all the medical patients from legate system to the new CRM system.
Eduardo FuerteDeveloper and Analyst

Author

Commented:
By going the strategy you pointed, I'm planning to split in 04 parts

     SELECT nome, Nome_Valido, cpf, CPF_Valido, data_nasc, telefone, endereco, cidade, uf, cep, 
       max(Cem) AS  cem , max(Infosaude) AS  infosaude, max(Procare) as procare
FROM v_UnificacaoPacientes_v5
GROUP BY nome, Nome_Valido, cpf, CPF_Valido, data_nasc, telefone, endereco, cidade, uf, cep
ORDER BY nome
OFFSET 10000000
FETCH NEXT 1000000 ROWS ONLY;



     SELECT nome, Nome_Valido, cpf, CPF_Valido, data_nasc, telefone, endereco, cidade, uf, cep, 
       max(Cem) AS  cem , max(Infosaude) AS  infosaude, max(Procare) as procare
FROM v_UnificacaoPacientes_v5
GROUP BY nome, Nome_Valido, cpf, CPF_Valido, data_nasc, telefone, endereco, cidade, uf, cep
ORDER BY nome
OFFSET 10000001
FETCH NEXT 1000000 ROWS ONLY;

SELECT nome, Nome_Valido, cpf, CPF_Valido, data_nasc, telefone, endereco, cidade, uf, cep, 
       max(Cem) AS  cem , max(Infosaude) AS  infosaude, max(Procare) as procare
FROM v_UnificacaoPacientes_v5
GROUP BY nome, Nome_Valido, cpf, CPF_Valido, data_nasc, telefone, endereco, cidade, uf, cep
ORDER BY nome
OFFSET 20000001
FETCH NEXT 1000000 ROWS ONLY;

SELECT nome, Nome_Valido, cpf, CPF_Valido, data_nasc, telefone, endereco, cidade, uf, cep, 
       max(Cem) AS  cem , max(Infosaude) AS  infosaude, max(Procare) as procare
FROM v_UnificacaoPacientes_v5
GROUP BY nome, Nome_Valido, cpf, CPF_Valido, data_nasc, telefone, endereco, cidade, uf, cep
ORDER BY nome
OFFSET 30000001
FETCH NEXT 1000000 ROWS ONLY;

Open in new window


Am I in the right way?
Ensure you’re charging the right price for your IT

Do you wonder if your IT business is truly profitable or if you should raise your prices? Learn how to calculate your overhead burden using our free interactive tool and use it to determine the right price for your IT services. Start calculating Now!

Eduardo FuerteDeveloper and Analyst

Author

Commented:
Could you check what's wrong when running this:

img001
Eduardo FuerteDeveloper and Analyst

Author

Commented:
Just considering

v_UnificacaoPacientes_v5

Open in new window


Is a view...
Eduardo FuerteDeveloper and Analyst

Author

Commented:
Only missed the word ROWS

OFFSET 100000 ROWS

Open in new window

Yes the ROWS word was missing in my original post and I've fixed it later.
If you are planning just 4 parts then the concept is OK. The OFFSET in the first query should be 0 and all other offsets should be minus one - 1000000, 2000000, ...
Eduardo FuerteDeveloper and Analyst

Author

Commented:
Thank you.

Unfortunatelly the performance is very slow in my case, so I have to adopt another strategy.
Slow performance means missing index on the nome column. Did you look at execution plan? Of course, I don't know the view definition and 3 mio rows from a (possibly) complex view does not help to improve the performance...

Just remember the data porting to a new system is one time task and if it takes hours then it is still OK.
Eduardo FuerteDeveloper and Analyst

Author

Commented:
I had tried to recreate the view as an indexed view on the column name but the specific database doesn't allow that.
I agree with your considerations about data migration is an unique event that where time to be consumed on strategies must to be only the necessary.

What I did is to split the data using the name initials and save the results on .CSV(s) - and then maybe join all the CSV(s).

Another company of the group I found have at about 21 million register. By going this way is slow but possible and I'm getting it.
OFFSET and FETCH does not work on indexed views... and CSV can contain characters which are not processed correctly...

If you are porting data between two different servers then you could have two more options:
1) Create a backup, restore it on the second server and then all queries should work relatively quickly. You may even prepare "export database" on the original server and then port the backup of this db.
2) Create a Linked server on the new machine and execute necessary queries on the linked server. Of course, millions of data rows will still consume time.

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial