Avatar of Eduardo Fuerte
Eduardo Fuerte
Flag for Brazil asked on

Could you point a way to split a very long MSSQLServer query to make it retrieves just parts of the lines for each time?

Hi Experts

Could you point a way to split a very long MSSQLServer query to make it retrieves just parts of the lines for each time?

I'm running a query that must retrieve 3.200.0000 lines, resulted from some group operations:

     SELECT nome, Nome_Valido, cpf, CPF_Valido, data_nasc, telefone, endereco, cidade, uf, cep, 
       max(Cem) AS  cem , max(Infosaude) AS  infosaude, max(Procare) as procare
FROM v_UnificacaoPacientes_v5
GROUP BY nome, Nome_Valido, cpf, CPF_Valido, data_nasc, telefone, endereco, cidade, uf, cep
ORDER BY nome;

Open in new window


Is that a way to partitionate the results in subsets of lines and then assemble it ?

Thanks in advance.
Microsoft SQL Server

Avatar of undefined
Last Comment
Pavel Celba

8/22/2022 - Mon
ASKER CERTIFIED SOLUTION
Pavel Celba

Log in or sign up to see answer
Become an EE member today7-DAY FREE TRIAL
Members can start a 7-Day Free trial then enjoy unlimited access to the platform
Sign up - Free for 7 days
or
Learn why we charge membership fees
We get it - no one likes a content blocker. Take one extra minute and find out why we block content.
Not exactly the question you had in mind?
Sign up for an EE membership and get your own personalized solution. With an EE membership, you can ask unlimited troubleshooting, research, or opinion questions.
ask a question
Eduardo Fuerte

ASKER
Hi Pcelba

We need to migrate all the medical patients from legate system to the new CRM system.
Eduardo Fuerte

ASKER
By going the strategy you pointed, I'm planning to split in 04 parts

     SELECT nome, Nome_Valido, cpf, CPF_Valido, data_nasc, telefone, endereco, cidade, uf, cep, 
       max(Cem) AS  cem , max(Infosaude) AS  infosaude, max(Procare) as procare
FROM v_UnificacaoPacientes_v5
GROUP BY nome, Nome_Valido, cpf, CPF_Valido, data_nasc, telefone, endereco, cidade, uf, cep
ORDER BY nome
OFFSET 10000000
FETCH NEXT 1000000 ROWS ONLY;



     SELECT nome, Nome_Valido, cpf, CPF_Valido, data_nasc, telefone, endereco, cidade, uf, cep, 
       max(Cem) AS  cem , max(Infosaude) AS  infosaude, max(Procare) as procare
FROM v_UnificacaoPacientes_v5
GROUP BY nome, Nome_Valido, cpf, CPF_Valido, data_nasc, telefone, endereco, cidade, uf, cep
ORDER BY nome
OFFSET 10000001
FETCH NEXT 1000000 ROWS ONLY;

SELECT nome, Nome_Valido, cpf, CPF_Valido, data_nasc, telefone, endereco, cidade, uf, cep, 
       max(Cem) AS  cem , max(Infosaude) AS  infosaude, max(Procare) as procare
FROM v_UnificacaoPacientes_v5
GROUP BY nome, Nome_Valido, cpf, CPF_Valido, data_nasc, telefone, endereco, cidade, uf, cep
ORDER BY nome
OFFSET 20000001
FETCH NEXT 1000000 ROWS ONLY;

SELECT nome, Nome_Valido, cpf, CPF_Valido, data_nasc, telefone, endereco, cidade, uf, cep, 
       max(Cem) AS  cem , max(Infosaude) AS  infosaude, max(Procare) as procare
FROM v_UnificacaoPacientes_v5
GROUP BY nome, Nome_Valido, cpf, CPF_Valido, data_nasc, telefone, endereco, cidade, uf, cep
ORDER BY nome
OFFSET 30000001
FETCH NEXT 1000000 ROWS ONLY;

Open in new window


Am I in the right way?
Eduardo Fuerte

ASKER
Could you check what's wrong when running this:

img001
Your help has saved me hundreds of hours of internet surfing.
fblack61
Eduardo Fuerte

ASKER
Just considering

v_UnificacaoPacientes_v5

Open in new window


Is a view...
Eduardo Fuerte

ASKER
Only missed the word ROWS

OFFSET 100000 ROWS

Open in new window

Pavel Celba

Yes the ROWS word was missing in my original post and I've fixed it later.
If you are planning just 4 parts then the concept is OK. The OFFSET in the first query should be 0 and all other offsets should be minus one - 1000000, 2000000, ...
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
Eduardo Fuerte

ASKER
Thank you.

Unfortunatelly the performance is very slow in my case, so I have to adopt another strategy.
Pavel Celba

Slow performance means missing index on the nome column. Did you look at execution plan? Of course, I don't know the view definition and 3 mio rows from a (possibly) complex view does not help to improve the performance...

Just remember the data porting to a new system is one time task and if it takes hours then it is still OK.
Eduardo Fuerte

ASKER
I had tried to recreate the view as an indexed view on the column name but the specific database doesn't allow that.
I agree with your considerations about data migration is an unique event that where time to be consumed on strategies must to be only the necessary.

What I did is to split the data using the name initials and save the results on .CSV(s) - and then maybe join all the CSV(s).

Another company of the group I found have at about 21 million register. By going this way is slow but possible and I'm getting it.
Experts Exchange is like having an extremely knowledgeable team sitting and waiting for your call. Couldn't do my job half as well as I do without it!
James Murphy
Pavel Celba

OFFSET and FETCH does not work on indexed views... and CSV can contain characters which are not processed correctly...

If you are porting data between two different servers then you could have two more options:
1) Create a backup, restore it on the second server and then all queries should work relatively quickly. You may even prepare "export database" on the original server and then port the backup of this db.
2) Create a Linked server on the new machine and execute necessary queries on the linked server. Of course, millions of data rows will still consume time.