justme4me
asked on
Improve SSMA performance
I have 2TB of data to copy from Oracle on Linux to SQL Server on Windows.
I am using SSMA. So far over the past 3 days only 575,000,000 have been migrated.
Is it feasible to change the batch size to 10,000,0000?
Is there a way to pause and restart this process without losing or duplicating data already inserted into the SQL Server table?
I am using SSMA. So far over the past 3 days only 575,000,000 have been migrated.
Is it feasible to change the batch size to 10,000,0000?
Is there a way to pause and restart this process without losing or duplicating data already inserted into the SQL Server table?
ASKER
Thank you but my question is more centrally focused on whether it is advisable to attempt set the batch size to 10,000,000. I have an extra zero in my original question. I am not trying to address a time out issue. I am seeking to shorten the time it is currently taking to transfer the data. It needs to be done in a matter of hours...not days. Thank you.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Depending on the complexity of the tables it might be faster to dump the data to CSV on the database server, compress it, move the data over, uncompress and load it directly on the database server. That should remove of the most network layer.
>>Thank you but my question is more centrally focused on whether it is advisable to attempt set the batch size
dfke commented on this and I agree with them: Larger batch sizes will likely slow things down more. Think about it: the database will have to keep track of a really large transaction in case it needs to roll back.
>>Thank you but my question is more centrally focused on whether it is advisable to attempt set the batch size
dfke commented on this and I agree with them: Larger batch sizes will likely slow things down more. Think about it: the database will have to keep track of a really large transaction in case it needs to roll back.
why copy ?
create a linked server to the oracle and read the table on the oracle
you won't have to maintain the data in 2 places then
or ... why did you insert it in the oracle if you needed it in the mssql ?
create a linked server to the oracle and read the table on the oracle
you won't have to maintain the data in 2 places then
or ... why did you insert it in the oracle if you needed it in the mssql ?
ASKER
Vito, I agree but I did not have any success with installing the correct Oracle OLE Provider. Therefore, I was unable to use SSIS. For the same reason, lack of Oracle OLE Provider, I am unable to use linked server.
ASKER
Slightwv,
My first attempt to get the data into SQL Server was to export it to .CSV file. I did not get the keeping of the null values to work and kept getting unexpected EOF error messages even though I parsed the select statement to include CRLF.
I am open to any suggestions to speeding up this process.
Geert Gruwez,
The reason the data had to go to Oracle first is because it's a Oracle backup which you cannot restore to SQL Server. At least not that I am aware of.
My first attempt to get the data into SQL Server was to export it to .CSV file. I did not get the keeping of the null values to work and kept getting unexpected EOF error messages even though I parsed the select statement to include CRLF.
I am open to any suggestions to speeding up this process.
Geert Gruwez,
The reason the data had to go to Oracle first is because it's a Oracle backup which you cannot restore to SQL Server. At least not that I am aware of.
>> did not get the keeping of the null values to work
Depends on how you created the CSV.
>> I parsed the select statement to include CRLF.
So your data can have <crlf> characters in a column? If so, that could cause issues with simple CSV.
Oracle's SQL*Loader can handle multiple line rows but there needs to be some delimiter to specify when a new data row begins. I have no idea if SQL Server tools also has this ability.
How much of the data/rows/columns might have this issue?
What other characters are allowed in those columns? Can you replace the <crlf> with some special character that isn't allowed to be in that field, do the CSV migration and change the special character back to a <crlf> after?
>>At least not that I am aware of.
Yes, I don't think SQL Server or any other tool can read a backup or export DMP file. I'm sure some tools out there that claim it but I would stick with Oracle tools.
Is it an actual BACKUP or is it an EXPORT?
Depends on how you created the CSV.
>> I parsed the select statement to include CRLF.
So your data can have <crlf> characters in a column? If so, that could cause issues with simple CSV.
Oracle's SQL*Loader can handle multiple line rows but there needs to be some delimiter to specify when a new data row begins. I have no idea if SQL Server tools also has this ability.
How much of the data/rows/columns might have this issue?
What other characters are allowed in those columns? Can you replace the <crlf> with some special character that isn't allowed to be in that field, do the CSV migration and change the special character back to a <crlf> after?
>>At least not that I am aware of.
Yes, I don't think SQL Server or any other tool can read a backup or export DMP file. I'm sure some tools out there that claim it but I would stick with Oracle tools.
Is it an actual BACKUP or is it an EXPORT?
ASKER
Slightwv,
I added the CRLF to the end of the select statement.
I tried to export to .CSV initially applying only markup csv.
I tried using this method but the issue was the null values throughout the file:
https://asktom.oracle.com/pls/asktom/asktom.search?tag=sqlplus-query-output-to-csv-or-txt-format
I added the CRLF to the end of the select statement.
I tried to export to .CSV initially applying only markup csv.
I tried using this method but the issue was the null values throughout the file:
https://asktom.oracle.com/pls/asktom/asktom.search?tag=sqlplus-query-output-to-csv-or-txt-format
>>I added the CRLF to the end of the select statement.
No need unless you wrote your own code to do it.
>>I tried using this method but the issue was the null values throughout the file:
Which method? There are at least 3 different methods talked about in that thread.
I would probably use SQLCl for direct CSV output. It does all the work for you.
A close second is string concatenation if the all the columns data will be less than 4000 characters total.
No need unless you wrote your own code to do it.
>>I tried using this method but the issue was the null values throughout the file:
Which method? There are at least 3 different methods talked about in that thread.
I would probably use SQLCl for direct CSV output. It does all the work for you.
A close second is string concatenation if the all the columns data will be less than 4000 characters total.
ASKER
The file contains multiple varchar(4000) columns.
I've attached the file I referenced above.
I will look into SQLcl and see if it can speed things along.
Thank you for your continued support here.
test.sql
I've attached the file I referenced above.
I will look into SQLcl and see if it can speed things along.
Thank you for your continued support here.
test.sql
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
I installed Oracle Client and ODAC components.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Yes. Let me get them.
ASKER
Test connection failed because of an error in initializing provider. The OraOLEDB.Oracle.1 provider is not registered on the local machine.
ASKER
I located the dll and registered it with REGSVR32
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
No. I downloaded and registered the 64bit version. REGSVR32 is proper command to register dll on Windows Server 2016. I'm starting to think I need to download, install and register the 32bit version. Have to answer question, Is Visual Studio running in 32bit mode.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
If you encounter time-outs then a larger batch probably will increase the duration even more. Consider disk, network bandwidth, memory, etc. It is advisable to run ssma on 64bit Windows with at least 4GB ram.
Instead by reducing the batch size you will have a shorter duration transaction and thus avoid timeouts. Please note that this is a trial and error solution as you work out how much to transfer. It's is not merely row count, but size of the rows (how many columns, any large data types (varchar(MAX), image, etc.)) that also need to be considered.
Cheers