impact analysis on Oracle to MariaDB column type conversion

marrowyung
marrowyung used Ask the Experts™
on
Hi,

We are doing impact analysis on the new type AFTER convert from Oracle 10g/12c to to MariaDB, any information on

1) Is there any impact on the change of type ( in terms of RAM, disk space and CPU) ?
2) What to do to get ride of the impact?
3) URL for the proof?

For the following type conversion:

NUMBER(10,0)  and NUMBER (9,0) convert to BIGINT
NUMBER(3,0) convert to SMALLINT
NUMBER(1,0) convert to TINYINT.
NUMBER without () (e.g. NUMBER) to DOUBLEe
      DATE  to DATETIME
      VARCHAR2() to VARCHAR()  (e.g. VARCHAR2(1 CHAR) will become VARCHAR(1). VARCHAR2(100 CHAR) will become VARCHAR(100)
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
nociSoftware Engineer
Distinguished Expert 2018

Commented:
Oracle: NUMBERS are allways floats: https://docs.oracle.com/cd/B28359_01/server.111/b28318/datatype.htm#CNCPT1832
Mysql (Also MariaDB): https://dev.mysql.com/doc/refman/5.5/en/integer-types.html
and: https://dev.mysql.com/doc/refman/5.5/en/integer-types.html
Which you can convert to ints...   because they all have 0 decimal digits behind the decimal dot.
DOUBLE(M,D) would be a derect translation from NUMBER(M,D).

VARCHAR is Equivalent to Oracle VARCHAR2 not many surprises there:
https://docs.oracle.com/cd/B28359_01/server.111/b28318/datatype.htm#CNCPT1822
https://dev.mysql.com/doc/refman/5.5/en/char.html

wrt. CPU the *INT are probably faster, as they need less hardware realestate to be active, need no conversions etc.
The type ranges are ok.
wrt. Storage expect to shrink on the per record data size. (slightly depending on number of occurences) from 8 bytes quantaties...
(in RAM there might be a slight difference, probably less than storage) because of extension to natuaral sizes for integers in registers.
if you mean size in buffers, see Storage sizes).
DATETIME also has a time component. Whicht might need the use of DATE formatting functions.
marrowyungSenior Technical architecture (Data)

Author

Commented:
"Oracle: NUMBERS are allways floats: https://docs.oracle.com/cd/B28359_01/server.111/b28318/datatype.htm#CNCPT1832
Mysql (Also MariaDB): https://dev.mysql.com/doc/refman/5.5/en/integer-types.html
and: https://dev.mysql.com/doc/refman/5.5/en/integer-types.html
Which you can convert to ints...   because they all have 0 decimal digits behind the decimal dot.
DOUBLE(M,D) would be a derect translation from NUMBER(M,D).
"
I show up the research up there and that is what we should do and you are right on that.

"wrt. CPU the *INT are probably faster, as they need less hardware realestate to be active, need no conversions etc.
The type ranges are ok.
wrt. Storage expect to shrink on the per record data size. (slightly depending on number of occurences) from 8 bytes quantaties..."

you are saying NUMBER(N,0) is slower than int? and record is bigger than int in MariaDB?

"DATETIME also has a time component. Whicht might need the use of DATE formatting functions.
Is this your solution?"

then the impact is ?

so once migrated to oracle, but CPU and storage usage will be lower?


another other impact you can think of ONCE converted to MariaDB type ?
Most Valuable Expert 2012
Distinguished Expert 2018

Commented:
>>Storage expect to shrink on the per record data size. (slightly depending on number of occurences) from 8 bytes

You've said this in two questions now.  I don't see where Oracle uses a fixed 8 bytes to store a number.
Oracle Database stores numeric data in variable-length format.
https://docs.oracle.com/en/database/oracle/oracle-database/19/cncpt/tables-and-table-clusters.html#GUID-AB604FF0-D321-44CC-956E-62A7791EE2FA
Ensure you’re charging the right price for your IT

Do you wonder if your IT business is truly profitable or if you should raise your prices? Learn how to calculate your overhead burden using our free interactive tool and use it to determine the right price for your IT services. Start calculating Now!

Most Valuable Expert 2012
Distinguished Expert 2018

Commented:
The current docs have the same verbiage in this case so it isn't a major issue but 11.1 has been desupported for a long time now.  Relying on anything in the docs for that version can be problematic.

Anyway that section is the same as the doc link I posted.  I still don't see where it states that the default for a NUMBER is BINARY_DOUBLE but I might have overlooked it.
Most Valuable Expert 2012
Distinguished Expert 2018

Commented:
Not sure what you are trying to tell me?  All I was saying is I cannot find where Oracle numbers are fixed 8 bytes.
marrowyungSenior Technical architecture (Data)

Author

Commented:
actually any comparison chart to compare the actually occupied size of each type ?

I also don't see why there are 8 bytes.
marrowyungSenior Technical architecture (Data)

Author

Commented:
hi,

any update for me ?
nociSoftware Engineer
Distinguished Expert 2018

Commented:
Oracle:
According to this (i found through slightvw's  link) a number is from 1-22 bytes long.
https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/Data-Types.html#GUID-75209AF6-476D-4C44-A5DC-5FA70D701B78
So how long = number(10,0) -- it doesn't tell..
A Double (binary value) is 8 bytes long
(I read an older manual which mapped Doubles to binary doubles Binary doubles (IEEE Float) = 8 bytes.  [ seel below ]

Mysql:
A FLOAT/DOUBLE is 4 or 8 bytes long.   Appearantly depending on the mantissa you give it.
https://dev.mysql.com/doc/refman/5.5/en/floating-point-types.html
Now there is a problem with the description and example here....
The precision in the example specifies  7,4 as values in decimal digits.

THe specification tells  precision 0-23 --> 4 bytes float, 24-53 --> double.
Those precision is binary precision as the mantissa of IEEE Floats for doubles = 53 and for floats - 23 (the remainder of the storage is for the exponent)  
https://en.wikipedia.org/wiki/IEEE_754
Most Valuable Expert 2012
Distinguished Expert 2018

Commented:
For Oracle just set up some quick tests and see what takes up what space:
https://docs.oracle.com/en/database/oracle/oracle-database/18/sqlrf/VSIZE.html#GUID-CDDB2A17-9398-4AF8-96FB-4297DDA2665B

drop table tab1 purge;
create table tab1(col1 number(10), col2 number(10,2));
insert into tab1 values(1,1.1);
insert into tab1 values(1234567890,12345678.99);
commit;

select vsize(col1), vsize(col2) from tab1;

Open in new window


My results from that test:
VSIZE(COL1) VSIZE(COL2)
----------- -----------
          2           3
          6           6

Open in new window

marrowyungSenior Technical architecture (Data)

Author

Commented:
slightwv,

That example only show number number(10) and number(10,2) takes up the same sizes of storage only !
but mariadB, no idea.

how about:

  DATE  to DATETIME
      VARCHAR2() to VARCHAR()

what will be the impact?

varchar2() and varchar() no different ?

how about DATE type ?

so only NUMBER to INT will have impact ?

number 22 bytes, double is 8byte long.

noci,

"
THe specification tells  precision 0-23 --> 4 bytes float, 24-53 --> double.
Those precision is binary precision as the mantissa of IEEE Floats for doubles = 53 and for floats - 23 (the remainder of the storage is for the exponent)  "

what has precision 0-23 ? number (0-23) = 4 bytes ?  and number (24-53,0) is double type in MySQL/MariaDB?
marrowyungSenior Technical architecture (Data)

Author

Commented:
only should I conclude that for type conversion:

Number:

Conversion:

1. NUMBER(10,0)  and NUMBER (9,0) convert to BIGINT 
2. NUMBER(3,0) convert to SMALLINT 
3. NUMBER(1,0) convert to TINYINT. 
4. NUMBER without () (e.g. NUMBER) to DOUBLE

   Impact area:
 
   1) Database server CPU:  We expect Less CPU loading in MariadB.
   2) Database server RAM:  We expect Less consuming on RAM in MariadB.
   3) Database server Disk:  We expect less space taken in MariaDB.

       Detail:
 
       i) We expect less loading as INT is better for CPU to process that NUMBER type.
       ii) NUMBER in oracle takes 22 bytes, DOUBLE type in MariaDB is 8byte long. So storage need is less in MariaDB after it convert from Oracle.
       iii) RAM usage therefore less.

Date:

Conversion: DATE  to DATETIME 

   Impact area:

   1) Database server CPU: NA
   2) Database server RAM: NA
   3) Database server Disk: NA

      Detail:
       DATE And DATETIME is equal on both oracle and mariaDB.

Varchar ():

Conversion: VARCHAR2() to VARCHAR() 

    Impact area:

   1) Database server CPU: NA
   2) Database server RAM: NA
   3) Database server Disk: NA

      Detail: VARCHAR2 () in Oracle and VARCHAR() in Maria is the same.


?
Most Valuable Expert 2012
Distinguished Expert 2018

Commented:
>>but mariadB, no idea.

I can only help with Oracle.

>>NUMBER in oracle takes 22 bytes

I believe my example showed that isn't accurate.

>>Impact area:

I assume you have tested these assumptions and have verified them?  I haven't seen that discussed in any of your current questions here.
marrowyungSenior Technical architecture (Data)

Author

Commented:
"I believe my example showed that isn't accurate."

so what should be it?

actually for DATE in Oracle and Datetime in MariaDB, how large is it ?

"I assume you have tested these assumptions and have verified them?  "

not migrated yet so I don't know and just do some research there.

" I haven't seen that discussed in any of your current questions here."

but I conclude from here, any thing wrong you can see, you can point it out.
Most Valuable Expert 2012
Distinguished Expert 2018

Commented:
>>so what should be it?

Based on the example I posted and the documentation, number sizes in Oracle appear to be directly related to the value the column contains.

>>not migrated yet so I don't know and just do some research there.

You posted the impacts like they were stated facts.  I'm merely pointing out that I don't think you can state them as facts.

>>any thing wrong you can see, you can point it out.

I feel you are getting way too deep into this.  If your goal is to compare down to the single byte then you will need to just migrate and add everything up.  

If the Oracle database is 1TB, I would expect the MariaDB to be somewhere close.  I wouldn't expect it to be 500M or 2TB after migration.

Create a test table in Oracle with varying data types.  Load it up with sample data, migrate it, compare the sizes and make an educated guess at the complete migration.

You could probably have done this in about 1 day.  You've now been trying to figure this out here in 11 days and counting.

If you want to get a CPU/RAM idea, take an Oracle query that you know is a resource hog, do a quick table migration and run the same query in MariaDB then compare.

Probably between a couple days and a week.

>>actually for DATE in Oracle and Datetime in MariaDB, how large is it ?

The docs have the size for an Oracle DATE.  I assume the MariaDB docs have the same.

I showed you the VSIZE function in Oracle, pick a date column and see the size?
nociSoftware Engineer
Distinguished Expert 2018

Commented:
Size for MYSQL: https://dev.mysql.com/doc/refman/8.0/en/storage-requirements.html

Please read the whole page. It does explain the corner cases.

Appearantly someone made a perlscript to do this:
https://dba.stackexchange.com/questions/114471/calculate-row-size-and-max-row-size-for-a-table
marrowyungSenior Technical architecture (Data)

Author

Commented:
slightwv ,

" I'm merely pointing out that I don't think you can state them as facts."

that one is my consolidation, please let me know if you see any problem.

"number sizes in Oracle appear to be directly related to the value the column contains."

I am sorry, what it suppose to mean?

"Size for MYSQL: https://dev.mysql.com/doc/refman/8.0/en/storage-requirements.html"

any similiar thing for Oracle ?

"If the Oracle database is 1TB, I would expect the MariaDB to be somewhere close.  I wouldn't expect it to be 500M or 2TB after migration."

I think so, or the different could be only few 100 MBs.

"Create a test table in Oracle with varying data types.  Load it up with sample data, migrate it, compare the sizes and make an educated guess at the complete migration.

You could probably have done this in about 1 day.  You've now been trying to figure this out here in 11 days and counting."

agree, but the migration will not migration only 1 rows of data to it but all, but my tools can't migration all data.  so can't make both table have same number of record except done manually to add 1 to 10 rows only ,so this one is not making it an apple to apple comparison.

" If your goal is to compare down to the single byte then you will need to just migrate and add everything up.  "

why ? it just an analysis and will go to present. so we just need to write down sth make sense.

noci,

"Oracle: NUMBERS are allways floats: "

this let me think that the float is integer and in MariaDB it will be INT too, so speed the same ?

"Appearantly someone made a perlscript "

I don't know perl script.

"VARCHAR is Equivalent to Oracle VARCHAR2 not many surprises there:
https://docs.oracle.com/cd/B28359_01/server.111/b28318/datatype.htm#CNCPT1822
https://dev.mysql.com/doc/refman/5.5/en/char.html"

I though varchat2() to varchar() can be straight forward, same thing so change type impact is very small  and only disk space ?

the link shown:
  In oracle : VARCHAR2() has storage size between 1 and 4000 bytes
  In MariaDB: varchar () 65,535 byte.
 
so impact with will CPU time longer to process that as size is bigger and disk size larger ?

DATE and datetime too ? I just want compare these kind of thing and that's why write down the conclusion up there.

now seems really can't find how many types NUMBER(x) takes in Oracle? any idea ?
Most Valuable Expert 2012
Distinguished Expert 2018

Commented:
>>"number sizes in Oracle appear to be directly related to the value the column contains."

They are not fixed size.  Did you run the quick test case I posed back in http:#a42803246

It shows this.

>>any similiar thing for Oracle ?

Yes.  In the Oracle docs.  There are several Oracle doc links in this thread that talk about various data type storage options.

I'm not going to go back through them to see if this one has already been posted but here is the Data Types link:
https://docs.oracle.com/database/121/SQLRF/sql_elements001.htm#SQLRF30020

>>agree, but the migration will not migration only 1 rows of data to it but all

I understand that but for estimation purposes one table, if done properly, should give you decent numbers to formulate an overall estimate.

>>but my tools can't migration all data.  so can't make both table have same number of record except done manually to add 1 to 10 rows only

I cannot help with the tools but it sounds like you need better tools.  All I need is vi and sqlplus and I can load up millions of rows in many tables in Oracle so I do not understand your "1 to 10 rows" restriction.  I've done a little with MySQL and remember being able to execute scripts so I don't see the restriction on the MariaDB side either.

>>not making it an apple to apple comparison.

I mentioned above "If done properly" it can be.  You know the database and data types.  Create a single table that represents the distribution, load several sample sets and measure.

For example:  If the current Oracle database is 50% number columns, 30% varchar2 with an average data length of 200 bytes, 15% dates:  Create a table that has that structure, load up a bunch of rows with random (using the random functions) and measure.

>>why ? it just an analysis and will go to present. so we just need to write down sth make sense.

Exactly.  You don't need a 100% accurate measurement but it seems like that is what you are looking for.

>>In oracle : VARCHAR2() has storage size between 1 and 4000 bytes

With the new Oracle features a varchar2 can now be 32K.

>>so impact with will CPU time longer to process that as size is bigger and disk size larger ?

Only if you store the full 64K?  If a varchar(30000) column in mariaDB only stores a single character, does it use more disk/CPU to read?

>>now seems really can't find how many types NUMBER(x) takes in Oracle? any idea ?

Assuming you meant "bytes" and not "types", I've posted that several times here.
marrowyungSenior Technical architecture (Data)

Author

Commented:
"With the new Oracle features a varchar2 can now be 32K."

so this means to me if we don't change anything and just upgrade oracle from 10g to 12c, it can be 32k instead of 4000  byte ?

"If a varchar(30000) column in mariaDB only stores a single character, does it use more disk/CPU to read?"

but will mariaDB reserve all space varchar(30000) need and just store one character ? or it will keep taking more and more until it reach the maximum storage need by varchar (3000)
Most Valuable Expert 2012
Distinguished Expert 2018

Commented:
>>so this means to me if we don't change anything and just upgrade oracle from 10g to 12c, it can be 32k instead of 4000  byte ?

Yes, the documentation is your friend:
https://docs.oracle.com/en/database/oracle/oracle-database/12.2/sqlrf/Data-Types.html#GUID-7B72E154-677A-4342-A1EA-C74C1EA928E6
•32767 bytes or characters if MAX_STRING_SIZE = EXTENDED

Then there is a link to:
https://docs.oracle.com/en/database/oracle/oracle-database/12.2/sqlrf/Data-Types.html#GUID-8EFA29E9-E8D8-40A6-A43E-954908C954A4

2.1.1.7 Extended Data Types



Beginning with Oracle Database 12c, you can specify a maximum size of 32767 bytes for the VARCHAR2, NVARCHAR2, and RAW data types. You can control whether your database supports this new maximum size by setting the initialization parameter MAX_STRING_SIZE as follows:

•If MAX_STRING_SIZE = STANDARD, then the size limits for releases prior to Oracle Database 12c apply: 4000 bytes for the VARCHAR2 and NVARCHAR2 data types, and 2000 bytes for the RAW data type. This is the default.


•If MAX_STRING_SIZE = EXTENDED, then the size limit is 32767 bytes for the VARCHAR2, NVARCHAR2, and RAW data types.


>>but will mariaDB reserve all space varchar(30000) need and just store one character ?

By definition the VAR in varchar is Variable.  I'm not going to dig into the MariaDB docs for you but I'm sure that has been covered by a link somewhere in this thread already.
nociSoftware Engineer
Distinguished Expert 2018

Commented:
slightwv: yep you were right it was mentioned in first comment:   https://dev.mysql.com/doc/refman/5.5/en/char.html

Because a length ok 30K requires 2 bytes an empty field will take 2 bytes, and all others will take 2+ sizeof string bytes (UTF-8 encoding might limit the actual visible string though to less than 30K).
Most Valuable Expert 2012
Distinguished Expert 2018

Commented:
I read the docs differently:
In contrast to CHAR, VARCHAR values are stored as a 1-byte or 2-byte length prefix plus data. The length prefix indicates the number of bytes in the value. A column uses one length byte if values require no more than 255 bytes, two length bytes if values may require more than 255 bytes.

The way I read it it is the length of the data not the declaration that determines this.  So a varchar(30000) with a single character will need 2 bytes not 3.
marrowyungSenior Technical architecture (Data)

Author

Commented:
slightwv ,

".  I've done a little with MySQL and remember being able to execute scripts so I don't see the restriction on the MariaDB side either."

you export using oracle dump and import to MySQL directly ? data type can be diff ! any problem on that ?
Most Valuable Expert 2012
Distinguished Expert 2018

Commented:
The only thing that can read an Oracle export DMP file is Oracle import.  It is a binary file.

You can use SQL Developer to "export" to a CSV or insert statements.
marrowyungSenior Technical architecture (Data)

Author

Commented:
"You can use SQL Developer to "export" to a CSV or insert statements"

you mean only export to CSV can be use to import to MySQL from oracle directly ?
marrowyungSenior Technical architecture (Data)

Author

Commented:
actually just want to compare diff after convert the table to mariaDB, so for storage size diff  I think it is really good to create the respective table in MariaDB/MySQL and compare the size this is good on storage comparsion.

CPU and RAM can only relies on running a query and see how CPU and RAM utilize ?
Most Valuable Expert 2012
Distinguished Expert 2018

Commented:
>>you mean only export to CSV can be use to import to MySQL from oracle directly ?

"Directly" isn't really correct but, yes.

CSV or generate the insert statements from Oracle and execute them against MySQL.  You would need to validate the syntax to make sure the SQL SQL Developer generates is syntactically correct in MySQL.  If you have dates, I doubt the syntax will be correct so I would probably go with CSV.

>>CPU and RAM can only relies on running a query and see how CPU and RAM utilize ?

RAM can be affected by several factors outside of executing SQL.  I cannot speak to how MySQL works but Oracle can be set up with Automatic Memory Management (AMM) so the memory used can grow and shrink as needed.

CPU also has a lot to do with tuning efforts.  You cannot assume the same SQL will perform the same across two different database platforms.  Each product needs to be tuned based on how it does what it does.  A query optimized for Oracle might perform badly in MySQL because it hasn't been tuned for MySQL.
marrowyungSenior Technical architecture (Data)

Author

Commented:
" If you have dates, I doubt the syntax will be correct so I would probably go with CSV."

yeah, this is what I am wondering and that's why I ask, should not be ! but yes this is the method you mean ?

"RAM can be affected by several factors outside of executing SQL.  I cannot speak to how MySQL works but Oracle can be set up with Automatic Memory Management (AMM) so the memory used can grow and shrink as needed."

I mean how can we measure the RAM impact once move to MariaDB, running query and see how much memory it takes can be one example.

"CPU also has a lot to do with tuning efforts."

then it will be very hard to do compare as the situation is not the SAME as before, index can be all different, how can we compare ?

so this mean only storage impact can be measure ?
nociSoftware Engineer
Distinguished Expert 2018

Commented:
for dates DON't depend on defaults make an explicit  format on a query f.e. ISO 8601 ( https://en.wikipedia.org/wiki/ISO_8601 )
YYYYMMDDThhmmss.sss  (basic combined date time format) or YYYY-MM-DD hh:mm:ss.sss  With times always in a 24 hour format.
(no PM/AM).

You cannot compare RAM between Oracle RDBMS or MySQL/MariaDB.  They work differently.  Oracle has a SGA where MySQl/MariaDB do not.
Oracle runs in multiple processes, MySQL/MariaDB uses a multi threaded approach.
Most Valuable Expert 2012
Distinguished Expert 2018

Commented:
>>so this mean only storage impact can be measure ?

I would focus on overall performance.  Do you really need to measure CPU and RAM usage?  I've never had to do that in my 30 years of database work.  I've never had to go to Management and say "Oracle is better because it uses less CPU than <anything else>".  It may be possible that MariaDB needs more CPU resources or it may be possible that Oracle does to get similar performance but you won't know that until you have properly tuned the system specifically to that product.

Taking 5 million rows in a table and issuing "select * from that_table" without ANY tuning specific to the platform/database product will really not be a fair comparison between the two.
marrowyungSenior Technical architecture (Data)

Author

Commented:
noci,

"YYYYMMDDThhmmss.sss  (basic combined date time format) or YYYY-MM-DD hh:mm:ss.sss  With times always in a 24 hour format."

this already diff and when insert data by CSV, this already cause an error ?

slightwv,

"you won't know that until you have properly tuned the system specifically to that product"

yeah I agree, bu tit is just part of impact analysis.

so in your option only compare queries performance is enough to compare diff DB platform? like once our DB migration to other DB, how is the impact on performance ?

"Taking 5 million rows in a table and issuing "select * from that_table" without ANY tuning specific to the platform/database product will really not be a fair comparison between the two."

sorry, your reason is ?
Most Valuable Expert 2012
Distinguished Expert 2018

Commented:
>>sorry, your reason is ?

Memory allocations, disk layouts, block sizes, database tunables, OS parameters, etc...  Oracle likes certain database files on certain types of disks.  If you don't understand Oracle's architecture and only run DBCA to take all the defaults, you shouldn't expect optimal performance  out of the box.  I can only guess MariaDB is the same.

Only after you have thought out the architecture and configured the database to match the date and Operating System can you start to look at optimizing how the applications access the data.

Running setup.exe and clicking "Next" through all the options and attempting to compare performance really isn't a good test.  In my opinion.
marrowyungSenior Technical architecture (Data)

Author

Commented:
"  If you don't understand Oracle's architecture and only run DBCA to take all the defaults, you shouldn't expect optimal performance  out of the box."

I agree ! query tuning is very necessary !

"Only after you have thought out the architecture and configured the database to match the date and Operating System can you start to look at optimizing how the applications access the data."
yeah ! usually this involve a very experienced oracle DBA.

"Running setup.exe and clicking "Next" through all the options and attempting to compare performance really isn't a good test.  In my opinion."

then I can't finish that impact analysis !

I think basic impact analysis for capacity planning is ok !

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial