Sybase ASE 12.5.0.3 - Index creation / build time

Hi Sybase Experts,

We have a requirement to move Sybase ASE 12.5.0.3 (30 databases) from HP-UX to AIX (as per very specific requirements). We plan to use BCP for tables / data migration. Post data migration we need to created/build indexes.  The database is huge with about 3TB data  in approx 10000 In our test environment, it took 2 min to rebuild index with 1million rows (HP-UX OS). We have tables with 10million  to 35 million rows.tables and 15000 indexes. There are about 150 cluster indexes as well. We need inputs on time required to build indexes after performing BCP operations for data migration as we have downtime only over weekend.

Would you please share your experiences about time required to create index for a table with a million rows (about 1GB data size) on AIX - Sybase ASE version 12.5.0.3. What are the options to speed up index creation / building parallelism to achieve data migration over weekend's downtime?

Thanks for your time and suggestions!
LVL 3
mganeshAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

grant300Commented:
You don't mention how much IBM hardware you are running on.  Which P-series machine is it and how many CPUs, how fast are they, how much RAM, and what is your disk configuration and connectivity.

You can, of course, do as much BCP and index creation as you like in parallel IF you have adequate horsepower; you will just have to build a bunch of different scripts to run with different connections.

To be honest, I would not do things this way at all for a couple of reasons.  First, you are going to have a heck of a time getting everything moved and indexed in a weekend.  Unless your AIX box is very much faster than the HP box, the math just doesn't work.

If I read your slightly garbled question, you have 10,000 tables.  If you spend just one minute on each one doing the BCP and index builds (that may or may not be optimistic), you will have spent 167 hours or about 3.5 times what you have available.  You should also plan on doing Update Statistics when you are done so you catch all the non-index columns.  There's more time.  Let's assume you have enough horsepower on the new machine to do four parallel operations without degrading run times too badly.  That means you can do it in a weekend IF NOTHING GOES WRONG!!!  That ain't going to happen either.  Take it from someone who has done a bunch of migration projects.

BTW, with migrations like this, plan on doing it three times.  The first is a purely test run to see what the problems are and get an idea of the run times.  The second is full up dry run with what you think you are going to do for real and gives you a chance to work out the last of the kinks.  Then you get to do the real migration.

Going back to the thought I had a couple of paragraphs back, I would not even try this the way you are talking about.  There are a couple of decent options to consider.

First, if your shop has replication server, you can set up the new machine, create all the databases, indexes, and objects, and then setup replication.  So what if rep server takes a week to do all the initial population.  Production is not down any longer than it takes to setup rep server.  This also allows you to do testing at your leisure rather than doing two queries at 6:47AM Monday morning and pretending you are production ready.  You may be able to do this with an evalutation or developers copy of rep server so you don't have to invest too heavily, however, you will have to get a rep-server expert in to help out if your shop does not already have that person.

The second approach, and the one I would use, is to move to a newer version of ASE on the AIX box which supports cross platform restores.  This actually has a number of advantages.  First, 12.5.0.3 was never really all that stable compared with 12.5.1 - 12.5.4 so you would be going to a more stable version.  Second, 12.5.0.3 is no longer supported.  If you don't want to go to 15, you should reallly be patched up to 12.5.4 anyway just so you are not flying with your pants down.  Third, doing cross platform restores is the surest and simplest way to make certain you have everything you need, and don't have problems with things like identity columns and other nastiness.  The one thing you will have to do is to delete and update all the statistics.  You can do things like rebuild the cluster indexes BEFORE you do the backups and restores.

For my money, the patch to 12.5.4 and cross platform restore is your best shot at actually pulling this off in a weekend.  Rep server is the highest cost and most complicated but it gets you out of the pressure cooker and allows you the most latitude to test and make sure things are working before you pull the big knife switch.  The BCP and build indexes approach is frought with problems and may not even be practical in the time you have.

BTW, how long does it take to do your nightly full backups of your databases on the HPUX machine.  One of the nice things about the fact that all this stuff is split up in 30 databases is that you can pipeline the work so that you start your restore on the target as soon as your backup of one database is done.  My guess is that the AIX machine will be faster so you will finish up the restores shortly after the last backup is completed.

Regards,
Bill
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
mganeshAuthor Commented:
Thanks Bill for a really quick response. Great explanation.
I agree with your views. We had thought of following approaches:
1) Cross-platform dump/load across HP-UX and AIX with target ASE version 15.0.3 (big bang).
2) Patch ASE 12.5.0.3 on HP-UX to 12.5.3 or higher and then perform cross-platform migration using dump and load (phased)
3) Replication server
I am comfortable with second approach - I trust there will be absolutely no issues with patch upgrade from 12.5.0.3 to 12.5.4 on HP-UX (application will not break) and then using dump-and-load to move to AIX.
However, there is very much insistance about  AS-IS move of ASE instance (no version change) from HP-UX to AIX. Replication server seems to be best suited option if ASE version has to remain same, but it will add cost as you have mentioned. With this background, I was trying to explore BCP option for cross-platform data migration. Index creation time is my major concern.
I am sorry, I do not have much details about hardware at production site at this moment. Depending on server capacity, parallelism could be built to speed up index creation with upto 3-4 threads.. this is what I am thinking as of now and hence wanted views from experts in this forums.
Thanks.

0
alpmoonCommented:
I am guessing that it is achievable during a weekend. But, it really depends on number of CPU's and their speed. Also you need to do a lot of testing to find the best server configuration. I am assuming that only some of those 10000 tables are big. You should find the best config for small tables, mid size tables and big tables. Then change server configuration between index builds on tables. You don't need to run update stat as statistics are created during index build.

For big tables:
Start trying index build using "max parallel degree" as 6 and 'number of sort buffers' as 20000. Set number of worker processes' and 'proceudre cache' high enough. Increasing parallel degree may help more.
0
Starting with Angular 5

Learn the essential features and functions of the popular JavaScript framework for building mobile, desktop and web applications.

mganeshAuthor Commented:
Thanks alpmoon for your response.
50% of the tables have  more than 1 million records - larget tables having 35million rows and table data size in GBs
e.g. few top tables:
 rowcnt       TblSize MB
========     ============
 35,022,418       13636.90
 7,972,884       6827.97
 7,843,554       6717.22
 5,979,043       4491.51
 4,392,986       3762.15
 4,767,137       3546.11
 17,261,056       3160.59
 6,452,595       2416.01
 13,412,844       1982.62
 5,199,808       1558.72

Considering 4 parallel threads, can we expect index creation to be optimized?

Thanks for your time.
0
mganeshAuthor Commented:
Hi,

Any thoughts of using Sybmigrate in this scenario please....

Thanks.
0
mganeshAuthor Commented:
I guess with 30 very large databases, sybmigrate may not help as running more than one
session of sybmigrate on the same source and target database path is not allowed..
0
alpmoonCommented:
To make it clear, you shouldn't try to create more than one index at a time. It would be an IO bound process and you may not get any benefit at all (might be beneficial for small tables). Instead you should try to use as much as more cpu to make it faster. But, as you use more engines the overhead is becoming higher as well. So you should find the optimal number of engines. That number might be different for a million row table than 35 million row table. For small tables, it is not worth at all.

I suggest using sybmigrate because it uses in memory bcp very efficiently. You need to configure sybmigrate to use more threads in parallel to insert data. I think default is three threads to create objects and then three threads to copy data. You should increase it in your scenario. But don't create all indexes by sybmigrate. Use a list in config file and include only small indexes in this list. You can't change server configuration during sybmigrate. After sybmigrate finishes its job, then create indexes on large tables by using parallel sort.

By the way if these are the top tables you have, a weekend would be more than enough to create all indexes as long as you use ASE efficiently.
0
grant300Commented:
A number of thoughts...

You still have to update statistics table statistics even if you have just built the indexes because you won't have histogram data on the non-index columns.  If you don't have SARGs against non-index columns, you can skip the step but that is rarely the case unless you have either a very simple database (doesn't sound like it here) or you have way to many indexes on your tables.

The patch HP-UX up to 12.5.3 or 12.5.4 and then cross platform dump/load is the best route.  If you can get management to buy into the patch, you have your solution.

As always, you (they) should install the patch in a test (or dev) environment first.  There are a handful of new reserved words and the query optimizer improved a bit so I would not just patch production and hope.

In terms of parallelism, I was thinking in terms of parallel operations in different databases.  You have 30 databases so you could have one operation going on in each of say 4.  The idea there is that you won't saturate the log device on any one database that way.  Hopefully, the new system has enough spindles and the devices will be spread out enough so you can do this.

Sybmigrate is a reasonable fallback if you cannot get the patch approach approved.  Again, because you have multiple source and target databases, you can run separate sybmigrate sessions in parallel, at least until you suck the life out of the source system.  In most cases, when there is a platform change and new hardware, there is also a substantial bump in horsepower.  You really need to look at the performance on the existing machine and then figure out what the new configuration looks like and extrapolate what you can expect there.

BTW, how long does it take to backup the existing HPUX production machine?

Regards,
Bill
0
alpmoonCommented:
Hi Bill,

It might be a long discussion, but I don't think that updating statistics on all columns is needed in most of the situations. If you don't have an index on a column, these stats are only valuable to choose the table order of a join when there is no useful index. And if it is a critical query, you must have a relevant index to join large tables anyway. If you meant having statistics on all index columns, it is a different consideration as it depends on ASE version.

On the other hand, I agree that upgrading ASE to 12.5.4 and dump and load would be a better option than cross-platform move by sybmigrate (or bcp).
0
mganeshAuthor Commented:
Hi Bill,
Target machine is likely to be IBM Power 570 series.
http://www-03.ibm.com/systems/power/hardware/570/specs.html
This has 4 GHz processor whereas current HP server (HP9000-82) has 1 GHz processor.

From your experience, would you be able to comment if this IBM P 570 machine will be able to take up multiple parallel workers for migration of 3 TB data (10000 tables + 15000 indexes) over weekend?
Thanks for your time.
0
grant300Commented:
Well, you really have not given me enough information here.

The p570 is a heck of a machine and I highly doubt that your organization has specified it with a single CPU/core.  In fact, the minimum number of 4.2GHz Power 6 cores you can configure is 4 and you can go all the way up to 32.  You can also configure as much as 768GB of RAM.

I am also guessing the the old HP9000 had more than a single processor but I could be wrong.

Assuming a reasonable minimum configuration for a p570 as 4 cores and 64GB of RAM, I would say you should be able to handle 4 simultaneous operations without any problem at all.  You may find you can handle more than that if you are waiting on normal disk I/O latencies; but not if you have buried the storage subsystem.  I would not use parallel worker threads (e.g. parallel index builds) though as I don't believe you will get as much scalability as you will out of running multiple separate operations, particularly because of logging and aggregation issues.  You have a lot of objects but only a few are large enough to benefit substantially from parallel operations.

There is a large caveat here and that is the storage configuration.  Performance will depend greatly on how many spindles you have, what kind of array hardware, what kind of connectivity, what kind of RAID you are using, and how carefully you distribute db devices across the physical hardware.  More often that not these days, the large spindle sizes are seen as a way to reduce cost and footprint.  Unfortunately, you still need a bunch of spindles when you are doing a lot of random I/O.  RAID-5 has also become ubiquitous and is the worst case configuration for OLTP databases.

The p570 should run circles around your old HP9000 however, it is possible to configure the storage hardware and/or Sybase so poorly that you won't get all the work done in a weekend.  As I said, you are going to have to do this three times...

Also, to reiterate, given the complexity of your installation and sheer number of databases and objects, the only practical way to do this is a cross platform dump and load.  Any other approach is going to take months of prep and testing, cost a small fortune, and may ultimately fail anyway.  As a consultant, I would most likely walk away from a client who insisted on trying to script 10,000 BCPs and 15,000 index rebuilds and expect to get it done, under the gun, over a single weekend when, in fact, a simple, easy, and robust alternative is available.  In my experience, those folks that insist on making bad decisions in spite of good advice turn around and blame the consultant for the failure anyway.  My two cents....

Regards,
Bill
0
alpmoonCommented:
I think we need to make some points clearer. Thereis no need to write scripts to copy data and build all indexes. All data move and most of indexes can be included in sybmigrate. It only requires list of indexes included.

We have tested a similar migration even though it was dump and load and building some indexes as it is cross-platform. But, I think we have more number of very large tables (10+ million rows). One of the largest database took 20+ hours during initial test with serial sort. Then we have reduced that time to 2:20 (including dump and load time) by using 6 way parallel sort. So parallel sort is the main area to reduce the total migration time. Otherwise I don't think that you can achieve it in two days. But, as I said earlier you should use it for only large tables selectively.
0
mganeshAuthor Commented:
Hi Bill,

Thanks for elaborate explanation again.
Sorry, I did not clarify on number of CPUs - client is willing to know what resources will be required to optimize parallelism for this operation and not restricted to fewer CPUs or RAM.

Amongst 10000 tables, 50% of tables have millions of rows - more than 10 years data is residing in the databases and latest update is most of the data is loaded through flat files (which may provide us opportunities to preload some of the history data into target instance ahead of downtime window).

I am also of opinion that it may be possible to optimize BCP for data load, but index building and optimizing and testing parallelism is going to be a time taking task. The risk is higher than dump and load method as requirement is to complete prod migration by Oct. Testing/optimizing BCP or Sybmigrate in DEV/Test environment will certainly need more time before we rollout prod environment.

Regards,
Ganesh
0
mganeshAuthor Commented:
hi Alpmoon,

Thanks for your response.
Amongst 10000 tables, 50% of tables have millions of rows - more than 10 years data is residing in the databases. Amongst 15000 indexes, 150 are cluster indexes. A large chunk of data is loaded through flat files. The largest database is about 1 TB is size,2nd largest is about 500GB and rest of the databases are in range of 50-300GB. I agree with you about parallel operations and we are not considering serial option for large databases. My only worry is about likely contention due to many parallel threads causing delay in completing migration.

Regards,
Ganesh
0
alpmoonCommented:
I am not sure what contention are you worrying about. Unless you have more than 8 CPU's allocated for ASE, you shouldn't run anything else while running a parallel index build. It utilizes all CPU's set by 'max parallel degree' and also sort buffers are actually alocated from default data cache. Running anything else at the same time may reduce the benefit of parallel sort severely.

Therfore you should complete all bcp loads and build all small indexes by using 4 or more threads (depending on number of engines) and then build large indexes by using parallel sort one by one.  You need to be carefull about the time to change server config to allow parallel sort, becuase parellel index creation would be slower for small tables.
0
grant300Commented:
This is where it gets tricky.  If you are using sybmigrate to do this, you are going to lose the fine-grained control over what is run when and it will make changing the tuning to mid-stream much more difficult.

Has doing this the right way been completely explored and ruled out for some combination of good reasons?  If not, you are spending time throwing deck chairs off of a sinking ship.

Regards,
Bill
0
alpmoonCommented:
If it involves partial data load before cutover, some scripting is required anyway. Sybmigrate is flexible to exclude some of tables or indexes. But, you can only provide a list of tables in terms of data load. It doesn't allow you to load some of data into a table. So the databases involving preload has to be scripted. sybmigrate can only be useful for other databases.
0
mganeshAuthor Commented:
Hi Bill,

To answer your question on backup time : Total backup time for 3 TB data is about 5.5 to 6 hrs.

Regards,
Ganesh

0
grant300Commented:
Ganesh,

That would seem to make the cross-platform dump and load a no brainer.  You could easily do the dump and load and still have time to do whatever other maintenance seems prudent, e.g. reorgs, update statistics, etc., and, of course, take a fresh dump of the new platform.

Regards,
Bill
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Sybase Database

From novice to tech pro — start learning today.