Link to home
Start Free TrialLog in
Avatar of learning_sybase
learning_sybaseFlag for India

asked on

very high logical ios for a delete through stored proc.

One of the deletes is taking very long and is showing very huge logical ios but no 0 physical io's.
the same proc is taking 10-12 minutes in prod but in INT its taking too long.
Prod and INT database server has same configurations, same cfg file is used to bring up the database server.

the query plan shows work table in prod but not in INT, though both show table scans.

what can be the reason for this strange behaviour in INT.  The INT is loaded from prod, tables are reorged and update stats performed too.
Avatar of Joe Woodhouse
Joe Woodhouse

The fact that you have different query plans in these two ASEs is the clue. There is something different. Find that difference or differences.

If I were at your site being asked to investigate this I would assume nothing I'd been told was actually true until I could prove that it was.

Same .CFG file? Show me. I'd take the output of sp_configure from both ASEs while they were running, and run them through an sdiff. Ditto for sp_cacheconfig and sp_helpcache.

Normally I'd say since INT is loaded from a dump of PROD all data fragmentation and optimiser statistics must be the same... but then you say that tables are REORGed and UPDATE STATISTICS is also done. Do you mean done in INT after the load? If so then you are introducing changes. I'd want to see the results of an optdiag on all tables involved from both ASEs and likewise check for differences.

Are the queries run by the same ASE login in both ASEs? Is that login the same user in the databases in both ASEs? (Same suid, same uid?) Are there any login triggers and if so are they exactly the same? Logins aren't in the .CFG file nor in the database dump - databases dump their users, but logins are server-wide and live in the master database, which means if the suids are different it might be treated as a different user in the database. Likewise check the login(s) have identical roles.

I suspect it's going to come down to something about tempdb though, because tempdb is local to each ASE and is not included in the .CFG file or the database dump. Are the tempdb database(s) identical? I don't just mean the same size, but identical - same number of rows in sysusages with the same segmap and size values? Is tempdb cached in one ASE but not in the other? (Also not in the .CFG, or in a dump of a user database as database-level cachings are in master..sysattributes.)

I am guessing it might be a combination of the two above items: different logins being bound to different tempdbs which have different layouts and are cached differently.

I suggest you post the query plan showing us what's different in each ASE. It's also worth setting some other options like "show plancost on" and "show missing_stats on".
>> One of the deletes is taking very long and is showing very huge logical ios but no 0 physical io's.

Ideally Physical reads/IO are page reads from the actual table data(disk) to Memory and Logical reads/IO are pages read from memory if they are already available in Memory.
If you are seeing massive Logical IO, then it is trying to perform operation in the memory and yet to get committed to your actual data in disk which will take Physical IO..
Seems like you are trying to do DELETE of lot of records in a single transaction or batch which is taking more logical IO without even getting committed to Disk..
So, try deleting this huge table in batches of 1000 or 10000 depending upon your Server performance so that you see a combination of both Logical as well as Physical IO per batch or transaction..
Avatar of learning_sybase

ASKER

tempdb size were not same. the INT one has less size for data and log. I have increased the size and made same as prod.
Asked the application team to test. Lets see the outcome.

The CFG files, sp_configure output, sp_helpcache, sp_cacheconfig ...all match between INT and PRD.

The reorg for those 3 tables were done after the load in INT from PROD.

I think it must be the tempdb issue , but let me confirm...

Thanks for your help.
Dear Raja,

I undertstand your point but i dont agree it with what is happening here. The data is all in cache hence the logical io's are high but physical io is 0.
If that is the case then physical io should be high too but that is not the case.
I think its a tempdb issue, as the ASE has to create a worktable , do some sorting and then decide what all to delete , but if tempdb is not big enough then it will keep flushing  it and hence it is taking much longer.
Making tempdb big should resolve it, lets see,

thanks
ASKER CERTIFIED SOLUTION
Avatar of Joe Woodhouse
Joe Woodhouse

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
By INT, I'm guessing you mean internal/staging/dev while prod is your production machine.

If this is correct, then could be your prod database has tables populated very differently too.

As Joe mentioned, if you have 2x very different query plans produced, for the exact same SQL statement, then you have difference between your 2x configs. Likely part of the difference relates to data populating your tables.

Likely be helpful if you provide a copy of your SQL statement + EXPLAIN of SQL statement from both INT + PROD.

Someone will likely see the problem quickly + can provide an explanation or solution to the difference.
To be clear for anyone finding this in a later internet search:

We've identified a few areas that aren't the same in the two environments. Sybase is a little sneaky about where some config items live and so even copying the .CFG file and doing a backup/restore of a database isn't enough to guarantee they are identical. OP also confirmed some changes are made after the restore which absolutely affect query plans. We still have an open question about whether the same (server-wide) login is being used to run the query... this can also affect optimiser decisions particularly if anything is set in a login trigger.
Same login was used with same perms.
Making the tempdb of the concerned sever same as Prod resolved it.

So how can we know if its a tempdb contentions? what parameters or numbers or stats can tell us that ? Though there were no tempdb alerts , of getting filled up or something like that.

Thanks Joe
You've pretty much done this the right way - rule out the obvious differences (use the same .CFG file; dump and load the database), and look at the query plans.

Here the big clue was temp tables in one plan but not in the other... tempdb is obviously implicated. But was it the chicken or was it the egg? It could easily have been the other way, where doing the reorgs and update stats changed optimizer decisions.

We can't be certain of what's going on here without enabling a lot more detailed diagnostics, but if I had to guess I would guess that the optimiser wanted to use temp tables but didn't think there was enough space to do what it wanted in your INT environment, and so fell back to a method not quite as good. The optimizer knows how big your tempdb is and makes educated guesses about how much #temp table space will likely be needed.

I have sometimes seen table attributes like expected_row_size make a big difference to tempdb usage, but since you dumped and loaded the database those would all be the same here.

Partly it's just knowing what things feed into optimizer decisions. When you said so much was the same between the environments I simply went down the list of what wouldn't (necessarily) be the same and should therefore be checked. This isn't the first time I've seen tempdb differences have effects way beyond what we might expect.

I'm glad we sorted it for you.