AS/400 File Reorganization...

Posted on 2008-10-21
Last Modified: 2013-12-06
We have an IBM System I with i5/OS V5R4.    We would like to do a file reorganization on the 100 largest files each month.     The problem I am having is trying to get the file information (filename, library, file size) into a file which I can sort.     What is the easiest way I can do that?
Question by:jcurtis1027

Expert Comment

ID: 22771560
Run the following command:
===> dsplib lib(*all) output(*print)

once complete, open iSeries Operations Navigation, select your system i --> Basic Operations --> Printer Output.
The spool file name should be Qpdsplib.  Drag-n-drop to your local hard drive.  You now have a text document you can alter.  Import into excel or another spreadsheet app and you should be able to sort.

It may be easier to run the dsplib command on each library in question individually.  The output would be much easier to work with and take far less time to generate than with lib(*all).

hope that helps....

LVL 34

Accepted Solution

Gary Patterson earned 125 total points
ID: 22773486

Use the DSPFD command with the TYPE(*ATR) FILEATR(*PF) OUTPUT(*FILE) option to dump a list of physical files to an output file.  If your files are in multiple libraries, use wildcards, or execute the command multiple times, ADDing records to the output member rather than REPLACing them.

In addition to the file name, you'll get other useful information, including the record length, number of records, and number of deleted records.  You can name the output file anything you want, and use any tool you like (Query/400, DFU, SQL, etc.) to view the contents of the file and field names, or you can download the file using Client Access and view it in Excel or the format of your choice.    

In the past, I've developed automated reorg processes for clients that work basically like this:

1) Dump a list of candidate files to an output file using DSPFD
2) Select files needing a reorg using SQL, RUNQRY, FMTDTA, OPNQRYF, etc., sorting by the files that contained the most slack space first (number of deleted records * bytes per record)
3) Process each selected file:
3a) Estimate the time to reorg and rebuild indexes.  If the remaining reorg time window is not long enough, stop here.  Otherwise:
3b) Use the RMVLFM command to remove logical file members.
3c) RGZPFM the file
3d) ADDLFM to rebuild access paths

If you are reorganizing principally to recover deleted space, you should consider changing your files to Reuse Deleted Records (CHGPFM command).  

On growing files, you'll never have to reorg again.  Generally, if it is safe to reorg a file, it is safe to Reuse Deleted Records.  Be aware that enabling Reuse Deleted Records can slow down file write activity, since the system has to search out deleted records throughout the file and write to that particular page of the file, rather than just accumulating a block of records and tacking it onto the end of the file in a single operation.

- Gary Patterson
LVL 34

Expert Comment

by:Gary Patterson
ID: 22773530
By the way, there is no reason that you can't use the DSPLIB method hdgh outlines above.  The DSPFD approach does have several advantages, however:

DSPLIB does not offer an OUTPUT(*OUTFILE) option, so you have to parse a spooled file containing page and column headings that you have to deal with.  
DSPLIB includes all objects in a given library, not just physical files.  DSPFD FILEATR(*PF) produces a list of only physical files.
DSPFD gives you deleted records and record length, so you can calculate your space savings.
DSPFD gives you a native database file that you can use, for example, to drive an automated file reorg program or print a report for manual reorgs.

- Gary Patterson
PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

LVL 27

Expert Comment

ID: 22773553

> We would like to do a file reorganization on the 100 largest files each month.


It's hard to imagine a system where this would make sense. Is this purely a learning exercise or do you seriously intend to implement something like this?

I did a project a number of years ago for a regional (Washington, Oregon, Idaho, and possibly sections of other states) wholesale food distributor. Their file structure was different from most others I've seen. E.g., there were essentially no "orders" entered from customers; but rather the "line items" were generated by various algorithms that predicted how much of a given commodity food item would need to be loaded onto trucks headed to various cities/towns, based on prior activities. (Simplified, but close enough. It took a while grasp order line items that had no "orders".)

Their files grew and shrank by enormous amounts over a day's processing and reorgs were part of doing business.

But even slightly reasonably designed databases (in DB2 under OS/400 or later) are rarely benefited from such regular reorgs... _especially_ if you're really thinking of the "100 largest files".

OTOH, I can imagine regular _examinations_ of the largest files to see if a reorg is applicable.

But even so, what is the reason for considering the largest, rather than looking for ones that would return some benefit? Maybe it's your _smallest_ files that need a reorg? What has led you to think that "largest" makes a difference?

I really hope that that doesn't come across as negatively critical. I only hope to make certain that the point of reorgs is well understood.

Certainly, one benefit of a reorg _can_ be the recovery of space that was previously used for records that have since been deleted. In those cases, for large files that have high volatility, a reorg can be worthwhile. Ideally, though, it probably should be done once and infrequently after that, rather than monthly. But at that first time, the file attributes should be set so that deleted record space doesn't accumulate in the future.

Also, under some circumstances, you might have volatility that's combined with a lot of sequential-by-key accesses through a specific index. You'd want to reorg those few files according to the order of the particular index that controls the sequential access. But I wouldn't expect the "100 largest files" to all fit in that category. Often, it's either volatile _or_ sequential-by-key; and just as often, one of those doesn't apply.

So, do you have a specific business problem to solve? Maybe there are much better methods. Are you looking to learn how to approach a problem such as you've laid out in order to understand how it would be handled? It's certainly an easy request, and example programs could be supplied. Or has the question been completely misunderstood?

Can you elaborate?

LVL 34

Expert Comment

by:Gary Patterson
ID: 22773798
As usual, Tom brings up a good point (or two).

Reorganizing the 100 largest files (unless you've already done the analysis and determined that the 100 largest also happen to have the largest amount of deleted space) isn't necessarily the best use of reorg time.

One good way to determine the optimum files to reorg is to use the mechanism that I outlined above.  Estimate the number of bytes to be recovered and reorg the files with the highest potential recovery:

Number of deleted records * record length = Estimated space recovered.

This assumes, of course, that deleted space recovery is your main goal, as opposed to physical reordering of records into key sequence.  That requires a different analysis process, including analysis of programs accessing these files.  You want to identify those files that have the highest incidence of sequential-by-key access and where a strong preference for a single index exists.

- Gary

Author Comment

ID: 22777678
Thanks for all your input...  Tom we are a wholesale food distributor (but on the East Coast).    We are looking as part of our month end processing reorg'ing the 100 largest files.    We figure in time these all files will eventually be reorg'ed.    This is more for disk space than performance.
LVL 27

Expert Comment

ID: 22781977

Wholesale food distributor? Wow. Serendipity!

In that case, I can accept the possibility. Whether it's appropriate or not is still open.

Recovery of deleted-record space -- you _might_ be better off setting the files to re-use deleted-record space:

 ==>  chgpf  mylib/mylargefile  REUSEDLT( *YES )

The idea wouldn't "recover" the space, but it would help control the future allocations of space. The data spaces wouldn't be extended unless additional space was needed at run-time after deleted record areas were populated by active records.

The consideration would be whether or not you _need_ to ensure that sufficient space is allocated and available when your month-end runs. You might "recover" the space today, but month-end might crash when it tries to grow the files during month-end run-time. It's always possible that there simply won't be enough available space.

When space is released at the start of the month, it becomes available for anything else to use. How will you guarantee that some rogue query won't chew up just enough space in the 2nd week of the month that you'll exceed your limit in the middle of month-end?

That is, your 100 largest files _probably_ ought to hold on to a lot of space.

You can certainly run reorgs, but I would strongly recommend that the files get explicit size allocations also. The allocated sizes should be large enough to accommodate most normal month-ends with minimal additional allocations.

So, first thing would be to log the sizes of your large files after month-end completes. How many records fit into the sizes that are allocated, including deleted records? Whatever those sizes are, they are close to what ought to be allocated _before_ month-end even begins.

Note that space allocation takes time. You probably don't want your month-end spending more time linking new disk sectors into the file spaces than the time used for actual processing.

Consider running a job right after month-end every month. For each large file:

 ==>  dspfd  mylib/mylargefile type( *MBR ) output( *OUTFILE ) fileatr( *PF ) outfile( datalib/largembrs ) outmbr( *FIRST *ADD )

After all of the large files have their member attributes captured, you can collect the records into a history file to track growth or trends or whatever. You will have a picture of how big each file's _allocation_ ought to be. Once you know that, reorgs could be done to give you lots of room during the month.

But maybe a day before next month-end, run another job:

 ==>  chgpf  mylib/mylargefile  size( &x &y &z ) allocate( *YES )

The &x, &y and &z values would come from however you choose to assign sizes. Perhaps you'll have programming to read the member history and to generate size() values based on your history.

That job would allocate the spaces ahead of time. You'd not only know that space is actually available; you'd also have much of the processing done for allocations without impacting month-end run-times.

Note that the REUSDLT(*YES) attribute probably still is useful.

Also note that the ALLOCATE(*YES) attribute is actually applied to _members_ rather than files. It doesn't actually take effect until the members are "cleared, restored or reorganized". That means that a reorg would be appropriate right after each CHGPF command. It might also mean that similar jobs would be run after month-end -- CHGPF to reduce allocated attributes and RGZPFM to enforce reductions. (CLRPFM would work as well.)

In any case, consider creating a job description that has a library list that includes only those libraries that hold your "100 largest files". You can then submit a job using that job description for this command:

 ==>  dspfd  *USRLIBL/*ALL  type( *MBR ) output( *OUTFILE ) fileatr( *PF ) outfile( datalib/largembrs ) outmbr( *FIRST *ADD )

That job would then produce a member list of the files in the job description library list. You could use SQL DELETE to remove all rows that show smaller than some size limit that you choose. Remaining rows would be your "n largest files". (If you don't have SQL available, there are still plenty of possibilities.)

From that file, you can generate one, two or more other files to drive other jobs -- reorgs, allocations, whatever.

Overall, the whole thing ought to be possible to set up in a day's worth of time (scattered hours over a few days). Once done, it should more or less run itself.

Am I making sense? Plenty of details can be described or lots of different schemes can be proposed. (Or we can just do straight reorgs if you really want to.)


Featured Post

Best Practices: Disaster Recovery Testing

Besides backup, any IT division should have a disaster recovery plan. You will find a few tips below relating to the development of such a plan and to what issues one should pay special attention in the course of backup planning.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Introduction How to create multiboot configuration with XP\Vista and Windows 7 on it? And most important question - how to do this correctly so not to have any kind of nightmares we get when system gets screwed? First of all one should realize t…
I. Introduction There's an interesting discussion going on now in an Experts Exchange Group — Attachments with no extension ( This reminded me of questions tha…
This is used to tweak the memory usage for your computer, it is used for servers more so than workstations but just be careful editing registry settings as it may cause irreversible results. I hold no responsibility for anything you do to the regist…
Hi friends,  in this video  I'll show you how new windows 10 user can learn the using of windows 10. Thank you.

770 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question