Solved

CHGPF vs CRTPF

Posted on 2004-10-29
1,507 Views
Last Modified: 2012-08-13
Hi fellows,

can someone tell me the diference between the CHGPF and the CRTPF, yes I know the data will stay in the file, the LF's wil stay there, the security is the same, the triggers are still there.

But what is happening during the CHGPF???

Thanks a lot
0
Question by:theo kouwenhoven
    15 Comments
     
    LVL 26

    Expert Comment

    by:tliotta
    murphey2:

    I've never looked at the details, but I wrote my own CHGPF way back in OS/400 V2R2, so I might guess.

    Start with CRTDUPOBJ into QTEMP or perhaps QRPLOBJ for the PF with DATA(*YES). Then CRTDUPOBJ the various LFs into the same library. You now have a basic copy the file, the data and the logical views.

    Now CRTPF into the original library replacing the original physical file. Then do a CPYF temp back to original with FMTOPT(*MAP *DROP). This reloads the data into the new file defintion. Then CRTDUPOBJ the LFs from temp back to the original library. Finally, delete all the temporary stuff.

    There are a bunch of problems that must be overcome to make this work with all authorities and triggers and constraints and SQL views, etc.; but that kind of stuff is easy for IBM. With various APIs to retrieve all the info into user spaces or memory beforehand, everything can be recreated.

    This isn't intended to say that IBM's command does it this way. It only gives a conceptual view of the kinds of things that can be done. I have a RGZDBRPFM command on my old web site ( http://zap.to/tl400 ) that has a few samples of what can be done to speed the process of rebuilding access paths if the file is large with many LFs that don't share access paths. My command uses Data Base Relations to cut the the re-org time on very large files. I suppose I ought to get back out there and modernize it, but it still works for most cases, so...

    Tom
    0
     
    LVL 16

    Author Comment

    by:theo kouwenhoven
    Yeah Yeah, I know,

    I already have my own Zupdpd, and that is copying, creating, rebuilding, retrigerring and granting, but that was not the question,
    I discovered the CHGPF just a couple of weeks ago, and I was wondering if IBM was as smart as we are :), or that they have an other way to add a column to the file.

    SO ????



    0
     
    LVL 26

    Expert Comment

    by:tliotta
    Good question... there are indeed a couple apparent oddities with CHGPF. Since IBM has direct access to any system state/system domain function, they can pretty much do whatever they choose regardless of what they let us do (without breaking security).

    But, it's the same question that can be asked about all kinds of things -- e.g., IBM can mess around with program-associated spaces for ILE programs while we can only do so with them for OPM programs that have been compiled through the trusted translator (i.e., RPG or COBOL but not CL).

    If IBM can directly manipulate object descriptions -- which they obviously can because they create them -- then there's nothing to stop them from rewriting the file/field descriptions directly and then simply moving bytes between data spaces adjusting for fields as needed and associating the new space with the updated object description.

    Without major disassembly of CHGPF, I don't see how we can know. We could make some educated guesses based on what we see. I've seen one or two comments here about it.

    Tom


    0
     
    LVL 16

    Author Comment

    by:theo kouwenhoven
    Hi Tom,

    Yes, that's what I was thinking, If I look to the speed of a CHGPF and to the speed of my own program, I have the feeling that CHGPF is faster (maybe just becaus it is using C instead of CL....)

    Thanks for your reaction, I will leave the question open for a while, I'm almost sure that Dave will add some comment to this.
    0
     
    LVL 14

    Expert Comment

    by:daveslater
    Hi
    A very interesting question that I have been looking into.
    I have found some very dangerous consequences of using CHGPF and will post results in the next few days.

    Dave
    0
     
    LVL 14

    Accepted Solution

    by:
    Hi chaps, I got it done a bit quicker that I thought

    Like Tom has stated, unless we decompile the objects we can never know for sure.
    What I can be sure of is that the CHGPF uses the SQL ALTER TABLE (AT) function to perform its database changes. It has some basic validation such as not allowing fields used as indexes to be removed – but this validation is very basic.

    So I thought let’s see what we can do with both CHGPF and ALTER TABLE, to try and identify what IBM has done, I am running V5.1.

    I did a basic test under debug to look what happened
    Setup.
    Create a copy of a file – see how long it takes for timing purposes. Then create some indexes (14 seconds to create the file and 55 seconds to create the indexes)

    Add and drop fields in the file, the copy part of the CHGPF takes 14 seconds.

    Using CPYF to perform the same copy and it takes 2 minutes so IBM must be doing something different.

    We can all see that the file that file Q_TA00001 is copied so I am guessing that this is the original file that has been renamed (it has to be a rename due to the speed).
    The new file is created and the data pointers are mapped (not copied) back into it. Finally the Q_TA00001 file is deleted and the system re-associates the access paths etc to the new object. This can be achieved because we have an exclusive lock on the file.

    OK so what would happen if I tried to remove a field that was used as a key?
    This is not possible with CHGPF but it is with AT
    When using the AT command to remove a filed that is used as a key you get the warning that data will be lost – the indexes are then deleted – a bit dangerous we still have the original data so not too bad. Again the copy takes about 14 seconds so I guess that is via the data pointers.

    How can I prove my data pointers theory?
    With a keyed physical, the deleted records are removed using the CPYF with the COMPRESS(*NO) option , with a non-keyed physical they remain. So I created a keyed physical deleted some records and changed it – result the deleted records where still there. This looks like it backs up my data pointer theory.

    This got my mind in a bit of a state; it is simply pointer re-allocation with out any additional checks what is to stop us messing up some indexes.

    So I created a file

    F01      Char 10
    F02      Char 10
    F03      Char 10

    Create a unique index on F02 and added records
    aaaaaaaaa1
    aaaaaaaaa2
    aaaaaaaaa3
    for all three fields
    then altered the attributes on F02 to Char 9 using the CHGF command.
    It sailed through with only a might result in data loss warning. I now have a unique logical with duplicate keys – not too bad because the original data is still available.

    OK so if I can do it on a logical I should be able to do it on a physical.
    So I recreated the file as a keyed physical and performed the same test – with the same result i.e. duplicate keys in a unique file

    This to me is proof that only the data pointers are re-mapped and not the actual data. The down side is that you can corrupt your data

    I did a simple UPDDTA on the file that generated the following error, you can not access the file for update (I got member full error)

    I wanted to perform more tests on the data but I am unsure of the consequences of say RGZPGM and do not want to put the production machine in danger (I have change jobs and do not have my own test system any more)

    If anyone has a test box that can be put in danger you could try some other stuff on the corrupt indexes and post.

    Some other tests
    Changing keys.
    The data is again mapped quickly 14 seconds and a QDBSVR job takes on the roll of re-building the access path

    So my conclusion is the CHGPF renames the original database, and then it uses the data pointers to populate the data pointes of the new file with-out performing any consistency checks any associations i.e. access paths, triggers are transferred to the new object so alleviating the need for them to be re-created.

    Dave

    0
     
    LVL 16

    Author Comment

    by:theo kouwenhoven
    Hi Dave,

    I already thought That this was a typical problem/question for you... :)

    You mentioned ".....not too bad because the original data is still available."
    do you mean when you change the field back from 9 to 10 pos, the last position must be available again????

    0
     
    LVL 14

    Expert Comment

    by:daveslater
    Hi

    You mentioned ".....not too bad because the original data is still available."
    Hi that is right. but all of the other data in the file is available via the physical (except for the last byte of the altered field) and the logical file can be deleted.

    But for a keyed physical I am not sure what will happen - and did not dare to experiment too much.

    The command is designed to be used by programmers and I guess that you need to be carful.

    Dave
    0
     
    LVL 16

    Author Comment

    by:theo kouwenhoven
    Wow this is more then I expected, but when I look to your total story, the conclution is:

    As long as-
    - we don't make fields shorter,
    - we don't mess around with indexes
    it's quite save to use it.

    most of the time I will just use it to add fields to a file. So no problem ?!?!?!?!




    0
     
    LVL 16

    Author Comment

    by:theo kouwenhoven
    The more answers of this level I read, the more I be aware that I don't know very much about the AS/400 that's a bad score after 18 years AS/400 experience  ;-)


    0
     
    LVL 14

    Expert Comment

    by:daveslater
    Hi
    it is one of those things you take it on trust that it works, and for 99.999% of the time it will.
    Now that we are aware of some limitiations we can make sure our system remain safe - and that has to be good news.

    Again indexes are fine it is only unique indexs that are bad news.

    Dave
    0
     
    LVL 14

    Expert Comment

    by:daveslater
    Hi again
    >>>The more answers of this level I read, the more I be aware that I don't know very much about the AS/400 that's a bad score after 18 years AS/400 experience  <<

    It is not that we don't know much - it's just that the iSeries/AS400 is so vast and so poweful that our accumulated knowledge is probable about 20% of what the beast is capable of.

    If IBM made the details more available I am sure that no other computer on the planet would come close. Come on Big Blue you have the best system out there - let the world know and let us know the full potential so we can develop world beating applications.

    With ILE we now have the capacity to develop flexible, robust, highly scalable applications at great speed - but no one is doing it.

    A slightly bemused Dave
    0
     
    LVL 14

    Expert Comment

    by:daveslater
    PS
    it kept me out of the lounge for a few days **grin**
    0
     
    LVL 16

    Author Comment

    by:theo kouwenhoven
    Yeah,

    The machine is good (the best) the DB fast, the posibilities are endles...... but my boss perfers to sell websphere, and don't put any power in AS/400 customers, market or developement.... only networking en e-Junk.

    So I'm afraid that the 400 had his best (longest) time......
    0
     
    LVL 14

    Expert Comment

    by:daveslater
    Hi
    I have just found out some more info.
    The accesspath is re-built asynchronously instead of synchronously which is why it is quick and can not check for duplicates.

    Dave
    0

    Write Comment

    Please enter a first name

    Please enter a last name

    We will never share this with anyone.

    Featured Post

    How your wiki can always stay up-to-date

    Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
    - Increase transparency
    - Onboard new hires faster
    - Access from mobile/offline

    When we want to run, execute or repeat a statement multiple times, a loop is necessary. This article covers the two types of loops in Python: the while loop and the for loop.
    With the shift in today’s hiring climate (http://blog.experts-exchange.com/ee-blog/5-tips-on-succeeding-in-the-new-gig-economy/?cid=Blog_031816), many companies are choosing to hire freelancers to get projects completed efficiently and inexpensively…
    Want to pick and choose which updates you receive? Feel free to check out this quick video on how to manage your email notifications.
    how to add IIS SMTP to handle application/Scanner relays into office 365.

    884 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    21 Experts available now in Live!

    Get 1:1 Help Now