Maria Torres
asked on
SAS: COMPRESS is not properly working -- flat file is produced with trailing blanks for each record
A flat file is produced by one of our SAS program. When the flat file is opened, each record has trailing blanks (approximately 35,000 blank spaces).
Within the program, the option COMPRESS is assigned to YES. From my understanding this should remove the blank spaces.
Can someone point me in the right direction as to how remove the trailing spaces?
Thanks you.
Within the program, the option COMPRESS is assigned to YES. From my understanding this should remove the blank spaces.
Can someone point me in the right direction as to how remove the trailing spaces?
Thanks you.
ASKER
Thank you Shannon for the prompt response.
Yes, I am new to SAS to which is why my question is vague. Let me explain what we are trying to accomplish.
We have several SAS programs that were originally executed in UNIX. These programs produced flat DAT files.
We now transferred the SAS Programs from UNIX to Windows SAS, with the appropriate modifications. At first, the programs appeared to be executing without any issues. But once we review the flat DAT files, we noticed that the file size is 3 times larger than what it was when executed in UNIX.
When we opened the DAT file for review, we noticed that each delimited records are padded with spaces. Some records are padded with as much as 3,000 of blank spaces.
I'm assuming that in Windows, the generated files' records are padded to a specific size; but in UNIX, the files are not.
We are using the TRIM function when we populate the record before putting into a file. However, this function is not removing the trailing spaces. We, also, use the COMPRESS= data set option to compress an individual file.
Now we are trying to figure out a way of generating the files without the records being padded.
Any suggestion is appreciated. Thank you
Yes, I am new to SAS to which is why my question is vague. Let me explain what we are trying to accomplish.
We have several SAS programs that were originally executed in UNIX. These programs produced flat DAT files.
We now transferred the SAS Programs from UNIX to Windows SAS, with the appropriate modifications. At first, the programs appeared to be executing without any issues. But once we review the flat DAT files, we noticed that the file size is 3 times larger than what it was when executed in UNIX.
When we opened the DAT file for review, we noticed that each delimited records are padded with spaces. Some records are padded with as much as 3,000 of blank spaces.
I'm assuming that in Windows, the generated files' records are padded to a specific size; but in UNIX, the files are not.
We are using the TRIM function when we populate the record before putting into a file. However, this function is not removing the trailing spaces. We, also, use the COMPRESS= data set option to compress an individual file.
Now we are trying to figure out a way of generating the files without the records being padded.
Any suggestion is appreciated. Thank you
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Thank you. The LRECL worked like a charm.
I think that you have misunderstood a few SAS concepts.
First there is compression of SAS datasets. That is when SAS writes a data set it is possible to specify that the records be compressed. Hopefully the stored data set will be smaller than it otherwise be. However that is not the flat file you are needing. This compression is specified on a LIBNAME statement ---
Next there is the compress function used within a SAS data step or in SQL code in proc SQL. (And a few other places that produce content - EG compute blocks in proc REPORT). This will remove all of a nominated character from a string.
There is the COMPBL, COMPRESS and TRIM functions that you may have been using for removing blanks.
Worth noting here is the following discussing the case where the character to remove wasnt specified and is defaulted to a blank.
The important part is that the destination variable name has its own length and the output of compress will need to be padded with blanks up to the size of the destination variable.
========
My guess is that when you are writing out the data for the flat file, you have managed to have either a dataset variable or a temporary variable with a length of 2^15 ( = 32768) which contains all the information required but is padded out to the set length.
If you have specified COMPRESS on a LIBNAME statement it will not do what you want.
Really we need a bit more context of what you are doing to spot the real source of the trouble.
Ian