Trying to create SAS dataset from .dat file in macro

Hello, I am trying to see if there are any errors in the following SAS code.  Are there any errors or how this code can be done in a better way?  Any comments or suggestions are welcome!  

Input: file1.dat  
Note: layout is given for this file

Output: data1.sas7bdat
Note: This output SAS dataset file contains the same data as the input file1. All fields are character.

Process:
1.      Create an output record in data1 for every input record in file1.
2.      Each output record shall contain all input record variables.
3.      Save the data1 dataset in ascending order by var1/var5/var7.
4.      Write the following information to the SAS log:
      a.      Number of records read:    <insert the number>
      b.      Number of records output: <insert the number>


Working Code:

libname lib1 'C:\'; 
filename file1 'C:\file1.dat'; 

%let countin=0;
%let countout=0;

%macro mainmacro;

data lib1.data1;
	drop countin;
 	infile file1;    *Reading the file1.dat file into SAS;
 	input var1 $ 1-19 
 		  var2 $ 20-34 
 		  var3 $ 35-53 
 		  var4 $ 54-68 
 		  var5 $ 69-93
 		  var6 $ 94-98 
 		  var7 $ 99-103 
 		  var8 $ 104-105;
 
		      countin+1;   *Records Input;
		      CALL SYMPUT ('countin', countin);   *Placeholder;
run;

data _null_; 
   set lib1.data1;
    countout+1;   *Records Output;
      CALL SYMPUT ('countout', countout);   *Placeholder;
run;


proc sort data=lib1.data1;     *Sort;
	by var1 var5 var7;	
run;

proc print data=lib1.data1;    * Output;
run;

%put End of the SAS Log;
%put ----------------------;
%put Number of records read:    &countin;
%put Number of records output: &countout;
 
%mend mainmacro;

%mainmacro;

Open in new window

labradorchikAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

IanStatisticianCommented:
Hi there  labradorchik,

A/  Using a macro isnt necessary and ends up confusing you.  The initilisation of the macro variables outside the macro is surpurflous.  The macro variables outside the macro are different from those inside the macro  (unless you put in a %global statement in the macro).  Also they don't need initialising because they are written inside the first datastep and are not referenced before that.

With the necessary logic flow you can use 2 PUT statements in the first datastep instead fo the CALL SYMPUT call and macro variables.


B/ the comment on the statement

       infile file1;    *Reading the file1.dat file into SAS;

and the way the statement

            CALL SYMPUT ('countin', countin);   *Placeholder;

is used indicate you dont yet appreciate how a SAS datastep works.

Most often (unless there is advanced programming) the statements are executed reading the first case in where you have the input statement.  SAS then continues with the processing - with just data from one record -until the bottom That is it reaches the RUN statement. At that stage unless there is an OUTPUT statement you have programmed into the datastep, it writes out the data record in a simulated OUTPUT statement.
It then goes back to the top and does it all again reading the next record in.  All that is repeated again and again for each data record until the last data record on the input.


The comment on the file statement should indicate that it is specifying where the next data record is to come from, (rather that saying it reads all the data in at once which is what you implied).
eg
             * next record to come from file1 ;

The CALL SYMPUT statement will be executed for each data record which would be 149,999 unnecessary calls for a 150,000 record input file.  It is only for the last value that is used that will be the resultant value when the datastep finishes.

There is an END=xx option available for FILE and SET statements which will be useful here.  Check it out in the SAS statement reference.

You can accumulate both the input count and the output count in the first data step even though they will be equal in your case. In fact in more complicated processing counting both is a very useful thing to do.  However you should be counting the input records just after (or just before if you prefer) the INPUT statement, and count the output records just after or just before the OUTPUT statement.  In your case where you haven't coded an OUTPUT statement you should put the counter a bit before the RUN statement.

(Remember that you will be using the value of countout  for the log message on the last record so make sure you get it at the right time.

The second datastep can be done away with.
The proc print step wasn't a requirement and should be removed.

Hope that helps with your SAS learning.

Ian
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
labradorchikAuthor Commented:
Ian, thank you very much for all your comments and suggestions! I really appreciated!!
The above code worked as it is but I am currently rewriting the code into just one data step (instead of two) and will try not use my data step inside the macro.

aikimark,
I wish it was just a home work... :)
I write my own requirements/specs based on the data that needs to be processed and then I am just trying to figure out the best way possible to code in SAS, so it is more efficient code and does not take longer time for processing when 100,000 records or more are processed.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Databases

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.