Solved

SAS Data Sets Manipulations

Posted on 2011-03-16
13
610 Views
Last Modified: 2013-11-16
Hello,
I am trying to create multiple datasets named "Data_'Var1'_" from one dataset "OneData".
Dataset "OneData" contains variable "Var1" in its records, so based on that variable I would like to output multiple datasets "Data_'Var1'_" with "Var1" included in those datasets names and only with that particular variable "Var1" .

For example, if dataset "OneData" has only 5 records with "Var1" equal to
03
11
99
75
44

 then
I would have 5 datasets created from "OneData" dataset with the following names:
 
"Data_03_"
"Data_11_"
"Data_99_"
"Data_75_"
"Data_44_"

each of these 5 datasets should only have one record (with that particular "Var1") written to them.
 
Will I be able to create these datasets with the following code?  If not, what should I change or add anything in my codding?

%macro multiple;
   data lib.Data_&i_;
     set lib.OneData
      %do i=03 %to 99;
       %let Var1=%sysfunc(putn(&i,z2.));
     %end;
     ;
   run;
 %mend;

 options mprint;
 %multiple

Thanks!
0
Comment
Question by:labradorchik
  • 7
  • 6
13 Comments
 
LVL 14

Expert Comment

by:Aloysius Low
ID: 35153414
you won't be able to create the tables that you wanted... the quickest and cleanest way is to use the call execute method:

%macro execute;
      data _null_;
            set lib.OneData;
            call execute("data lib.Data_"||trim(left(var1))||"_;");
            call execute("var1 = "||trim(left(var1))||";");
            call execute("run;");
      run;
%mend;
%execute;

alternatively, you could read all the variable values into a global variable, separated by a space, then loop through the global variable... for each value in the global variable (separated by a space), create the dataset that you want and set the value into the dataset
0
 

Author Comment

by:labradorchik
ID: 35209079
Thank you lowaloysius!
I have been testing your macro example, it works, but not in a way I was imaging my output datasets to be.
I guess what I really would like to get as my output: datasets with a name "Data_'Var1'" but all records with "Var1" would have to stay inside of of this dataset.  For example if "OneData" has three records with "Var1"=11 then I would get all those three records in the "Data_11", which will appear only one time.  Right now as my output I am getting three datasets with  the same name "Data_ 11".

Note: For my output none of my datastes should have the same name. So, there will be only one "Data_ 11" dataset even if "OneData" dataset has 3 three records of "Var"=11. All 3 records should be inside of  "Data_ 11" dataset.

What would be a way to read records from "OneData" and output to datasets "Data_'Var1'" with all records included in them for that exact "Var1" value?
0
 

Author Comment

by:labradorchik
ID: 35231197
Hello,
I am still trying to get this one solved. Any help or comments will be greatly appreciated!
Thanks!
0
 
LVL 14

Expert Comment

by:Aloysius Low
ID: 35232223
I'm kinda lost here about the records you are expecting in the output datasets... are you able to give a more concrete example using the original one in your question?
0
 
LVL 14

Expert Comment

by:Aloysius Low
ID: 35232244
could you also post the log which generates 3 of the same data?
0
 

Author Comment

by:labradorchik
ID: 35232460

I am trying to write all three records with var1=28 to "Data_28" dataset, - there should be only one "Data_28" dataset at the end, and not three as I have now. I am just testing this code with only var1=28, but at the end there will be var1= from 01 through 99. I hope this clears up a little bit my previous post. Let me know if you have any other questions. Thank you in advance!!

Here is the log of that code with three records where var1=28:

%macro execute;
369              data _null_;
370                    set lib.OneData;
371                    call execute("data lib.Data_"||trim(left(var1))||";");
372                    call execute("var1 = "||trim(left(var1))||";");
373                    call execute("run;");
374              run;
375        %mend;
376        %execute;
MPRINT(EXECUTE):   data _null_;
MPRINT(EXECUTE):   set lib.OneData;
MPRINT(EXECUTE):   call execute("data lib.Data_"||trim(left(var1))||";");
MPRINT(EXECUTE):   call execute("var1 = "||trim(left(var1))||";");
MPRINT(EXECUTE):   call execute("run;");
MPRINT(EXECUTE):   run;

MPRINT(EXECUTE):   data lib.Data_28;
MPRINT(EXECUTE):   var1 = 28;
MPRINT(EXECUTE):   run;
MPRINT(EXECUTE):   data lib.Data_28;
MPRINT(EXECUTE):   var1 = 28;
MPRINT(EXECUTE):   run;
MPRINT(EXECUTE):   data lib.Data_28;
MPRINT(EXECUTE):   var1 = 28;
MPRINT(EXECUTE):   run;
NOTE: There were 3 observations read from the data set lib.OneData.

NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.01 seconds
     
NOTE: CALL EXECUTE generated line.
1         + data lib.Data_28;
2         + var1 = 28;
3         + run;

NOTE: Compression was disabled for data set lib.Data_28 because compression overhead wou
ld increase the size of the data set.
NOTE: The data set lib.Data_28 has 1 observations and 1 variables.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.01 seconds
     
4         + data lib.Data_28;
5         + var1 = 28;
6         + run;

NOTE: Compression was disabled for data set lib.Data_28 because compression overhead wou
ld increase the size of the data set.
NOTE: The data set lib.Data_28 has 1 observations and 1 variables.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.02 seconds
     
7         + data lib.Data_28;
8         + var1 = 28;
9         + run;

NOTE: Compression was disabled for data set lib.Data_28 because compression overhead wou
ld increase the size of the data set.
NOTE: The data set lib.Data_28 has 1 observations and 1 variables.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.02 seconds
0
Maximize Your Threat Intelligence Reporting

Reporting is one of the most important and least talked about aspects of a world-class threat intelligence program. Here’s how to do it right.

 
LVL 14

Expert Comment

by:Aloysius Low
ID: 35232547
does this mean that there's 3 records of Var1=28 inside DataOne? and if this happens, what you want is 1 dataset named Data_28 with 3 records of Var1=28?

i'll need to test my code back in office before I can post something (to ensure that it works!), but what i have in mind is to sort the data by Var1 and then have a by statement in the call execute data null step which will check if first.Var1 then execute the data step line, and if last.Var1 then execute the run statement, and if anything in between just output...

(again, this idea needs to be tested when i'm back in office in about 10hrs time :))
0
 

Author Comment

by:labradorchik
ID: 35232688
Sounds good lowaloysius! Thank you in advance!!
And yes, you are absolutely correct: "DataOne" has currently three records with Var=1 and I would like as my output just only 1 dataset named Data_28 with 3 records of Var1=28.
Note: later on "DataOne" will have hundreds of records with Var1= from 01 to 99, so I should only have 99 datasets with multiple records at the end.
0
 
LVL 14

Expert Comment

by:Aloysius Low
ID: 35237563
just tested the code to confirm it's working as expected...

so first step is to sort:
proc sort data=lib.OneData out=work.OneDataSorted;
by Var1;
quit;

then perform the call execute steps by including the by statement in the NULL data step and to check if first.Var1 and last.Var1 to decide what to do:
%macro execute;
  data _null_;
    set work.OneDataSorted;
    by var1;
    if first.var1 then do; /*if current record is first occurrence of the value in Var1, execute the data step to create the dataset*/
      call execute("data lib.Data_"||trim(left(var1))||"_;");
    end;

    call execute("var1 = "||trim(left(var1))||";output;"); /*regardless whether it is the first, last or anywhere in between, output the current record into the output dataset*/

    if last.var1 then do; /*if current record is the last occurrence of the value in Var1, execute the run statement*/
      call execute("run;");
    end;
  run; /*close off the data NULL step*/
%mend;
%execute;
0
 

Author Comment

by:labradorchik
ID: 35240568
Thank you very much for quick response back!

I just run this code and it looks like I have a problem with "output" statement. Do I need to put anything before "output" statement?
I notice if I take "output" statement out, then I only get one record written into Data_28 and not two records as I have right now in OneData dataset. Also, I noticed that only Var1 is written into Data_28 but what about other variables ( I have 10 other variables in the OneData dataset that need to be outputted as well)?

Note: I currently have two records with Var1= 28 in the OneData dataset.
Please see my log:

386  %macro execute;
387    data _null_;
388     set lib.OneData;
389      by Var1;
390       if first.Var1 then do;
391         call execute("data lib.Data_"||trim(left(Var1))||";");
392       end;
393         call execute("Var1 = "||trim(left(Var1))||"output;");
394       if last.Var1 then do;
395         call execute("run;");
396       end;
397    run;
398  %mend;
399  %execute;
MPRINT(EXECUTE):   data _null_;
MPRINT(EXECUTE):   set lib.OneData;
MPRINT(EXECUTE):   by Var1;
MPRINT(EXECUTE):   if first.Var1 then do;
MPRINT(EXECUTE):   call execute("data lib.Data_"||trim(left(Var1))||";");
MPRINT(EXECUTE):   end;
MPRINT(EXECUTE):   call execute("Var1 = "||trim(left(Var1))||"output;");
MPRINT(EXECUTE):   if last.Var1 then do;
MPRINT(EXECUTE):   call execute("run;");
MPRINT(EXECUTE):   end;
MPRINT(EXECUTE):   run;

MPRINT(EXECUTE):   data lib.Data_28;
MPRINT(EXECUTE):   Vat1 = 28output;
MPRINT(EXECUTE):   Var1 = 28output;
MPRINT(EXECUTE):   run;
NOTE: There were 2 observations read from the data set lib.OneData.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds

NOTE: CALL EXECUTE generated line.
1   + data lib.Data_28;
NOTE: Line generated by the CALL EXECUTE routine.
2   + Var1 = 28output;
                   ------
                   22
NOTE: Line generated by the CALL EXECUTE routine.
3   + Var1 = 28output;
                   ------
                   22
ERROR 22-322: Syntax error, expecting one of the following: !, !!, &, *, **, +, -, /, ;, <, <=, <>, =, >, ><,
              >=, AND, EQ, GE, GT, IN, LE, LT, MAX, MIN, NE, NG, NL, NOTIN, OR, ^=, |, ||, ~=.

4   + run;

NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set lib.Data_28 may be incomplete.  When this step was stopped there were 0 observations and
         2 variables.
WARNING: Data set lib.Data_28 was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.02 seconds
0
 

Author Comment

by:labradorchik
ID: 35240632
OK, I just got my second record with adding an extra ; before the output statement. So it worked! :)
Just one last thing: How can I make sure I write all the variable from OneData into Data_28?
At this moment I only see two records with Var1 in Data_28, but no other variables are shown.
0
 
LVL 14

Accepted Solution

by:
Aloysius Low earned 500 total points
ID: 35241219
yes... you'd noticed that your code is only call execute("var1 = "||trim(left(var1))||"output;"); whereas the one i provided was call execute("var1 = "||trim(left(var1))||";output;"); <-- you left out the ";" before output ;)

to output all the variables, i'll need to change the call execute portion:
%macro execute;
  data _null_;
    set work.OneDataSorted;
    by var1;
    if first.var1 then do; /*if current record is first occurrence of the value in Var1, execute the data step to create the dataset*/
      call execute("data lib.Data_"||trim(left(var1))||"_;");
      call execute("  set lib.Onedata;"); /*assuming here lib.OneData is sorted by Var1 already*/
      call execute("  where Var1='"||trim(left(var1))||"';");
    end;

    if last.var1 then do; /*if current record is the last occurrence of the value in Var1, execute the run statement*/
      call execute("run;");
    end;
  run; /*close off the data NULL step*/
%mend;
%execute;

i've not tested this code, so let me know if there are any errors (and post the log too)
0
 

Author Closing Comment

by:labradorchik
ID: 35241308
Wow... what a beauty!! :) Worked perfectly! Thank you!!!!
0

Featured Post

IT, Stop Being Called Into Every Meeting

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

Join & Write a Comment

Suggested Solutions

Cursors in Oracle: A cursor is used to process individual rows returned by database system for a query. In oracle every SQL statement executed by the oracle server has a private area. This area contains information about the SQL statement and the…
This article explains all about SQL Server Piecemeal Restore with examples in step by step manner.
This video shows, step by step, how to configure Oracle Heterogeneous Services via the Generic Gateway Agent in order to make a connection from an Oracle session and access a remote SQL Server database table.
This video shows syntax for various backup options while discussing how the different basic backup types work.  It explains how to take full backups, incremental level 0 backups, incremental level 1 backups in both differential and cumulative mode a…

760 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

20 Experts available now in Live!

Get 1:1 Help Now