Solved

SAS Data Sets Manipulations

Posted on 2010-09-21
4
746 Views
Last Modified: 2013-11-16
I have four SAS datasets as my inputs:
 masterdata
 data1
 data2
 data3

I need to separate the "masterdata" file into 3 wave files (Example: basically I need to have every "masterdata" record that match "data1" written to my output dataset "newdata1").
So, my output datasets are "newdata1", "newdata2", and "newdatat3".

Note: On all my input datastes my key variables are 'var3', 'var4', and 'var5'.
Here is what I came up with.  Is this correct?  If not, what do I need change or add?

proc sort data=lib.masterdata;
 by var3 var4 var5;
run;

proc sort data=lib.data1;
 by var3 var4 var5;
run;

data lib.newdata1;
 merge lib.masterdata(in=a) lib.newdata1(in=b);
  by var3 var4 var5;
 if a and b;
run;


proc sort data=lib.data2;
 by var3 var4 var5;
run;

data lib.newdata2;
 merge lib.masterdata(in=a) lib.data2(in=b);
  by var3 var4 var5;
 if a and b;
run;


proc sort data=lib.data3;
 by var3 var4 var5;
run;

data lib.newdata3;
 merge lib.masterdata(in=a) lib.data3(in=b);
  by var3 var4 var5;
 if a and b;
run;


     
0
Comment
Question by:labradorchik
  • 2
4 Comments
 

Author Comment

by:labradorchik
ID: 33725010
Sorry, my code should read as below:

proc sort data=lib.masterdata;
 by var3 var4 var5;
run;

proc sort data=lib.data1;
 by var3 var4 var5;
run;

data lib.newdata1;
 merge lib.masterdata(in=a) lib.data1(in=b);
  by var3 var4 var5;
 if a and b;
run;


proc sort data=lib.data2;
 by var3 var4 var5;
run;

data lib.newdata2;
 merge lib.masterdata(in=a) lib.data2(in=b);
  by var3 var4 var5;
 if a and b;
run;


proc sort data=lib.data3;
 by var3 var4 var5;
run;

data lib.newdata3;
 merge lib.masterdata(in=a) lib.data3(in=b);
  by var3 var4 var5;
 if a and b;
run;

0
 
LVL 9

Accepted Solution

by:
bradanelson earned 300 total points
ID: 33725328
It looks good.  Are you not getting the results you are expecting?  There are lots of ways to do this, and everyone will have there own opinion on the "best way".  However, if it makes sense to you and it works, you should stick with it.

You would get similar results using PROC SQL without the need to SORT the data:
%MACRO NewData(dsn);
  PROC SQL;
    CREATE TABLE lib.new&dsn AS
    SELECT a.*, b.*
    FROM lib.masterdata a INNER JOIN lib.&dsn b
    ON a.var3=b.var3 AND a.var4=b.var4 AND a.var5=b.var5;
  QUIT;
%MEND NewData;

%NewData(data1);
%NewData(data2);
%NewData(data3)


Or you could try and MERGE all three dataset in one MERGE.
proc sort data=lib.masterdata;
 by var3 var4 var5;
run;

proc sort data=lib.data1;
 by var3 var4 var5;
run;

proc sort data=lib.data2;
 by var3 var4 var5;
run;

proc sort data=lib.data3;
 by var3 var4 var5;
run;

data lib.newdata1 lib.newdata2 lib.newdata3;
 merge lib.masterdata(in=a) lib.data1(in=b) lib.data2(in=c) lib.data3(in=d);
  by var3 var4 var5;

  if a and b THEN OUTPUT lib.newdata1;
  if a and c THEN OUTPUT lib.newdata2;
  if a and d THEN OUTPUT lib.newdata3;
run;

Good luck!
0
 
LVL 14

Expert Comment

by:Aloysius Low
ID: 33725726
same comment as bradanelson: i don't see anything wrong with it. but if you have run it and it's not producing the results you are expecting then do post it here so that we can help...
0
 

Author Closing Comment

by:labradorchik
ID: 33727117
bradanelson,
Thank you!!
0

Featured Post

VMware Disaster Recovery and Data Protection

In this expert guide, you’ll learn about the components of a Modern Data Center. You will use cases for the value-added capabilities of Veeam®, including combining backup and replication for VMware disaster recovery and using replication for data center migration.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

I guess that all of us know that caching the data usually increase the performance, but I worried if all of us are aware about the risk that caching the data provides and how to minimize this.  That’s the reason why I decided to write this short art…
Entering a date in Microsoft Access can be tricky. A typo can cause month and day to be shuffled, entering the day only causes an error, as does entering, say, day 31 in June. This article shows how an inputmask supported by code can help the user a…
Video by: Steve
Using examples as well as descriptions, step through each of the common simple join types, explaining differences in syntax, differences in expected outputs and showing how the queries run along with the actual outputs based upon a simple set of dem…
Polish reports in Access so they look terrific. Take yourself to another level. Equations, Back Color, Alternate Back Color. Write easy VBA Code. Tighten space to use less pages. Launch report from a menu, considering criteria only when it is filled…

895 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

15 Experts available now in Live!

Get 1:1 Help Now