Solved

SAS Data Sets Manipulations

Posted on 2011-02-10
4
643 Views
Last Modified: 2013-11-16
Hello, I am trying to do the following:

Create "NewData" SAS dataset.

Steps:
   1. Read in all records from already existed "OldData1" SAS dataset and sort them by 'Var1', 'Var2', and 'Var3'.
   2. Read in all records from already existed "OldData2" SAS dataset and sort them by 'Var11', 'Var12', and 'Var13'.
         Note: 'Var1', 'Var2', and 'Var3' in "OldData1" have the same info as 'Var11', 'Var12', and 'Var13 in "OldData2".
   3. Merge "OldData1" and "OldData2" by 'Var1', 'Var2', and 'Var3'.
   4. Extract all variables in "OldData2" for matched records in "OldData1".
   5. Keep all records in "OldData1" and count the number of records in "NewData" and send counts to "Data.Log" file.

*************************
The following code is what I have wrote so far. Please let me know if I need to adjust, change or delete anything. I can not run my code through compiler yet because "OldData1" and "OldData2" are not finished yet.
Thank you!


data temp1;
  set lib.OldData1;
 run;

 proc sort data=temp1;
  by Var1 Var2 Var3;
 run;

 data temp2 (rename=(Var1=Var11 Var2=Var12 Var3=Var13));
  set lib.OldData2;
 run;

 proc sort data=temp2;
  by Var1 Var2 Var3;
 run;

 data lib.NewData (drop=count);
   length count 8;
   retain count 0;
  merge temp1 (in=a) temp2 (in=b) end=lastrec;
   by Var1 Var2 Var3;
    if b;
   count + 1;

 if lastrec then do;
  file Data.Log;
    put "Total Records Read-in from all NewData dataset  " count;
 end;
run;


0
Comment
Question by:labradorchik
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
  • 2
4 Comments
 
LVL 14

Expert Comment

by:Aloysius Low
ID: 34867062
your renaming of variable in data step for temp2 is not correct. it should be var11=var1 and so on

you are supposed to keep all records from olddata1 so the join is left outer join instead of right outer join in your merge step. in this case, you should have done an if a instead of if b.

to keep all variables from olddata2, you could specify in your merge statement that temp1 (keep = var1 var2 var3), or alternatively, when you extract data from olddata1 into temp1, just keep those 3 variables.

you have missed out defining a file statement for your file output.
0
 

Author Comment

by:labradorchik
ID: 34898244
Thank you lowaloysius for your comments!!
My mistake, I meant to code "Var11=Var1", - not "Var1=Var11".
I defined my file statement in my original coding, just did not include it here in my coding question as it seems obvious.
I also changed "if a" instead of "if b".

I have a question about your third comment: What if my "Olddata2" data set has many variables (like 30 or 40), would it be more efficient to keep all variables from "Olddata2" in some other coding way? I just usually use "keep" statement when I have to keep up to 10 variables but not more then that.

Also, when I test my code I do not currently get any errors and I see that I am creating "Data.Log" but nothing is written to it (Data.Log just stays empty) for some reason.  What can be a problem?
0
 
LVL 14

Accepted Solution

by:
Aloysius Low earned 500 total points
ID: 34903087
if you don't specify a keep statement and if there's no drop statement, all variables from that dataset is kept by default. however, the problem here is that you didn't specify any keep or drop variables in your merge statement, hence, all variables from olddata1 and olddata2 would be kept. this is why i asked you to do a keep on olddata1, keeping var1, var2 and var3.

i believe that you would have an error running the code actually, and that the log file isn't the one really generated by your code. as i mentioned, you have not defined a filename statement for your file statement. otherwise, you could also enclose the the filename in quotes i.e. file "D:\logfile.log". if it's still not working, then attaching the output log could help :)
0
 

Author Closing Comment

by:labradorchik
ID: 34961337
Thank you, [b]lowaloysius[/b]!
After a few fixed errors everything worked out fine.
0

Featured Post

Get proactive database performance tuning online

At Percona’s web store you can order full Percona Database Performance Audit in minutes. Find out the health of your database, and how to improve it. Pay online with a credit card. Improve your database performance now!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Never store passwords in plain text or just their hash: it seems a no-brainier, but there are still plenty of people doing that. I present the why and how on this subject, offering my own real life solution that you can implement right away, bringin…
These days, all we hear about hacktivists took down so and so websites and retrieved thousands of user’s data. One of the techniques to get unauthorized access to database is by performing SQL injection. This article is quite lengthy which gives bas…
Video by: Steve
Using examples as well as descriptions, step through each of the common simple join types, explaining differences in syntax, differences in expected outputs and showing how the queries run along with the actual outputs based upon a simple set of dem…
This is a high-level webinar that covers the history of enterprise open source database use. It addresses both the advantages companies see in using open source database technologies, as well as the fears and reservations they might have. In this…

617 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question