Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17

x
?
Solved

SAS Data Sets Manipulations

Posted on 2011-02-10
4
Medium Priority
?
649 Views
Last Modified: 2013-11-16
Hello, I am trying to do the following:

Create "NewData" SAS dataset.

Steps:
   1. Read in all records from already existed "OldData1" SAS dataset and sort them by 'Var1', 'Var2', and 'Var3'.
   2. Read in all records from already existed "OldData2" SAS dataset and sort them by 'Var11', 'Var12', and 'Var13'.
         Note: 'Var1', 'Var2', and 'Var3' in "OldData1" have the same info as 'Var11', 'Var12', and 'Var13 in "OldData2".
   3. Merge "OldData1" and "OldData2" by 'Var1', 'Var2', and 'Var3'.
   4. Extract all variables in "OldData2" for matched records in "OldData1".
   5. Keep all records in "OldData1" and count the number of records in "NewData" and send counts to "Data.Log" file.

*************************
The following code is what I have wrote so far. Please let me know if I need to adjust, change or delete anything. I can not run my code through compiler yet because "OldData1" and "OldData2" are not finished yet.
Thank you!


data temp1;
  set lib.OldData1;
 run;

 proc sort data=temp1;
  by Var1 Var2 Var3;
 run;

 data temp2 (rename=(Var1=Var11 Var2=Var12 Var3=Var13));
  set lib.OldData2;
 run;

 proc sort data=temp2;
  by Var1 Var2 Var3;
 run;

 data lib.NewData (drop=count);
   length count 8;
   retain count 0;
  merge temp1 (in=a) temp2 (in=b) end=lastrec;
   by Var1 Var2 Var3;
    if b;
   count + 1;

 if lastrec then do;
  file Data.Log;
    put "Total Records Read-in from all NewData dataset  " count;
 end;
run;


0
Comment
Question by:labradorchik
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
  • 2
4 Comments
 
LVL 14

Expert Comment

by:Aloysius Low
ID: 34867062
your renaming of variable in data step for temp2 is not correct. it should be var11=var1 and so on

you are supposed to keep all records from olddata1 so the join is left outer join instead of right outer join in your merge step. in this case, you should have done an if a instead of if b.

to keep all variables from olddata2, you could specify in your merge statement that temp1 (keep = var1 var2 var3), or alternatively, when you extract data from olddata1 into temp1, just keep those 3 variables.

you have missed out defining a file statement for your file output.
0
 

Author Comment

by:labradorchik
ID: 34898244
Thank you lowaloysius for your comments!!
My mistake, I meant to code "Var11=Var1", - not "Var1=Var11".
I defined my file statement in my original coding, just did not include it here in my coding question as it seems obvious.
I also changed "if a" instead of "if b".

I have a question about your third comment: What if my "Olddata2" data set has many variables (like 30 or 40), would it be more efficient to keep all variables from "Olddata2" in some other coding way? I just usually use "keep" statement when I have to keep up to 10 variables but not more then that.

Also, when I test my code I do not currently get any errors and I see that I am creating "Data.Log" but nothing is written to it (Data.Log just stays empty) for some reason.  What can be a problem?
0
 
LVL 14

Accepted Solution

by:
Aloysius Low earned 2000 total points
ID: 34903087
if you don't specify a keep statement and if there's no drop statement, all variables from that dataset is kept by default. however, the problem here is that you didn't specify any keep or drop variables in your merge statement, hence, all variables from olddata1 and olddata2 would be kept. this is why i asked you to do a keep on olddata1, keeping var1, var2 and var3.

i believe that you would have an error running the code actually, and that the log file isn't the one really generated by your code. as i mentioned, you have not defined a filename statement for your file statement. otherwise, you could also enclose the the filename in quotes i.e. file "D:\logfile.log". if it's still not working, then attaching the output log could help :)
0
 

Author Closing Comment

by:labradorchik
ID: 34961337
Thank you, [b]lowaloysius[/b]!
After a few fixed errors everything worked out fine.
0

Featured Post

NEW Veeam Agent for Microsoft Windows

Backup and recover physical and cloud-based servers and workstations, as well as endpoint devices that belong to remote users. Avoid downtime and data loss quickly and easily for Windows-based physical or public cloud-based workloads!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This article shows the steps required to install WordPress on Azure. Web Apps, Mobile Apps, API Apps, or Functions, in Azure all these run in an App Service plan. WordPress is no exception and requires an App Service Plan and Database to install
In this article, we’ll look at how to deploy ProxySQL.
Video by: Steve
Using examples as well as descriptions, step through each of the common simple join types, explaining differences in syntax, differences in expected outputs and showing how the queries run along with the actual outputs based upon a simple set of dem…
In this video, Percona Solution Engineer Rick Golba discuss how (and why) you implement high availability in a database environment. To discuss how Percona Consulting can help with your design and architecture needs for your database and infrastr…

670 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question