SAS Data Sets Manipulations

Posted on 2011-10-13
Last Modified: 2013-11-16

Trying to create a new SAS datasets & ASCII file from already given datasets/ASCII.

Given SAS datasets:
 "Data3<number>" (Data3<number> dataset may have different numbers (from 2 to 18, also number 23 and 34) inside its dataset name).
 "Data4<number>" (Data4<number> dataset may have different numbers (from 2 to 18, also number 23 and 34) inside its dataset name).
Given ASCII file:

Creating SAS datasets:
 "TempData" (temporary dataset)
 "DataFile"  (temporary dataset)
“ Data6”

Creating ASCII file:

Requirements for coding:
 1. Combine datasets "Data1" and "Data2" to "Data5" datatset
 2. Create “Data6”.
       2.1. Sort "Data5" dataset no duplicate keys on ‘Var3’, output to "TempData" temporary dataset (keep only ‘Var3’).
       2.2. For each variable ‘Number’ (this variable has numbers from 2 to 18, also number 23 and 34) in "File1.dat" (except number 34), merge "TempData" with "Data3<number>" using ‘VarID’ in "Data3<number>" = ‘Var3’(11:19) in "TempData".
            2.2.1. One "Data3<number>" to 0, 1 "TempData" (one to one match or one to none).
            2.2.2. Keep only matching "TempData" records.
2.3. Output only a single “Data6” SAS dataset and ASCII file “Data6.dat”containing the "TempData" records for all numbers.

This is what I have coded so far, but I am not sure how valid my code is. Also, testing SAS datasets are not available yet, so I can’t run my code to find out what errors I have.

/*1*/	data dir.Data5;
  	    set dir.Data1 dir.Data2;

/*2*/	2.1.  proc sort data=dir.Data5 out=TempData (keep=Var3) nodup;
                  by Var3;

        2.2.  data DataFile;
                  infile “File1.dat”;
                  input @1 Number $char2.;

        2.3.  data dir.Data6 (keep=VarID); 
                  set DataFile;
                   ptds=’dir: Data3’||number||’.sas7bdat’;
                    if fileexist(ptds) the do;
                     call symput(‘nbr’, number);
                     call execute(‘&nbr’);
                     if (&nbr ne ‘34’) then do;   
                      VarID=substr(Var3,  11,19);
                   merge  TempData(in=a) dir.Data3&nbr (in=b);
                     by VarID;
                  if a then output;

Open in new window

I know that my code looks confusing, but requirements are correct. What am I missing in my code? Also, how would I as well create and output “Data6.dat”?

Any comments are welcome!
Thank you in advance!!

Question by:labradorchik

    Author Comment

    Anyone would like to make a comment on my code? What is the best technique to code for these requirements?  Should I have used MACRO instead?
    Please, any comments are welcome!! :)
    LVL 7

    Accepted Solution

    Item 2.1  When unduplicating with SORT, the NODUP and NODUPKEY options do different things.  Your requirements are to unduplicate based on the value of a Key-- that's what NODUPKEY does.   You're using NODUP and that removes records that are exact duplicates.  

    Tips and Clues -- When unduplicating, SAS keeps the first unique record so sometimes you need to first sort the data to get it in a specific order--oldest record first, maybe, based on a date--before unduplicating.

    Other items not looked at yet.

    Author Comment

    Thank you!
    So, I guess I should have used NODUPKEY instead, like this:

    proc sort data=dir.Data5 out=TempData (keep=Var3) nodupkey;
                 by Var3;

    Or, do I need to use NODBKEY on the TempData dataset in the next step (different proc sort step)?

    Author Closing Comment

    Thank you for your comments!! I just posted another similar question, please see if you can comment on that one. Thank you in advance!!

    Write Comment

    Please enter a first name

    Please enter a last name

    We will never share this with anyone.

    Featured Post

    Looking for New Ways to Advertise?

    Engage with tech pros in our community with native advertising, as a Vendor Expert, and more.

    Introduction: I have seen many questions on EE and elsewhere, asking about how to find either gaps in lists of numbers (id field, usually) ranges of values or dates overlapping date ranges combined date ranges I thought it would be a good …
    Using SQL Scripts we can save all the SQL queries as files that we use very frequently on our database later point of time. This is one of the feature present under SQL Workshop in Oracle Application Express.
    Video by: Steve
    Using examples as well as descriptions, step through each of the common simple join types, explaining differences in syntax, differences in expected outputs and showing how the queries run along with the actual outputs based upon a simple set of dem…
    Polish reports in Access so they look terrific. Take yourself to another level. Equations, Back Color, Alternate Back Color. Write easy VBA Code. Tighten space to use less pages. Launch report from a menu, considering criteria only when it is filled…

    779 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    14 Experts available now in Live!

    Get 1:1 Help Now