Daniel Addison
asked on
How do I convert SAS code over to SQL code?
I have attached a file with the sas code highlighted in "yellow" needing to be converted over to SQL code. This is a project I am working on for the medical group practice. I have to have the project completed in 3 days. Please see the attachment. Any assistance with this is much appreciated.
Thanks,
sas-code-conversion.docx
Thanks,
sas-code-conversion.docx
This would be more appropriate in the EE Gig system. Have you considered using that?
Hello,
According to your Code.
1. To append multiple file as:
data paids_total;
set paids_parta paids_partb paids_dme;
run;
In SQL is:
proc sql;
create table as paids_total as
select *
from paids_parta
union
select *
from paids_partb
select *
from paids_dme;
2. The If statement is equivalent as a Case statement in SQL
3. Proc sort in SQL is equivalent as Order by
According to your Code.
1. To append multiple file as:
data paids_total;
set paids_parta paids_partb paids_dme;
run;
In SQL is:
proc sql;
create table as paids_total as
select *
from paids_parta
union
select *
from paids_partb
select *
from paids_dme;
2. The If statement is equivalent as a Case statement in SQL
3. Proc sort in SQL is equivalent as Order by
Hi there Daniel,
There is a mountain of work there.
Although the code in the
PROC SQL;
QUIT;
sections read like SQL and do approximately the same as SQL, there can be some differences that may trip you up.
As show above by Alexander, most of the DATA -- RUN; sections can be replaced by SQL code but there are a few problems.
1/ Within a GROUP as set by SQL, all rows are treated equally. However in a SAS dataset order is important and in your code there are a few places where the psudo variables "first" and "last" are used. These allow different processing for the first and last record (think SQL row) in a group (as given in the BY statement).
2/ SAS DATA steps will not access records in anything but sequential order. (There is a possible random access, but normally processing is sequential.) The inbuilt variable _N_ is the record number, and depending on your particular flavour of SQL that you have available you may not have an equivalent SQL column name or function.
3/ Information can be carried across from one record to the next by using "RETAIN", this is something that is difficult in SQL without a lot of bother. You have 8 cases of RETAIN.
4/ Normal SAS has only 2 data types - character and numeric - which is vastly different to what SQL allows.
======
In a few places there is
PROC SORT data=X1 out=Y1;
BY ZZZ;
RUN;
PROC SORT data=X2 out=Y2;
BY ZZZ;
RUN;
DATA Y3;
MERGE Y1(in=A1) Y2(in=A2);
BY ZZZ;
if A1 and A2 then output;
RUN;
corresponds to
create table Y3 as
select Y1.*,
Y2.*
from Y1
join Y2
on Y1.ZZZ = Y2.ZZZ
order by ZZZ;
The " IF A1 and A2" test ensures that the join has both sides. However if within a group there is more than one record unlike SQL which will do a full join, SAS (in this case) will just sequentiall join records. My guess is that in this case there is only one record for each BY group.
=====
Looking at the size of the code (and its long train of processing), my recommendation is to insist you have about a month to change and TEST this processing. There is likely to be many places where a translation will work at better than 98% of lines, it is just those 2% OF CASES that are left over that willl trip you up.
Ian
There is a mountain of work there.
Although the code in the
PROC SQL;
QUIT;
sections read like SQL and do approximately the same as SQL, there can be some differences that may trip you up.
As show above by Alexander, most of the DATA -- RUN; sections can be replaced by SQL code but there are a few problems.
1/ Within a GROUP as set by SQL, all rows are treated equally. However in a SAS dataset order is important and in your code there are a few places where the psudo variables "first" and "last" are used. These allow different processing for the first and last record (think SQL row) in a group (as given in the BY statement).
2/ SAS DATA steps will not access records in anything but sequential order. (There is a possible random access, but normally processing is sequential.) The inbuilt variable _N_ is the record number, and depending on your particular flavour of SQL that you have available you may not have an equivalent SQL column name or function.
3/ Information can be carried across from one record to the next by using "RETAIN", this is something that is difficult in SQL without a lot of bother. You have 8 cases of RETAIN.
4/ Normal SAS has only 2 data types - character and numeric - which is vastly different to what SQL allows.
======
In a few places there is
PROC SORT data=X1 out=Y1;
BY ZZZ;
RUN;
PROC SORT data=X2 out=Y2;
BY ZZZ;
RUN;
DATA Y3;
MERGE Y1(in=A1) Y2(in=A2);
BY ZZZ;
if A1 and A2 then output;
RUN;
corresponds to
create table Y3 as
select Y1.*,
Y2.*
from Y1
join Y2
on Y1.ZZZ = Y2.ZZZ
order by ZZZ;
The " IF A1 and A2" test ensures that the join has both sides. However if within a group there is more than one record unlike SQL which will do a full join, SAS (in this case) will just sequentiall join records. My guess is that in this case there is only one record for each BY group.
=====
Looking at the size of the code (and its long train of processing), my recommendation is to insist you have about a month to change and TEST this processing. There is likely to be many places where a translation will work at better than 98% of lines, it is just those 2% OF CASES that are left over that willl trip you up.
Ian
This question needs an answer!
Become an EE member today
7 DAY FREE TRIALMembers can start a 7-Day Free trial then enjoy unlimited access to the platform.
View membership options
or
Learn why we charge membership fees
We get it - no one likes a content blocker. Take one extra minute and find out why we block content.