Efficient adjusting of an increment variable to append sets

I have two datasets with the same variables on them that I'd like to append. The only problem is that they both have an ID variable that starts at 1 and increments for each record. When I append them, I want to take the last value of ID in the first dataset and add it to the values of ID in the second dataset, so that when they're appended I have a unique ID variable for each entry that increases as you go down the set.

The main issue is that both of these datasets are big - as in "takes 5 minutes to just run a basic data step on them" big. So while I can think of a few ways to do what I want, they all involve processing one or other of the datasets multiple times, resulting in a slow process. Is there a quick way to do it?

(Small edit to add info: generally, the second dataset is smaller than the first, so it's a little quicker to access as well if that affects the solution.)
LVL 1
ConfusingAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Aloysius LowCommented:
hi,

what i can think of is to run a data step, but instead of using ID from either table, use _N_ instead

i.e.
data C;
  set A (drop = ID) B (drop = ID);
  ID_NEW = _N_;
run;

however, this assumes that the ID on both tables are in running order.
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
wigmeisterCommented:
I would probably do this:

Take your largest dataset and get the number of obs in that dataset:
data _null_;
if 0 then set test1 nobs=nobs;
call symputx(nobs,nobs);  **symput
stop;
run;

Then step through your smaller dataset and increment ID:
data test2;
  set test2;
id = id + &nobs
run;

Put them together:
proc append base = test1 data=test2;
run;

You will probably have to adjust the way &nobs is being used in the equation, but you get the idea.  This is basically doing the same tihng that lowaloysius suggested, except it processes the smaller dataset (test2) twice, instead of the 'set' which will process both once.

I still think there might be a more efficient way somehow using macros, but it hasn't come to mind yet.  If I think of anything I will post back.
0
ConfusingAuthor Commented:
I actually found a solution myself that is about as efficient as lowaloysius', but for the excellent alternative methods I'm going to split the points between the two of you.
0
Aloysius LowCommented:
Out of curiosity, what is it that you have got?
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Databases

From novice to tech pro — start learning today.