Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17

x
?
Solved

Efficient adjusting of an increment variable to append sets

Posted on 2009-07-15
4
Medium Priority
?
646 Views
Last Modified: 2013-11-16
I have two datasets with the same variables on them that I'd like to append. The only problem is that they both have an ID variable that starts at 1 and increments for each record. When I append them, I want to take the last value of ID in the first dataset and add it to the values of ID in the second dataset, so that when they're appended I have a unique ID variable for each entry that increases as you go down the set.

The main issue is that both of these datasets are big - as in "takes 5 minutes to just run a basic data step on them" big. So while I can think of a few ways to do what I want, they all involve processing one or other of the datasets multiple times, resulting in a slow process. Is there a quick way to do it?

(Small edit to add info: generally, the second dataset is smaller than the first, so it's a little quicker to access as well if that affects the solution.)
0
Comment
Question by:Confusing
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
4 Comments
 
LVL 14

Accepted Solution

by:
Aloysius Low earned 1000 total points
ID: 24867569
hi,

what i can think of is to run a data step, but instead of using ID from either table, use _N_ instead

i.e.
data C;
  set A (drop = ID) B (drop = ID);
  ID_NEW = _N_;
run;

however, this assumes that the ID on both tables are in running order.
0
 
LVL 4

Assisted Solution

by:wigmeister
wigmeister earned 1000 total points
ID: 24878036
I would probably do this:

Take your largest dataset and get the number of obs in that dataset:
data _null_;
if 0 then set test1 nobs=nobs;
call symputx(nobs,nobs);  **symput
stop;
run;

Then step through your smaller dataset and increment ID:
data test2;
  set test2;
id = id + &nobs
run;

Put them together:
proc append base = test1 data=test2;
run;

You will probably have to adjust the way &nobs is being used in the equation, but you get the idea.  This is basically doing the same tihng that lowaloysius suggested, except it processes the smaller dataset (test2) twice, instead of the 'set' which will process both once.

I still think there might be a more efficient way somehow using macros, but it hasn't come to mind yet.  If I think of anything I will post back.
0
 
LVL 1

Author Comment

by:Confusing
ID: 24901530
I actually found a solution myself that is about as efficient as lowaloysius', but for the excellent alternative methods I'm going to split the points between the two of you.
0
 
LVL 14

Expert Comment

by:Aloysius Low
ID: 24905360
Out of curiosity, what is it that you have got?
0

Featured Post

[Webinar] Lessons on Recovering from Petya

Skyport is working hard to help customers recover from recent attacks, like the Petya worm. This work has brought to light some important lessons. New malware attacks like this can take down your entire environment. Learn from others mistakes on how to prevent Petya like worms.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Microsoft Access is a place to store data within tables and represent this stored data using multiple database objects such as in form of macros, forms, reports, etc. After a MS Access database is created there is need to improve the performance and…
This post looks at MongoDB and MySQL, and covers high-level MongoDB strengths, weaknesses, features, and uses from the perspective of an SQL user.
In this video, Percona Solution Engineer Dimitri Vanoverbeke discusses why you want to use at least three nodes in a database cluster. To discuss how Percona Consulting can help with your design and architecture needs for your database and infras…
In this video, Percona Solution Engineer Rick Golba discuss how (and why) you implement high availability in a database environment. To discuss how Percona Consulting can help with your design and architecture needs for your database and infrastr…

721 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question