• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 5539
  • Last Modified:

How do I do a Sumif on a number of columns in SAS

consider the following table

Cust Nanme  Product A  Product B  Product C   Product D      
Jason                1              0              1                 0
Michael              0              1              0                 1
Mary                  1              0              0                 0

Its a table with customer as a column and in remaining columns are 1/0 columns which indicate whether a customer has bought product A, B, C, D etc. The table  above only has four products - the actual table I am dealing with has 2000 columns or products, plus millions of customers. According to above table, Jason buys Product A and D. The question I  need to answer is how many customer who buy product A also buy product B, how many customers who buy product A also buy product C, how many customers who buy product A also buy product D. After doing all the A's I would need to do the B's ie How many people who buy B also buy C, how many people who buy B also buy D and so on until all possible product combination pairs are covered. What is the most efficent way to do this in SAS. I was thinking if I could do something similar to Sumif in excel I could get my question answered. However any help or comments much appreciated.
0
hantran99
Asked:
hantran99
1 Solution
 
Kyle AbrahamsSenior .Net DeveloperCommented:
Not an expert on SAS but If you wanted to go that route:
http://core.ecu.edu/psyc/wuenschk/SAS/Help/Sum-SAS.htm

Take a look at the missing data points part.

Essentially in your query I would create a new column, say A_B
if A=1 and B = 1 then A_B= 1; else A_B = 0;

However to do this for each product is a staggering number of columns to keep track of.  

Instead you may want to do something like, if a person bought product A, how many other products did they buy?  Not sure what your exact requirement is.


0
 
ghoshsaikat83Commented:
Hi hantran99,

Find the attached file...this might do the job for you.
Data.xls
0
 
wigmeisterCommented:
In SAS, you can run a Proc Summary on the data using the Class variable.  Then using the _TYPE_ field you can access the record level you desire.  However, given you have almost 2000 variables on millions of records, this will produce a very large dataset.  Quite frankly, this will produce results that will most likely be meaningless, because with that many variables you are likely to have limitless combinations of those variables.

This is too much data for excel to process.

Reference this link for example:
http://www2.sas.com/proceedings/sugi27/p077-27.pdf
0
Microsoft Certification Exam 74-409

VeeamĀ® is happy to provide the Microsoft community with a study guide prepared by MVP and MCT, Orin Thomas. This guide will take you through each of the exam objectives, helping you to prepare for and pass the examination.

 
hantran99Author Commented:
All the above is very useful advice.  Let me give them a try and let you know how I go.
hantran99          
0
 
RJB_ACCCommented:
Hi,
no-one seemed to answer your question as I would have thought.
So try this for size. Similar could be done with Macros but I think that it would take longer to process.

data rawdata;
      CustName = 'Jason';   ProductA=1; ProductB=0; ProductC=1; ProductD=0; output;
      CustName = 'Michael'; ProductA=0; ProductB=1; ProductC=0; ProductD=1; output;
      CustName = 'Mary';    ProductA=1; ProductB=0; ProductC=0; ProductD=0; output;
run;

%let Prods = 4;
%let ProdsL1 = %sysevalf(&Prods-1);

data sumif;
      set rawdata end=eof;
      length RowType $12.;
      array Product{4} ProductA--ProductD;
      array Combo[&ProdsL1,&ProdsL1];
      do i = 1 to &ProdsL1;
        do j = i+1 to &Prods;
            if Product[i] and Product[j] then Combo[i,j-1] + 1;
        end;
      end;
      if eof then do;
            RowType = "Results"; output;
            do i = 1 to &ProdsL1;
              do j = i+1 to &Prods;
                  Combo[i,j-1] = i;
              end;
            end;
            RowType = "Product From"; output;
            do i = 1 to &ProdsL1;
              do j = i+1 to &Prods;
                  Combo[i,j-1] = j;
              end;
            end;
            RowType = "Product To"; output;
      end;
      drop CustName ProductA--ProductD i j;
run;
0
 
hantran99Author Commented:
Thank you. I'm still very new with SAS, so thanks for your help.
0
 
RJB_ACCCommented:
No worries, let me know if you have any questions about the code above and how it works.
I'm happy to explain.
0

Featured Post

VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

Tackle projects and never again get stuck behind a technical roadblock.
Join Now