Link to home
Start Free TrialLog in
Avatar of CalBob
CalBob

asked on

Apriori Algorithm in R

I have what I thought was a well prepared dataset.  I wanted to use the Apriori Algorithm in R to look for associations and come up with some rules.  I have about 16,000 rows (unique customers) and 179 columns that represent various items/categories.  The data looks like this:

Cat1  Cat2  Cat3  Cat4  Cat5 ... Cat179
1,        0,       0,        0,      1,     ...  0
0,        0,       0,        0,      0,     ...  1
0,        1,       1,        0,      0,     ...  0
...

I thought having a comma separated file with binary values (1/0) for each customer and category would do the trick, but after I read in the data using:

>data5 = read.csv("Z:/CUST_DM/data_test.txt",header = TRUE,sep=",")

and then run this command:

> rules = apriori(data5, parameter = list(supp = .001,conf = 0.8))

I get this error:

Error in asMethod(object):
column(s) 1, 2, 3, ...178 not logical or a factor. Discretize the columns first.  

I understand Discretize but not in this context I guess.  Everything is a 1 or 0.  I've even changed it from INT to CHAR and received the same error.  I also had the customer ID (unique) in column 1 but I understand that isn't necessary when the data is in this form (flat file). I'm sure there is something obvious I'm missing - I'm new to R.

What am I missing?  Thanks for your input.
ASKER CERTIFIED SOLUTION
Avatar of CalBob
CalBob

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial