• Status: Solved
• Priority: Medium
• Security: Public
• Views: 1373

# What are the best tools for performing data mining cluster analysis

What are the best tools for performing data mining cluster analysis? It does not matter whether they are expensive/cheap, open source/proprietary software. All options are taken into consideration.

For example, I need to select or customize cluster fragmentation rules

P.S.: Couldn't find appropriate question zones so selected this 2.
0
AlexKostrub
• 9
• 7
• 5
• +1
1 Solution

Commented:
Might it be easiest to calculate the standard deviation and then specify how many standard deviations from the mean the data is considered to be within a 'cluster'?
0

Author Commented:
It can be efficient only for continuous values (i.e. numbers) but I need to analyze descrete values (for example: tuples of fixed cardinality from some set of discrete values ) where it is unknown how to set metric
0

Commented:
>for example: tuples of fixed cardinality from some set of discrete values

Not sure I understand the problem. If there's a set of discreet values why can they not be treated as a continuous set?
0

Author Commented:
I mean that I have 6 different descrete values for one tuple. And it is unknown how to calculate mean and standard deviation.
0

Commented:
AlexKostrub,

Patrick
0

Author Commented:
Each row is a tuple which consists of 6 numbers. My aim is to group this tuples into several groups according to some patterns.
Report3.txt
0

Commented:
What do the tuples represent?
0

Author Commented:
Each tuple consists of numbers of one lottery play
0

Commented:
If this is an attempt to predict lottery numbers then I cannot help.
0

Author Commented:
Not predict but group tuples of numbers into several different classes
0

Commented:
>Not predict but group tuples of numbers into several different classes

What does that mean?
0

Author Commented:
This means that I assume lottery winning numbers can be divided into groups. I treat a row from the file above as one unit. I want to divide this units into several groups.
0

Commented:
AlexKostrub,

I'm afraid I'm leaving this question to others.

Patrick
0

Commented:
There is no possible way to effectively data mine or do any analysis of any kind on winning lottery numbers. They are random, independent events.
If you flip a fair coin and get heads 5 times in a row, what is the chance that the next flip will be tails? 50%. Every time. It doesn't matter at all, ever, what has happened in the past. There is no effect whatsoever on future events.
Please, do not waste your time and resources on trying to make patterns out of independent random data. It will never work. Guaranteed.
I have a bachelor's degree in math with a statistics emphasis. I also have a master's in computer science and have done my fair share of data mining. Everyone who studies either will agree with me.
0

Commented:
Random data will appear to form patterns, but those patterns are mirages and are not real. The only possible way that analysis of lotteries could be of any use would be if they were generated by an especially poor random number generator, but no reasonably sized lottery or casino would dare do that.
0

Commented:
Hi

I don't know data mining very well but disagree that you cannot find patterns in "chaos". The reason for that is you have x numbers and you take x numbers from it then this is your base pattern in your "chaos" (so it is not true chaos). Example. Depends on the lottery but if you draw 20 numbers out of 80 and you want to hit 10 out of them (The lottery I analyzed a while ago) there is a pattern 10 numbers in a row do happen very rarelly but do happen. Also if you split 80 into add / even you can range of 3-17 odds/evens so occasionaly your lottery is 17 out of 40 where you want to hit 10..(very rare).

Appologies if it isn't applicabe to this question... but thought I will share that BECAUSE I want to use data mining myself (I need to learn it at some point) but not for lottery; in my case for predicting games results based on carefully selected data (which I don't have yet).

Regards
Emil
0

Commented:
I agree with Tommy. I decided to leave this question to others as I believe it' a total waste of time analysing lottery results in the hope of predicting the winning numbers and I have no intention of wasting any of my time on such a pointless venture.
Patrick
0

Commented:
I never said you can't find patterns. You can. All the time. They just don't mean anything because the patterns you find have no bearing on the future. At all. Ever. The end. It's a mathematically proven fact.

Many people try to demonstrate otherwise and even write books and sell their ideas, but they are either seriously misguided or liars who are preying on the simple minded.
0

Commented:
Hi

My understanding was that you don't know if a particular customer will buy a particular product from a particular range because you don't know the future, but you can identify more or less what he might buy and when if you have sufficient data.

From lottery point of view.... (depends on lottery).... if a lottery draws 20 balls out of 80 and I want to hit 10 and I know that 10 in a row appears once every 2 years with maximum 4 years and the last that that occured was 3.5 years then taking the same numbers (70 combinations) gives me good chances of winning.... if I'm lucky.... so let says it will occur with after 8 months (above max) and it is daily draw and costs 2 zloty :) then 70x2zloty=140 * 8months *30 days = 33 600 zloty and hitting 10 is 100 000 zloty (and many 9s 8s and so on) so you win by predicting the future...... unless you are very unlucky.

Maybe this is not pure data mining but I think close enough ;)

Regards
Emil
0

Author Commented:
It is wide known that probability of falling out of single number is statistically almost equal for all listed numbers. But I think that probability of falling out of some combinations of numbers is more probable than others and analysis of fallen numbers of several lotteries shows that it is true. Although statistically it is incorrect to make conclusions on falling out numbers on finite dataset of results but in practice it has sense. My aim is to find such combinations of numbers that are more probable than other combinations.
0

Commented:
I'm not sure if the data mining can answer this question but my previous approach was odd vs evens to increase probability (less frequent but more probable to hit) obviously by always using the same numbers. this is more efficient if we wait for the right moment, but we need to be still lucky. Total balls in a row seems to be good in certain lotteries but it is rare, may take long time to occur again and can be rather expensive with obviously the risk of getting unsual scenario.

Anyway forex seems to be easier then this especially for short time of periods (minutes = low risk = low gain) - using statistics only... as there are certain expected behaviours.

Regards
Emil

0

Commented:
My aim is to find such combinations of numbers that are more probable than other combinations.
There are none. There are definitely some that have come up more than others in the past, but that means absolutely nothing. The future will do whatever it pleases.
0

Commented:
odd vs evens to increase probability (less frequent but more probable to hit) obviously by always using the same numbers.
That actually doesn't increase your chances at all.
0

Commented:
>My aim is to find such combinations of numbers that are more probable than other combinations.

i'm afaid tha is  just not going to happen.
'
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.