# heuristics / fuzzy logic / AI

on
I'm trying to design a simulated sports game.  I have come a long way, but I am stuck at the most important part - What goes into deciding what happens each play.

I am looking for something like heuristics, fuzzy logic, artificial intelligence, etc. for php.  Could anyone please offer some advice?

Comment
Watch Question

Do more with

EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®

Commented:
Using AI in this situation will be unsophisticated at best, so perhaps just generating random numbers for events in your game would be much simpler.

Commented:
I already have it working with "randomizing", but it's not very accurate.  I was hoping for the best possible answer for each scenario rather than some random.
Most Valuable Expert 2014
Top Expert 2015

Commented:
what game?  how would human players decide what happens each play?

Commented:
Thanks for getting back to me.

the game is fantasy football.  no user interaction during the simulation.  the user selects their starting lineup, play preference etc. before the game.  once the simulation has started I need something to look at all aspects and decide the outcome.  for example:
If user 1 has a great passing game, what would they do against a user with a great pass defense?  how many yards would they get each play?  what if they have home field advantage?  what if it is grass or turf?  what about weather?

Commented:
I think, the best solution of your problem will be a function to count the score. Using more sophisticated methods to evaluate score is not necessary.

The evaluation could look like:
a1*P1 + a2*P2 + ... an*Pn = x
where P1,...Pn are some player team preferences or diffrence of teams preferences. for example p1 could be (offence_of_team_1 - defence_of_team_2)
and
a1...an are some parameters which show how important is particular part of game. for example multiplier of offence will be bigger than multiplier of home filed advantage.
x is a score, which you need to interpret. If you set everything symmetrically, then if x > 0 wins team 1 and if x< 0 wins team 2.

Commented:
You can still use random numbers for those kind of things.  For example, if you have numbers for the values of "passing" and "pass defense", say 1=great and 5=poor, use those to decide the range of the random number.

team1 passing =1
team2 pass defense = 2
passingdifference = 1
team1homefield = 1
turf = 3 (whatever that may mean)
weatherfactor = -1
etc.

then sum the numbers up in a formula and generate a random number between 1 and the sum.  Say the total comes out to 12 out of a possible 18.

So the number generated is between 1 and 12, but the simulation expects between 1 and 18, so that limits the results based on the factors.

Most Valuable Expert 2011
Top Expert 2016

Commented:
@robwetz: You might want to click the "request attention" button and ask a moderator to add this to the Algorithms zone, too.  Just a thought. ~Ray

Commented:
@Ray_Paseur: Thanks for the suggestion, Ray - I just submitted the request.

@ziper: I like your suggestion, but I don't think it fits for what I need.  I haven't done a very good job explaining exactly what I need. (I explained in more detail below)

@yodercm: That's something simular to what i'm doing now.  I agree with you, it does work.  The problem is the result is not what i'm looking for.  Like I said to ziper, I haven't done a good job explaining, so I apologize; let me explain in more detail.

I currently have a bunch of players stored in my database.  all of these players have their stats in a table.  when team1 plays team2, I grab both teams starting lineups.  each play goes through a loop.  first it checks to see what down it is.  if it's not 4th down, it picks between a running play or a pass play (random depending on user prefs 50/50, 40/60, etc.).  If it's a running play, it grabs the stats for the starting runningback.  -- this is where I'm getting stuck --  if the runningback averages 5.2 yards per carry, and the defense allows 2.4 yards average, what should the gain be?  I'm currently using something simular to yodercm is suggesting.  the problem with it is it's too repetitious.  a runningback who averages 5.2 yards per carry should have the chance of breaking a 30 yard run, or lose a few yards, depending on the situation.  my point is, I feel like there are way too many factors to just randomize one thing.  hopefully I explained a little better.

Commented:
OK, then randomize the factors.

If the team/player value is 4, add a random number between -2 and 2.

I think if this idea is already in place and not working as you wish, you just need to work more on the formulas for the results.  Put random numbers in all the factors, instead of just the final calculation.
Top Expert 2004
Commented:
I'm no statistician, but I do have some thoughts.  I've pondered systems like these for quite some time.  I find them interesting puzzles.

First, dealing with the average attributes of each player means you are necessarily depending on a generalization.  That is why you are seeing such repetition.  To achieve better accuracy, and a wider range of possibilities, establish a curve of that player's actual performance (based on play history), break it into percentile groups, then use a cumulative random to select an "offensive power" for that player during that play.  Do the same thing for the defense.  The random selection should reflect the curve's statistical probability for the result, such that the likelihood of an "average" play is greater than something on the extremes.  This can be easily achieved by simulating dice rolls, such as the sum of 10 random numbers between 0 and 10.  Each percentile group should associate with a definite amount of yardage gained or lost.

Once you have the offensive and defensive power, whichever is greater is the "winner" for that play.  Now you need a separate algorithm to determine the actual result of the win.  That algorithm should probably consider offensive power, defensive power, and random conditions you introduce into the play (someone left a banana on the field, high wind pushes a pass into a defensive receiver, etc), as well as a standard random deviation.

Using your example of the running back vs. defensive line, the RB has an average of 5.2 yards per carry, but his actual performance on a play-by-play basis ranges from -10 yards (pwned!) to a 60-yard sprint for TD.  After creating his curve and selecting a percentile, say his "power" comes out to 10 yards.  On the defensive side, the power can range from complete failure (that 60-yard sprint) to overrunning the offensive line.  Say defensive power comes out to 3 yards, just a little above average.  The difference is 7, in favor of offense.  The random deviation of the play "instance" can be another random, say -5 to +5.  You could even modify the range of the deviation based on the difference in powers.  That would allow for a defensive power of -2 combined with an offense power of 50 yielding the 100+ yard punt return.  For this example, say you get -3, so +7 + (-3) = +4 yards as your initial baseline play result.  Add whatever other modifiers you want into the play, adjust for our universe's chaotic nature, so on, ad nauseum.  Some teams statistically do extremely poor on 3rd down conversions...

At the end, the play goes into the play history, thus acting as a subtle modifier on future plays.  This allows average good players to (normally) improve over time.  In the long run, you will need to limit that effect with some kind of aging mechanism.  At the same time, you may need to "boost" poor players in order to show improvement through experience.  Statistically, players' abilities will tend to migrate away from the center norm towards the extreme of their initial bias (good player vs bad player).  Without some kind of limiting on this effect, you will eventually end up with gods vs. dweebs.

This is all computationally expensive, but the results tend to follow strong over weak, with just enough random to provide the occasional upset without being too "WTF?!".  (For you Civ fans: what do you mean my tank was killed by a longbowman?)

Commented:
I was hoping for some type of algorithm to help me along, but I think this is exactly what I need.  Thanks for taking the time to explain, routinet.  Thank you everyone else for your responses.

Commented:
@routinet: just to clarify, when you say, "establish a curve of that player's actual performance (based on play history), break it into percentile groups, then "

do you mean: 45% chance of 5 yard run, 15% chance of 20 yard run, etc.; but you kind of lost me on, "use a cumulative random to select an "offensive power" for that player during that play."  what range should be rolled and what makes up the "power"?

I think you explained it, but I'm still a little confused - sorry:
"The random selection should reflect the curve's statistical probability for the result, such that the likelihood of an "average" play is greater than something on the extremes.  This can be easily achieved by simulating dice rolls, such as the sum of 10 random numbers between 0 and 10.  Each percentile group should associate with a definite amount of yardage gained or lost."

Thanks again!
Top Expert 2004

Commented:
I mean generate a statistical analysis of the player's play history, but I may be using incorrect terminology.  Again, I'm no statistician.  :)

Basically, each player will have a history of yardage gained/lost per play.  Take the detail of that history and plot it on a graph, with yardage as the X-axis and frequency as the Y-axis.  That should yield a nice statistical curve, on which you can use the normal statistical methods.  The particular method I'm thinking of is percentiles.  If you 'roll' your dice and come up with 36, then look for the 36th percentile on the graph, and use the related yardage as the base power of that player for that play.  With 10 random numbers between 0 and 10, you should be generating the same kind of selection frequency as indicated by a standard, centered bell curve.

I'd have to locate and dig through my stats 101 book to give any kind of formula help, but I recall this being a relatively standard operation in analysis.  In terms of flow/pseudo-code:

1) fetch player history details and organize as a normal distribution (http://en.wikipedia.org/wiki/Normal_distribution)
2) generate random between 0 and 100 for percentile.  Percentile selection should represent a standard "centered" normal distribution (typical bell curve)
3) modify percentile based on game conditions, such as a "good" day for the player, home field advantage, etc.
4) translate percentile into yardage using (1)
5) a. steps 1-4 for offense
5) b. steps 1-4 for defense
6) compare 5a and 5b for general outcome (gain/loss)
7) modify (6) with random deviations, arbitrary adjustments, and universal entropy (i.e., inject chaos into your order)

Commented:
got it.  this is great. :-)

thanks again, routinet!
Top Expert 2004

Commented:
To be a little more clear, a history of your percentile selection should show a classic normal distribution, as pictured here:

http://z.hubpages.com/u/324995_f520.jpg

Your player's history graph will not be quite so symmetric, of course, but you will still be able to divide into percentile groups.  IIRC, this is an expression of standard deviations from the norm.  Or maybe vice versa.  In any case, you generate your percentile, find where that percentile lies on the player's graph, which should pinpoint a discrete value as the median of that percentile group.  That value becomes your base power.

I do hope someone here knows a bit about statistics and can explain this in proper terms.  I knew enough to pass my final on it about 7-8 years ago, and have not used it since...  :/

Commented:
I have been beating my head against the wall trying to get this to work.  I figured out my mean, median, deviation, etc., but it still doesn't seem right.  The numbers just don't add up.  My frequency ranges from -4 to 67 but the deviation chart goes from -28 to 28.  I attached a screenshot and my excel document so you can see what I mean.  Any more assistance would be greatly appreciated.  Sorry for being so difficult.
yards.xls
Picture-1.png
Most Valuable Expert 2014
Top Expert 2015

Commented:
the picture looks like a Gaussian distribution, while the distribution of the numbers in your xml file look more skewed then a  Gaussian distribution,
Top Expert 2004

Commented:
Your graph is not what you want.  You have taken the detail, generalized it into mean and deviation, then attempted to recreate pseudo-detail by graphing it to a standard distribution.  That assumes the presence of symmetry to the mean in your data, which is not the case.  You need to create the same graph from the actual detail.  You'll end up with a similar graph, though as ozo noted it will have a positive skew.

Once plotted, break the graph up into percentiles.  With 365 observations, each percentile will own 3.65 data points.  It is up to you to determine the rounding or interpolation.  From your data, for example, (-4) has a frequency of 4, which is just over 1 percentile.  (-3) has a frequency of 11, which is about 3.5 percentiles.

Now you need to select the random percentile which will form the basis of that player's power.  Do this by generating 10 random numbers between 0 and 10.  The sum is the percentile you need.  The matching x-axis value will be the base power.  Repeat the same process for the other player, and you have the unmodified result of the play.

Commented:
Once again, you have been very helpful and I appreciate the help.  I promise not to ask anymore questions.  :-) Thanks again, routinet!

Do more with