jonesgj
asked on
Readability Function
Hi Experts,
I wanted to see if I could produce a readability function to examine the content of a text box. The formulae I need to apply would be:
1 - Flesch reading Ease
The output of the Flesch Reading Ease formula is a number from 0 to 100, with a higher score indicating easier reading. The average document has a Flesch Reading Ease score between 6-70. The formula reads as follows:
206.835 – (1.015 x ASL) – (84.6 x ASW)
where:
ASL = average sentence length (the number of words divided by the number of sentences)
ASW = average number of syllables per word (the number of syllables divided by the number of words)
AND
2 - Flesch-Kincaid Grade Level
The more common Flesch-Kincaid Grade Level formula converts the Reading Ease Score to a U.S. grade-school level.
(.39 x ASL) + (11.8 x ASW) – 15.59
where:
ASL = average sentence length (the number of words divided by the number of sentences)
ASW = average number of syllables per word (the number of syllables divided by the number of words
I do have another question asking how to examine a textbox content quickly, but I felt this was a different and more complex issue (unless someone has this built already!)
Any way - hope you can help
Kind regards
Jonesgj
PS I will be out for the next 6 or so hours, but will log on as soon as I return.
I wanted to see if I could produce a readability function to examine the content of a text box. The formulae I need to apply would be:
1 - Flesch reading Ease
The output of the Flesch Reading Ease formula is a number from 0 to 100, with a higher score indicating easier reading. The average document has a Flesch Reading Ease score between 6-70. The formula reads as follows:
206.835 – (1.015 x ASL) – (84.6 x ASW)
where:
ASL = average sentence length (the number of words divided by the number of sentences)
ASW = average number of syllables per word (the number of syllables divided by the number of words)
AND
2 - Flesch-Kincaid Grade Level
The more common Flesch-Kincaid Grade Level formula converts the Reading Ease Score to a U.S. grade-school level.
(.39 x ASL) + (11.8 x ASW) – 15.59
where:
ASL = average sentence length (the number of words divided by the number of sentences)
ASW = average number of syllables per word (the number of syllables divided by the number of words
I do have another question asking how to examine a textbox content quickly, but I felt this was a different and more complex issue (unless someone has this built already!)
Any way - hope you can help
Kind regards
Jonesgj
PS I will be out for the next 6 or so hours, but will log on as soon as I return.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Yep... that way makes it very simple to calculate, leaving only the special 'words' containing '.'s (dots) for you to care about when splitting sentences.
By 'words' I'm also considering numbers with decimal separators and such represented by dots.
Alex :p
By 'words' I'm also considering numbers with decimal separators and such represented by dots.
Alex :p
ASKER
Thanks Guys,
I agree, the ARI looks a lot easier. What I will do over the next few minutes is build something to calculate this, and then compare it with text which I know the Flesch reading Ease and the Flesch-Kincaid Grade Level scores for. If close then I'll use the ARI.
I did find a defintion for splitting syllables on the web:-
The dividing line between two syllables goes
- before one consonant:
ka-tu, lii-sa, suo-ma-lai-nen
- between two consonants:
kyl-lä, met-sä, haus-ka, a-me-rik-ka
- before the last of three consonants:
[/i]rans-ka, kort-ti[/i]
- between two vowels which do not form a diphthong:
lu-en, mai-to-a, ha-lu-ai-sin, ra-di-o
I agree, the ARI looks a lot easier. What I will do over the next few minutes is build something to calculate this, and then compare it with text which I know the Flesch reading Ease and the Flesch-Kincaid Grade Level scores for. If close then I'll use the ARI.
I did find a defintion for splitting syllables on the web:-
The dividing line between two syllables goes
- before one consonant:
ka-tu, lii-sa, suo-ma-lai-nen
- between two consonants:
kyl-lä, met-sä, haus-ka, a-me-rik-ka
- before the last of three consonants:
[/i]rans-ka, kort-ti[/i]
- between two vowels which do not form a diphthong:
lu-en, mai-to-a, ha-lu-ai-sin, ra-di-o
Nice... :)
But those last 2 aren't that easy... specially the last one...
ARI sounds great to me... :))))))
Alex :p
But those last 2 aren't that easy... specially the last one...
ARI sounds great to me... :))))))
Alex :p
ASKER
Alex .... I have compared the ARI and the Coleman-Liau to what I want and it almost aligns. However, the syllables are the key, as there are some anomalies in the test passages.
For example on a selected 2 pieces of published work
Age -- Flesch -- ARI
15+ -- 70.4 -- 21.51
12+ -- 80.3 -- 20.953
11+ -- 79.5 -- 11.888
10 + -- 82.1 -- 8.910
11 + -- 77.9 -- 8.325 <<< !
As the Flesch calculations are in word etc you would have thougt someone has done this already?
Anyway, If I looked at the syllable approach again what would be the best approach ignoring the last two rules? If I could do this then I could then do another comparison
Thanks again..
For example on a selected 2 pieces of published work
Age -- Flesch -- ARI
15+ -- 70.4 -- 21.51
12+ -- 80.3 -- 20.953
11+ -- 79.5 -- 11.888
10 + -- 82.1 -- 8.910
11 + -- 77.9 -- 8.325 <<< !
As the Flesch calculations are in word etc you would have thougt someone has done this already?
Anyway, If I looked at the syllable approach again what would be the best approach ignoring the last two rules? If I could do this then I could then do another comparison
Thanks again..
Just don't forget that these rules are language specific... not universial.
I think the best approach is to create an array with the consonants chars... something like:
Dim consonants As New ArrayList
consonants.Adapter(New Char() {"b", "c", "d", "f", "g", "h" ... and so on})
I'm using an araylist so we can evaluate if a char is a consonant as:
if consonants.Contains("c") then...
This work must be done word by word... letter by letter...
The first rule is simple, you just have to, for each letter in each word:
- if letter is a consonant and is not at the beggining nor the end of the word then it mark the end of a syllable, excluding the consonant char.
The second one is quite the same thing as the first and I think you must evaluate them together.
- if letter is a consonant and is not at the beggining nor the end of the word then it mark the end of a syllable, including the consonant char.
Alex :p
I think the best approach is to create an array with the consonants chars... something like:
Dim consonants As New ArrayList
consonants.Adapter(New Char() {"b", "c", "d", "f", "g", "h" ... and so on})
I'm using an araylist so we can evaluate if a char is a consonant as:
if consonants.Contains("c") then...
This work must be done word by word... letter by letter...
The first rule is simple, you just have to, for each letter in each word:
- if letter is a consonant and is not at the beggining nor the end of the word then it mark the end of a syllable, excluding the consonant char.
The second one is quite the same thing as the first and I think you must evaluate them together.
- if letter is a consonant and is not at the beggining nor the end of the word then it mark the end of a syllable, including the consonant char.
Alex :p
ASKER
Does anyone know whether I could somehow pass my text to word 97 or above, and get back either all the stats, or the number of syllables or the Flesch reading Ease score or the Flesch-Kincaid Grade Level score?
Nop... sorry...
Alex :p
Alex :p
ASKER
Thanks Alex .... I'm desperately trying to avoid work! :-)
I can see that...
But at the same time I don't know if that Word interop is that salvation for u...
I would try to develop my own code.
Alex :p
But at the same time I don't know if that Word interop is that salvation for u...
I would try to develop my own code.
Alex :p
ASKER
Thanks Guys.
After much and lengthy searching I have to go with the more simple answer, especially in the interests of time.
Jonesgj
After much and lengthy searching I have to go with the more simple answer, especially in the interests of time.
Jonesgj
As I see it at first look the biggest problem is that ASW...
I'm not quite seeing how to produce an algorithm to get the number of syllables out of a word!
Despite that, ASL is just a matter of splitting sentences by '.' (dot) and then split each sentence by ' ' (space) to get the number of words per sentence.
Shure there may be exception words like this current '.Net' that even MS Word flips out...
RegEx can be used... specially on the sentences splitting...
Do u have any idea about that ASW thing?
Alex :p