Link to home
Start Free TrialLog in
Avatar of jonesgj
jonesgj

asked on

Readability Function

Hi Experts,

I wanted to see if I could produce a readability function to examine the content of a text box. The formulae I need to apply would be:


1 - Flesch reading Ease
The output of the Flesch Reading Ease formula is a number from 0 to 100, with a higher score indicating easier reading. The average document has a Flesch Reading Ease score between 6-70. The formula reads as follows:

206.835 – (1.015 x ASL) – (84.6 x ASW)
where:
ASL = average sentence length (the number of words divided by the number of sentences)
ASW = average number of syllables per word (the number of syllables divided by the number of words)

AND

2 - Flesch-Kincaid Grade Level
The more common Flesch-Kincaid Grade Level formula converts the Reading Ease Score to a U.S. grade-school level.
(.39 x ASL) + (11.8 x ASW) – 15.59

where:
ASL = average sentence length (the number of words divided by the number of sentences)
ASW = average number of syllables per word (the number of syllables divided by the number of words


I do have another question asking how to examine a textbox content quickly, but I felt this was a different and more complex issue (unless someone has this built already!)


Any way - hope you can help

Kind regards

Jonesgj


PS I will be out for the next 6 or so hours, but will log on as soon as I return.

Avatar of Alexandre Simões
Alexandre Simões
Flag of Switzerland image

Hi...

As I see it at first look the biggest problem is that ASW...
I'm not quite seeing how to produce an algorithm to get the number of syllables out of a word!

Despite that, ASL is just a matter of splitting sentences by '.' (dot) and then split each sentence by ' ' (space) to get the number of words per sentence.
Shure there may be exception words like this current '.Net' that even MS Word flips out...

RegEx can be used... specially on the sentences splitting...


Do u have any idea about that ASW thing?

Alex :p
ASKER CERTIFIED SOLUTION
Avatar of Bob Learned
Bob Learned
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Yep... that way makes it very simple to calculate, leaving only the special 'words' containing '.'s (dots) for you to care about when splitting sentences.
By 'words' I'm also considering numbers with decimal separators and such represented by dots.

Alex :p
Avatar of jonesgj
jonesgj

ASKER

Thanks Guys,

I agree, the ARI looks a lot easier. What I will do over the next few minutes is build something to calculate this, and then compare it with text which I know the Flesch reading Ease and the Flesch-Kincaid Grade Level scores for. If close then I'll use the ARI.

I did find a defintion for splitting syllables on the web:-


The dividing line between two syllables goes
- before one consonant:
ka-tu, lii-sa, suo-ma-lai-nen
- between two consonants:
kyl-lä, met-sä, haus-ka, a-me-rik-ka
- before the last of three consonants:
[/i]rans-ka, kort-ti[/i]
- between two vowels which do not form a diphthong:
lu-en, mai-to-a, ha-lu-ai-sin, ra-di-o
Nice... :)
But those last 2 aren't that easy... specially the last one...

ARI sounds great to me... :))))))

Alex :p
Avatar of jonesgj

ASKER

Alex .... I have compared the ARI and the Coleman-Liau to what I want and it almost aligns. However, the syllables are the key, as there are some anomalies in the test passages.

For example on a selected 2 pieces of published work

Age   --  Flesch    -- ARI
15+   --  70.4      -- 21.51
12+  --  80.3       -- 20.953
11+  -- 79.5        -- 11.888
10 +  --   82.1     -- 8.910
11 + --  77.9      -- 8.325 <<< !

As the Flesch calculations are in word etc you would have thougt someone has done this already?

Anyway, If I looked at the syllable approach again what would be the best approach ignoring the last two rules? If I could do this then I could then do another comparison

Thanks again..


Just don't forget that these rules are language specific... not universial.

I think the best approach is to create an array with the consonants chars... something like:

        Dim consonants As New ArrayList
        consonants.Adapter(New Char() {"b", "c", "d", "f", "g", "h" ... and so on})

I'm using an araylist so we can evaluate if a char is a consonant as:

        if consonants.Contains("c") then...


This work must be done word by word... letter by letter...

The first rule is simple, you just have to, for each letter in each word:
- if letter is a consonant and is not at the beggining nor the end of the word then it mark the end of a syllable, excluding the consonant char.

The second one is quite the same thing as the first and I think you must evaluate them together.
- if letter is a consonant and is not at the beggining nor the end of the word then it mark the end of a syllable, including the consonant char.


Alex :p
Avatar of jonesgj

ASKER

Does anyone know whether I could somehow pass my text to word 97 or above, and get back either all the stats, or the number of syllables or the Flesch reading Ease score or the Flesch-Kincaid Grade Level score?


Nop... sorry...

Alex :p
Avatar of jonesgj

ASKER

Thanks Alex .... I'm desperately trying to avoid work!   :-)
I can see that...
But at the same time I don't know if that Word interop is that salvation for u...

I would try to develop my own code.

Alex :p
Avatar of jonesgj

ASKER

Thanks Guys.

After much and lengthy searching I have to go with the more simple answer, especially in the interests of time.

Jonesgj