Results 1 to 7 of 7
  1. #1
    Platinum Lounger
    Join Date
    Feb 2001
    Location
    Yilgarn region of Toronto, Ontario
    Posts
    5,453
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Syllables of a word (Word97SR2+)

    I wanted some simple code to compute the Gunning-Fog index (readability) of text, and for that I thought I needed to determine the number of syllables in an English word.

    Here's the complete template.

    The syllable-counting code is not complete. It needs much adjustment and fine-tuning in special-cases.

    Examine the Module MAIN for the two test macros.

  2. #2
    Star Lounger
    Join Date
    Dec 2000
    Location
    Tacoma, Washington, USA
    Posts
    68
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Syllables of a word (Word97SR2+)

    BTW, have you tried Tools>Options Spelling & Grammer "Show readabilty statistics"?
    Rates text on a 100-point scale; the higher the score, the easier it is to understand the document. For most standard documents, aim for a score of approximately 60 to 70.

    The formula for the Flesch Reading Ease score is:

    206.835
    <IMG SRC=http://www.wopr.com/w3tuserpics/DougKlippert_sig.jpg>

  3. #3
    Platinum Lounger
    Join Date
    Feb 2001
    Location
    Yilgarn region of Toronto, Ontario
    Posts
    5,453
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Syllables of a word (Word97SR2+)

    > Tools>Options Spelling & Grammer (ooops!)

    Thank you, and yes, I have. However the motivation for coding the FOG index was its appearance in the text book I'm using for a Business English course.

    I'd love to hear your feedback on the sylabble-counting portion of the code.

    Most algorithms for readability seem to focus on length of sentence and count of "complex" words. What else might one use?

  4. #4
    Platinum Lounger
    Join Date
    Feb 2001
    Location
    Yilgarn region of Toronto, Ontario
    Posts
    5,453
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Syllables of a word (Word97SR2+)

    Here is "better" code using a crude Regular Expression; I'm not happy with the definition of a "word", but it's a start ....
    <pre>Sub FOG()
    If Len(Selection.Text) < 100 Then
    MsgBox sngFogIndex(ActiveDocument.Range)
    Else
    MsgBox sngFogIndex(Selection.Range)
    End If
    End Sub
    Public Function sngFogIndex(rng As Range) As Single
    Dim sngWords As Single
    sngWords = rng.Words.Count
    Dim sngSentences As Single
    sngSentences = rng.Sentences.Count
    Dim sngAverageWordsPerSentence As Single
    sngAverageWordsPerSentence = sngWords / sngSentences
    Dim sngLongWords As Single
    sngLongWords = lngCountWordsSyllables(rng, 3)
    Dim sngPercentageLongWords As Single
    sngPercentageLongWords = 100 * sngLongWords / sngWords
    Dim sngResult As Single
    sngResult = sngAverageWordsPerSentence + sngPercentageLongWords
    sngFogIndex = 0.4 * sngResult
    End Function
    Public Function lngCountWordsSyllables(rng As Range, lngCount As Long) As Long
    Dim wd As Range
    Dim lngResult As Long
    For Each wd In rng.Words
    Application.StatusBar = wd.Text
    If lngSyllables(wd.Text) > lngCount Then
    lngResult = lngResult + 1
    Else
    End If
    Next wd
    lngCountWordsSyllables = lngResult
    End Function
    Public Function lngSyllables(strText As String) As Long
    Dim re As New RegExp
    With re
    .Global = True
    .Pattern = "[aeiou]*[^aeiou]+"
    Dim matches
    Set matches = .Execute(strText)
    lngSyllables = matches.Count
    End With
    'Sub TESTlngSyllables()
    ' MsgBox lngSyllables("establish")
    'End Sub
    End Function</pre>


  5. #5
    Super Moderator
    Join Date
    May 2002
    Location
    Canberra, Australian Capital Territory, Australia
    Posts
    5,054
    Thanks
    2
    Thanked 417 Times in 346 Posts

    Re: Syllables of a word (Word97SR2+)

    Hi Chris,

    I think the line:
    .Pattern = "[aeiou]*[^aeiou]+"
    in the lngSyllables function should be changed to:
    .Pattern = "[^bcdfghjklmnpqrstvwxz]+"

    The reasons are twofold:
    [tab[aeiou]*
    does nothing in this context and
    .Pattern = "[aeiou]*[^aeiou]+"
    with or without "[aeiou]*" doesn't pick up syllables that form word endings, such as in "baby", but interprets "Dad" as having two syllables.

    Both approaches fail, though, when the word ends in a silent vowel (eg "babe") or has "es" to form the plural without sounding the "e". Still, counting these as adding to the text's complexity might be reasonable, on the premise that they take extra effort to apply correctly (kinda like saying they're silent syllables).

    Later:
    Regarding the word count in the sngFogIndex function, I think you should use:
    sngWords = rng.ComputeStatistics(wdStatisticWords)
    instead of
    sngWords = rng.Words.Count
    That's because .Words.Count returns a count of words, paragraphs & spaces, while .ComputeStatistics(wdStatisticWords) returns a count of words only. You can see the effect of this if you run the code on a simple document and add/delete some empty paras. This may address your concerns over the definition of a Word.

    There is also a problem with the Sentence count, in that Word doesn't differentiate between an abbreviation followed by a period and a sentence. For example, "Hello, Mr. Chips." counts as two sentences. I imagine one could code around that with a great deal of effort ...

    Cheers
    Cheers,

    Paul Edstein
    [MS MVP - Word]

  6. #6
    Platinum Lounger
    Join Date
    Feb 2001
    Location
    Yilgarn region of Toronto, Ontario
    Posts
    5,453
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Syllables of a word (Word97SR2+)

    Macropod, thanks for the constructive criticism.

    I agree with all your points. The code is somewhat slipshod in its components, but serves me well for gaining a rough value of the FOG index. In particular, given a chunk of text, such as a one-page memo, I needed a means to observe if the FOG index had improved between draft and final version, so almost any method would suffice - as long as the tool was consistent.

    > .Pattern = "[^bcdfghjklmnpqrstvwxz]+"
    Quite so. My regExp skills are still nascent, and I'd toyed with a more complex string, but settled for what I thought was "a series of vowel strings followed by consonant strings". Your specification says, i think, "a series of consonant strings", which is more accurate.

    >ends in a silent vowel (eg "babe")
    Quite so, and here too I was aware that my string wasn't perfect, but figured (again) that as long as the defect was applied consistently, I'd be OK. In this case I figured that my syllable count would tend to be elevated, so that the FOG index would be elevated, and I had to bear in mind that if I got a value of 10.7 it was probably, really, a little less. Nonetheless, a comparison of before/after should indicate if any change had taken place.

    > sngWords = rng.ComputeStatistics(wdStatisticWords)
    Again, agreed. I think I once found four different ways of counting words, each providing a different result. In debugging this code I'd noticed "." as a word, and shrugged again, on my grounds that consistent defects provided a consistent comparison.

    > problem with the Sentence count
    Right, and yes, effort, which I didn't want to spend for this purpose. In class we might devote ten minutes to FOG index, and providing a handy tool to get a general idea is my goal. ("Here's a little template on a floppy; if you are worried about viruses, the code is short and you can see it all ....") without offering a major opus involving complex iteration.


    In particular, I don't like the loop through the .Words in the range. I feel that some of those loops take an absurd amount of time. I'd be inclined to use a fast method to extract strings and process only those. (Your homework assignment (grin!) is to devise a RegExp that will quickly extract only strings of length > 5 from a string of text; if nothing else this ought to isolate candidates for three syllables (3 vowels, 2 consonants or longer) and reduce the iterations.

    Thanks again for the input. I will incorporate your suggestions and compare the two styles.]

    And now on to your next message ..... (later: I noticed that you combined the two into one post. Oh well ...)

    >I noticed that you haven't counted 'y'

    Right. Same reasons as above. I went for a quick-and-dirty that gave results.

  7. #7
    Platinum Lounger
    Join Date
    Feb 2001
    Location
    Yilgarn region of Toronto, Ontario
    Posts
    5,453
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Syllables of a word (Word97SR2+)

    Results of a quick test :
    I tested a single one-page memo (letter), using both the entire document, and just the body - the paragraphs between "Dear Sir" and "Yours sincerely".
    <pre> All Body
    Me 11.16308 14.0771
    Pattern 12.49641 15.42976
    Words 12.7947 15.4073
    </pre>

    A better pattern ("[^bcdfghjklmnpqrstvwxz]+") gave me a higher FOG index; that suggests that your pattern found more 3+syllable words than did mine. I'd go with that, because I'm not even sure that my pattern detected trailing syllables. ("Babe" is two syllables in most of the pop-songs of my age group!)
    A better word count raised the FOG index above my original method, BUT in combination with your better syllable count, the FOG index dropped (15.42976 to 15.4073). Presumably a more accurate (lower) value for AverageWordsPerSentence outweighed a more accurate and higher value for PercentageLongWords.

    Regardless, applying your two corrections shows me that my FOG index for this single memo is significantly higher than I had thought, and since my purpose is to gain an alert whenever the wording gets too complex, this is good. (The theory is a FOG index between 8 and 12 is OK, but outside those bounds is too simple or too complex.)

    So, your fixes are very much IN! Thanks again.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •