Crash words in the Eloquence speech synthesizer

The Eloquence speech synthesizer is an old but popular synth used in many screen readers such as JAWS, Window-Eyes, and System Access, Cellphones, and notetakers.

It’s widely used, very understandable at high speeds and responsive. Despite that, it has at least one problem: it can crash when it reads certain words. With the way screen readers are designed, when the synthesizer crashes, your screen reader goes with it. After that happens, it can be difficult to reload, sometimes requiring the dismissal of error dialogs.

This leads to a question. Are the crashes simple crashes, or can they lead to buffer overflows and the ability to execute code via a carefully-crafted string? If code execution is possible, it might be difficult to get through the preprocessing most screen readers do; but there might be something else using it that does less. I don’t have the knowledge and experience to check for myself, but someone might.

What can we do about this? Unless the screen reader vendors release patches, not much, if Eloquence continues to be used. GW Micro has an Eloquence Fix script, which can easily be updated as new words are found. I don’t know of any fixes for the others. Recent versions of JAWS fixed a few of them, but not all. Unless strings are converted from unicode before checking (e.g. Python’s s.encode(‘mbcs’)), several characters in the unicode range can be used to substitute for the letters, defeating any simple regular expression. I think that the GW Micro script already does this. In the folowing section, delete the slash (/) character from the words listed. I’ve put it there to avoid crashing anyone using Eloquence to read this page. “Re: ” at the beginning says that this line is a regular expression describing part of the word See comments on previous line. These re’s might not be 100% correct, or might pick up false positives. Experiment a bit.

We need to worry about prefixes/suffixes for this one. E.G. re, anti, -ing, etc.
re: c/aesur.+

these next few all have one common pattern. A prefix ("h'", "j'", "s'", "x'", "z'") folowed by something else,
followed by "'re" or "'ve". I won't list the prefix/suffix, just the middle part.
Concatenate the middle parts with apostrophes to get endless combinations. e.g. prefix+a'b'c+suffix.
They don't seem to work in a sentense,
but if you can get them to read on their own they should crash. Experiment a bit.
Empty - prefix+suffix
s, d, hs, js, ll, xs, zs
re: bj+s
re: bx+s

Next, we have word+hesday and word+hesway. e.g.
wed/hesday. I think the word needs to end on a consonant.
But we also have, at the end of a word:
re: hh+s[dw]ay

This crashes, e.g. when next to a number or by itself, but not as part of a word. e.g. nietzsche won't crash.
tz/sche
re: \d+:\d+(1st|2nd|3rd|...)
e.g. 2:33 rd
Some of them don't work, like 4th. I Don't know why.
re: (un|re|non)cosp
juaras - this one is weird.
Add text after it, it'll eventually crash. Example:
juaras/aaaaaaaaa aa
also juares, juaros, juarus.

Last updated: June 21, 2015
Advertisements

14 Responses to “Crash words in the Eloquence speech synthesizer”

  1. Dentin Says:

    Thanks for this list Tyler. I believe we’ve managed to mask all of these patterns out using the socket filters on Alter Aeon; it took a few tries to get patterns that don’t trigger in general usage, but I think we’re there now.

    If you find any that still make it through the filters, drop me a line and I’ll take care of it.

  2. Dentin Says:

    Tyler,

    Have you found anything new recently? I haven’t heard of any issues since the last round of filter updates.

  3. Bill Says:

    Hi, Tyler. Thanks for this summary. I’ve written an initial voxin driver for a new speech back-end for Orca and NVDA, and this was very useful.

  4. Dentin Says:

    FYI Bill, I’ve not seen any other instances of crash words come up since June, so this list is probably pretty close to complete. Tyler did a bang-up job here.

  5. Dentin Says:

    I just got a crash report on the words ‘web hes kill’ and ‘ged hes kill’ (remove spaces for the actual crash words.) I’m trying to get people to verify it now.

    Tyler, would it be possible for you to narrow down the matching string for this? I currently have ‘hes k’ masked out, which I’m guessing will be sufficient, but it might have a more general pattern.

    Thanks. Your work has been extremely helpful.

  6. Dentin Says:

    I have independent verification on those two words. No other shorter matches appear to trigger it. Validated on Jaws 12 and 13, as well as another unspecified version.

  7. Tyler Spivey Says:

    You opened up a whole new can of worms. Here we go.
    web hes kill – crashes
    web hes kil (with only one l) – no crash
    wed hes kill – crashes, as I expected
    So that same rule applies. That’s why jed hes kill will crash it. Now, let’s try something with the same format that won’t match your hesk pattern:
    web hes bill – crashes
    web hes sill – crashes

    Searching around a bit, I found::
    web hes {reich, rob, rock, root, thorpe, tide, side} – crashes

  8. Dentin Says:

    Thank one of my jerk players for this can of worms. I have no idea where he got it from.

    Do you think that a better pattern than hesk might be to mask “dhes” and “bhes”? With that many trailing words I don’t think a trailing match is going to do the job.

  9. Tyler Spivey Says:

    If you block “dhes”, you’ll end up blocking adhesives.
    Letting it read the output of this, I found a few more:
    for i in we{a..z}{a..z}{a..z}skill;do echo $i;done

    we[bdfjlmnqvz] {hhs,hes} kill

  10. Dentin Says:

    Does the we[bdfjlmnqvz] {hhs,hes} pattern also work with other trailing words, like rob/rock/root/tide/side etc?

    This one looks like a real nightmare to filter properly.

  11. Tyler Spivey Says:

    It does. e.g. weq hes tide will crash.
    I thought the first part didn’t have to be 3 letters, so I started trying things. We get at least these that aren’t 3 letters:
    {ed, dead, undead, yb} – combine those with the rest, and it dies.

    The only common thing seems to be the second part, and I guess the first part has to end in one of the letters from the last comment.

    Given that logic, adhesives should crash it, but it doesn’t. Ok, so the third part can only start with a consonant, which narrows it down even further.

    If you can afford to put a regex parser in your input path (PCRE or something), you should be able to match:
    ([bdfjlmnqvz](hhs|hes)[bcdfghjklmnpqrstvwxz])
    And globally replace with ****, since it should always match 4 characters. Problem is figuring out whether that occurs in a real english word or not, or if there’s something really obvious that can be done to make it crash again.

  12. Dentin Says:

    In the immediate short term, because I don’t have easy regex matching in the server, I’m going to just mask out hhs and [bdfjlmnqvz]hes. I’m ok with losing ‘adhesives’ for now.

    I did scan for hhs and [bdfjlmnqvz]hes in our data files, and the only matches were a bogus player name and the word adhesives, so this looks reasonably safe.

    Thanks for your help.

  13. Daniel Parker Says:

    Thanks for this list. I had a friend tell me about another crash string: any number, followed by a colon, followed by this string (without spaces or quotes, of course): “2 2 n d”. It worked. Have never seen this pattern on other lists of Eloquence crash words so I thought I’d put it out there.

    • Dennis Towne Says:

      Yeah, we’ve known about that one for a while, but I’m not sure why it never got posted here. The four sets of five character expressions I use for filtering are:

      : [0-9] 1 s t
      : [0-9] 2 n d
      : [0-9] 3 r d
      : [0-9] [0456789] t h

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


%d bloggers like this: