Stolen from Becky S3o…

“Aoccdrnig to a rscheearch at Cmabrigde Uinervtisy, it deosn’t mttaer in waht oredr the ltteers in a wrod are, the olny iprmoetnt tihng is taht the frist and lsat ltteer be at the rghit pclae. The rset can be a total mses and you can sitll raed it wouthit porbelm. Tihs is bcuseae the huamn mnid deos not raed ervey lteter by istlef, but the wrod as a wlohe.”

Interesting, huh? Maybe not. I dunno. One thing I learned in my linguistics classes is that the humans use a lot of grammar as we read. It’s not that we start from letters, then figure out words, then take the words, figure out sentences and so forth. We actively use grammar while we’re reading, so it’s almost top down instead of bottom up. This can be evidenced by the kind of mistakes we regularly make. The classic example is:

“The horse raced past the barn fell.”

Most people, when they read that, see it as ungrammatical. But it’s all in how you parse it. If you see raced as not something the horse itself did, but something that the horse was involved with, more like, “The horse, the one that was raced past the barn, fell.” then the sentence makes sense. But most people don’t parse it that way. This is seen as indication that we don’t take the whole sentence and parse it, but base how we parse things on a grammar as we go.

Anyway, that’s why most text and speech recognition programs don’t do too well, compared to humans. Their grammars are too limited, they don’t use enough linguistic info. They’re completely bottom up in their processing. So like, no machine could understand the top sentence. So it really makes sense to build more grammatical understanding into text/speech recognition systems to improve them. But there are drawbacks which I don’t remember. Technical / computational complexity? Over specificity? Something like that.

Speech recognition is actually really interesting. There are so many problems with it. Like, most people don’t realize that the way their vocal sounds break down, the basic vocal parts, don’t correspond to the written syllables. That becomes a problem because when a machine doesn’t understand someone, they tend to speak slower and “clearer”, but they start splitting up their pronunciation according to written syllables instead of their natural phonemes and that just confuses systems more.

Whatever, boring, I guess.