The Myth of Ambiguity


This comes as a little addition to my previous entry, The Phoneticization of Chinese, which talked about why it sucks that Chinese is not a deterministically phonetic script (read about ‘determinism’ with regard to scripts at the end of this piece), why it should be a deterministic script, and, finally, an example of a simple system that could implement a phonetic script for Chinese. One of the big questions that come out of it is that if Chinese writing becomes completely phonetic, will readers lose the ability to easily disambiguate? The answer is “Yes and No”; I shall tackle both sides of the answer below.

Yes, they will lose the ability to disambiguate

If you’re walking down the street today and you see the character 医, you know it’s not the ‘one’ or ‘clothes’, but instead something to do with doctors or medicine – perhaps, a clinic or a hospital. If you see ‘Yī’ instead, you do not have that same power of disambiguation. And here, unless there is some obvious context, if you’re going down the street, you could be trying to disambiguate between a clothes shop or a clinic. With toneless pinyin, which is very common on street signs in China and Taiwan, you are left with even more possibilities as to what ‘Yi’ could mean. Similarly, even with the 心言水木 system, you do not have that power of disambiguation anymore, however, you do know the tone because that’s embedded in the character.

The basic limitation of a phoneticization system for Chinese is that it relies on context and the fact that speakers can easily disambiguate between terms when they are placed (a) in context and (b) in polysyllabic words. So, for example, if somebody just threw the word ‘zhǎo’ at you, you might not be able to tell the person whether it meant ‘to look for’, ‘to ask for’ or ‘to give change’, but if the same person gave you ‘xúnzhǎo’ instead, you could say, without hesitance, that it meant ‘to seek/look for’. However, since Classical Chinese is something that was never meant to be spoken out and have its meanings resolved through context and polysyllabic words to the same degree as Spoken Chinese is, the use of a phonetic system for written down Classical Chinese would be infeasible as it would become even more indecipherable than it is today. Another area where such a phonetic system would be infeasible would be in the transcription of Chinese names, which usually have words of different meanings strung together and those words tend not to have meanings distinguishable from context.

No, they will not lose the ability to disambiguate

For this part, I draw my comparisons from the two languages I am most familiar with – English and Hindi. When students start learning Chinese and start using computers for typing, they soon realize how many different characters have the same exact pronunciation. And, usually, it’s quite disheartening to find out that there is so much more yet to be learnt. In reality, most of the characters that come at the ends of those lists are so rare that you may never encounter them in your entire life. However, let’s say that there are ten to twenty odd relatively frequently used meanings for each syllable in Chinese; is this not the case in other languages such as English and Hindi?

First of all, let me take a somewhat amusing example from Hindi. We have a word, kal or कल (pronounced ‘cull’ with an unaspirated ‘k’) that means both ‘tomorrow’ and ‘yesterday’; similarly, another word parson or परसों (pronounced ‘per-so-ng’) means both ‘the day after tomorrow’ and ‘the day before yesterday’. Yet, never in my life have I been confused about which one it meant; in fact, I never even actively thought about the fact that they were the same word until a friend of mine pointed it out to me sometime after I arrived in the United States at the age of eighteen. Of course, if you’d handed me the word kal randomly and out of context, I suppose I shouldn’t be able to tell you what it meant, but if you put it in a sentence (and Hindi has tense), it is always unambiguous what it means. Clearly, this particular double meaning would never work in Chinese because there’s no tense and these very time words are used for disambiguation between past, present and future.

Now, some examples from English. Let’s consider the word “fine”. According to, this word has 18 meanings as an adjective, 4 meanings as an adverb, 2 meanings as an intransitive verb, 3 meanings as a transitive verb and 1 meaning as a noun. Similarly, the word ‘bank’ has 18 meanings, the word ‘draft’ has 38 meanings, and so on. Yet, any capable speaker of the English language can determine which meaning applies when with trivial ease, using context. For example, “You are fine” and “You are fined” have two distinct meanings which are incredibly hard to mix up. In fact, my claim is that you can make English completely and deterministically phonetic by having only one standard way of writing every single sound (converting ‘rhyme’ to ‘rime’, etc.) and still not lose an ounce of meaning in an English sentence. What is my claim based on? It’s based on the fact that it is already done every single day when people speak English. There is no distinction between ‘two’, to’ and ‘too’ when I speak, yet my listener knows which one I’m referring to. Similarly, when Chinese people speak Chinese, they do not speak in Chinese characters, but the closest they do come to speaking is some sort of phonetic system like Pinyin or 心言水木. The fact that dictation (tīngxiě) exists in Chinese should be proof enough of the fact that a completely phonetic system works!

Deterministic and Non-deterministic Phoneticism

Chinese and English, in my opinion, both use a “phonetic” system but that the degrees of phoneticistic value differ for the two. If phonetic value is given a number between 1 and 100, then Chinese is somewhere in the 10s or 20s, definitely below the minimum passing grade and English is somewhere in the high 80s. Basically, given a random Chinese character, you have a 10-20% chance of getting the sound of the character right (my estimations have been generous, actual chances may be even lower) and with English you have 80%-something chance of getting the sound precisely right. Neither is 100% phonetic, neither is 100% detached from phonetics. For example, if every single Chinese character refused to give a clue about its pronunciation, like 天 or 木, then the Chinese characters would be at the 0% mark. If English was transcribed in a way that you could always be able to figure out accurate pronunciations of words you’d never seen before, then English would be at the 100% mark.

With Pinyin and 心言水木, the plan is to take Chinese to that same 100% mark. If you hear a word and you have the Chinese ear, you have 100% probability of detecting the correct Pinyin for it, and if you know the correct Pinyin for a word, you have 100% probability of being able to say it out aloud accurately. This attainment of 100% probability is the same as saying it is a deterministic script.


One Response to “The Myth of Ambiguity”

  1. 1 The Phoneticization of Chinese « 閔士睿之日记


Fill in your details below or click an icon to log in: 徽标

You are commenting using your account. Log Out /  更改 )

Google photo

You are commenting using your Google account. Log Out /  更改 )

Twitter picture

You are commenting using your Twitter account. Log Out /  更改 )

Facebook photo

You are commenting using your Facebook account. Log Out /  更改 )

Connecting to %s

%d 博主赞过: