Tibetan Input System Principles

THL Toolbox > Fonts & Related Issues > Tibetan Scripts, Fonts & Related Issues > Tibetan Input System Principles

Tibetan Input System Principles

Contributor(s): THL Staff, Steven Weinberger, Chris Fynn.


Simply having a Tibetan font does not mean that you can now use Tibetan in a computer. You also need a keyboard or input mechanism to efficiently type in Tibetan, and you also need that input mechanism to work with the various types of software you typically use. One of the problems concerning the use of Tibetan fonts in digital contexts has been the limited and non-standard nature of Tibetan script input tools and software. Usually keyboards are also platform-specific, so that they only work for Windows, Mac, or Macintosh. The current page presents the different types of keyboards or input systems in terms of their principles - such as Wylie, Sambhota, etc. - and offers nothing about their specific implementations in various operating system contexts. For the details on the latter, see Tibetan Input Tools for Linux, Tibetan Input Tools for Macintosh, Tibetan Input Tools for Windows, and Tibetan Input Tools for Browsers or Cross-Platform.

The input method is wholly separate from both the glyphs and encoding. It is the way a user strikes keys to produce these glyphs on the screen. There is no intrinsic connection between the input method and the glyphs or encoding, and thus a single font can be invoked through multiple types of input methods and keyboards. In addition, just having a font does not entail you have a keyboard or viable input method – if the operating system doesn't provide one, then one must fashion one separately. Thus support for Tibetan script doesn't necessarily mean support for the keyboard input method one prefers.

Any font could theoretically work with any keyboard. For example, the key "q" on a standard English keyboard could just as easily product "a", "b", "c" or any other letter if such a keyboard was being used. The standard English keyboard is called a QWERTY keyboard since the keys on the top row of letters read, from left to right, Q, W, E, R, T, and Y. The great thing about the QWERTY keyboard is that it is widely used, so that people can buy a computer with the expectation that its keyboards will have this keyboard preprinted on the keys, and that it will be supported by all operating systems and keyboards.

Unfortunately, Tibetan keyboards, or input systems, have never undergone such a standardization. There are thus many different systems with dedicated communities of users who have learned to touch type rapidly with them, and who are completely at a loss if forced to use a different keyboard.

General Types

There are two basic issues that account for the variability in Tibetan keyboards. The first is the basis on which one correlates Tibetan letters to keys on the keyboard, and the second is how vertical stacks are formed. Since the early 1980s multiple types of Tibetan keyboards have emerged on the basis of these variables, and various communities of users have developed who are devoted to these keyboards and learned to use them for rapid input of Tibetan. The keyboards can be classified into the following four types: phonetic, ka-kha-ga-nga, frequency and random.

Of these, the most common at present are Sambhota Keymap 1, Tibetan Computer Company Keyboard #1, and THL Extended Wylie. Other keyboards such as ACIP and Beida have far more specialized user sets that have grown used to them. However, we expect that in the future, a new frequency-based keyboard will become the norm among Tibetans in China, while the THL Extended Wylie system is likely to be the norm amongst international users of Tibetan.

Letter-Keys Correlation Schemes

There are three basic variants - phonetic correlations (Tibetan letters are correlated to roman letters on the basis of phonetic resemblance), alphabetical sequence (Tibetan letters are associated with keys in Tibetan alphabetical order starting from the roman letter located in left hand top corner of keyboard), or frequency (where Tibetan letters most commonly used are correlated to the keys most easily reached in touch typing). Another issue is whether each Tibetan letter is assigned to a key on a one to one basis, or whether – as in some phonetic keyboards – certain Tibetan keys must be input by hitting two keys. Examples of the latter are the Tibetan letter "kha" being input by "kh", or by "shift k".

People who know English – whether Tibetans or non-Tibetans – tend to prefer phonetic keyboards of the Wylie type. This is because the phonetic associations make it considerably easier to learn and remember for those already accustomed to an English keyboard. However, Tibetans who don't know English tend to find the principle of hitting two keys (k+h for kh) for one Tibetan letter confusing. Without question a well designed frequency keyboard would be best for Tibetans not knowing English, but establishing one such keyboard across a wide variety of Tibetans is a political task, not a technical one. It remains to be seen what will happen in this regards.

Stacking Method Variants

There are three basic variants – the use of intelligent keyboard methods that don't require any stacking key of any type, use of a "stacking key" that signifies a vertical stack of Tibetan letters is to follow, and the use of a "modifier key" that is used to signal that a given letter is to be subjoined in a stack to what preceded it. The best practice is the use of no stacking key at all. Users can simply type in the Tibetan letters and the keyboard sorts out the stacks. For this reason, most users, including Tibetans, prefer such a scheme. However, this obviously requires a more complex keyboard from the programming point of view, but is much more convenient from a user point of view. The use of a special stacking key involves hitting a special key once – such as "f", "h", or "+" – to signify that all the following letters are part of a vertical stack. When a stack key is used, the programming of keyboards is simpler, but it adds an additional key from the user point of view and is thus inefficient. In addition, the keyboard knows when to stop stacking essentially because it runs out of precomposed glyphs of stacks. Such methods can thus be problematic for more complex and unusual stacks used to represent Sanskrit and mantric syllables. The use of a special modifer key involves typing in the first letter of a vertical stack, and then using a modifer key – such as the "shift" – for each subsequent letter in the stack to indicate that it should be subjoined in the stack. Unlike the stacking key, thus, this involves using a key pressed simultaneously with the letter in notion (shift k, etc.), as well as pressing that key for each item in the stack after the first letter (rather than only once at the beginning of the stack).

The most typical stacking keys are to use +, or the "shift" key – you type in the uppermost letter of the stack, and then all other letters in the stack are preceded by the stacking key to indicate that they are part of a stack. Thus the "rgya" stack might be input as "r+g+y" or “r <shift>g <shift> y” .

Inputting Unusual Stacks

Unusual stacks of letters and vowels can be input with the Microsoft Himalaya font. Enter the consonant(s) first, then subscriptions and the first vowel that appears under the consonants, and then for each addition vowel under the consonant use the stacking keystroke for your particular input system, often a plus sign (+), followed by the vowel. After entering all the vowels below the consonant, enter the vowels above the consonant in a similar way, using a plus sign and the vowel. Begin with the vowel immediately above the consonant and continue up the stack of consonants. If you are working in a word processing file you will need to greatly increase the line height to display all the vowels. For example, ཀྱོོོོིིེེུུུ is entered using a Wylie keyboard with the following keystrokes: kyu+u+u+o+o+o+o+i+i+e+e

1. Phonetic Keyboards

Phonetic keyboards are based upon phonetic connections between the roman script letters found on keyboards (a, b, c, etc.) and the Tibetan alphabet. Thus the Tibetan letter "ka" is invoked with the key "k", "ta" with "t" and so forth. There are four major forms of these phonetic keyboards.

1. Wylie keyboards: these rely upon the international standard Wyle transliteration scheme as a base. Such a scheme, however, is inadequate for complex uses, since the original Wylie scheme itself only covered core characters without the ability to represent, for example, complex stacks representing Sanskrit conjuncts in Tibetan. These are thus outdated and of limited utility at present.

2. THL Extended Wylie keyboards: these rely upon an extension of the Wylie scheme governed by THL as a base. It has comprehensive coverage of all possible Tibetan conjuncts. English speakers and international users tend to prefer it (or some other variety of extended Wylie).

3. ACIP: this relies upon a transliteration scheme developed by the Asian Classics Input Project, and represents significant divergences from Wylie. However it does offer a comprehensive coverage of all possible Tibetan conjuncts. It is only primarily used by India-based monastic input groups belonging to the Asian Classics Input Project (ACIP), and is unlikely to ever achieve a user base outside of that.

4. Sambhota Keymap One: this is a partial Wylie keyboard that, however, uses a + key to create stacks, and represents the second column aspirated letters with the capital letter of the corresponding column letter rather than by adding an "h" (thus "K" instead of "kh", "T" instead of "th" and so forth). Tibetans who don't know English could find this easier since each Tibetan letter has a unique corresponding key. This is widely used by Tibetans in China, India and Nepal at present. However, there it may ultimately displaced by new frequency keyboards.

2. QWERTY Keyboards

QWERTY keyboards are based simply upon the layout of the standard international keyboard, named as such because of the presence of the Q, W, E, R, T and Y letters on the top row of keys from left to right. The most common form simply lists out the Tibetan alphabet starting with "ka" from Q, and proceeds from the top left to the right, and then down to the next row. Thus in a QWERTY keyboard, "Q" generates "ka", "W" generates "kha", "E" generates "ga", "R" generates "nga" and so forth. In an AZERTY (French) keyboard, "A" generates "ka" and so forth. When it reaches the end of that row, it continues the sequential correlations with the left most alphabetical key on the next row down, and so on through the third and lowest row of alphabetical keys. The four vowels are usually located in the middle of the second row. Such keyboards are thus commonly referred to as the "ka-kha-ga-nga" keyboard. 1Tibetan Computer Company Keyboard #1 is one of the most popular instances of this.

Tibetan Computer Company Keyboard #1: one of the most popular instances of this, and has been used widely by Tibetans in Nepal and India.

Bhutanese Layout Unicode Keyboard: this relatively recent keyboard implements the vowels in the middle of the top row, which was considered a better place than the middle of the center row since it results in some of the more frequent consonants being easier to access. It uses a "modifer key" for stacks, namely the "shift" key. It is increasingly becoming a standard among Bhutanese since it is now the official layout for all Bhutanese Government organizations.

3. Frequency Keyboards

Frequency keyboards are based upon correlating the most frequently used Tibetan letters with the keys on the keyboards that are easiest to reach for a typist, while the least frequently used letters are associated with keys that are farthest out of reach. Theoretically such a keyboard is ideal, since if a typist has been trained to use it, it should offer the most convenience. However, the standard international QWERTY keyboard is in fact one of the worst possible keyboards for use with English, and was in fact set up originally to discourage typists from typing fast since typewriters at the time would jam otherwise. However once it became established, it proved to be impossible to overcome the already established user and manufacturing base to implement a more optimal keyboard (e.g. the Dvorak keyboard). While all current Tibetan keyboards have a fairly modest in user base, and there is no significant manufacturing issue since standard international keyboards are used, in the absence of an organized national push in China it seems unlikely that any frequency-based keyboard will succeed in displacing any of the several existing Tibetan keyboards that have already developed modest user bases. However, Tibetans in China are currently trying to design new frequency keyboards, and these may yet win the day for the majority of Tibetans.

Tibetan Computer Company Keyboard #2 is an instance of a keyboard designed for optimal use on frequency base, but to date has not developed any significant user base.

The Beidafazheng Keyboard exists in two versions, the more recent of which was designed on the basis of extensive research into frequency issues. They have a highly skilled but very limited user base within the Chinese publishing industry. Historically these have been tied to the non-Unicode Beidafazheng and Huaguang fonts used in Chinese publishing, but they may be migrated for Unicode support in the coming years.

4. Arbitrary Keyboards

Arbitrary keyboards are keyboards where it appears key associations were made without any rational basis, whether based upon phonetic, frequency, or international keyboard layout criteria.