“Phasa Sipsongbanna” ภาษาธรรม สิบสองปันนา 傣文字母歌. Sung by / ขับร้องโดย: Ee Orn Noi อีอ่อนน้อย (of Tambon Sang Yong, Muang Laa, Xishuangbanna แห่ง ต.ส่างยอง อ.เมืองล้า แคว้นสิบสองปันนา) This is an educational version of the song with images and subtitling (in Tai Tham script) by Mr. “ThaiEsan” of Thailand. URL: https://www.youtube.com/watch?v=1tdp_OTZdS4 .

Designing a Tai Tham Unicode Font. By Ed Trager for the Libre Graphics Meeting 2014, Leipzig, Germany, April 02-05, 2014.

What is Tai Tham?

Tai Tham is an Indic-derived script of South East Asia that is closely related to modern Burmese script and the old Mon script. The script derives from the Pallava script which is itself descended from the ancient Brāhmī script of the Indian subcontinent.

For centuries throughout the northern regions of Southeast Asia in the Xishuangbanna (西双版纳 สิบสองปันนา) region of Yunnan in China, in the Lanna (ล้านนา) region of Northern Thailand, in the Shan state (รัฐฉาน) of Myanmar around the city of Kengtung (เชียงตุง) and in Laos, the Tai Tham script (อักษรธรรมล้านนา หรือ ตั๋วเมือง) has been used extensively to preserve important religious and cultural texts.

In Buddhist monastaries throughout the region, texts in Sanskrit, Pali, and local Tai (傣 ไท) languages were traditionally written on palm leaf manuscripts (贝叶经 คัมภีร์ใบลาน).

While scripts such as Central Thai and Laotian have replaced Tai Tham in the regions of Thailand and Laos where Tai Tham was traditionally used, Tai Tham remains the only script of the Tai Kuen (ไทเขิน) people in the Shan state of Myanmar, and continues to be taught also in monastaries by the Tai Lue (傣仂 ไทลื้อ) people in the Xishuangbanna region in China.

In the current Digital Age, there is renewed interest in using the Tai Tham script to preserve the cultural heritage of the past and to educate the youth of a new generation. Historically, only monks and a few other educated people had access to Tai Tham texts. Now digital technology, the internet, and the continuing development of Unicode as the worldwide standard for the exchange of textual information are beginning to make it possible for many more people to access documents written in Tai Tham.

I want to mention a few examples to give you a sense of what is happening.

l'École française d'Extrême-Orient (EFEO, http://www.efeo.fr) in Paris, in cooperation with the Princess Maha Chakri Sirindhorn Anthropology Centre in Bangkok (ศูนย์มานุษยวิทยาสิรินธร, http://www.sac.or.th ) have made available online a collection of Lanna palm leaf manuscript chronicles, traditional stories and Buddhist narratives called tamnan (ตำนาน, http://www.efeo.fr/lanna_manuscripts/).

Another notable example is the Digital Library of Lao Manuscripts (ຫໍສະໝຸດດິຈິ​ຕອລໜັງສື​​ໃບລານລາວ, http://laomanuscripts.net), an online collaboration of the National Library of Laos (ຫໍສະໝຸດ​ແຫ່ງ​ຊາດ​ລາວ), the University of Passau and the Staatsbibliothek zu Berlin Preußischer Kulturbesitz in Germany.

But it is not just scholarly collaborations with European institutions that are occurring. There is a lot of local activity in Southeast Asia as well.

The Center for the Study of Palm Leaf Manuscripts at the Lanna Institute at Chiangmai Rajabhat University (ศูนย์ใบลานศึกษา สถาบันล้านนา มหาวิทยาลัยราชภัฏเชียงใหม่) maintains a Facebook page which highlights many interesting events and activities related to Tai Tham culture.

But what is missing, not only from these online resources, but in general? High-quality Unicode fonts, keyboard and input methods, and software capable of handling Tai Tham are required.

Much work remains before people can conveniently use Tai Tham on computers and on the internet. For example, currently neither the EFEO project nor the Lao Manuscript project allow the user to search directly using Tai Tham.

And the problem is not just an “online” problem. Books are being published in Thailand using digital fonts that are sub-optimal. The example word circled in red on the left shows a common problem of “glyph crowding”. On the right, the word circled in red is reproduced using the Hariphunchai font in which I have made an effort to address exactly these kinds of problems.

I have even seen this glyph crowding problem in the two-volume Mae Fah Luang (แม่ฟ้าหลวง) Lanna-Thai Dictionary published in 1991. Produced by the Princess Mother Foundation in cooperation with Siam Commercial Bank to commemorate 90th royal birthday anniversary of Her Royal Highness the Princess Mother Sangwan Mahidol, The Mae Fah Luang dictionary is considered an authoritative dictionary of the Northern Thai language.

Because of the inadequacy of fonts, a number of books, especially various Tai Tham primers such as the one shown here, have been reproduced from hand-written manuscripts.

So with all of this going on, I decided to try my hand at creating a Tai Tham font ...

The goal of the Hariphunchai Tai Tham Font Project is to create a freely available, professional-quality Unicode Tai Tham font licensed under the terms of the Open Font License (OFL, http://scripts.sil.org/OFL).

Creating such a font is easier said than done. There is a rich manuscript tradition for Tai Tham, but not a rich typographic tradition. Therefore, it was necessary to take inspiration from this rich manuscript tradition ...

Although we rarely think about it, the reality is that the tools and the materials used for writing influence the forms of letters that are produced. For example, we can trace the development of serifs in Latin to the technology of incising the letters on stone.

Likewise, we can trace the very rounded forms of Tai Tham letters to the technology of inscribing letters on palm leaves.

Since the technology behind traditional Tai Tham manuscripts is quite different than what we are familiar with in the West, let’s briefly look at how traditional Tai Tham palm leaf manuscripts are created.

First leaves are cut from a “laan” fan palm tree.

Two species of fan palm trees in the genus Corypha are used for manuscripts.

The leaves are trimmed, boiled in a herbal mixture to prevent insect damage, and then dried.

Here you can see the leaves being laid out to dry.

Here you can see the dried trimmed palm leaves ready to be inscribed using the styli with sharp needle points called “lek jaan” (เหล็กจาร). Here also you can see a cloth dauber and bottles containing ink that is applied afterwards.

Taut string or cord soaked in a mixture of resin and carbon soot is used to snap ruled lines onto the prepared palm leaves. After that, the lek jaan เหล็กจาร stylus is used to inscribe text, usually on both sides of the prepared palm leaves.

Here’s a closer look.

As you can see here, the letters are literally scratched on the surface ...

To improve clarity after inscription, the surface is wiped down with a mixture of resin and carbon soot ink in order to darken the inscribed letters.

A finished folio inscribed by a young monk.

Finished folios ...

After the inscription process is complete, holes are punched through and the leaves are strung together with cord or string. This process is called “saay sanong” (สายสนอง). Twenty-four leaves are strung together into a bundle called a “phuk” (ผูก). A complete manuscript may consist of one or more “phuk”.

Drawing inspiration from hand written Tai Tham manuscripts may be a good idea, but it is a difficult task. Of course there is the problem of local variation because no two hands are the same. But, on top of that, there is also the problem of regional variation. So, for a project such as Hariphunchai, we need to try to find the commonalities that are shared by all scribes. We want to capture those common features of shape, styling, structure, and spacing into a modern font that all readers of Tai Tham will recognize as being clean, legible, and pleasing to the eye.

In the slide here we see regional variation in the shape of subjoined na in manuscripts from Xishuangbanna (Yunnan Province, China), Keng Tung (Shan State, Myanmar), and Chiangmai (Thailand).

Now that we have seen how the script has been written traditionally, let’s examine some of the characteristics of the Tai Tham writing system.

Tai Tham is an abugida. Consonants have an inherint vowel sound “a”. The inherint vowel sound can be modified by adding different signs.

Tai Tham is written so that it is hanging from a baseline as shown here.

And just like in Thai, Lao, and Burmese, there are no spaces between words. Spaces are used to mark natural pauses between phrases and sentences.

Structurally, Tai Tham consists of base consonants that are surrounded by vowel signs, tone marks, or even subjoined consonants.

In general, consonants are written from left to right. However MEDIAL RA is an exception and is written in front of a base consonant.

While MEDIAL RA is the only consonant that can occur before a base consonant, many other consonants can be written BELOW a base consonant in a SUBJOINED form, as shown here. As illustrated here, the shape of the subjoined consonant NG has not changed. It is just a smaller version of the original.

But in other cases, the shape of the consonant changes when it is written in a subjoined position.

Finally, a consonant may be modified by one or more vowel signs which may precede, follow after, sit on top, or hang below a base consonant.

There are also tone marks which sit above the base consonant.

Putting this altogether, here we have the word for “Buddha” with all the customary honorifics present. Notice that we have a prefixed MEDIAL RA, a subjoined vowel u, a subjoined consonant th, a tone mark, and even a two-part vowel that surrounds the final base consonant. Yes, it is a bit complicated!

Here’s a larger view of the preceding word for Buddha.

I started almost all of the glyph design work in Inkscape. All tools have their advantages and disadvantages. Here I will discuss some of the reasons why I like working in Inkscape. First, the width of strokes in the Hariphunchai font remains constant over most of a gylph, except for the initial loop and tail endings. In inkscape, I found it easiest to start a glyph design by working with a simple bezier line segment or stroke (black). I could then add the initial loop (red). For the tail, I could also start with a single bezier curve line and once I got close to the desired curve, I could convert that line stroke to an outlined path (purple) and then narrow the end of the tail until it appeared correct.

Another reason I liked working in Inkscape was because I could take existing glyphs, make them semi-transparent or change colors, and then use them as templates for other characters. This again helped make the design of certain glyphs very efficient and insured that my glyph designs were consistent.

Another good thing about using Inkscape for glyph design is that it is very easy to use as a kind of design sandbox where you can quickly manipulate different possible glyph designs (black) and look at them side-by-side in the context of other finished glyph designs (purple).

I am not saying that you can’t do this in FontForge. Of course FontForge also provides tools and views that allow you to see how a sequence of glyphs look placed on a line next to one another. However, as an experienced user of Inkscape, I found that Inkscape met most of my needs in ways that were very efficient, intuitive, and fast.

Of course Inkscape has some drawbacks as well. One problem I encountered frequently when I joined different pieces of a glyph together is that the resulting union set would contain some points that were too close together.

I often did not notice these problems until after I had imported the glyph outlines into FontForge. FontForge will issue warnings when points are too close together (among other things). This problem actually frustrated me quite a lot, because then I would have to remove the offending point (using either Inkscape or FontForge) and then try to get the curves back to where I wanted them again. It was actually quite annoying!

Well the reality is that we do not live in a perfect world. Overall, using Inkscape and Fontforge together was a reasonable decision.

Using Inkscape of course meant that I ended up creating hundreds of individual SVG files where many of the SVG files contained the outline for a single glyph. Single-glyph SVG files were named according to the Unicode code point, e.g., 1A20.svg, 1A21.svg, etc.

To facilitate project management, I created a nested file tree of all of the SVG glyph files with consonants filed under a “consonants” folder, vowel signs in a “vowels” folder, subjoined consonants in a “subjoined” folder, and so on.

I then created a project on Sourceforge. You can find the online code repository at http://sourceforge.net/projects/hariphunchai/.

I then imported the SVG outlines into FontForge. Dave Crossland describes how this is done at http://understandinglimited.com/2007/10/29/inscape2fontforge/.

Just like Inkscape, FontForge has advantages and disadvantages. Let’s talk first about what I believe are some of the advantages of FontForge.

For the Hariphunchai font, as you might expect I needed to create a number of anchor attachment points for the various vowel signs, tone markers, and subjoined consonants that may surround any given base consonant. Here you can see the anchors in blue. Initially I created a minimalistic set of anchor classes. But later I realized that an expanded set of anchor classes would allow me to position glyphs with greater flexibility and precision.

However a greater number of anchor classes also presented a problem if I had to manually add and then position all those anchor points for each and every single base consonant …

Fortunately however, FontForge’s Spline Font Database (SFD) format is just a text file!

As a result, after manually adding and setting the positions of all the anchor points for one base consonant, such as the consonant U+1A23 shown in the previous slide, I could then go into “command line mode” with vi and just copy the anchor point list created for U+1A23 out to all of the similarly-shaped “cousin” consonants.

Of course not every “cousin” consonant was exactly like U+1A23, and so I still needed to carefully review and in many cases perform manual adjustments back inside FontForge’s graphical interface. Nevertheless, using this method over a small set of base consonants that were carefully chosen because the represented the basic set of “shape variants” for the Tai Tham script most definitely increased my productivity and made the process of reviewing and tweaking the positions for individual base consonants much more enjoyable.

I won’t lie to you: It took me a while to figure out how to use OpenType GSUB positioning. But eventually I figured out how to get the stacked vowels and tone marks and the subjoined consonants to appear correctly. I had available to me two test “platforms”: (1) HarfBuzz’s “hb-view” tool, and (2) Firefox or, to be more precise, Aurora, which is the pre-release nightly build of Firefox.

But then, after all of my struggles and just when I thought that things were starting to go swimmingly well, one day in late December of 2013, I woke up and found this: the dreaded dotted circle was now appearing in Aurora where it had not before!

I was at a loss for an explanation. Had I somehow done something wrong? Or had something changed in HarfBuzz? I wrote to the HarfBuzz mailing list to find out.

My plea for help was answered by Martin Hosken who wrote “The answer is simple but insidious … “ Uh oh, this did not sound good!

“The normalization for Tai Tham … is broken”

The Unicode stability policy is very frustrating: mistakes can get encoded into the standard and may never be fixed!

But fortunately, within a few days Jonathan Kew had proposed a patch to Behdad Esfahbod …

The patch simply reverses the mistake of the Unicode Technical Committee by making U+1A60 TAI THAM SAKOT have lower priority so that it comes after any tone marks in a word.

This is example illustrates one of the benefits of collaboration within the larger Open Source development community. As a “new” script in Unicode, no type layout engine had ever been thoroughly tested to see if it supported the complex layout that Tai Tham requires. While Behdad Esfahbod and the other developers of HarfBuzz had added code in HarfBuzz to support Indic and Indic-derived scripts such as Tai Tham, until a Unicode Tai Tham font became available, there was really no way to know whether HarfBuzz truly supported Tai Tham or not.

On the other side of things, for me as a developer of a Unicode Tai Tham font, I didn’t really have any OpenType-based platform that would fully support my font. So it has been a bit of a “chicken and egg” kind of problem. Fortunately, the HarfBuzz team has been very responsive and thus the problem described here was quickly resolved.

By early February, 2014, the patched version of HarfBuzz finally made it’s way into Aurora, the nightly development version of the Firefox browser. This meant that I finally had an easy-to-use “platform” that could render my OpenType-based Tai Tham font.

Having resolved the “show stopping” Tai Tham layout issues, I was then free to pursue some of the artistic and aesthetic goals. One of the things I did is add a few additional ligatures that, based on my examination of manuscripts (especially some of the manuscripts in the EFEO online collection) really stood out in my mind.

Once I got to the stage where my font basically worked, then I really needed a platform where I could type longer passages of text in Tai Tham to see how things looked. Now of course input methods for Tai Tham are not widely available. So I had to write my own. I started by writing a very crude input method to use in my web-based Key Curry application (http://unifont.org/keycurry/) which you see here.

Later I learned from Theppitak Karoonboonyanan that regardless of the fact that the Unicode Consortium chose to encode Tai Tham using a logical back store, people in Thailand would certainly expect to be able to type Tai Tham using a visual input method similar to that which is used for the Central Thai and Laotian scripts. I therefore sat down and wrote a “proof of concept” input method for Tai Tham that is based on the Thai Ketmanee keyboard layout and allows the user to type prefixed vowels and the MEDIAL RA in visual order. The input method engine in Key Curry automatically converts prefixed vowels and MEDIAL RA from visual to logical order as required by Unicode.

Designing a Tai Tham Unicode Font - LGM 2014 - Google Slides