Zawgyi and Unicode in Myanmar
by Aung Kham (www.myanmar-entrepreneur.com)
August 8, 2014
What is Language?
Language (according to Wikipedia) is the human capacity for acquiring and using complex systems of communication, and a language is any specific example of such a system. The scientific study of language is called linguistics.
Machine language is a set of instructions executed directly by a computer's central processing unit (CPU)
What is Myanmar Language?
Myanmar is the official language name that people in Myanmar use to communicate to each other (verbally and virtually). There are many Myanmar languages (due to the diversity of Minorities) such as Burmese, Shan, Karen, Mon. All Myanmar language which use the same scripts must have a uni-script.
- Zawgyi
Because Zawgyi is the first language for people to get in touch with Myanmar fonts and keyboards, 100 percent of Myanmar internet users use Zawgyi for communication.[a][b][c]
- English
50 percent of Myanmar people use both Zawgyi and English language (US) for communication.
- Myanglish (using English font which sound in Myanmar language when reading)
20 percent of Myanmar internet users use English keyboard to type which sound Myanmar language.
- Unicode
Unicode is being used only by few people. The percentage is too small. But Government websites in Myanmar are using Myanmar unicode for updating information.
What is Zawgyi?
Zawgyi is a set of keyboard and machine language that was created for burmese people communication. This was produced by Alpha Mandalay company in Myanmar, early 2004s. It became the first language to be used in communicating with each other via digital devices such as: computer, mobiles, etc. Virtually it shows the correct characters to readers. But technically it does not obey the rules and regulation as provided by http://www.unicode.org. (Unicode is a computing industry standard for the consistent encoding, representation and handling of text expressed in most of the world's writing systems.)
Virtually people can read it, but not all computers are able to read it. This is due to the error in the encoding system. For example, Zawgyi does not have text support for Shan, Karen and other similar minority languages in Myanmar.
When two and more people communicate via email (digital way), they both have to install a script (fonts and keyboard produced by Alpha Zawgyi). When people communicate, they use Internet browsers. When they want to search something, they use ‘Search engines’.If the internet browser can process language, no one don’t need to install any fonts and keyboards. But the problem is no browser and any search engine don’t support Zawgyi because technically Zawgyi is not correct. It renders in correct way for readers, but don’t for machine processing.
So what are the disadvantages for using Zawgyi and other non-unicode languages?
The main problem in Zawgyi is about processing. No computer can process Zawgyi in the ideal way.
The following facts are from http://my.wikipedia.org/wiki/Wikipedia:Font
Example -
The result of finding လ+ ူ+ မ+ ွ+ု ေရး The result of finding လ+ ူ+မ+ ု +ွ ေရး
Zawgyi renders the same font (Zawgyi show the same font) to users. But to a device, they are unable to recognize it as they are not the same font due to different series of writing. In this image above you will see Google revealing two different result. In unicode, order of writing is according to its regulation whereas Zawgyi has no regulation to it.
As machine have to be programmed and cannot intellectually process Zawgyi and un-unicode languages. It creates future problem and block further development such as:
It also effect that people from other countries who may not know what languages we are using. All they see are square boxes. They may not have chance to learn about Myanmar language and even if they install Zawgyi, because of the computer and others using Unicode they may have trouble adjusting. They will want to use Google translators and perhaps other softwares to translate and read about Myanmar but it becomes impossible because Zawgyi fonts and keyboards software don’t translate in any translation apps.
What is Myanmar unicode?
Myanmar unicode is an encoding for the Myanmar writing system being produced according to rules and regulations laid down by the global computer standard organization (www.unicode.org)
There are more than 14 Myanmar unicode fonts, but which are using the same machine language, standardized by Unicode.org. The most people use is ‘Myanmar 3’ To learn more about Unicode here is the link - http://www.myanmarlanguage.org. the website builder is Myanmar unicode supporter. There are many fonts and keyboard proclaiming that they are unicodes.
Problem with Myanmar Unicode
Windows OSs before 2008 didn’t support Myanmar Unicode in machine. But from 2008 Windows OS onwards, users don’t need to install Myanmar Unicode to be able to read it. Mac OS also is supporting Myanmar unicode. Firefox and Google services (including Google Chrome) have also moved from Zawgyi to begin supporting Myanmar unicode. For example, Google is now automatically converting Zawgyi to MM unicode[f][g][h][i][j][k][l]. But as most myanmar people are still using Windows 7, and are still installing Zawgyi, they still have problems in reading Unicode. Changing language setting in computer will also not support Myanmar Unicode even in Windows 8 and Mac OSs. So we still have problems due to unchanging behavior of Myanmar internet users.
Why people are hard to change their behavior of using Zawgyi to Unicode?
Yes, it is too busy and boring to change. The problem is the way of typing is not the same. People are familiar with Zawgyi typing layout. Unfortunately typing layout for Unicode is little bit different from Zawgyi.[m][n]
So what are the solutions at the moment?
We, in only one way, have to use Unicode in the future for developing Myanmar ICT technology. But as people are hard to change, the following are possible solutions.
Converters such as
Unfortunately there are some who can not be bothered to use the program or know how to. Still there are other solutions such as Keyboard type writing layout auto correct softwares.[o][p][q]
There are some softwares which let people to type in Zawgyi keyboard layout then it process and output as Myanmar Unicode. The problem to this also is that:[r][s][t][u]
These are all still short term solutions. The long term solution is encouraging people to use Myanmar unicode. That’s what Unicode supporters in Myanmar are doing. Not only are all government sites in Myanmar making the change but also there are more than 50 websites in Myanmar now using Unicode standard. But most people visiting websites such as myanmarmobileapp.com and many others are only running on Zawgyi font. They embedded the font, meaning it doesn’t matter if you have Zawgyi or Myanmar Unicode. The web page will show you in Zawgyi font because it’s embedded. You can embedded Zawgyi or Unicode. It’s your choice.
There are also some websites who accommodate by using dual machine language method.
Can there be a Zawgyi unicode?
Some said that Zawgyi developers already developed Unicode version of Zawgyi fonts and keyboard. But it still requires a software installation[v][w][x][y][z] to adjust. It would be of great assistance to Myanmar language if Zawgyi became part of Unicode.
How about embedding fonts?
Embedding fonts in websites is a technique to show the only preferred font (in this case Zawgyi or Unicode) when a user visits the website.This is created so that a user can overcome the issue of incompatibility between different encoding standard especially if the PC doesn’t actually have the fonts in its processor. PC will just render the fonts getting from website. But the problem is that it doesn’t work in a mobile browser. The technology still needs to be developed.
Even if the font is embedded the site owners still faces the dilemma of whether to use Unicode or Zawgyi as their font of choice. If Zawgyi is not optimized and unicode is chosen this may still be a better solution. There are some who say that font embedding would slow down site speed. In conclusion site embedding is also not a better solution but a quick short solution.
There are still many other related issues to talk about. For example: people know about Win Myanmar font. But Win (like Zawgyi) is not unicode. Win Myanmar Systems, better known as Win Myanmar Fonts, has been the nation's de facto standard for Myanmar language processing since 1992. It translates MS Windows and Excel menu's to Myanmar language. The Win Myanmar font system for Microsoft Windows is very common in Myanmar for digitally processing the Burmese script. Win Myanmar Systems enables you to type, design, or print in Myanmar language right on your computers. The software supports both non-Unicode ASCII formats and ISO-compliant Unicode formats. Often called Win Fonts, they are not just fonts, as they come along with a very user-friendly keyboard drivers for Windows (compatible with Windows 3.0 to Windows 8). However in this conversation about Zawgyi and Myanmar Unicode, Win Myanmar is a topic that needs to be known on the side but not involved.
References -
http://www.unicode.org/notes/tn11/myanmar_uni-v2.pdf
http://my.wikipedia.org/wiki/Wikipedia:Font
https://code.google.com/p/zawgyi/wiki/WhyUnicode
http://www.myanmarlanguage.org
Update (April, 2015)
Latest Android OS such as Lollipop automatically support Myanmar Unicode. [aa][ab][ac][ad][ae][af][ag][ah][ai][aj]Latest Mac OS also already have built-in Myanmar language which follow unicode standard). Windows OS also have Myanmar Unicode already. It means all global tech products already have built-in Myanmar Language for Myanmar people. Myanmar people cannot just start changing habit.
Update (December 2016)
I have heard people complaining that Unicode-compatible keyboards are not natural, requiring people to type in a very special way. This is not true, since a smart keyboard can rearrange the input characters automatically, which makes typing much easier.
Note that Android has updated its Myanmar keyboard (Unicode) with reordering rules that make it easier to type. Feedback would be great. And how can people be more informed on Unicode keyboards?
[a]Is this still correct? Any update?
[b]From a survey of websites, about 90% are in Zawgyi. However, most people using social networks use Zawgyi. It's not 100%, though.
[c]Thanks, is there any place online that tracks this data?
[d]I've been working to detect Zawgyi in web pages for the Google index. However, the variations in what people type for search means that the results are not good.
Another big issue: good search depends on "segmenting", breaking text into individual words. This is difficult with Unicode, but almost impossible with Zawgyi.
[e]Kindly noted.
[f]How is this being done? Do you mean in search engines?
Currently on my comptuer (OSX) i use google docs and only zawgyi displays correctly even though I have unicode as default font.
[g]This was written by someone else. Google detects Zawgyi in web pages that it reads in search and converts internally to Unicode for indexing. There is still a lot to be done.
[h]Re: Google Docs, you may need to tell your browser about the Unicode font installed on your system. I know that this is necessary in Chrome.
[i]Craig, I see that Google now has a supported Myanmar Web font "Padauk" which is a great step forward, however when I go to print from Google Sheets the font ordering is incorrect and there is no way to fix since the printing is done not on computer but on Google Server to PDF. Is there anyone I can talk to about this?
[j]Please let me know your environment for using Google Sheets, including type of browser and operating system version.
If you are willing, please share a link to a sample page that gives you this problem.
[k]I'd like to see your results and the characters in the spreadsheet to better understand this.
[l]Using newest OSX with Chrome, but also tried in Safari and other staff tried on PC with Chrome and Firefox.
When viewing the page it seems to use local fonts. We can display both Zawgyi and Unicode fonts correctly to edit and manipulate (depending on which font we have set as default in browser). When we click PRINT PREVIEW it also displays correctly. But the final print produced by Google (which I assume is done in the cloud based on forums) must use a different font rendering than the computer browser.
I manually converted a whole sheet to Unicode to try to export using Padauk font since it was Web font. The results were better, but there were several issues including the ေ coming after the consonant rather than before.
I can include screenshots but not sure the best way to do that in comments. Should we communicate via email?
+knoxjohnny@gmail.com
[m]What can be done to make typing easier, more like Zawgyi input? Is it enough to reorder the code points when the user types them in a non-Unicode order? For example, ေ + က (1031 + 1000) can be changed to ကေ (1000 + 1031) by a smarter keyboard.
It would be very helpful if you could provide a list of reordering that would make typing easier.
[n]Let me take him and do it in a week.
[o]I've build this simple website that let's people look at Zawgyi and other data using fonts. It is not a converter for production, but may be helpful for people to understand the problems.
http://zawgyi-unicode-test.appspot.com/burmese/
Let me know what you think.
[p]It is 404 not found
[q]Please try again: http://zawgyi-unicode-test.appspot.com/burmese/
[r]What do you think of MM3 and Myansan keyboard layouts for Unicode? Are there others we should consider?
[s]What are the softwares that I can buy to type in Zawgyi way and output as Unicode? Thanks in advance.
[t]I think you are asking for a keyboard that has a Zawgyi layout but produces Unicode output.
I'm sorry but I don't know which one would meet your needs. There are many ways of arranging the keys on a keyboard, and the layout actually does not depend on the font.
For Android, this search returns many keyboards: https://play.google.com/store/search?q=burmese%20android%20keyboard
I suggest looking through this list to find one that works well for you. Good luck, and please let me know if you find something good.
[u]I tried all available on android, simple one by one typing is ok, pronounciation-wise. But if I need to put one character on top of each other, I cannot find a keyboard better and simpler than Zawgyi (which is not unicode) (which I do not mind as long as it gives what I need)(for problems with viewing in other pc, I make it pdf, so everyone seeing the same thing I see).
I have seen other professional typists used softwares in Myanmar. They have a long list of codes that they need to type in to type "one character on top of each other", which I cannot find online or any other available softwares online/android.
So, unicode or not, I have to go back to Zawgyi on Windows 7 to get what I need. :(
[v]Hi, Craig. I typeset Burmese documents in InDesign CS5/CS6 using Zawgyi, but there is one glyph that does not display correctly. Now that I know Zawgyi-One (the one font InDesign uses) is not fully Unicode compliant, what are other compatible solutions? Thanks!
[w]Hello, and thanks for your note. Zawgyi will display lots of things badly, so I'm not surprised with the problems. Zawgyi-One (and other Zawgyi versions) are *far* from Unicode compliant, and probably never will be.
A better solution is the use a Unicode font for all the text. I don't know Indesign, but I expect that it could use other fonts.
http://www.unicode.org/faq/myanmar.html should give you more information.
I can give you more info with my email: ccornelius@google.com. Feel free to send me a note.
[x]One more thing: this page may be of interest to you. It offers detection and conversion for Zawgyi text to Unicode: http://zawgyi-unicode-test.appspot.com/convertui/
Contact me with any questions.
[y]We use CS4 and CS5 and have lots of problems with Adobe and Myanmar fonts. Often times we have to use old Legacy fonts like Win-Inwa (pre-Zawgyi) because of the variety of font faces for graphic design. Most of these are not yet available for Unicode and Unicode does not render correct in the older versions of Adobe (not sure about current CC).
[z]Actually I've heard that Adobe now fully supports Unicode in newer versions, but again there is limited font selection for graphic design. The ability to transcode old legacy fonts into new Unicode fonts would be very useful for the transition.
[aa]What is most needed to help people make the transition to Unicode? Education, better keyboards, more fonts? I would appreciate your thoughts on this.
[ab]Better keyboards already exist, more fonts are already there. A very large amount of content is in Zawgyi. People want to read this content. When communicating, the other party will have Zawgyi only, so if they type in Unicode the other party can not read it. What is needed is automatic conversions by Facebook, Google, and other online services.
[ac]Thank you for your note. I'm working on some ideas to make Google Search work better in Unicode, and also to provide basic support for Zawgyi queries.
A big challenge is the phone vendors that replace Burmese Unicode fonts with Zawgyi.
[ad]May be you can suggest Google about this issue and influence Myanmar government. Then Myanmar government will influence the vendors. Just a dynamic thought. I know this is hard to implement.
[ae]Currently the social pressure is to use Zawgyi even if the logical choice is to use Unicode. I think that businesses needs to have a top-down decision from the government to make the switch. It is not profitable for businesses to use unicode currently because of user demand. What is needed is for their to be incentives or fines related to font usage. If Facebook and Google auto converted hosted content and online searches to Unicode and the telecoms (MPT, Ooredoo, and Telenor) autoconverted texts to unicode it would make Unicode the new cool tool to have so you can join the communication. Once the switch happens I don't think anyone would ever look back. As it stands the transition will never just happen because both fonts are mutually exclusive and don't allow for gradual transition.
[af]I know that inside Facebook app, if you select Burmese as the main language, it will show you an option to convert Zawgyi to Unicode. That's how I have been using Facebook lately to read my friends' posts. Personally I would never step on Zawgyi. No matter how convenient to use Zawgyi or familiar with that, it's not a forward looking approach and a huge setback for the country to join international standard.
[ag]This is great news! This type of tool may be the answer, although currently there is little incentive for people to switch. For example companies like ours want to use Unicode, but are forced to type or convert to Zawgyi because all of our users just communicate in Unicode. On principal we totally agree that Unicode is the answer, but in practicality if we switched to Unicode almost non or our users would interact with our page or products.
If the government were to mandate that all companies need to use Unicode by default and convert then Facebook and Google by default could render all posts in Unicode (even if typed in Zawgyi) it would make a mass push toward Unicode. Especially if also backed by MPT, Telenor and Ooredoo sending mass texts telling everyone it is happening and some public announcements leading up.
Once the public gained critical mass I don't think anyone would look back.
[ah]One final note, a "transltion" step could be autoconverting to Unicode but forcing users to click "CONVERT to zawgyi" since I"m sure Facebook doesn't want to abandon users with phones that they don't know how to upgrade. But by adding that extra step and making Unicode default no matter who posts it would turn the tide.
[ai]Actually they put support for both fonts in languages menu. So now there's Myanmar and Myanmar (zawgyi). And the don't converter option is there but it's not working. Could be a bug.
[aj]FYI, the Myanmar Computer Federation wrote this letter support Unicode in 2014: http://www.unicode.org/L2/L2014/14141-myanmar-comp.pdf