Thursday 28 July 2011

Unicode 6.0

Phonetic-symbol anoraks/nerds/geeks can have hours of fun browsing the Unicode Standard, the repository of all the characters that can be displayed on a modern computer screen (blog, 22 Jan 2007). If you haven’t got the book (which is hefty), browse online.

Now there’s a new version of the Standard, 6.0 (well, it came out last October, actually). Unlike previous versions, it has not been published as a printed book, but is available only online.

So what’s new in version 6.0? In brief: there are 2,088 new characters, including (I quote)
• over 1,000 additional symbols—chief among them the additional emoji symbols, which are especially important for mobile phones
• the new official Indian currency symbol: the Indian Rupee Sign
• 222 additional CJK Unified Ideographs in common use in China, Taiwan, and Japan
• 603 additional characters for African language support, including extensions to the Tifinagh, Ethiopic, and Bamum scripts
• three additional scripts: Mandaic, Batak, and Brahmi

There are also extensive technical changes to do with character properties and format specifications.

Two new Cyrillic characters cater for Azerbaijani. Two new Arabic characters and ten new Devanagari characters cater for Kashmiri. Thirty-two new Ethiopic characters cater for Gamo-Gofa-Dawro, Basketo, and Gumuz. Complete new blocks of letters cater for Mandaic, for Batak, and for Brāhmī.

Is there anything of particular interest to phoneticians and IPA users?

How about a symbol for a voiceless retroflex lateral fricative? A sort of combination of ɬ and ɭ? It’s not (yet) an official IPA symbol, but it’s a logical combination of two. Here it is, U+A78E. (Unicode numbers are given in hexadecimal and prefixed with the identifier U+.)
If you’ve always wanted a COMBINING DOUBLE INVERTED BREVE BELOW, it’s now available. But unless you’re a Uralic Phonetic Alphabet aficionado, you’ll have managed without. Do you have a use for subscript h k l m n p s t? I doubt it. Even if you do, you’d probably simply use the subscripting tag <sub> </sub>, as I have just done. In Unicode 6.0 they’re ready-made at U+2095 to U+209C.

Students of the minority languages of China may welcome three new Bopomofo characters to cater for Hmu and Ge. (Bopomofo is a phonetic notation system based on Chinese characters.)

It’s one thing to have a symbol recognized in Unicode and assigned a U+ number. It’s something else for the new symbol to become available in an available font. We’ll just have to wait and see if and when these new characters make an appearance in documents on our display screens.

Don’t hold your breath.

11 comments:

  1. Unfortunately, <sub> </sub> is not accepted in comments on this blog!

    ReplyDelete
  2. something else for the new symbol to become available in an available font

    Exactly. Just the other day, I was reading a few India-related articles on Wikipedia and was wondering what that ugly low-resolution image for the rupee symbol was doing there... Even in the actual Rupee article the character is only used once; the rest of the mentions use the image.

    ReplyDelete
  3. For a character to be accepted by the Unicode consortium, the submission is required to demonstrate its use, so it is clear that someone has a need for these subscripts. To write them off as substitutable by <sup> is missing the point of the Unicode standard: it is not simply the character set for HTML, it is designed to be universal across ALL platforms and applications.

    As for inclusion in fonts: the labiodental flap was introduced into Unicode three years ago, and is already available in nine fonts (http://www.fileformat.info/info/unicode/char/2c71/fontsupport.htm) which isn't bad going really. No breath-holding is required for the subscripts themselves: they are already in Symbola. (It should be noted there will always be at least one font implementing new characters as the Unicode consortium do not approve submissions unless there is at least one font which implements the character: for them to print it with!)

    This page is a very useful resource for anyone who wants to track down the implementation of any Unicode character:

    http://www.fileformat.info/info/unicode/char/search.htm

    ReplyDelete
  4. @Stuart Brown: It should be noted there will always be at least one font implementing new characters as the Unicode consortium do not approve submissions unless there is at least one font which implements the character: for them to print it with!

    I can see that there must be a glyph requiring a codepoint, but is it kosher for a font implementation to jump the gun by assigning the glyph to the proposed codepoint before it's been approved by the consortium? I thought the Private Use Area was for implementing unapproved characters; so any pre-existing font implementation would need to be amended post approval to move the character to its permanent codepoint.

    ReplyDelete
  5. The people who contribute fonts for use in the Unicode books are of course on the inside track for the location of the character. Many such fonts come from Evertype, Michael Everson's company.

    ReplyDelete
  6. Although not quite on a par with the nasal ingressive voiceless velar trill (see blog entry dated Monday, 6 April 2009), I feel that an ingressive version of the above mentioned voiceless retroflex lateral fricative also deserves a symbol. I think it was probably one of the sounds used in an advert for instant coffee, where the hostess goes into the kitchen and pretends to be using a percolator.

    ReplyDelete
  7. Thank you for this use Full Post!

    I have bookmarked this and i also am looking forward to reading new Post.https://www.fiftyshadesofseo.com/

    Press Release Submission Sites
    Free Press Release Submission Sites Instant Approval
    Free Press Release Submission Sites India
    Press Release Submission Sites List
    Article Submission Sites
    Free Article Submission Sites
    Article Submission Sites With Instant Approval
    High Da Article Submission Sites
    instant approval article submission sites
    dofollow article submission sites
    classified submission sites
    Free Classified Submission Sites List in India
    Classified Submission Sites List
    Classified Submission List

    ReplyDelete
  8. Very nice I always prefer to read the quality content and this thing I found in you post. Thanks for sharing. https://godsofseo.com/

    Guest Posting Sites
    Guest Posting Site
    Guest Posting Websites
    Forum Submission Sites
    Forum Submission
    Forum Posting Sites
    Forum Submission List
    Forum Posting
    Video Submission Sites
    High PR Video Submission Sites
    Video Sharing Website
    Web 2.0 Sites
    Web 2.0 Sites List
    Best Web 2.0 Sites for Backlinks
    Web 2.0 Submission
    Question and Answer Websites
    Question and Answer Sites List
    Question and Answer Sites
    Question and Answer Sites for SEO

    ReplyDelete

Note: only a member of this blog may post a comment.