Discussion:
Xlib UTF-8 support
Mirco Bakker
2006-12-07 02:17:32 UTC
Permalink
Hi

After reading the UTF-8 and Unicode FAQ I tried to programm a small X application that displays a variety of chars from different charsets. While most european charsets (e.g German äüö, French éèà) work fine, russian and asian charsets aren't displayed (or scrambeled).

The programm (written in C) uses only the standard Xlib. The writing is done using XmbDrawString() (AFAIK function of choice). I also tried Xutf8DrawString (X_HAVE_UTF8_STRING is set) with the same effect. After Googeling for hours I found a few outdated reports that Xlib has a Bug handling UTF-8 Strings (or Fonts). Is this still true or is my code crap?

TIA, Mirco
--
"Ein Herz für Kinder" - Ihre Spende hilft! Aktion: www.deutschlandsegelt.de
Unser Dankeschön: Ihr Name auf dem Segel der 1. deutschen America's Cup-Yacht!
Michael B Allen
2006-12-07 03:06:09 UTC
Permalink
Two things. First, I believe Pango is becoming the defacto method for
rendering non-Latin1 text in general purpose applications (I've never
used it but from installing apps I can see more and more apps depend on
it). Second, make sure you're in the UTF-8 locale. If you're not,
UTF-8 text will not be rendered properly.

Mike

On Thu, 07 Dec 2006 03:17:32 +0100
Post by Mirco Bakker
Hi
After reading the UTF-8 and Unicode FAQ I tried to programm a small X application that displays a variety of chars from different charsets. While most european charsets (e.g German äüö, French éèà) work fine, russian and asian charsets aren't displayed (or scrambeled).
The programm (written in C) uses only the standard Xlib. The writing is done using XmbDrawString() (AFAIK function of choice). I also tried Xutf8DrawString (X_HAVE_UTF8_STRING is set) with the same effect. After Googeling for hours I found a few outdated reports that Xlib has a Bug handling UTF-8 Strings (or Fonts). Is this still true or is my code crap?
TIA, Mirco
--
"Ein Herz für Kinder" - Ihre Spende hilft! Aktion: www.deutschlandsegelt.de
Unser Dankeschön: Ihr Name auf dem Segel der 1. deutschen America's Cup-Yacht!
--
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/linux-utf8/
--
Michael B Allen
PHP Active Directory SSO
http://www.ioplex.com/
Rich Felker
2006-12-07 04:28:36 UTC
Permalink
Post by Michael B Allen
Two things. First, I believe Pango is becoming the defacto method for
rendering non-Latin1 text in general purpose applications (I've never
I'm hoping we can remedy this situation. Xft/pango is extremely slow
compared to the core X font system, and there's nothing wrong with the
core system as long as the X/font server could communicate
OpenType/AAT/etc. tables to Xlib for Xlib to use in correctly choosing
glyphs.

Unfortunately we're a long way from having something like this
working, but in the mean time Xlib and core fonts should work fine for
UTF-8 as long as you don't need context-sensitive glyphs.
Post by Michael B Allen
used it but from installing apps I can see more and more apps depend on
it). Second, make sure you're in the UTF-8 locale. If you're not,
UTF-8 text will not be rendered properly.
Also make sure a font with iso10646-1 encoding is selected... Any
other ideas?

BTW I don't know what policy on this list is, but in general it's
considered bad to top-post on lists I think.

Rich
Jiro SEKIBA
2006-12-07 04:36:01 UTC
Permalink
At Thu, 07 Dec 2006 03:17:32 +0100,
Post by Mirco Bakker
The programm (written in C) uses only the standard Xlib. The writing is done using XmbDrawString() (AFAIK function of choice). I also tried Xutf8DrawString (X_HAVE_UTF8_STRING is set) with the same effect. After Googeling for hours I found a few outdated reports that Xlib has a Bug handling UTF-8 Strings (or Fonts). Is this still true or is my code crap?
X UTF-8 supports is ok, but only a few fonts have all glyphs.

If you specify any specific iso10646-1 font, check the font dump
by xfd -fn 'font', you'll see that only some glyphs are defined.
If you use other font like -gnu-unifont, it has some more glyphs.

Or you can use font sets instead of single iso10646-1 font.
Try to specify legacy fonts separated by ',' comma.
Like "a14,k14,*", ('*' is wild card).

Regards
--
Jiro SEKIBA <***@sekiba.com>
Rich Felker
2006-12-08 05:36:05 UTC
Permalink
Post by Jiro SEKIBA
At Thu, 07 Dec 2006 03:17:32 +0100,
Post by Mirco Bakker
The programm (written in C) uses only the standard Xlib. The
writing is done using XmbDrawString() (AFAIK function of choice).
I also tried Xutf8DrawString (X_HAVE_UTF8_STRING is set) with the
same effect. After Googeling for hours I found a few outdated
reports that Xlib has a Bug handling UTF-8 Strings (or Fonts). Is
this still true or is my code crap?
X UTF-8 supports is ok, but only a few fonts have all glyphs.
A few? Actually no fonts have all glyphs. :(
Part of this is just incompleteness, but part of it is the
insufficiency of 1-1 character/glyph mapping.
Post by Jiro SEKIBA
Or you can use font sets instead of single iso10646-1 font.
Try to specify legacy fonts separated by ',' comma.
Like "a14,k14,*", ('*' is wild card).
Hm? What programs will use this?

Rich
Peter Lunicks
2006-12-08 06:55:46 UTC
Permalink
Post by Rich Felker
Post by Jiro SEKIBA
Or you can use font sets instead of single iso10646-1 font.
Try to specify legacy fonts separated by ',' comma.
Like "a14,k14,*", ('*' is wild card).
Hm? What programs will use this?
Rich
Well, emacs supports fontsets. They are based on designating certain fonts to
be used for certain ranges/sets of characters. I don't know about other
programs or about Xlib itself, though.

For emacs fontsets see
http://www.gnu.org/software/emacs/manual/emacs.html#Fontsets

PL
Jiro SEKIBA
2006-12-10 01:57:39 UTC
Permalink
At Fri, 8 Dec 2006 00:36:05 -0500,
Post by Rich Felker
Post by Jiro SEKIBA
X UTF-8 supports is ok, but only a few fonts have all glyphs.
A few? Actually no fonts have all glyphs. :(
oh, well that's correct :(. A few has sufficient glyphs I sould say.
#"sufficient" depends on way you use though
Post by Rich Felker
Post by Jiro SEKIBA
Or you can use font sets instead of single iso10646-1 font.
Try to specify legacy fonts separated by ',' comma.
Like "a14,k14,*", ('*' is wild card).
Hm? What programs will use this?
XCreateFontSet takes that syntax, which means all applications using
Xmb*/Xwc* API may take those fontset syntax.

Regards
--
Jiro SEKIBA <***@sekiba.com>
Loading...