Whole article taken from the Gentoo Forums.

This is a static version of Enabling Japanese at the Gentoo Wiki. Pervious versions are archived here and here. This page was updated on 3 Febuary 2006.



Enabling Japanese



Support is available in the Desktop Environments forum. Make sure to include all the appropriate versions of things - like kde-3.3.4.

Of all languages to learn, Japanese is known as one of the most challenging - not because of the spoken language, but the written language. The objective of this HOWTO is to make your gentoo box work with that written language. For this, there are two sections: Japanese Fonts, and Japanese Input. Those setting up input should, of course, set up their fonts first. New installations will want to make sure they have the proper USE flags set, as outlined below.

---



Japanese Fonts



You simply want to read the stuff, say, in Mozilla Firefox. You need to install fonts - A good sign that you have not installed the proper fonts is that the following characters appear as boxes with numbers inside: 日本語フォント

emerge media-fonts/kochi-substitute For Japanese
emerge media-fonts/arphicfonts For Chinese
emerge media-fonts/baekmuk-fonts For Korean


It never hurts to get them all.
There are other cjk and unicode fonts available in the portage tree, to be found with emerge search fonts, with some notible exceptions: Bitstream Cyberbit, available in an ebuild outside of portage, due to questions in licensing. Arial Unicode MS is another great font, which you may or may not have access to. There have been reports of errors in emulators while using this font, but this same procedure can be followed for any Microsoft-provided truetype fonts you may find:



emerge cabextract

Find a copy of aruniupd.exe - online availability changes.

cabextract aruniupd.exe

For system-wide installation use

cp *.ttf /usr/share/fonts/


for local installation (no root access)

cp *.ttf ~/.fonts/

Then

fc-cache -fv

Programs will probably have to be restarted to access new fonts.
Arial Unicode MS is now available to your system. Web browsers like Firefox should probably have this mentioned in their settings. Specifically, in Mozilla Firefox, look at See Preferences >> General >> Fonts & colors >> Fonts for: Japanese


Java 1.4.x


This has been tested on blackdown-jdk-1.4.2.03 :

cd $JAVA_HOME/jre/lib/

cp font.properties font.properties.old
cat font.properties.ja | sed "s/-watanabe-mincho/-misc-Kochi Mincho-medium/g" | sed "s/-wadalab-gothic-medium/-misc-Kochi Gothic-medium/g" > font.properties

echo 'appendedfontpath=/usr/share/fonts/kochi-substitute' >> font.properties
/usr/sbin/env-update && source /etc/profile


Java 1.5 (unverified)


frostschutz says:

According to some docs I've read, Java 1.5 is supposed to support 'fallback fonts' without having to add them explicitely to fonts.properties. So all you have to do is to create a .../jre/lib/fonts/fallback/ directory and put at least one unicode font with Japanese support in there (or, since these fonts tend to get very big, just a symlink to an existing font in your /usr/share/fonts/ directory).

Japanese Input


Fonts are not enough for you? Good. Let's prep your system for input support. It should be noted that this process is quite similar for Chinese, Korean, and a host of other languages.

Setting Locale


Using japanese characters means using character sets outside the normal POSIX range; Unicode characters. To input them, you need to allow their use on your system.


locale


LANG=
LC_CTYPE="POSIX"
LC_NUMERIC="POSIX"
LC_TIME="POSIX"
LC_COLLATE="POSIX"
LC_MONETARY="POSIX"
LC_MESSAGES="POSIX"
LC_PAPER="POSIX"
LC_NAME="POSIX"
LC_ADDRESS="POSIX"
LC_TELEPHONE="POSIX"
LC_MEASUREMENT="POSIX"
LC_IDENTIFICATION="POSIX"
LC_ALL=

All of the entries should be either blank or say "POSIX", unless your locale has been previously set. If so, you need to figure out where. ; )

locale -a


de_DE.utf8
en_GB.utf8
en_US.utf8
fr_FR.utf8
ja_JP.utf8

Gives a list of all the unicode locales availble on your system. This list can be expanded or limited by editing your needed locales, should you be missing an entry. Uou are choosing the language you want your menus to be in, NOT the one you are currently setting up input for. For example, a Frenchmen wanting to write japanese would choose fr_FR.utf8 from this list.
Now, continuing with the Frenchman example:


echo LANG="fr_FR.UTF-8" >> /etc/env.d/02locale

env-update
>>> Regenerating /etc/ld.so.cache...
source /etc/profile


Notice the change from utf8 to UTF-8. It is required since all UTF 8 enabled locales are specified in terms of UTF-8 and not utf8. Make sure it has taken effect.

locale


LANG=fr_FR.UTF-8
LC_CTYPE="fr_FR.UTF-8"
LC_NUMERIC="fr_FR.UTF-8"
LC_TIME="fr_FR.UTF-8"
LC_COLLATE="fr_FR.UTF-8"
LC_MONETARY="fr_FR.UTF-8"
LC_MESSAGES="fr_FR.UTF-8"
LC_PAPER="fr_FR.UTF-8"
LC_NAME="fr_FR.UTF-8"
LC_ADDRESS="fr_FR.UTF-8"
LC_TELEPHONE="fr_FR.UTF-8"
LC_MEASUREMENT="fr_FR.UTF-8"
LC_IDENTIFICATION="fr_FR.UTF-8"
LC_ALL=


If not, restart, reboot, and ask questions afterwards.

Ok, one answer: /etc/env.d/02locale is used because of precident, and outlined as such in Using UTF-8 in Gentoo, a good thing to read if you have issues at this point or later.

Setting USE flags



Next, you need to add the following USE flags to your make.conf, if they do not already exist:


cjk - standing for 'Chinese Japanese Korean' - gives support for Hanzi-inspired characters ( two byte, kanji, the reason you get al those accented 'a's).
nls - 'native language support' - supposedly for enabling other languages in your interface, the nls flag could be used by some ebuilds as an 'other language support'; Enabled this as a one of many safeguards to ensure that Japanese locality is compiled in.
immqt-bc - lets Qt handle different input methods.
-immqt - This is explicitly disabled because it conflicts with immqt-bc. Setting this flag would require recompiling all programs that depends on Qt3, and has broken in the past. THis recomendation will change with Qt4.
unicode - Unicode is the pot every character is thrown in (except cursive Hebrew, apparently ^.^; )


With these flags set in your /etc/make.conf, you should make sure all your currently portage-installed packages have the correct support built in. New systems should make sure to do this early (if not recompiling all packages), to avoid rebuilding as much software packages as possible.

emerge world --newuse


Input Methods



Now, Japanese has both kana and kanji - you need a dictionary to give you possible kanji. Anthy is different from other systems available because it does not require any services to be started.

emerge anthy	

Now that the dictionary is installed, an additional input method will be built.
UIM, the Universal Input Manager, is what routes keyboard input.

emerge uim	

On its own, UIM is enough (under gtk+) to handle Japanese input. You can check this from the text entry context menu of most gtk+ programs (excluding firefox), in which UIM-anthy will be one of the new choices. UIM, in fact, becomes the defauilt gtk+ input method once installed - and it has a Gnome control panel available if you are satisfied with switching methods via keyboard. (qt requires an export QT_IM_MODULE=uim statement)

Graphical Input Method selection


SCIM, the Smart Common Input Method, provides a taskbar icon and menu for switching between input methods. It is especially good for computers with more than two methods available - or for people that prefer mouse access.

emerge scim-uim	

Qt needs an aditional step to use scim - emerge scim-qtimm. GTK+-only users do not need to do this though.
Now that everything is installed, we just need to tell everything to use scim. The following can go in /etc/xprofile for all users, or your own ~/.xprofile.

export XMODIFIERS=@im=SCIM

export XMODIFIER=@im=SCIM
export GTK_IM_MODULE=scim
export QT_IM_MODULE=scim



Wrapping up


To actually use your input method, you will at have to env-update; source /etc/profile and restart X11; you may possibly have to reboot.
Once you have done so, start up a text editing program like kwrite or gedit. A keyboard icon will appear in the system tray, that lets you select from your different input methods.

Once you are using an input method, like uim-anthy, there several modes to choose from: raw input, hiragana, katakana, half-width katakana, and a typewriter-like variation of the latin alphabet. Start typing in Hiragana mode, and you text will be converted as the appropriate kana are found. The spacebar brings up a list of possible kanji and cycles through it, and hitting enter accepts and uses the replacement. More keyboard combinations are at uim-anthy.



Notes


CJK fonts sometimes cause xorg-x11 compiled with the flag hardened to fail when starting up. Reference

"To enable UTF-8 on the console, you should edit /etc/rc.conf and set UNICODE="yes", and also read the comments in that file"
"Alternate WMs" Reference
GMplayer just doesn't, okay?

If you get letters that are inconsistant with the font you expected, you are not using raw input mode. Try some other modes.
The SCIM button can seem to flash or temporatily dissapear. This is because scim keeps settings per program - firefox input could be in Japanese while Gedit is in another language.
Gjiten & Kiten (part of kedu) are japanese dictionary programs, using EDICT. Gjiten is more comprehensive, but requires you to manualy install dictionaries. Nihongo Benkyo is another possibility, Bug 112894 for ebuilds

We'll get to Qt4 when KDE does.



See also



HOWTO Make your system use unicode/utf-8

Inputting Japanese text in Linux and some BSDs

Linux Internationalization HOWTO

SCIM wiki

Anthy Wiki (Japanese)

UIM

More on Chinese fonts