23. MFS and National Language Support

This section written by Oleg V. Zhirov <O.V.Zhirov@inp.nsk.su> , Aug 3, 1998

23.1. MFS and National Language Support

Main problem is that *nix and DOS uses codesets, which can differ. So, in Russia the most popular codeset for *nix is koi8-r, while DOS standard used so called `alternative' codeset cp866.

While DOSEMU access DOS partitions directly, through original DOS (V)FAT drivers, it doesn't matter, that linux locales are set to koi8-r, and all works correctly. However, in this way you cannot start more than one copy of DOSEMU at the same time.

A more elegant solution (which in fact heavily supported by recent development of DOSEMU) is to mount all DOS partitions into linux filesystem and access them through the `internal network' system MFS (Mach File System). It allows use the security of real *nix (linux) filesystem, and one can have simultaneously so many DOSEMU sessions, as he wants ;-).

However, new problems occur:

  1. Mount DOS partitions into Linux. Currently, linux kernel linux-2.0.34 support mounting both MSDOS and VFAT partitions. No problem is, if your DOS partition has true MSDOS format. More ambigious case is if your have VFAT partition. You can mount it as VFAT and get access to 'long names', which looks as very attractive option. But in this case you *have lost* any access to _short_filenames_ - DOS aliases of long filenames. To access short_filenames you need mount VFAT partition as MSDOS (!) (I am not even sure, that this is completely safe (!)).

  2. IMPORTANT: mounting DOS partition, you optionally make filenames convertion from DOS codeset to Linux one, if you want to see filenames correctly (in Russia cp866 -> koi8-r). Otherwise I need to support both codesets in my Linux. (Currently I do no conversion and have on my console both codesets).

  3. Bug in MFS: transfer directory/file names from Linux to DOS via MFS includes also some name conversion from *nix standard to DOS one. In original (current in dosemu-0.97.10) release of MFS a lot of corresponding locale-dependent (!!!) char/string operations on _DOS_names_ were performed with _Linux_ locale setting (in my case koi8-r). As a result, one get a garbage instead of original filenames.

  4. If you mount VFAT partition in LINUX as VFAT, some of filenames are long. I fact, curent dosemu MFS system can `mangle' them, converting into short aliases. There is NO vay to reconstruct the true DOS alias from long_name_, since in DOS created SFNAME~index index value depends exceptionaly on the _history_ of file creation. As a result, `mangled' names differs from true DOS shortnames.

23.2. Patching of MFS

Presented patch of MFS cures problem (3) only. Summary of modification:

To patch MFS put patch.mfs into directory dosemu-0.97.10/src/dosext/mfs/ and issue a command

    patch -p1 <patch.mfs
and then compile dosemu as usual.

To mount dos VFAT partition in Linux filesystem I do like as

    mount -t vfat -o noexec,umask=022,gid=107,codepage=866,iocharset=cp866 /dev/hda1 /dos_C

(or corresponding line in /etc/fstab). In this vay I have dos names in the `alternative' codepage 866. To operate them in Linux I turn console in `altenative' charset mode (see, e.g. CYRILLIC-HOWTO).

NOTE: VFAT module in linux-2.0.34 seems to be buggy: Creating via Linux filename with lowercase russian letters, you obtain file not accessible by DOS (or even Win95) - probably, people, who makes VFAT support have forgotten creating shortname DOS alias from long names or short names turn to upper case chars with ASCII codes > 127 ? (To my first look, they use tolower() and toupper() functions which works for ascii<=127). Fortunately, DOS creates filenames properly, even with cyrillic letters, and creating files with russian names is completely safe.

In DOSEMU I load DOS (DOS-7) from hdimage. To access dos partition in DOSEMU I have in config.sys two lines

         install=c:\subst.exe L: C:\
         install=c:\lredir.exe C: linux\fs/dos_C

Obvious disadvantage of my approach: long filenames have short aliases slightly different than in direct partition access. In practice, this doesn't result in problem, since DOS stuff usually does not exploit long filenames, and files with long filenames are irrelevant.

23.3. TODO:

  1. In Linux kernel: Cure in VFAT module upper/lowercase bugs for ascii>127.

  2. In Linux kernel: VFAT module should provide access to short DOS aliases of long names, even if VFAT partition is mounted in longname support mode (with -t vfat).

  3. In dosemu MFS: MFS should use short DOS aliases, actually existing for VFAT longnames.