|
From Oem to Ansi
By Uwe Holz
Last Update:
Sunday, December 28, 2003
More information about Oem and Ansi character sets found at Microsoft's Homepage.
Data conversion from old DOS based projects often requires Oem2Ansi translation. You might say that this is not a big deal, since Windows has build in all required functions to do the job. Well, that's true but what if
- You are developing a platform independent tool?
- You have to support different language settings?
- You need a way to support a language where you don't even know the rules for this conversion?
The following article describes a generic way to solve all these problems by a template based Oem to Ansi translation.
Technical Background
Windows programs typically use Ansi characters, unknown for DOS. The DOS character set (known as DOS/OEM character set) includes various line drawing and other characters unknown under Windows. Both character sets are identical for character codes 0 until 127. Windows comes with functions like OemToAnsi(), OemToChar(), OemToCharBuff() and others to translate form Oem to Ansi.
To make things more complicated this translation is different for all languages and there is no one-to-one mapping between the ANSI and OEM character sets even when sticking with the same language. It's pretty obvious that such kind of translation is constant source of trouble.
The solution
The best way to solve problems like this is to make the solution fully configurable, what requires a configuration file containing the Oem and appropriate Ansi codes. The program reads in the values from that file at startup and does all Oem to Ansi translations based on it.
The CGI version of the tool can be tested right here. Just type in the name of the file you want to upload to the Oem2Ansi utility and you will receive the resulting file as web download in return:
Note: Specify only text files. Other formats containing binary data are not supported. Your data will be converted but not stored at the server in any way. If you have any doubt about it, just take a look into the oem2ansi source code in order to verify this.
Command line arguments
The command line version of the tool can be downloaded and tested. It comes with source code. It is based on German Oem to Ansi translation (setup file german.oem) by default, which can be changed by command line option oem2ansi=mylanguage.oem. The other possible command line argument is the file to be translated:
c:\> oem2ansi.exe file=<DBF/CSV source file> [<oemtab=mylanguage.oem>]
Linux version:
lx:/temp # oem2ansi.bin file=<DBF/CSV source file> [<oemtab=mylanguage.oem>]
Sample 1:
Calling oem2ansi.exe running on top of Windows (W9x, W2K, XP) in order to convert file readme.txt containing German Oem strings:
c:\temp> oem2ansi.exe file=readme.txt oemtab=german.oem
Sample 2: Calling the Linux version in order to convert file readme.ger:
lx:/tmp # oem2ansi.bin file=readme.ger
If your favoured platform differs from Win32 or Linux just use the available platform independent source code and recompile it at your machine. Take the Linux specific project file makefile.linux and adjust it to the appropriate GNU-C++ Compiler.
How to create a new language template?
Passing a different language setting by (mylanguage.oem) to the tool assumes, that mylanguage.oem is available. If the .oem file for the language you are looking for is not part of the current archive it can be easily created manually. Each Oem code (128 until 255) at the left side must have the language depending Ansi counterpart (right side). The German version looks like follows:
; File: german.oem
[OEM2ANSI]
128=199
129=252
130=233
...
253=178
254=166
255=160
If you don't know the codes for proper Oem to Ansi translation of the desired language, ask your current Windows installation. If the appropriate language is set, change to the /bin directory of your oem2ansi installation folder and launch oemtab.exe. Assume your installation directory is c:\temp\oem2ansi-1.0 do the following steps:
c:\> cd \temp\oem2ansi-1.0\bin
c:\temp\oem2ansi-1.0\bin> oemtab.exe
The desired .oem file will be created. The source code is pretty simple. The tool does nothing else then calling Win32 API function OemToCharBuff() for every single character code from 128 until 255 and stores both values into a text file:
for (i = 128; i < 256; i++)
{
c = i;
OemToCharBuff( (char *)&c, (char *)&c, 1);
sprintf(szLine, "%d=%u\r\n", i, c);
strcat(szOutBuf, szTemp);
}
If you are interested in the complete source
code for this small helper utility, just download file doemtab-10.zip.
|