COM in the 32-Bit World
COM made an interesting break with both operating systems: It only supports Unicode, and ANSI is just left out in the cold. If you cannot speak Unicode at some level-even if it only means supporting the MultiByteToWideChar and WideCharToMultiByte calls to convert it-, then you cannot speak to COM. With so much of even the basic functionality in the 32-bit Windows shell requiring Unicode, every application must at least do a little work in Unicode. Strings in 32-bit COM (OLESTRs and BSTRs) are always Unicode. Of course, most applications support it minimally by handling the conversion functions and using the system default codepage (CP_ACP), the one codepage guaranteed to always be supported.
The core operating system only added a few API calls to the list that would support both Unicode and ANSI (seen in Table 6.2).
Table 6.2 The Win32 API Calls for Which Windows 98 Added Unicode Support
What It Does
Appends one string to another
Copies a string to a buffer
However, many new interfaces were added, such as new shell extensions and integrated browsing enhancements. These are all COM interfaces and thus only support Unicode.
Windows Millennium Edition (Windows Me)
Windows Me did not really add much to the equation. According to Microsoft, it is the last version of the Win9x code base that will ever ship, but, in fairness, the company has been saying this since the OSR2 release of Windows 95. The basic issues I mentioned in connection with Windows 95 and 98 apply to Windows Me.
Two sets of APIs that have had Unicode support added in Millennium are those related to the Input Method Manager (IMM) API, which I discuss further in Chapter 8, and the Geographical Information Management (Geo) API. Geo is used by many of the Windows Me components that map locale information to a geographical location.
Data Storage Engines
The engines themselves, whether SQL Server, Jet, FoxPro, or other, initially stayed away from the Unicode world, preferring the provincial world where a single codepage is all that would be needed. Although both Jet and SQL Server could do their own string normalization in many cases (see Chapter 12 for more information on this topic), it was done only for performance reasons and not to support the notion of multiple codepages in the same file. Both products were a step beyond the operating system notion of the default codepage. You could explicitly choose to use any single codepage that the OS supported, but you were still limited to a single codepage.
The more recent versions of Jet and SQL Server, however, do support Unicode as a native format: In Jet, everything was moved to Unicode; in SQL Server, you could choose between ANSI and Unicode. Other engines (such as FoxPro) have no native Unicode support at the database engine level.
Data Access Methods
Unlike the engine itself, most of the data access methods (ADO, OLE DB, DAO, and RDO) are COM components that only support Unicode. So what do data layers do when they must speak Unicode if the underlying engine does not? Well, simply speaking, they convert to and from Unicode, using either the default system codepage, or in the case of FoxPro, Jet, and SQL Server, the codepage of their choice. Obviously there is a lot of room here for conversion errors.
The move to Unicode by the data engines not only the made conversion errors usually go away (if everything can stay in one format, there is nothing to incorrectly convert!), it also improved performance because so many conversion calls went away! Chapter 12 has more information on why and where there are sometimes still problems in this area.
The popular Visual Basic author Bruce McKinney once stated, “Someday there will be Unicode data file formats, but it might not happen in your lifetime.” How wrong this turned out to be! Over the first three 32-bit versions of Office, all the major applications (Word, Excel, Access, and PowerPoint) have moved to both Unicode file formats and Unicode executables. Even in the world of text files (which are usually stored in ANSI format), provisions to not make assumptions about the codepage of the file have been made.
To give a specific example, this very book was written, edited, and laid out by the publisher using Word 2000. Why? Because in many cases, I wanted to support multilingual text. I did not want the publisher to use QuarkXpress, a very popular program in publishing circles, because it has the exact same limitations as I have been describing in other programs. I have had to deal with the limitations of such packages for years in the articles I have written (and Quark, Inc. definitely is a standard for many publishers), but for this book it was important to be able to treat all languages as equal. By moving to Word 2000, I am able to include Hindi text such as “आप यहाँ पर क्यों आना चाहते हैं?” or Thai text such as “ทำไมคุณถึงต้องเข้ามาชมเว็บไซต์นี้?” without requiring the use of special screenshots for each bit of text. I will discuss this further in Chapter 10, “Handling Localized Resources with Satellite DLLs.”
For the curious, translations for the previous Hindi and Thai texts are given in Table 6.3, in many languages (perhaps even yours!). These translations were produced for many of the locales used on the trigeminal.com Web site.
Table 6.3 Look, Ma, No Bitmaps! Many Ways to Say the Same Phrase (Showing Off the Capabilities of My Publisher!)
आप यहाँ पर क्यों आना चाहते हैं?
или защо Ви трябва да идвате тук?
That is, why would you want to be here?
örneğin; Neden bu sitede olmayı isteyeceginiz gibi?
d.h. warum lohnt es sich, hier zu sein?
i.e., pour quelles raisons dé sirez-vous explorer ce site?
δηλαδή, γιατί θέλετε να είστε εδώ;
כלוםר למה בכלל תרצה להיות כאן?
d.w.z., waarom wilt u hier zijn?
m.a.o. vad gör du här?
porque é que tu queres estar aqui?
возможно это то, что Вам надо
ejemplo, ¿Porqué deseas estar aquí?
in altre parole, perché potreste voler visitare queste pagine?
cu alte cuvinte, de ce sunteti aici?
ஏன்நீ இஙுகு வரவேண்டு?
Windows 2000, known while under development as NT5, simply continued the tradition of NT3.1, 3.51, and 4.0. It did pick up the new shell from Windows 98 and addressed many usability complaints. But, from the globalization standpoint, it moved much closer to the worldwide EXE model, throughout: There were no longer bug fixes that existed only for specific languages! Support for MUI (the MultiLanguage User Interface) proved that Windows 2000 was a worldwide operating system.
Some applications, unfortunately, are still stuck with ANSI, most notably Internet Information Server, but these applications have been clearly put on notice where they need to be heading: Unicode.
Yet another model was used for the smallest operating system: Windows CE is closest to COM in that it only supports Unicode at the API level. However, because there are only a limited number of applications that still do support pure Unicode and only a limited amount of space on a smaller device for codepage translation tables, Windows CE applications are still limited in the number of codepages they can use. It is clearly, however, a step in the right direction.
Visual Basic in the 32-Bit World
And at last I am to the most important RAD tool in terms of this book: the 32-bit versions of Visual Basic! There are many issues that surround Unicode support in VB:
Of course, the order in which I have presented these points would lead anyone to believe the final answer to the question “Is VB Unicode?” would be “Yes, but…”, and maybe that is the best answer to give. Visual Basic is indeed Unicode with its Unicode string storage and Unicode interfaces, but as the data engines and access methods learned, there is a lot more to supporting Unicode then making sure the front door supported it, especially if you want to get the benefits of Unicode. If you think about it, Visual Basic forms gain nothing from their Unicode interfaces, nothing at all. Why is that? Well, when they are used, a single codepage is required. Therefore, the only thing that the Unicode interfaces of the forms package gives VB is compatibility with COM; none of the benefits inherent in Unicode, such as being able to support many languages/locales, are available here.
Below is How to change time zone information by using…
The following is not meant to be an exhaustive look…
The question of whether Visual Basic is ANSI or Unicode…
Your email address will not be published.