挪威的森林污的部分:Win32ASM学习
来源:百度文库 编辑:中财网 时间:2024/04/29 14:09:38
Win32ASM
1 INTRODUCTION 6
1.1 WHAT THIS DOCUMENT IS NOT ABOUT 7
1.2 WHAT THIS DOCUMENT IS ABOUT (AND PREREQUISITES) 7
1.3 KEEP THE BALL ROLLING... 10
2 PRODUCT CHOICES 11
2.1 CHOOSING AN ASSEMBLER 11
2.1.1 MASM 6.11a, 6.11d and 6.12 11
2.1.1.1 MASM availability (and uncertain future) 11
2.1.1.2 MASM capabilities 12
2.1.2 TASM 5.0 13
2.1.3 Other assemblers 13
2.2 CHOOSING A LINKER 13
2.3 CHOOSING A DEBUGGER 14
2.4 CHOOSING A GUI IDE 14
2.4.1 The bad news is... 14
2.4.2 The good news is... 15
2.4.2.1 Programmer's IDE for Windows 95/NT v2.3 15
2.4.2.2 Watcom's 10.6 IDE. 15
3 BUILDING AN ASSEMBLY LANGUAGE WIN32 APPLICATION 17
3.1 USING MASM 17
3.1.1 MASM Vs ML 17
3.1.2 MASM Documentation 17
3.1.3 Your MASM source code 18
3.1.3.1 Use of registers 18
3.1.3.2 Function call conventions 20
3.1.3.2.1 The naming convention 20
3.1.3.2.2 The parameter passing convention 21
3.1.3.3 Win32 (and other) function prototypes 22
3.1.3.4 The INCLUDELIB directive 23
3.1.3.5 Segments and sections 24
3.1.3.6 Alignment issues 25
3.1.3.7 END statement and Entry Point. 26
3.1.4 MASM options 27
3.1.5 Miscellaneous OS and systems issues 28
3.1.5.1 Beware of the CLI 28
3.1.5.2 Beware of the STD 29
3.1.6 Various MASM goodies 29
3.1.6.1 Data Types 29
3.1.6.2 Base and Index 30
3.1.6.3 Structures and Unions 30
3.1.6.4 Local directive 30
3.1.6.5 INVOKE through a function pointer 31
3.1.6.6 Global labels 32
3.1.6.7 Structured programming directives 32
3.1.6.8 Structure addressing 33
3.1.6.9 Use of SIZEOF & LENGTHOF 34
3.1.6.10 Use of TYPEDEF 34
3.1.6.11 Use of ALIAS 34
3.1.7 MASM bugs and shortcomings 35
3.1.7.1 Invalid code generation in INVOKE using 16 bit parameters 35
(or a mix of 16 and 32 bit)
3.1.7.2 The infamous 512 bytes buffer 35
3.1.7.3 INVOKE and forward references 36
3.1.7.4 Macro limitations 36
3.1.7.5 Listing generation 36
3.1.7.6 Missing conditions in structuring directives 36
3.1.7.7 Major flaws in the MASM macro language 38
3.2 USING LINK 39
3.2.1 Libraries 39
3.2.2 Debugging options 39
3.2.3 Linking an .EXE file 40
3.2.3.1 Linking a Console executable 40
3.2.3.2 Linking a Windows executable 40
3.2.4 Linking a DLL file 40
3.2.5 Advanced linking techniques 41
3.2.5.1 Grouped Sections 41
3.2.5.2 DLL forwarders 42
3.2.5.3 Weak Externals 43
3.3 DEBUGGING AN ASSEMBLY LANGUAGE WIN32 APPLICATION 43
4 VARIOUS GRIPES 44
4.1 THE ABSENCE OF LDT SUPPORT IN INTEL-BASED PLATFORMS 44
5 WIN32ASM TOOLKIT 49
5.1 THE EXAMPLE FILES 49
5.2 THE INCLUDE FILES 49
5.2.1 General Include files 50
5.2.1.1 Win32Inc.equ 50
5.2.1.1.1 UnicAnsi.equ 50
5.2.1.1.1.1 The UnicAnsiExtern macro: 50
5.2.1.1.1.2 The String macro 51
5.2.1.1.2 Win32Types.equ 51
5.2.1.1.3 Win32Defs.equ 51
5.2.1.1.4 Win32Strs.equ 51
5.2.1.2 Win32Res.equ 51
5.2.2 API header include files 51
5.2.2.1 CommCtl32.equ 52
5.2.2.2 CommDlg32.equ 52
5.2.2.3 GDI32.equ 52
5.2.2.4 Kernel32.equ 52
5.2.2.5 TAPI32.equ 52
5.2.2.6 User32.equ 52
5.2.2.7 WinMM.equ 52
5.2.2.8 WinSpool.equ 52
5.3 THE MACRO FILES 52
5.3.1 Instr.mac 52
5.3.1.1 Structuring directive extensions 52
5.3.1.1.1 .BLOCK & ENDBLOCK 52
5.3.1.1.2 FOREVER 53
5.3.1.1.3 Condition mnemonics in structuring directives 53
5.3.1.2 Saving and restoring registers 53
5.3.1.3 UnusedParm 54
5.3.1.4 Internal consistency checking macros 54
5.3.1.4.1 MUSTBE 54
5.3.1.4.2 MUSTBEM 55
5.3.1.4.3 MUSTBEMGLE 55
5.3.1.4.4 SHOULDBE 55
5.3.1.5 Enumeration macros 55
5.3.1.6 Breakpoint macros 56
5.3.2 InitExit.mac: Runtime Initialization / Termination Macros 56
5.4 THE SERVICE ROUTINES 60
5.4.1 FatalError 60
6 BIBLIOGRAPHY 60
6.1 [BOOTH, 96.01] 60
6.2 [BRAIN, 96.01] 60
6.3 [INTEL, 95.01] 61
6.4 [PETZOLD 96.01] 61
6.5 [PIETREK 95.01] 61
6.6 [RECTOR & AL, 96.01] 61
6.7 [RICHTER 96.01] 61
6.8 [RICHTER 97.01] 61
6.9 [SCHULMAN 94.01] 61
Disclaimer
This documentation and associated files is provided "as is" and any express or
implied warranties, including, but not limited to, the implied warranties of
merchantibility and fitness for a particular purpose are disclaimed. In no
event shall the author be liable for any direct, indirect, incidental, special,
exemplary, or consequential damages (including, but not limited to, procurement
of substitute goods or services; loss of use, data, or profits; or business
interruption) however caused and on any theory of liability, whether in
contract, strict liability, or tort (including negligence or otherwise) arising
in any way out of the use of this software, even if advised of the possibility
of such damage.
Distribution
Since this documentation and associated files are declared as Public Domain,
you are allowed to distribute it without any restrictions on any storage or
communications media, as long as you use the distribution self-extracting file
without any modification, using the same file name (Win32ASM) as the original
file.
Trademarks
All brand names and product names used in this documentation are trademarks,
registered trademarks, or trade names of their respective holders.
1 Introduction
==============
Microsoft never documented the way to develop applications for Win32 Intel
platforms using assembly language. The only assembly language documentation
that Microsoft ever produced on the topic is the Win95 DDK, dedicated to the
development of virtual drivers (VxD) in the Win95 environment, and it scarcely
covers any Ring 3 application programming matter.
In addition; development in a Win32 environment requires the use of numerous
reference data, such as function prototypes; structures, type and constant
definitions, macros and other data. Microsoft released these items as C header
files (.H files) in the Win32 SDK, but no equivalent files were ever published
for assembly language.
This complete lack of documentation and tools propagates the illusion that
developing in assembly language for Win32 is something that simply could not be
done. This is aggravated by answers commonly provided by Microsoft's developer's
support staff, claiming that "assembly language programming for Win32 is not
supported by Microsoft" and even more often that "No, it cannot be done."
The truth is very different. Programming in assembly language for Win32 is
indeed very possible,
* it is just as simple to achieve as with a High Level Language (HLL),
* there is nothing magic about it,
* the tools are the same, and
* with some initial explanations and road-mapping, the considerable C-oriented
documentation that Microsoft (and others) have released can be used to program
in assembly language.
Moreover, programming in the Win32 environment is paradoxically considerably
easier now than it has ever been:
* The Win32 API provides a vastly improved equivalent to the standard runtime
library assembly language programmers never had before,
* the new operating environment, with its true multi-tasking services, provides
a new context and new challenges for high performance assembly language
applications,
* the Win32 environment offers the assembly language programmer additional
debugging aids, tools and protection that never existed before.
Two critical keys are missing today to make the above facts obvious: Assembly
specific documentation, and include files describing various symbols such as
function prototypes, typedefs, structures and constant definitions.
This quick documentation tries to fill a part of the first shortcoming, and we
hope the accompanying set of include files will be one step in the right
direction toward remedying the second.
The bottom line is:
Programming in assembly language for the Intel PC platform has never been
easier than it is today in the Win32 environment, and we hope this document
will help you take advantage of the new opportunities this opens.
1.1 What this document is not about
* This document is not a tutorial on assembly language programming
* This document is not a tutorial about the Win32 API or Win32 programming
issues. We realize that there is a need for initiation textbooks on the
topic, but we believe that the first step to accomplish is simply to show
that assembly language programming on Win32 platforms is easy. Once this
point becomes more obvious, we hope that other individuals will start
developing new material (or adapting from existing bases).
* This document does not discuss the relevance of assembly language programming
in today's world, nor the compared benefits and drawbacks of assembly language
programming vs. HLL language programming. We assume that anyone reading this
document has the intellectual ability to decide
* whether assembly language is relevant, useful, fun, efficient, cost effective
and/or needed, either generally or in any specific case, and
* whether the much touted benefits of platform independence are relevant to
one's particular situation, application and marketing strategies.
* This document does not reproduce information that can be found in the
Microsoft documentation. Although we realize that it would be easier for the
reader, it creates at least three problems:
* The Microsoft material is copyrighted
* The material to copy and/or paraphrase would be huge,
* As time and new releases go by, the Microsoft material is frequently updated,
and we simply could not keep up with it.
For these reasons, whenever relevant documentation already exists, this document
will only provide "pointers" to the Microsoft documentation (or to any other
documentation, for that matter).
* Finally, this document is not be considered as a finished product, nor are the accompanying files. This only reflects an ongoing work, subject to changes, corrections and (many) potential improvements. Contributions are welcome.
1.2 What this document is about (and prerequisites)
This document explains ways to develop, link and debug arbitrary large assembly language projects in (and for) an Intel WIntel32 environment (i.e. Win95 or NT for Intel platforms). It describes various tricks, tips and recipes that we collected the hard way, by trial and error, documentation reading, debugger abuse and various other techniques.
It is intended for assembly language programmers
* who are already fluent in Intel (32-bit) assembly language,
* who know enough Win32 programming to write (or at least read and fully
understand) a Win32 program in the C language,
* who have access to and understanding of the Microsoft's Win32 SDK
documentation and tools and
* who are looking for all the useful details that Microsoft carefully refrained
from explicitly documenting.
The "official" information about Win32 programming can be found in Microsoft
Developer's Network (MSDN). The Win32 SDK is delivered with MSDN Level 2
(a.k.a. "Professional") and upper level subscriptions.
Building applications using the techniques indicated in this documentation
requires access to Microsoft's Win32SDK documentation and Microsoft's Win32
import library (.LIB) files. Both are available as part of Microsoft's
Microsoft Developer's Network (MSDN). Both are also distributed with various
32-bit compiler sets, but as this document assumes and documents the use of the
Microsoft Assembler, you must insure that the import .LIB you are using are
somehow compatible with the regular Microsoft programming tools.
Talking about documentation, there is one public domain piece of documentation
that anyone interested in Win32 programming should get:
In March 1996, Sven B. Schreiber released to the public domain a wonderful set
of tools, with complete source and documentation. The whole set is available on
the Internet at Sven's site, as file
ftp://ftp.orgon.com/pub/asm/WALK32_1.ZIP
It is also available from several other sources (try your favorite search engine
and/or Archie program), and on Compuserve, GO PCPROG, Library 1 (Assembler)
I could not use Sven's tools, first because I found them too late and then
because I needed Microsoft's COFF format compatibility and symbolic debugging
support.
But I still wish I had found WALK32 earlier than I did. In addition to the
collection of tools it provides, WALK32 includes remarkable documentation about
Win32 and Win32 programming that can be used in any context. The documentation
exposes many sound and clean programming techniques, including a few that I
wish Microsoft had thought about when they originally designed the Windows API.
Whatever the tools you end up using, there are a lot of ideas, techniques and
code that you can reuse from WALK32, and you shouldn't pass it.
Sven also published an article in the November 96 issue of Dr Dobb's Journal
that shows a special application of WALK32 in Netware programming. The source
code (including a mini WALK32 environment) can be found on www.ddj.com
Lots of other information can be found in any good computer book shop, as Win32
is a fashionable enough topic these days. But keep in mind that:
* Many books claim to cover Win32 programming; but not so many cover it
properly.
* Many books are mere copying and/or rephrasing of Microsoft documentation and
examples.
* No book so far really covers Win32 programming for assembly language. So
assembly language programmers have to read at least another programming
language in order to find the information they need about Win32. The "other
language" that is needed is C. The official Win32 SDK by Microsoft describes
most of the interface to Win32 in term of C language and C data structures.
All function prototypes, structure descriptions and constant definitions are
described in ".H" (C header) files in the MS SDK.
* There are many other books about Win32 and other languages (C++, Delphi and
Visual Basic come to mind). We do not recommend attempting to use any of
these books for our particular purpose, as the languages they cover offer a
higher level of abstraction than C: As such, they tend to hide interfacing
details away from the programmer and to make transposition to assembly
language harder. It often becomes mostly impossible to relate the examples
to the underlying machine (assembly) implementation. Be particularly careful
when selecting new books about Win32 programming, as an increasing number of
books cover the topic exclusively through C++ and MFC without ever mentioning
this fact on the cover.
The short bibliography at the end of this document mentions a few reference
books and magazines we found useful (and sometimes more) in discovering
assembly language issues for Win32.
If we had to pick only ONE third party Win32 book, it would likely be "Advanced
Windows (Third edition)" [Richter 97.01]. This book clearly explains all the
important mechanisms in Win32 (with C code examples), covers most differences
between the Win95 and NT (including NT 4.0) implementations, and exposes many,
many of the pitfalls and oddly documented aspects you have to know when
programming for Win32. Be warned: This book is probably not for the beginning
Windows programmer. Beginning Windows programmers might want to start with
"Programming Windows 95" [Petzold 96.01].
Finally, Matt Pietrek's "Windows 95 System Programming Secrets" [Pietrek 95.01]
covers many aspects of Windows 95 internals and contains lots of invaluable
information about many Win32 topics.
For those readers who own a Win32 compatible HLL compiler, another source of
information could sometimes be the .ASM files that are delivered with the
source of their runtime library. Some interesting pieces can be found there,
possibly bringing information about some advanced topics like Structured Error
Handling (SEH).
Last but not least, this document refers to a number of advanced (and sometimes
not so well known) features of MASM 6.1x. It assumes that you have access to the
MASM Programmer's Guide, either in its paper form or its electronic form. The
electronic form is available in MSDN Archive Edition, "Product Documentation/
Languages/ Macro Assembler 6.1 (16-bit)". Do not be fooled by the "16-bit"
mention in the table of contents of the electronic documentation: this
documentation is the image of the latest printed documentation and does handle
the 32-bit features of MASM as well. It is very unfortunate that Microsoft
decided to bury the MASM documentation in the archive CD-ROM rather than in
the MSDN Library were it belongs, since MASM 6.1 is a dual, 16-bit and 32-bit
product, and its 32-bit part is alive and well. It is specially inconsistent,
now that Microsoft has brought MASM back to the MSDN Universal CD-ROMs, that
the MASM documentation has not been restored as well.
At the time of this writing, there is a section about MASM in the MSDN Library
[Product Documentation \ Languages \ Macro Assembler 6.11 for Windows NT
(32-bit)], but it unfortunately contains no more than a few release notes about
MASM 6.11.
For those using the electronic documentation in MSDN Archive Edition, and since
the electronic documentation does not provide table of contents and/or page
numbers, we will attempt to reference the MASM documentation through its
"hierarchical path", as describing the tree organization that appears using
MSDN's INFOVIEW viewer.
References to the documentation will thus look such as in:
"Chapter 1 Understanding Global Concepts/ Language Components of MASM/
Statements".
Unless specified otherwise, all references will be to the Programmers Guide, in
MSDN Archive Edition, "Product Documentation/ Languages/ Macro Assembler 6.1
(16-bit)"
At this time, we (unfortunately) do not know of any third party book that could
be considered as an exhaustive (and, one would hope, improved) replacement for
the MASM Programmer's Guide.
1.3 Keep the ball rolling...
This documentation can be improved.
If you feel that you hit a stumbling block in Win32 programming that you think
is related to the use of assembly language, please Email me at the address
below. If the topic falls indeed in the category we are trying to cover; and if
we can provide a solution, we will add it to this documentation.
If you discover some interesting piece of information, if you work out some
solution to a tough Win32 assembly-language problem or more generally feel you
have something that could be useful to other Win32 assembly language programmers
and that you would like to contribute, please Email it.
My own, single resources to improve the situation as we know it today are
limited. But there are undoubtedly enough of us assembly language programmers
around the world to create the missing pieces and make life easier to all.
Philippe Auphelle
philippe@irci.ihub.com
2 Product choices
2.1 Choosing an assembler
2.1.1 MASM 6.11a, 6.11d and 6.12
2.1.1.1 MASM availability (and uncertain future)
At the time of this writing, and to the best of our knowledge, MASM 6.11a is the
latest commercially available incarnation of Microsoft Macro Assembler. MASM.
MASM 6.11a is not the latest release of the software, though: it can be patched
to 6.11c, 6.11d and 6.12 (see below).
Until recently (September/ October 1997), one could question Microsoft's
willingness to keep on supporting MASM:
* MASM did not appear anywhere in Microsoft's Web site, not even in the
developer product list.
* MASM did not appear as a product in any of the MSDN Universal CD-ROM disks,
although MSDN Universal contains by definition all of the current Microsoft
development products.
* The documentation for MASM only appears in the MSDN Archive edition, "Product
Documentation/ Languages/ Macro Assembler 6.1 (16-bit)." Since the MSDN Archive
CD-ROM is dedicated to obsolete products, one can question the actual position
of MASM in the Microsoft product line.
* Visual Studio (aka Developer Studio), Microsoft's universal IDE (Integrated
Development Environment), has provision for supporting nearly all Microsoft
translators. "Nearly", that is, except for MASM, and even though the current
version of MASM supports just everything it needs to in term of IDE prerequisite
functions: Debugging, local / global symbol handling, COFF code format, etc...
(more about this later)
Then by the end of August 1997, several things happened:
Microsoft posted a patch on the Microsoft Web Site. This patch turns any MASM
6.11, 6.11a, or 6.11d. to the brand new MASM 6.12. The patch is available at:
http://support.microsoft.com/support/kb/articles/Q173/1/68.asp
At about the same time, MASM 6.11a was rehabilitated as a Microsoft product, as
can now be seen at
http://www.microsoft.com/products/developer.htm
http://www.microsoft.com/products/prodref/450_ov.htm
Finally, MASM 6.1 was (at last!) included in the MSDN Universal Edition
(Level 4), starting with the October 97 issue. The CD-ROM version also contains
the patches required to build a MASM 6.12 image.
So at the time of this writing, MASM 6.12 is the latest version of Microsoft
assembler.
According to the README.TXT file delivered with the 6.12 patch, 6.12 corrects a
number of the bugs that plagued 6.11, and brings Pentium Pro (a.k.a. 686) and
MMX instruction set support to MASM.
Most of this document was written during the MASM 6.11d area, so a few of the
bugs and limitations that we mention here might have been fixed in 6.12.
Likewise, new bugs that could have been implemented in MASM 6.12 are not
covered yet.
2.1.1.2 MASM capabilities
MASM 6.1x contains a number of advanced features that are very useful to the
Win32 programmer. Several of these capabilities were actually introduced with
6.0 and 6.1 (but not all of them worked properly back then). Among these
features are:
* Full Win32 support. MASM 6.1x is a true Win32 console application, and
supports long file names for source, object, listing and symbol files.
* Support of both objects formats, Intel OMF ("old") and COFF (Win32).
* Support of function prototyping (headers), as well as assembly time parameter
checking.
* HLL interfacing features, with automatic handling of parameters, both at call
time and inside the called function. There is even TYPEDEF support.
* Support of several ways of handling external definition, including support of
"soft" externals through the EXTERNDEF directive (users of the late and missed
OPTASM top assembler will appreciate this one).
* Generation of "decorated" names allowing link-time checking of the number of
parameters passed to a function. This can be used with Win32 import libraries
to prevent parameter list errors from crashing processes
* Support of local (stack) variables symbolically, including at debug time.
* Support of structured programming directives. These brings the capacity of
building "GOTO less" (JMP less would be more appropriate here) programs.
Traditional labels and jump instructions are of course still available and can
be used to construct assembly language programs in the great tradition of
assembly spaghetti style, that gave assembly language maintenance its
incomparable reputation.
* Support of the INCLUDELIB directive to simplify the linking process and allow
the automatic invocation of import libraries by the linker.
* Last but not least; MASM is now a very fast assembler when running under
Win95 or NT. Providing the Win32 system has enough memory, its assembly speed
doesn't significantly suffer from the compiling of very large and numerous
include files and/or complex macros. When initiated through my editor, a typical
source with a few includes and a number of complex macros assembles in a few
seconds on my relatively slow Pentium 60. This includes MASM load time, and the
fact that listing file generation is always turned on in my configuration.
MASM is far from being perfect, though. It would certainly require a few
improvements in some areas (more on this later). But it is a very good tool as
it is.
2.1.2 TASM 5.0
TASM is Borland's assembly language.
TASM 4.x already supported 32-bit programming (it required a patch to run in a
Win95 DOS box, though).
When Borland announced the brand new TASM 5.0 in March 96, I rushed my order,
thinking it would solve most of the remaining problems I had with MASM (like
TASM had done in the past). Unfortunately, it turned out to be different: TASM,
that once used to be more compatible with MASM than MASM itself, now fails to
fully support a number of the new MASM capabilities that appeared with MASM
6.10. Several of these being features that make programming to Win32 much
easier, I regretfully had to go the MASM way rather than the Borland one.
2.1.3 Other assemblers
To the best of our knowledge, there is today no other 32-bit assembler that is
suitable to full Win32 support (debugging code included), and that has a syntax
compatible with that of MASM.
The best Microsoft compatible assembler ever, OPTASM from SLR systems,
unfortunately disappeared without ever reaching 32-bit state.
WATCOM has an assembler, but that is too far from MASM compatibility to be
really useful out of the WATCOM world.
There is also a GNU assembler, but the exotic GNU syntax (AT&T) is way too far
from the original Intel/Microsoft syntax commonly supported in the Wintel world
to fit my integration needs.
2.2 Choosing a linker
The choice of the Microsoft assembler and the debugging formats it supports
induces the choice of a Codeview compatible linker. The choice of the COFF
format reduces the choice yet a bit more.
The Microsoft linker fits the bill. I successively used :
Link32.exe 1.00 Probably came from the NT SDK/DDK
Link.exe 3.00.5270 From VC++ 4.0
Link.exe 4.20.6164 From VC++ 4.2
Link.exe 5.00.7022 From VC++5.0
The WATCOM linker might work too, but since the Microsoft linker is widely
available and did what I expected, there was little incentive to look for
something else, and I didn't check it further. At the time of this writing,
other linkers I know of either don't support the Codeview debugging format
(Borland) or only support OMF object format (Symantec, Borland).
Those deciding to favor the OMF object format might want to consider the
excellent, feature rich and superfast OPTLINK linker: As far as I know, it is
not sold anymore as a stand alone product but now belongs to the Symantec C++
development suite.
2.3 Choosing a debugger
There are three debuggers I know of that offer full symbolic debugging
compatible with the MASM / Microsoft Link combination:
* Microsoft own Developer Studio GUI debugger,
* WATCOM's GUI debugger, sold with their WATCOM C++ package. I used the one
sold with 10.6 for a while and it worked quite well. Mostly comparable to the
Microsoft debugger, although generally faster to load and operate. I haven't
upgraded my WATCOM compiler to version 11 yet, so I can't tell whether anything
has changed in the latest version.
* Numega's SoftIce 3.x (or upper) debugger. This one sits in a class of its
own. It supports full symbolic debugging like the two previous ones, has an
order of magnitude more capabilities than any other Win32 debugger and allows
tracing inside the OS, access to the OS internal structures and much more. It
is not a GUI debugger: it will either switch the screen to character mode when
the debugger is entered (preserving the GUI screen), use an alternate screen
(off a second video adapter, either monochrome or VGA) or allow remote control
through an other machine.
2.4 Choosing a GUI IDE
By the current definition (and available products), a GUI IDE (Integrated
Development Environment) is a piece of software that consistently puts together
an editor, a "make" facility, a linker, at least one language translator and a
debugger.
The choice of an IDE ended up being all too simple:
2.4.1 The bad news is...
I currently know of no "full" IDE that can directly and naturally home MASM
development under Win32 (I'd love to be proved wrong on this one!)
One could expect that Microsoft's overhyped Visual (Developer) Studio would
have the minimal support required to integrate Microsoft's own MASM. But
unfortunately, as I mentioned earlier, at the time of this writing, it doesn't.
Obviously, the question was asked often enough to bore to tears the folks at
Microsoft's developers support. So they published a Knowledge Base article on
the topic:
Using the Development Studio or Visual Workbench with MASM
Article ID: Q106399
Revision Date: 07-DEC-1995
The article presents four pitiful kludges requiring the user to do most of the
work by hand for each project to be managed through the IDE. But unfortunately,
Microsoft didn't bite the bullet and incorporate MASM inside Developer (Visual)
Studio. Someone at Microsoft has still to realize that the whole purpose of the
IDE is to allow the programmer to merely click or drag'n'drop source files in
the module tree, not to impose further configuration chores.
2.4.2 The good news is...
With quite a bit of work, it is possible to assemble a set of tools that has
many of the characteristics of a full-fledged IDE. Once the initial integration
is done, creating new projects and managing existing ones is just as simple as
one would expect using a specially designed IDE.
2.4.2.1 Programmer's IDE for Windows 95/NT v2.3
First, here is a site that I recently found and that everyone should know about:
http://www.execpc.com/~sbd/
This page, "Software By Design", belongs to Gregory Braun. Gregory Braun wrote
a whole collection of small and beautiful freeware utilities, most of them in
both 16 and 32-bit versions.
Among these, PMAN32.ZIP, a "Programmer's IDE for Windows 95/NT v2.3", a very
simple but nicely crafted general purpose programmer's IDE, designed to
synchronize a command line code translator such as MASM with an editor, a
debugger, a linker and a make program. It looks like PMAN32 needs Link and
Nmake (because it generates directives files for these two utilities).
PMAN allows one to configure a "project", defining the compiling and linking
parameters. PMAN holds the list of modules and generates the Nmake file and
link parameter file. PMAN relies on file associations to let you switch from
one file to the next by clicking on modules in its file list. Well, just try
it: This is a no risk offer, it's an 80K download.
Gregory Braun also developed a free hex editor, a free toolbar, a free notebook,
a free CardFile-like phonebook / dialer, and even a game, BattleStar. Last time
I checked, there were 18 different nice, well-behaved programs, all under 100K,
requiring no setup, runtime, bulky DLL and or complex installation. And all the
ones I tried worked great.
2.4.2.2 Watcom's 10.6 IDE.
The Watcom10.6 C/C++ compiler comes with an IDE.
The Watcom10.6 IDE is not as flashy as the Borland and Microsoft ones. But it
contains nearly everything that's needed for the job.
"Nearly" only, because the way one configures the Watcom IDE is by modifying
a set of text files (.CFG extension), and none of the syntax of those files
is officially documented. The configuration syntax is quite complex, the
configuration files are large and customization is quite painful.
In other words, Watcom created their IDE as a customizable tool but
unfortunately, did not intend to go as far as making it a fully open, tool
independent program.
But with a some imagination and the help of a user-contributed file found on
the Watcom CompuServe forum and documenting part of the .CFG files, it was
possible to tailor the WatcomIDE to make it use MASM, LINK and all their
options instead of native Watcom tools.
Once this had been done, replacing the support for the Watcom debugger with
that of the Developer Studio or SoftICE was not such a complex matter.
The benefits are:
* The resulting IDE uses its native WATCOM make program and provides a graphical
interface to it (no fiddling with yet another disgusting "make" syntax).
* After proper configuration, The Watcom IDE gives access to all of the options
for all tools, (MASM, LINK,... ) as radio buttons, checkboxes, edit boxes, etc.
These are maintained individually on a file by file basis and can be changed
through the GUI interface.
* Finally, the Watcom IDE supports multiple targets and cascaded dependencies,
like a set of libraries and several .EXE files. If you modify one of the source
files for a library, and this library is used to link an .EXE files, rebuilding
the .EXE will first automatically rebuild the library (and recompile the source
file). Unfortunately, there does not seem to be support for non-Watcom include
files.
WATCOM supplies a text editor, but I didn't like it too much. I personally use
American Cybernetics' Multi Edit for Windows (MEW). It has off-the-shelf
support for all the language translators I ever heard of, plus many more I never
thought could even exist. It does in-editor compiles either synchronously or
asynchronously. It has myriads of customization capabilities, all accessible
through nice hierarchical menus. And if that were not enough, it has a full
macro language, and macros are provided with sources so you can change them to
your will (I actually hardly ever needed this).
Last but not least, MEW has built-in support for several IDEs from various
vendors, including the WATCOM one. Practically, this means that clicking on an
error line reported by MASM to the WATCOM IDE brings up the MEW editor with its
cursor pointing to the right line number. Pressing the "build" button in the
IDE automatically directs the editor to save the source files before the IDE
launches its "make" session. Etc...
Last time I checked, you could get an evaluation copy of MEW 7.1x from
www.amcyber.com. And NO, they don't bribe me for plugs!
Alternately, here is freeware bargain:
There are dozens of text editors available on the Internet, as substitute for
NotePad and/or WordPad. But there are not so many that are good programmer's
editors.
Here is the best we found:
"Programmer's File Editor" (PFE32), written by Alan Phillips. PFE32 is a
full-fledged programmer's editor and supports a compiler / assembler, keyboard
macros and other features. Last but not least, there is even an Alpha and a
PowerPC version.
PFE32 can be downloaded from
http://www.lancs.ac.uk/people/cpaap/pfe
and many other sites (hint: use an Archie search).
At the time I'm writing this, the name of the file is PFE0701I.ZIP, but this is
likely to change as new versions are published. Go to the URL above if in doubt.
The author can also be reached at
A.Phillips@lancaster.ac.uk
It was a lot of boring work to put all the IDE pieces together (specially
customizing the Watcom IDE), but retrospectively, the result was probably worth
the effort. I just wish one day, the good people at American Cybernetics bit
the bullet and added to their great editor a fully configurable GUI "make", as
well as the minimal additional support required to interface with the most
common linkers and debuggers. This would turn a great editor into the first
fast, compiler-independent GUI IDE, something the development world is missing.
If anyone out there has found any useful, fully GUI combination of tools
covering the same (or a larger) area, I'd love to hear about it!
3 Building an assembly language Win32 application
3.1 Using MASM
3.1.1 MASM Vs ML
The latest incarnation of MASM is not MASM.EXE, as once was. The true name of
MASM is now ML.EXE.
MASM.EXE still exists in Microsoft's Assembly language package, but only as a
mere shell that provides command line compatibility to older versions of MASM.
Both the MASM and ML names are used in this text and always refer to the latest,
greatest ML.EXE.
The latest version of ML is version 6.12.
MASM 6.11d is available on MSDN Level 2 (and higher levels), on the NT DDK
CD-ROM, in the \DDK\BIN\I386\FREE directory (files ML.EXE & ML.ERR), and can
be turned into 6.12 by a patch we mentioned above.
The previous version, 6.11c, is also available on the Win95 DDK CD-ROM, but I
don't know of any valid reason to use it rather than 6.11d (or 6.12).
3.1.2 MASM Documentation
Anyone who wants to program in assembly language might want to buy the Microsoft
MASM 6.1x package, whether or not one already owns a copy of ML.EXE through the
MSDN CD-ROMs: The reason is the Programmer's Guide book included in the official
MASM package. This book doesn't exist in any other current source (specifically,
it does NOT exist in MSDN Library, unlike all other Microsoft development
products documentation). Nor do we have any third party substitute, like a good
reference book on assembly language covering all the features of ML.
The worst part of the story is that the MASM programmer's guide is truly a
terrible piece of documentation: It contains many invaluable little pieces of
information. But they have been carefully buried all over the book in weird and
absurd places, like Easter eggs, apparently to make absolutely sure only people
very serious about assembly would possibly find them. That is, if they really
tried hard enough... And no other book but the MASM Programmer's guide documents
these things... The other manuals in the pack are obsolete and mostly useless
for Win32 programming, as are most other programs on the MASM 6.1x distribution
diskettes.
Another thing you need to do to complete your quest for MASM documentation is
to search the whole MSDN library CD for the string "6.11." This will pull
various items that include READMEs and knowledge base articles, detailing a
number of small improvements, limitations and features that are not mentioned
anywhere else. Some of these knowledge base articles are either obsolete, or
wrong, or both, but heck, you can't win all the time.
Finally, the README file included in the MASM 6.12 patch contains a number of
documentation updates and clarifications (some of them identical to previously
published Knowledge Base articles.)
Read them carefully.
3.1.3 Your MASM source code
3.1.3.1 Use of registers
Upon return from a Win32 function, the function return value, if any, can be
found in EAX.
All other values are returned through variables passed in the function parameter
list you defined for the call.
A Win32 function that you call will always preserve the segment registers and
the EBX, EDI, ESI and EBP registers. Conversely, ECX and EDX are considered
scratch registers and are always undefined upon return from a Win32 function.
EAX is never preserved either, and as we saw above, it most of the times
contains a return value. Otherwise, it is void.
This convention is derived from the 32-bit PASCAL convention for register use.
A little known fact is also that a Pascal procedure expects the direction flag
to be clear upon entry, and must keep it clear upon exit (MASM Programmer's
Guide, page 311).
If your assembly language PROC uses any of these precious registers, you might
have to preserve them too. The simplest way to do so with MASM is by using the
USES clause of the PROC directive, as follows:
Foo PROC USES EBX ESI ;Proc only changes EBX and ESI internally.
MASM will generate the appropriate PUSHes upon entry and the required POPs
automatically before each subsequent RET instruction in the PROC. Of course,
since you are undoubtedly programming in a structured way, there will be a
single exit point to your function, a single RET instruction, and this will
never generate more than the minimum number of POPs.
EBP is a special case, that you might not find too often in the USES list.
If your function uses any local data (aka "dynamic" data, defined on the
stack through the LOCAL statement), or if your function is called with stack
parameters (declared in the PROTO / PROC definition), then MASM will generate
the appropriate stack frame. It will set EBP as its addressing base to access
both local variable and parameters. In this situation, MASM will also
automatically generate code to save and restore EBP, so it would be a waste to
mention the EBP register in the USES list.
Be especially careful to NOT change EBP in any PROC that is defined as taking
parameters and/or using local data - or at least, to use EBP very carefully in
such cases. Using the command line switches /Fl /Sg creates an expanded listing
file showing all generated instruction and helps in mastering these delicate
situations.
The above rules about register preservation might or might not apply to your
own functions, though.
The general rules are as follows:
Win32 functions that you call from your own code do not care about the entry
contents of the EBX, EDI, ESI and EBP registers.
Win32 functions currently do care about the content of the segment registers,
though, assume them to follow the holly FLAT model, and don't bother reloading
them upon entry into system code. In other words, if you ever happen to play
with segment registers, Win32 is very likely to expect the segment registers
upon function entry to be in the same state as they were when your process
initially got control.
When calling functions that you wrote from your own code, it is obviously your
own business to decide whether the calling function, the called function or
neither of them will save and restore registers. You might be able to spare
quite a few cycles by only saving registers when you know it is really needed:
it is not because MASM gives you some HLL-like facilities that you should start
generating code like that of a compiler!
On the other hand, functions that you wrote and that you register as a callback
with any Win32 or other function external to your code should always play by
the rules: The upper level Win32 function that will eventually call your code
certainly doesn't expect you to change any of EBX, EDI, ESI, EBP and the segment
registers before returning control. So callback functions that you write should
always return with these registers unaltered.
The very same rule applies to the DLL functions you export: the code calling
your DLL expects your code to play by the rules and respect EBX, EDI, ESI and
EBP.
A quick one about the segment registers: you will probably not need to do
anything with the segment registers. As you certainly know, Win32 uses the
FLAT model, where all segment registers contain descriptors mapping the whole
logical address space your application (process) sees. The way the flat model
was implemented by Microsoft is too restrictive in my opinion, and deprives the
programmer of a very useful and efficient native CPU mechanism, as is explained
below in "The absence of LDT support in Intel-based platforms", page 41.
But this is unfortunately the way it is today. Unless proved otherwise, it's a
no-no to change DS, ES, CS, SS or FS. The GS register doesn't seem to be used,
but to my knowledge, nothing has been published on the topic so far and I did
not investigate any further yet personally. If you ever try to play with GS,
you should at the very least realize that you are doing it at your own risks.
In any case, in the great tradition, although Microsoft didn't let the Ring 3
application programmer use segmentation and segment registers, they permitted
themselves to use it, even in your own Ring 3 code:
If you look at the FS register, you'll notice that at any time, the FS register
contains a valid selector. This has been documented in [Pietrek 95.01]. The
selector in FS actually points to a Thread Information Block (TIB). The TIB
contains various thread-dependant items. The contents of the TIB are used by
many Win32 syscalls; and changing the FS register is very likely to crash your
process automagically - and very soon.
3.1.3.2 Function call conventions
Interfacing with Win32 is designed for HLLs, and MASM makes provision for
predefined "Function Call Conventions", that let it take care of a few boring
matters for you. MASM allows you to pick one from C, PASCAL, STDCALL, SYSCALL.
But Win32 always requires and only supports STDCALL, with one exception
(explained below).
The function call convention is used in conjunction with the PROTO, PROC and
INVOKES directives. You define a default for the whole module by using the
.MODEL directive ahead of your source file, and this saves you from repeating
the calling convention for each subsequent PROC or PROTO you define.
.386 ;(could be 486, 586,or 686 too)
.MODEL FLAT,STDCALL
Defining a given default calling convention in .MODEL still allows you to
override the default for any given function. The calling convention you
explicitly state in specific PROC and/or PROTO directives overrides the one
in .MODEL
The calling convention defines two different aspects of function interfacing:
* Naming
* Parameter passing
STDCALL is an hybrid of the C convention, for the naming and order or
parameters of the stack, and PASCAL convention, for the removal of parameters
from the stack.
3.1.3.2.1 The naming convention
The naming convention defines the way the names you assign to symbols in your
modules appear outside of your module, i.e. at link time.
The STDCALL naming convention tells MASM to "decorate" subsequent PROC names,
prefixing them with an underscore (_) and postfixing them with a @ sign,
followed by the number of bytes in the parameter list (this is the type of
decoration the Microsoft C compilers generate).
For instance, a function Foo that would take two DWORD parameters would have
its name turned into the external (link time) name
_Foo@8
by the assembler (since two DWORD generate 8 bytes of parameters). This trick
is used by the linker to perform some brute force (but very useful) parameter
consistency check against the Win32 import libraries. If you inadvertently
specified the wrong number of parameters in the PROC and PROTO definition,
MASM would generate the wrong "@x" postfix value and the link would fail with
an undefined reference, pointing at the error. Believe me, this is much better
than getting some unpredictable behavior at runtime because of a parameter
error, disguised into a stack error...
3.1.3.2.2 The parameter passing convention
The parameter passing convention defines the way the parameters in the parameter
list are pushed on the stack, and the way they are removed from the stack.
The STDCALL calling convention defines that parameters are pushed on the stack
right to left (i.e. the last parameter of the list is pushed first). It also
defines that the callee (rather than the caller) will remove the parameters
from the stack.
Having the callee remove the parameters has two benefits:
* it saves an instruction for each call in the code (as the stack cleanup code
does not need to be repeated in the code after each call)
* it is much more likely to trigger a trap (exception) if the caller calls using
the wrong parameter list: as the callee will remove what should be the number
of bytes in the parameter list, the resulting stack misalignment is likely to
trigger an exception very soon. Using the C convention, the calling code would
always consistently remove the number of bytes it placed on the stack, and the
error could go unnoticed for a long time.
Like with any good rule, there is one exception to the use of STDCALL in Win32:
functions accepting a variable number of arguments require that you define them
as using the C calling convention. With such functions, the callee can't remove
the parameters from the stack, since it doesn't know until runtime how many of
them where pushed in the first place. So the caller does the cleanup. To the
best of my knowledge, this exception only applies to ONE function in the whole,
huge Win32 API: the wsprintf function. Other functions requiring a variable
number of data items (like wvsprinter and FormatMessage) actually take a fixed
number of parameters including a "va_list" parameter instead, i.e. a pointer to
a list of arguments that can be variable.
The MASM Programmer's Guide claims (in Chapter12/ Naming and Calling
Conventions/ The STDCALL and SYSCALL Calling Conventions, page 311) that
prototyping an STDCALL function with the VARARG keyword achieves the same effect
as declaring it C: But it appears that in fact, MASM flags this as a programmer
error (true in 6.11d, not checked in 6.12).
Using PROTO definition and INVOKE statements, MASM will generate the right set
of instructions to push (and for the C function, to remove) parameters from the
stack.
I highly recommend that you use PROTO definitions to define external functions
(whether imported or exported), and the INVOKE directive (rather than a series
of PUSH followed by a CALL) to generate the boring list of PUSHes needed to call
Win32 functions. There is a small limitation to the use of this directive (see
below, "The infamous 512 bytes buffer, page 32"), but it can be dealt with.
PROTO definitions work in a way similar to function prototypes for C compilers:
they define function headers that the assembler uses
* to generate the right calling sequence when the function is invoked but not
defined in the module,
* to check parameter consistency in the module where the function is actually
defined.
3.1.3.3 Win32 (and other) function prototypes
The best way to handle Win32 API calls is to create include files containing
PROTO directives for the various Win32 functions, as well as the equates and
structures they require.
After a while, we found out that the simplest way to organize this was to stick
to the scheme used in the Win32 SDK: whenever possible,
* use an include file for each import library (or DLL) that is needed from the
SDK, and
* give the include file the same name as the import .LIB (and DLL) with some
distinctive extension name (like .EQU, .INC or HDR),
* When defining the PROTO headers, keep exactly the same parameter names as the
ones used in the SDK. This is not mandatory, as parameter names are just
placeholders in the PROTOs, but it will help you relating the SDK documentation
with your ASM documentation
* When defining structures and equates, stick exactly to the spelling and case
of the original equates, structure names and structure members. If you are
using TYPEDEFs, do this for TYPEDEFs too. For equates, structures and structure
members, this is a must, as these names are the ones you will find in the SDK
documentation, examples and third party books. You don't want to spend half your
development time looking up the nice original name you found for that error code
or structure member, do you? There is one case when this rule cannot apply, and
this is when a structure member collides with a MASM reserved word, like for
instance, OFFSET. In this case, use a simple, consistent, automatic renaming
convention to create a non-colliding name. AND clearly document the trick for
posterity. I personally decided to prefix the name with an underscore ('_').
Following these simple rules makes it very easy to organize your include files.
Lookup a function in the MSDN InfoView documentation; constants, return codes
and structure names are the same as the ones mentioned in the SDK so you can use
them as is. Now press the "QuickInfo" button, look at the name of the import
library, et voila, you directly know which EQUate (PROTO) file to include in
your source file.
Both MASM and TASM come with utilities to convert C header files (.H files) to
include files (.INC files). The Microsoft one is H2INC.EXE, while the TASM one
is H2ASH32.EXE. I ended up not using them, because:
* neither is able to properly compile original unmodified .H files. Both spit
tens of errors compiling WINBASE.H, for instance. And manually tweaking copies
of the original .H files to get them to properly convert is a pain.
* I decided I preferred to know what I imported in my files rather than
blindely importing whatever the converter converted. There are many things in
the .H files that might have some historical value (for 16-bit portability, for
instance), but do not mean a thing in a new Win32 assembly context.
* I didn't like the twisted syntax the Microsoft tool (H2INC) generates for
function prototypes: a TYPEDEF with an artificial name (PROTO_
followed by a PROTO referencing the TYPEDEF.
As a result, I chose to create include files for each of the libraries I needed
manually, and to add function prototypes, structures and equate definitions
along the way, as I needed them.
The ideal solution to this problem could be an assembler that would properly
compile and interpret a large subset of the standard .H files, smartly enough
to gobble irrelevant errors. Hey, one can dream a little! On the dark side,
this would likely slowdown assembly by a large amount.
Alternately, an H2INC-style utility that would be able to properly and cleanly
compile any of the existing .H files to .ASM source would certainly help.
3.1.3.4 The INCLUDELIB directive
A very useful MASM feature is the INCLUDELIB MASM directive (MASM Programmer's
Guide, "Chapter 8/ Developing Libraries/ Associating Libraries with Modules",
p. 222): Insert an
INCLUDELIB KERNEL32.LIB
directive ahead of the include file that contains the PROTO definitions for the
KERNEL32 functions, for instance, and MASM will automatically generates a linker
directive (embedded in your object code) adding KERNEL32.LIB to the list of
libraries to search at link time.
MS Link is able to recognize and process the embedded directive. I didn't check
with other linkers, but I would assume they understand the embedded directive
the same way.
By using this technique, the only other thing that the linker needs to resolve
external references is a LIBPATH switch on the command line: A command-line
switch like /LIBPATH:G:\WinSDK\LIB tells MS Link where the Win32 import
libraries files (that you might define using INCLUDELIB) can be found. If you
have several groups of libraries, use the /LIBPATH directive several times on
the command line.
One of the MASM-related Knowledge Base articles claims that INCLUDELIB is not
supported with LINK: Don't believe it, INCLUDELIB does work just great, at
least with recent the 3 most recent MS LINK implementations I checked.
Additional tricks:
If you need to include several libraries, use several INCLUDELIB directive.
INCLUDELIB can be used to pass other directives to the linker. What INCLUDELIB
does is to embed a "-defaultlib:" directive in the special ".drctve"
pseudo-section (see Microsoft definition of the COFF format for more
information on that special section).
The (dirty) kludge we use here is that INCLUDELIB passes everything that
follows it "as is" to the linker.
So a line such as
INCLUDELIB Kernel32.lib -verbose
Will actually pass
-defaultlib:Kernel32.lib -verbose
as a additional parameter line to the linker...
The END directive acts about the same as the INCLUDELIB directive. It only
generates an
-entry
parameter rather than a -defaultlib one.
One can regret that Microsoft did not make provision for a generic LINKDIRECTIVE
verb instead of (or in addition to) creating the specialized INCLUDELIB and END
directives.
3.1.3.5 Segments and sections
There are no more " segments " in Win32 world, because they have been replaced
by " sections. " You can still define sections the old way, using SEGMENT
directives. The best way, adequate in most application, is to use simplified
directives as offered by MASM.
There are mostly 4 predefined section names that are useful in Win32
programming:
.CODE
.DATA
.DATA?
.CONST
.CODE, as the name implies, defines the .CODE section. Don't you ever try to
write to it!
.DATA defines the initialized data section.
.DATA? defines an uninitialized data section.
.CONST defines a read-only section for constants. Can't be written to either.
This defines "pre-canned" sections. But you can generate any customized section
with any name and attributes using the segment directive. Be aware that this
grows the size of the resulting .EXE file, and that this also wastes some
memory: Sections are usually aligned on 4K page boundaries, so needlessly
creating many sections containing each a few bytes could potentially waste
nearly 4K bytes per section, even if the actual negative impact might be
somewhat reduced by the VM management logic.
There are a few cases where you might want or need to declare additional
sections, though, and let the linker reorganize the data at link time. This is
the case if for some reason, data that you declare scattered all around many
modules needs to be consolidated / concatenated in the runtime program image.
This could also be the case with statically allocated memory that you would
want to manipulate at runtime in a special way through the virtual memory set
of system calls (now see why the sections have to be page aligned?)
3.1.3.6 Alignment issues
Alignment issues are critical to performance in 32-bit systems. This is
something you should never overlook when programming in 32-bit assembly
language, as misalignment will go unnoticed but silently kill your performances.
It is particularly important to remember this point when coding byte strings and
structures in data sections: Coding something like
ALIGN DWORD ;This is equivalent to ALIGN 4.
MyString BYTE 'FooBar',0
SomeDWORD DWORD 0 ;Watch out! Not aligned!
might results in severly degraded performances.
The variable "someDWORD" and the following DWORDs are not properly aligned, and
can considerably slow operations down if accessed very frequently.
There are several ways to handle this:
* Manually insert ALIGN DWORD directives after each non-DWORD directive,
* Grouping data items by size, and prefixing each group with an ALIGN
directive,
* Creating additional sections (possibly by defining macros to define a
.DATABYTE and a .DATAWORD sections). This is a costly solution, though
(sections are allocated with a 4K page granularity).
Likewise, when coding structure, always group data items in such a way that
* DWORDs are always aligned on a DWORD boundary
* Words are always aligned on a WORD boundary
* When using bytes, group them by 4, or by 2 with an adjacent word, etc...
Finally, do not forget to mention structure alignment in your structures, such
as this:
Foo STRUCT DWORD ;Equivalent to STRUCT 4
DWFoo0 DWORD -1
BBar BYTE 1
DWFoo1 DWORD -1 ;Padding will be added before
;DWFoo1 to achieve proper alignment.
Foo ENDS
The alignment rules for structures are documented in MASM Programmer's Guide,
"Chapter 5/ Structures and Unions/ Declaring Structures and Unions/ Alignement
Value and Offsets for Structures", page 119. One thing that I have not seen
documented is that the alignment specification after the STRUCT keyword can be
a type specifier directive (like the DWORD in the example above), instead of a
number (1, 2 or 4).
Look at the code generated by the above structure: MASM will pad item BBar with
zeroes to respect the alignment inside the structure.
Since VC++4 / VC++5, Visual C++ started to default to aligning structures on
QWORD boundaries. There is no real hardware reason to follow this rule with the
current machines, as the cache lines for the Intel 486 and Pentium processors
are 16 bytes and 32 bytes, respectively. Other Intel recommendations suggest
aligning data the following way:
WORD data should not cross a DWORD boundary,
DWORD data should be aligned on a DWORD boundary,
QWORD data (double precision reals) should be aligned on an 8-byte boundary.
So in our 32-bit Intel world, I currently see no real reason to align on QWORDs.
I suspect this change is nothing more than Microsoft's anticipation of the use
of 64-bit machines, where the performance hit might occur when the native word
format (QWORD) alignment is not respected. On 64-bit machines, it is safe to
assume that the C integer will be... 64 bits, and that by simply aligning
structures to QWORDs, the alignment issues would be solved. Previous experience
suggests that portability issues are not limited in any way to simple word size
and alignment matters, but this is yet another story.
The exact architecture of the future Intel machines is not known at this time,
anyway, and the only points that have been disclosed tend to indicate that they
will use "dual mode" machines, able to execute either in 32-bit or in 64-bit
mode.
Now, considering that:
* assembly language is not that portable anyway,
* we don't know what the 32-bit code performance will be like on these machines,
but might assume that Intel will make their best to make it look excellent to
ease the transition, and
* nobody expects the 32-bit machines and code to disappear overnight (heck, most
of the world is still mostly running 16 bits DOS code!),
The bottom line is, "I am not sure this is worth bothering at this point."
A last word on alignment:
Sven B. Schreiber brought to my attention that alignment issues are not only a
performance issue: they are also reliability issue. Since NT 3.51, some APIs
will simply crash your process if you pass them parameters that are not aligned
on DWORD boundaries. Sven makes a special mention of resource data such as
BITMAP structures that must always be DWORD aligned.
3.1.3.7 END statement and Entry Point.
Note: The following applies to 6.11d. This issue is documented as "fixed" in
MASM 6.12 README.TXT file, but I have not checked it yet.
If you're using MS Link, you are likely to discover that putting a label field
in the END directive of your main program to specify the entry point does not
seem to work:
END Start ;Will seemingly be ignored by MS Link.
This behavior seems to be consistent with the Knowledge Base article
"32-Bit Flat Memory Model MASM Code for Windows NT", Article ID: Q94314,
Revision Date: 23-JAN-1995.
Do NOT believe this article, it is all wrong. There is a bug in the way the END
directive is implemented, but it can be worked around easily.
Here is the scoop: MASM 6.11d does process the Start label in the END directive
and generates an embedded /ENTRY directive in the object code. The only problem
is that it does not generate the right label in the "/entry:" directive it
writes to the .OBJ file.
If an STDCALL directive is in effect, and the entry procedure is Start, the
"END Start" directive will generate an inline
-Entry:_Start"
directive.
Now, note that LINK will in turn decorate the name it gets in the "entry"
parameter and internally change it at link time to "__Start", that it expects
to be a PUBLIC in the object file.
Strike 2: There can't be neither _Start nor __Start externally defined in the
MASM module, because Win32 requires the use of STDCALL. And STDCALL will change
the "Start" in the source code into _Start@0.
So MASM should be consistent with itself (and with LINK...), and apply the
default interface convention to the END directive, thus generating an inline
/Entry:Start@0
link directive for everything to work fine.
Since MASM is inconsistent, we have to fix the problem ourselves. This can be
done by replacing the END directive with an ENTRY directive, defined by the
following macro:
ENTRY MACRO EntryPoint:REQ
LOCAL EntryPoint
IF @Version GE 611
ALIAS <_&EntryPoint&@0>=<&EntryPoint>
ENDIF
END &EntryPoint
ENDM
With this macro used in your code in place of the END directive, the code line
ENTRY Start
should work just fine, by instructing the linker to use the right label in
replacement to the one it can't find. See the explanation of the ALIAS directive
below (Use of ALIAS, page 31) for more details on this magic.
3.1.4 MASM options
You have to use the /Cp (or at least the /Cx) MASM command line option.
/Cp forces all identifiers to be case sensitive. All Win32 symbols are case
sensitive.
You also have the option of using /Cx instead of /Cp, but I wouldn't recommend
it: With /Cx, external and public symbols are case sensitive while others are
not. This is the kind of sloppiness that is confusing at best, and usually
strikes back when you expect it the least: If you suddenly decide to reorganize
your code modules (by breaking out a very large code file, for instance),
symbols that used to be internal might need to become external. And this will
suddenly reveal discrepancies in the case usage. Since code restructuring is
already a delicate operation, you usually don't want this extra problem to
surface at that time.
You will likely want to use the /c command line parameter, too: this will
prevent MASM from automatically attempting to launch LINK. For all but the
smallest projects, you will want to control the launching of LINK by other
means.
Use /COFF code generation in ML. Link only knows COFF natively and you won't
need to undergo a COFF to OMF conversion. The call to the OMF-to-COFF converter
is handled by LINK (by shelling out to an external conversion program), but this
only slows down the linking process.
Use /Zi to get symbolic debugging information.
This should work with any debugger directly supporting the Microsoft's Codeview
debugging information. We successfully tested Microsoft's MSDEV IDE debugger,
Watcom's 10.6 debugger as well as with SoftICE for Win95 version 3.x.
Using the /W3 option to turn maximum warning level on might be useful too.
You might want to use /Sc to get instruction timings in the listing file.
Timings depends on the definition you gave of the target CPU (.386, .486, .586,
etc...).
This facility is quite useful, but:
* Keep in mind that this only shows brute force timings, and does not take into
account other factors like CPU pipes, AGI stalls, cache side effects, alignment
effects, etc... See [Booth, 96.01] or [Intel, 95.01] for more details on timings
and processor dependant optimization.
* MASM unfortunately doesn't show timings (or cumulated timings) for the code
it generates through structured programming directives and/or macros.
3.1.5 Miscellaneous OS and systems issues
3.1.5.1 Beware of the CLI
At least with Win95, beware of the CLI instruction.
CLI is a privileged instruction, and as such, should not be used when running
protected mode. This also applies to several other instructions, such as of
course STI and a handful of others. One would expect a process using it from
ring 3 code to be trapped by an OS exception, and the offending process
terminated.
Not so with Win95: One day, porting some code from the DOS world, I
accidentally left in a CLI instruction (with no subsequent STI nor POPF to
reset the Interrupt flag to its original value) in a Win32 program. This
resulted in a bug that it took me a long while to figure out. The problem of
course did not show up at the place where the CLI instruction was. And to make
things worse, that code was not always called either. When it happened, though,
the execution of the CLI from a thread simply messed up the inter-thread
cooperation with no apparent cause, and the resulted was that the whole process
seemed to internally hang... without freezing the rest of the system. At some
point, I came to suspect a leftover CLI, ran a text search for CLI on the whole
pile of sources and bingo, found the offending line. Everything went back to
normal after I removed it...
When I told this story to Sven, he quickly pointed out that this Win95 behavior
was carefully documented in "Unauthorized Windows 95" [Schulman 94.01], pages
319-331. Schulman documents the effect of CLI in Win95 under various conditions.
The most worrisome ones are when the CLI (with no matching STI) happens in a
16-bit regular DOS program running in Intel's CPU V86 mode.
The effect I saw happens when running under protected mode, and affects only
the faulty process.
Under the same circumstances, NT would quite simply trap the process with a
"Privileged Instruction" exception.
3.1.5.2 Beware of the STD
If you ever set the Direction flag through the STD instruction, do NOT forget to
clear it before reentering the OS. In a CallBack or DLL, CLD before returning to
the caller. In your own code, CLD before calling any OS function.
Windows 95 is likely to crash if you forget to CLD. It seems that the kernel
entry points do not insure the state of the D flag before starting to execute,
and that at least some portions of the kernel code and/or device drivers expect
and assume the direction flag to be clear when they get control.
I have no idea whether this affects NT.
In any case, just play it safe and do not forget to CLD after you STDed.
3.1.6 Various MASM goodies
The following are a few little known MASM capabilities or obscure point of
documentation that could easily be missed:
3.1.6.1 Data Types
The "old" data definition directives (DB, DW, DD) have been extended and
superseded by newer, more precise ones (Don't gasp; the "old" ones are still
valid):
BYTE, SBYTE, WORD, SWORD, DWORD, SDWORD, FWORD, QWORD, TBYTE, REAL4, REAL8,
REAL10.
They are defined in MASM Programmer's Guide, "Chapter 4/ Declaring Integer
variables/ Allocating Memory for Integer Variables", page 86, and their 16- and
32-bit C equivalents are defined "Chapter 12/ The MASM/High-Level-Language
Interface/ The C/MASM Interface", page 315.
3.1.6.2 Base and Index
The same registers can be used as either base or index registers. Both uses are
not equivalent, specially in term of performances (refer to documents on Intel
CPUs optimization). In normal situations, the "natural" syntax will achieve what
you want. The syntax to control the base address register you want is defined
in MASM Programmer's Guide, "Chapter 3/ Operands/ Indirect Memory Operands with
32-bit Registers/ Scaling factors (end of paragraph)", page 70.
3.1.6.3 Structures and Unions
MASM gives the programmer the ability to define nested structures, unions, and
combinations of both.
This is defined in MASM Programmer's Guide, "Chapter 5/ Structures and Unions/
Declaring Structure and Union Variables", page 121.
One thing that is not defined so well (no example) is how to declare an array
of uninitialized structures.
Providing Item is defined as a structure,
Foo Item 10 DUP ({})
the above will define an array, Foo, of 10 uninitialized structures of type Item.
There is more information about this topic in the README file for the MASM 6.12
patch (c.f. supra.)
3.1.6.4 Local directive
The LOCAL directive is quite a useful one.
Contrary to popular belief, it is not limited to declaring DWORDs, though: Even
strings or arrays can be defined the following way:
LOCAL EXP_BUF[5+1]:BYTE
(Do not forget the alignment issues!)
This funny syntax is documented in the MASM Programmer's Guide, but you are
likely to miss it, unless you look very carefully at the code fragment at
"Chapter 7/ Procedures/ Creating Local Variables Automatically", page 192: the
example to look at is the "aproc" procedure, and the way it defines a local
array of words...
You can't initialize a local var. The whole LOCAL space is merely allocated all
at once on the stack at runtime upon proc entry and you have to MOV any initial
value there all by yourself (look at the code the LOCAL statement generates).
But you can of course use symbolic addressing to do so, and the assembler will
automatically generate the corresponding EBP-based addressing.
It is probably useful here to mention a little know but very important
characteristic of the Win32 stack management:
Sven B. Schreiber brought to my attention the sequence of code that the C
compilers generate to "probe" the stack when assigning large LOCAL blocks.
"Large" is defined here as near or above 4K (a VM page). The problem seemed to
be that if a program would unexpectedly hit a stack page that had not been
committed, the whole process would suddenly disappear with no warning or
message. Not even a "Poof!" or a cloud of smoke... This did not seem very clear
at that point, as the stack probe would not seem to accomplish more than what
the "faulty" program would do, i.e. merely touch the uncommitted page. I finally
found the whole story in Richter's "Advanced Windows" [Richter 97.01], in the
chapter titled "Using Virtual Memory in Your Own Applications", subtitle "A
Thread's Stack."
The problem is that the stack normally grows and shrinks "sequentially." The OS
tracks the current bottom of the stack by trapping accesses to the lowest page
that is currently committed, the "guard page." If an access is done in that
page, the OS commits yet another page immediately below, and this new page
becomes the guard page. Trouble occurs if an application allocates more than 4K
(a page) of data at one time and manages to directly access one page below the
guard page, effectively jumping over the guard page and defeating the stack
growth logic. This should normally trigger an access violation, but it instead
silently kills the process, just like a stack overflow would do.
This problem is taken care of by the "stack checking" logic of the compilers:
when the compiler detects that a function is allocating more than 4K of local
data, it generates code that "touches" the allocated data sequentially, from
top to bottom, 4K at a time. Whenever the guard page is touched, and new page is
committed 4K below it and the newly committed page becomes the guard page, and
the compiler stack probe routine prevents a stack fault from occurring this way.
Once again, this is fully documented with all gory details in the aforementioned
Richter book.
The bad news is that this case is not currently handled by the MASM prologue
code generated in STDCALL functions.
But there is good news too:
* One is that this problem is limited to functions that allocate 4K or more of
local variables at a time, that there are probably not so many such functions,
and that it is easy for the programmer to track these manually and add a call
to a "stack touch" loop in these functions.
* Another good news: MASM makes provision for fully customizing the prologue
and epilogue code that are generated upon PROC entry and exit. This is the
right place to write the stack touch logic. The epilogue code has access to
everything it needs, such as the size of the LOCALs for the function, the
function's .MODEL, etc...The way to achieve this is documented in the MASM
Programmer's Guide, "Chapter 7/ Procedures/ Generating Prologue and Epilogue
Code", page 198. One of the include files provided with MASM, "PROLOGUE.INC",
gives an example of a customized prologue code in a 16-bit environment.
3.1.6.5 INVOKE through a function pointer
This is specially needed when the function takes parameters.
The way to do this requires a TYPEDEF and is defined in "Chapter 7/ Procedures/
Calling Procedures with Invoke/ Invoking Procedures Indirectly" on top of page
198 in the MASM Programmer's Guide.
Without parameters, CALL to a DWORD variable can be used: it is more
straightforward to code (no TYPEDEF, etc...), and supports forward references.
3.1.6.6 Global labels
Labels are normally local to PROCs and can't be made external:
Foo: ;This label is local to the PROC where it is defined.
You can make it global by using the following notation:
Foo:: ;This label is now global for the whole module.
(MASM Programmer's Guide, "Chapter 8/ Declaring Symbols Public and External/
Using EXTERNDEF/ (look at label "codelabel" in the example) page 215)
Remember that most local labels can be eliminated by the use of structured
programming directives.
3.1.6.7 Structured programming directives
Look carefully at, and do use the ML (MASM) structured programming directions.
MASM has .IF / .ELSE / .ELSEIF /.ENDIF, .WHILE / .ENDW(HILE) / .BREAK / .BREAK
.IF / .CONTINUE / CONTINUE .IF / .REPEAT / .UNTIL (MASM Programmer's Guide,
"Chapter 7/ Jumps/ Conditional Jumps/ Decision Directives", pages 171, up to and
including "Chapter 7/ Loops/ Writing Loop Conditions/ Expression Evaluation",
page 179).
These directives generate jumps and conditional jumps, with the exception of the
.IF directive that can also generate logic instructions. The generated code is
very easy to control, and as such, fully stays under programmer's control. The
removal of extraneous labels and jumps from the source code makes the code much
easier to read at virtually no efficiency cost. It also relieves the programmer
from having to find zillions of meaningful labels.
The list of expression operators accepted by the conditional directives (for
both .IF and loop directives) is defined at "Chapter 7/ Loops/ Writing Loop
Conditions/ Expression Operators", page 178.
Expressions are normally unsigned, but they can be forced to be signed. The way
to accomplish this is defined in the paragraph starting at "Chapter 7/ Loops/
Writing Loop Conditions/ Signed and Unsigned Operands", page 178.
The only shortcoming is that there are some limitations in the .IF capability
of testing for preexisting several complex combinations of condition codes that
have JMP equivalent, like BE (below or equal), etc. But since most of the times,
the .IF instruction generates the condition codes itself, the problem seldom
occurs.
3.1.6.8 Structure addressing
Pre 5.1 MASM versions used to accept structure member names with no mention of
the parent structure, and about any notation, including the '+' sign, etc...
Post-5.1 versions of MASM will only accept to address a structure member through
a base register if it has been told that the base register actually points to
the relevant structure.
Item STRUCT
Foo DWORD Foo
Bar DWORD Bar
Item ENDS
MyItem Item <>
One way to do this is by fully qualifying the structure path, such as
MOV EAX,MyItem.Foo
Yet another (needlessly clumsy) notation is
MOV EAX,(Item PTR[EBX]).Foo
Finally, one can use the ASSUME directive. This way is particularly convenient
in the most common case, when several members of the same structure are
manipulated in the same code area:
MOV EBX,OFFSET MyItem ;Get access to MyItem.
ASSUME EBX:PTR Item ;Tell MASM what EBX points to.
MOV EAX,[EBX].Foo ;Works even if labels Foo and Bar are
MOV ECX,[EBX].Bar ;used in many distinct structs.
ASSUME EBX:Nothing ;Tell MASM we're done with EBX.
This form as the added benefit of checking for erroneous base register use.
MASM will flag as an error any attempt to use the wrong register for addressing
the structure.
Alternately, it is possible to disable checking through the OPTION OLDSTRUCTS
directive, but this is not recommended: it prevents the use of identical member
names in two distinct structures, the nesting of structures and many other
useful features. This likely to bite you in some of the numerous Win32
structures, where the identical symbols are often use in distinct structures.
All of this (and some more) is documented in the MASM Programmer's Guide, but
the information is once again oddly split in two parts:
* under "Chapter 5/ Structures and Unions/ Referencing Structures, Unions and
Fields" Page 126, (where you would expect to find it),
* under "Appendix A, Differences between MASM 6.1 and MASM 5.1, OPTION
OLDSTRUCTS (text and examples)", pages 370-371, (where you would probably not
expect to find it...)
3.1.6.9 Use of SIZEOF & LENGTHOF
SIZEOF and LENGTHOF replace the old brain-damaged (and hardly usable) SIZE and
LENGTH of the previous versions of MASM.
The SIZEOF operator is probably the most useful, and works with about anything
the assembler knows of: BYTE strings, arrays, structures, TYPEDEFs, etc...
For instance, SIZEOF DWORD is a valid expression that yields "4."
It is considerably more useful than the previous SIZE operators, and simpler
than the "classical" way of computing the length of an item:
Foo BYTE 'Some string here'
FooLng = $-Foo
For details about SIZEOF, LENGTHOF and TYPE, see page 108 of the MASM
Programmer's Guide.
3.1.6.10 Use of TYPEDEF
MASM 6.1x offers a TYPEDEF feature, comparable to the one in C. I ended up
deciding to generally not use it, as it introduced a bit too much constraints
to my own taste for an assembly language: One has to draw a line between high
level and a low level, and my personal decision was to draw it here.
I didn't investigate too much either into the whereabouts of using TYPEDEFs.
There is an exception to the rule, though, and this is the use of INVOKE
through pointers: This requires the use of TYPEDEF to generate the right calling
sequences, see "Chapter 7/ Procedures/ Calling Procedures with Invoke/ Invoking
Procedures Indirectly" on top of page 198 in the MASM Programmer's Guide.
3.1.6.11 Use of ALIAS
See below, "Weak Externals", page 40.
3.1.7 MASM bugs and shortcomings
3.1.7.1 Invalid code generation in INVOKE using 16 bit parameters
(or a mix of 16 and 32 bit)
Note: This 6.11d bug has not yet been checked in MASM 6.12
When one declares an 8 or 16 bit parameter in an INVOKE list, MASM gets very
confused: it tries to be smart and to generate code extending the parameter on
the stack to 32 bits, but gets hopelessly confused in the data size of the set
of PUSHes and POPs that it generates. The exact error depends on the exact code
pattern being assembled, but the net result is always inconsistent generated
code, and a stack structure that doesn't match the instructions MASM generated
(carefully look at the extended listing).
The result is a GPF that seems to strike from nowhere. This occurs when the
current code segment is a 32-bit one ("USE32"), which is always the case when
programming for Win32.
The resulting GPF is such that when it happens, it looks like it can't be
tracked: ESP gets loaded with an invalid value so when the process crashes, you
have completely lost the stack context and/or the value of EIP, and don't know
anymore where the error came from. Even single stepping in the code is confusing,
as when one experiences this for the first time, the problem seems to strike
from nowhere (like a fault that would happen in some unrelated and unknown code
portion).
A similar incorrect set of instructions is generated if you mention segment
registers in an INVOKE list (not that usual, though).
You have been warned!
3.1.7.2 The infamous 512 bytes buffer
There is a very annoying limitation in ML (MASM): Its input (parsing) buffer,
aka "logical line" is only 512 bytes. This is documented as such in the
Programmer's Guide, "Chapter 1/ Language Components of MASM/ Statements", page
22, so at this point, we have to call it a feature.
But when coding a Win32 INVOKE with a long parm list, you might want to code
(and document your code) such as this:
INVOKE CreateProcess,
OFFSET lpApplicationName, ;=> to EXE name
OFFSET lpCommandLine, ;=> to command line string
OFFSET lpProcessAttributes, ;=> to process sec attribs
OFFSET lpThreadAttributes, ;=> to thread sec attribs
bInheritHandles, ;handle inheritance flag
dwCreationFlags, ;creation flags
OFFSET lpEnvironment, ;=> to new env block
OFFSET lpCurrentDirectory, ;=> to current dir name
OFFSET lpStartupInfo, ;=> to STARTUPINFO
OFFSET lpProcessInformation ;=> to PROCESS_INFORMATION
Well, forget it; the comments count in the 512 bytes, so the above whole
"logical line" (that doesn't use TAB characters but spaces) doesn't fit in the
512 buffer. MASM will flag it as an error (Booo). However long the INVOKE is
(and some of them ARE very long), it has to fit in 512 bytes... So you have to
remove comments and leading spaces, pack several parameters per line, etc...
You can NOT get away by using the continuation character (\) nor any other trick.
A similar problem is likely to happen with macros generating long byte strings.
VERY frustrating.
3.1.7.3 INVOKE and forward references
INVOKE doesn't take forward references to PROTO definitions, although MASM is
defined by Microsoft as an N-pass assembler, and although a CALL instruction
does accept forward references.
So you have to move PROTO definitions ahead of their first invocation.
Likewise, ADDR is sometimes required instead of OFFSET in INVOKE parameter
lists, but ADDR does not support forward references (while OFFSET does).
3.1.7.4 Macro limitations
MASM doesn't recognize strings and complains about unmatched items when it finds
symbols such as '<' and '(' in string parameters.
3.1.7.5 Listing generation
A number of minor syntax errors generate a "fatal error." An example is typing
something like
Foo PROC USES EAX,ESI ;Extraneous ","
but there are many other potential (and very benign) causes for a fatal error.
Unfortunately, the Fatal Error prevents the .LST (listing) file from being
generated. This is particularly painful when debugging complex macros, since
the only way to debug macros is precisely to generate a listing with full code
expansion.
So one ends up with a macro that generate offending code that can't be seen
because the generated code can't be listed. Yet another very frustrating
situation.
3.1.7.6 Missing conditions in structuring directives
This one belongs to the "shortcomings" category.
As documented in the MASM Programmer's Manual, "Chapter 7/ Loops/ Writing Loop
Conditions/ Expression Operators", page 178, all possible conditions can be
tested by creating complex expressions, such as in
.IF (EAX > 0) && (DWORD PTR [EBX] == 0)
NOP
.ENDIF
that generates
00000052 2 83 F8 00 * cmp eax, 000h
00000055 7m,3 76 06 * jbe @C0006
00000057 5 83 3B 00 * cmp dword ptr [ebx], 000h
0000005A 7m,3 75 01 * jne @C0006
0000005C 3 90 nop
.ENDIF
0000005D *@C0006:
But quite often, one needs to test preexisting condition codes, such as those
resulting from an arithmetic operation (e.g. SUB EAX,EBX), or those returned by
a routine.
To handle these cases, the authors of MASM created some special (and somewhat
redundant) symbolic for directly testing preexisting condition flags: ZERO?,
CARRY?, OVERFLOW?, SIGN? and PARITY?.
They can be used as in
.IF CARRY? ;Generates a JNC
The MASM authors apparently didn't think about the obvious, that of simply
deriving all the existing Intel J
in their structured programming directive: allowing expressions such as .IF Z?,
.WHILE C?, .UNTIL S?, .BREAK .IF P?, CONTINUE IF AO?, etc,... would have been
simple, intuitive and exhaustive.
This oversight is unfortunate for two reasons:
First,
.IF !CARRY? ; The only direct way to generate a JC
is not as readable as
.IF ABOVE? ; Is more mnemonic.
At the opposite of MASM, the Intel mnemonics define various synonyms for the
same conditions to improve code readability. For instance, Intel defines both
a "JZ" (Zero) and a "JE" (Equal), that are exactly the same instruction. But
testing for ZERO makes sense after a subtraction while testing for EQUAL is
intuitive after a comparison. Ditto for JL and JNGE, JGE and JNL, etc...
But the most annoying part is this story is that the existing predefined symbols
don't allow generation of some of the less usual "combo" jumps, those that test
for multiple conditions at once, such as G, GE, BE, LE, and their negations. I
have not found any way to solve this one.
Even trying to use combined flags does not work: a "JL" for instance takes a
jump when the SIGN? flag is not equal to the OVERFLOW? flag.
Let's try this:
.IF Sign? == Overflow? ;This should generate a "JL"
Ooops!
error A2154: syntax error in control-flow directive
00000060 7m,3 75 01 * jne @C0008
(Boooo!)...
Another example: A "JA" takes its jump if both Carry and Zero are false. The
inverse of a JA is a JBE, and is taken when either Carry or Zero is True.
So the following expression should generate a "JA":
.IF CARRY? || ZERO? ;Should generate a JA
But instead, it generates the right logical sequence in the most inefficient way.
00000060 7m,3 72 02 * jb @C0009
00000062 7m,3 75 01 * jne @C0008
The bottom line is that we have no way to generate any of the jump instructions
that test combined flags, nor to redefine the right mnemonics for them.
3.1.7.7 Major flaws in the MASM macro language
The macro language as it is in MASM 6.1x is barely usable for complex macros
that need to perform true parsing tasks to generate complex structures or
streams. A good macro language should be able to simply describe the syntax of
any instruction or directive consistent with the existing language. This is far
from being the case with MASM.
The successive piling of features since the original MASM 1.0 (back in 1981!)
resulted in what the macro language is today, that is a rather inconsistent,
limited and much too complex language.
The current macro language does not allow the creation of macro instructions
that would behave exactly as native instructions, directive and/or data
definition streams: the macro notation comes in the way.
Literals passed as operands to macro instruction that would take a list of
operands the same way as a BYTE directive, for instance, can't include special
symbols such as "!", "<" and ">", because they have special meanings to the
macro language. The "!" is the forcing character for parameters of macros, while
the rule is different in the rest of MASM: For instance, forcing a quote in a
quoted string is forced by doubling the quote, as in :
BYTE "Foo ""Bar"" "
The use of some older features of the macro language additionally preclude the
use of the quote, double quote, backslash, percent and ampersand symbols.
The macro syntax does not allow one to parse the "label" field of a macro
invocation as a parameter and use it as such to generate a label somewhere in
the generated code. Etc...
As an example, here is a problem to solve:
Use the macro language to define a directive that would transparently generate
a Unicode strings using a syntax exactly compatible with that of the native BYTE
directive, for instance.
Go to "The String macro", page 42, for the best solution we found so far. And
remember that the result does not reflect in any way the pain it was to achieve
it. The challenge is open, by the way: anyone with a better solution to the
problem, please Email!
The way things are, I don't see any way this situation could be fixed by
"enhancing" the macro language again.
If MASM ever goes back to the development cycle one day, or if a new MASM
compatible assembler is ever developed, I would gladly vote for the creation
of a brand new, incompatible but consistent macro language, and rewrite all my
existing macros from scratch so they can still compile the old source code
syntax. I guess such an enhanced MASM could also insure backward compatibility
by taking an OPTION OLDMACROS directive to hide the new syntax and re-enable
the existing brain-damaged 6.1x syntax.
3.2 Using LINK
3.2.1 Libraries
Use the /LIBPATH: switch to specify a single directory path where default
libraries can be found. If you need to specify multiple directories, you need
to use the /LIBPATH: switch several times.
The default library files that will be looked for using these path will
typically be those you specified using INCLUDELIB statements in your prototype
include files (see 3.1.3.3, "Win32 (and other) function prototypes".
This is the simplest way to let the linker reach the Win32 import libraries from
the Win32SDK:
/LIBPATH:C:\Win32SDK\LIB
3.2.2 Debugging options
For debugging, use the three following switches:
/DEBUG
/DEBUGTYPE:CV
/PDB:none.
Don't use
DEBUGTYPE:COFF or
DEBUGTYPE:BOTH.
The "real" full-featured debugging information that symbolic debuggers use is
actually CodeView format information, so /DEBUGTYPE:CV should be all you need.
In addition, using the COFF debugging information will sometimes (depending on
some obscure pattern in you code) result in a failed ASSERT in LINK.EXE (crash),
apparently due to some disagreement on the COFF debug records between ML and
LINK (Note: The 6.12 README.TXT mentions that work has been done to fix problems
in that area, but we have not checked it yet).
Using PDB:NONE includes the debugging info into the .EXE file, which is probably
the simplest way to have it: all the debugging information the debugger needs
is in the .EXE file, the debugger doesn't have to locate and open a separate
/PDB file. Using a separate PDB file is fine, and will make the .EXE much
smaller, but adding the risk of having de synchronized .EXE and debugging info,
and leaving it to you to make sure the debugger finds the .PDB file.
3.2.3 Linking an .EXE file
3.2.3.1 Linking a Console executable
Use the /SUBSYSTEM:CONSOLE switch.
The (large) capabilities accessible to Console programs are defined in the Win32
SDK (MSDN Level 2 and upper), in
Win32 Programmer's Reference
Overviews,
System Services
Consoles and Character-Mode Support
3.2.3.2 Linking a Windows executable
Use the /SUBSYSTEM:WINDOWS switch.
3.2.4 Linking a DLL file
Use the /DLL switch.
You might also add the /SUBSYSTEM:WINDOWS or /SUBSYSTEM:CONSOLE switches.
If you don't do it, according to the generated PE code, Link will assume
/SUBSYSTEM:WINDOWS.
The subsystem switch definitely reflects in the generated PE file, as can be
told by running a PE dump program. But to the best of my knowledge, this does
not makes any difference to the Win32 loaders, though: Under Win95, a console
application doesn't care to call functions in a DLL declared as Windows. I
have not checked at this point whether it makes any difference to a Windows
application that the DLL is declared as Console. Nor did I test any of these
under NT.
But I don't see any reason why it would matter, as many Windows DLLs are used
by both console and windows programs, and I have never heard about any special
PE restriction.
About everything you need to know about DLLs is defined in the Win32 SDK CD-ROM
(MSDN level 2 and upper). It's more precisely defined in:
Win32 Programmer's Reference
Overviews
System Services
Dynamic-Link Library
The above section explains how a DLL is defined by an entry point function (with
its initialization / exit sub-functions), and a number of exported functions the
DLL exposes to the outside world.
Another great place for DLL information is [Richter 97.01]. It specially covers
what you should not, ever do in one of the DLLEntryPoint routines, and why not
knowing it can nicely deadlock you program. I am not sure this later fact is
documented anywhere else.
The only other thing you need to realize to connect the above pieces is that the
name of the entry point function is defined through the /ENTRY: directive of
the LINK utility (the same directive that is used to define the program/process
entry point in an .EXE).
At this point, you will have about all the first level information you need to
know about DLLs, and especially about the entry point function.
The rest of what you need explains how to tell LINK to generate your DLL, and
this is documented in the MSDN Library CD-ROM:
Product Documentation
Languages
Visual C++ x.y
User's Guides
Visual C++ User's Guide
LINK Reference
Module-Definition (.DEF) Files
The functions your DLL expose will be defined in the EXPORTS section of the
.DEF files. Alternately, the other way to define exports is through the use of
a command line switches (/EXPORT:). For any large project, I tend to like the
.DEF file approach better, but this is largely a matter of personal preference
(and of the building tools one uses).
As you can see, we picked up the documentation of the VC++ linker that we took
from recent MSDN Library documentation. About the same documentation applies to
several earlier versions of the 32-bit LINK utility, probably up to its earliest
release that used to be known as LINK32.
3.2.5 Advanced linking techniques
3.2.5.1 Grouped Sections
There is a little know characteristic in the COFF/PE specifications (MSDN,
"Portable Executable and Common Object File Format (PE/COFF) Specification 4.1")
that can prove very useful for program construction.
Here is an excerpt of the specification:
The "$" character (dollar sign) has a special interpretation in section names
in object files.
When determining the image section that will contain the contents of an object
section, the linker discards the "$" and all characters following it. Thus, an
object section named .text$X will actually contribute to the .text section in
the image.
However, the characters following the "$" determine the ordering of the
contributions to the image section. All contributions with the same
object-section name will be allocated contiguously in the image, and the blocks
of contributions will be sorted in lexical order by object-section name.
Therefore, everything in object files with section name .text$X will end up
together, after the .text$W contributions and before the .text$Y contributions.
The section name in an image file will never contain a "$" character.
Using this feature, the linker can be used to construct tables of related
objects. The related objects can be declared in different modules, but the
linker will be able to consolidate / concatenate them in an orderly way in the
resulting image, building structures that the program will be able to use at
run time.
An example of this use can be found in ": Runtime Initialization / Termination
Macros", page 48
3.2.5.2 DLL forwarders
First, the good news:
The PE format offers a very interesting (but barely documented) option to DLL
creators: DLL forwarders. A DLL forwarder allows the DLL programmer to specify
an entry point in a DLL and "forward" the DLL call at runtime to another
function in another DLL. A typical use for this technique is in implementing a
"shell" to another DLL, for instance, that would implement front end to certain
functions and transparently forward the call of others to existing functions in
the shelled DLL. The nice thing about function forwarders is that being
implemented in the image format and in the loader, they don't require the
writing of a single line of code. In addition, the OS overhead of function
forwarding is minimal.
The only place this capability is documented in the official Microsoft
documentation is in the MSDN library, under
Specifications
Portable Executable and Common Object File Format (PE/COFF) Specification 4.1
Unfortunately, while the PE/COFF documentation describes how this capability is
implemented in the PE file, it doesn't describe how to talk any given linker
into generating the corresponding image feature. And of course, the official
MS Link documentation doesn't describe how to instruct LINK to create a DLL
forwarder for a given DLL entry.
Matt Pietrek mentioned the existence of this feature in [Pietrek 95.01] as well
as in several articles in the Microsoft Systems Journal, but never got around
to study how the feature could be implemented using Microsoft Link.
Jeffrey Richter finally provided the answer in [Richter 96.01].
From a .DEF file, the general definition of an EXPORTS table entry looks like:
entryname[=internalname] [@ordinal[NONAME]] [DATA] [PRIVATE]
The trick is a very simple one: A DLL forwarder is created by placing the
address of the target function as the optional internalname, such as:
SomeFunc=OtherDLL.SomeOtherFunc
When defining exports from the command line, the syntax is:
/export:SomeFunc=OtherDLL.SomeOtherFunc
I have found that at least with my MASM / linker couple, using this feature
forced me to specify decorated names in the .DEF file where the forwarder was
defined. This might or might not be true of other versions of the software.
Without using forwarders, you don't have to bother as the linker handles the
decoration automatically.
If you get undefined symbols corresponding to names involved in forwarders, try
decorating the names manually. This will likely propagate the error up the DLL
chain, back to the main .EXE, and you will have to fix the upstream .DEF files
accordingly.
Now for the bad news:
DLL forwarders are not implemented in the Win95 loader. You will get an
OS-generated runtime error complaining about a missing DLL symbol if you try to
use them.
3.2.5.3 Weak Externals
Weak Externals are a way to provide link-time default replacements to undefined
externals.
Excerpt from MSDN, "Portable Executable and Common Object File Format (PE/COFF)
Specification 4.1"
"Weak externals" are a mechanism for object files allowing flexibility at link
time. A module can contain an unresolved external symbol (sym1), but it can also
include an auxiliary record indicating that if sym1 is not present at link time,
another external symbol (sym2) is used to resolve references instead.
If a definition of sym1 is linked, then an external reference to the symbol is
resolved normally. If a definition of sym1 is not linked, then all references
to the weak external for sym1 refer to sym2 instead. The external symbol, sym2,
must always be linked; typically it is defined in the module containing the weak
reference to sym1.
Weak external can be implemented in MASM 6.1x using the ALIAS directive (only
very poorly described in a MASM 6.11 release note). In the example described
above, the directive would be:
ALIAS
Beware: Any syntax errors in a ALIAS definitions (and/or reference to a missing
symbol) usually trigger page faults in MASM 6.11d (Owell...).
Note: The README.TXT file for MASM 6.12 claims that a number of Access Violation
causes have been fixed, but we have not checked this one at this time.
3.3 Debugging an assembly language Win32 application
If you followed the above rules and used the options mentioned above for
assembling and linking, you ended up with an executable with full CodeView
debug information.
Any debugger supporting CodeView debugging info should be able to support your
executable file and offer full symbolic / source debugging. I have successfully
used the WATCOM debugger (version 10.6), Microsoft's Developer Studio debugger
and Numega's SoftIce 3.x.
If Visual Studio (aka Developer Studio) is loaded on your machine, you can use
it as a mere debugger, even if you never use its IDE to develop your MASM
application: If you want to debug your great new FUBAR.EXE application on the
current drive, and providing you installed VC++ on drive G: under \MSDEV, just
run:
G:\MSDEV\BIN\MSDEV.EXE FUBAR.EXE
This will launch the Visual Studio IDE right into the debugger and allow you to
start debugging. All symbolic facilities should be there.
The best tool I have found so far for debugging Win32 applications is Numega's
SoftIce 3.x., and the best setup I found for it was to run it on a single
machine, using a second video controller and dedicating a small alternate screen
to the debugger.
Unfortunately, although it claims to fully support MASM, SoftICE doesn't
support the complete legal character set that MASM allows (as defined in MASM
Programmer's Guide, "Chapter 1/ Language Components of MASM/ Identifiers" page
9): as a result, labels including special characters such as '$' and '?' are not
supported and can not be accessed or used in expressions with SoftICE. This is
quite unfortunate, since MASM itself does generates labels with '?' symbols for
local symbols - such as those generated by the structured programming macros.
In addition, there are quite a few libraries that use the perfectly legal '$'
and '?' characters, and debugging code using these libraries is very akward.
4 Various gripes
4.1 The absence of LDT support in Intel-based platforms
Microsoft decided a few years ago that since NT was to be multi-platform, the
only MS blessed way of programming was to use C (or C++). From that day,
Microsoft seemingly stopped caring about assembly language, to the point of
mostly ignoring it as they do today. What MS might not have anticipated is that
the MIPS people, soon followed by the PowerPC folks, would drop off the NT
market, and that the remaining non-Intel NT platform (the DEC Alpha) would
represent a minuscule part of the market, making the whole portability issue a
very moot point.
The MS folks went as far as preventing use of a key feature of the Intel CPU,
seemingly because they didn't have any exact counterpart on other (RISC)
platforms. Did you ever notice that there is no way to benefit from segmentation
in user mode under WIntel32 platforms?
I know that the forced use of segmentation with 16-bitness in previous times
gave a terrible reputation to segmentation: It was a nightmare in 16BitLand to
manipulate large pieces of data broken in 64k segments.
But segments have another use and benefit:
They provide a very efficient way to implement multi-instantiation: By simply
changing the place where the data segment registers point it is possible to
implement code reentrancy and re-instantiation in a fully code-transparent
transparent way. And as the segment registers are part of the CPU context, the
OS can automatically keep separate data context for each thread running off the
same piece of object code.
So by initially manipulating the segment registers, a process thread could
launch many threads running the same code, with each thread running off its own
data area (materialized by its own descriptor) The benefit is, no addressing
restriction nor inefficiency in the available addressing modes. This segment
mechanism was explictely designed in the CPU as the simplest and one of the most
efficient way to program a piece of code running a single application for many
users simultaneously, for instance.
Well, the problem is that Win32 provides no documented way to let a Ring 3
(user mode) process allocate an LDT entry. This means one can't allocate a
bunch of memory from the OS, ask the OS to create a selector for it and start a
new instance of a thread that would use this new data area as its data segment.
The only mechanism that Microsoft offers to achieve multi-instantiation is what
they call Thread Local Storage (TLS): It is implemented in two different ways,
Dynamic TLS and Static TLS (see [Richter 97.01] for details).
Dynamic TLS allows the programmer to use an OS allocated 64 DWORD array to
maintain thread-specific pointer. This implies:
* That the number of data items that can be tracked this way is limited to 64,
* that all instantiated memory accesses are done at best through an indirection
* that access to the array is accomplished by system calls, making the mechanism
even less efficient
The least inefficient way, static TLS, is still hardly acceptable: Compile-time
storage is allocated in the .TLS section, and the OS replicate the TLS segment
for each new thread that is started. This means that each thread in the process
gets a block of TLS storage the size of all the TLS data from all the threads...
In other words, the TLS section is allocated as if it where global for all the
threads, and the threads that don't need any instantiation data still get it.
In addition, and as pointed out by [Richter 97.01], "on an x86 CPU, three
additional machine instructions are generated for every reference to a static
TLS variable."
This is something most C programmers don't see, since the extra overhead only
appears in the code generated by the compiler, that most C programmers don't
look at or really care about. The problem is completely hidden at the C source
level. But it certainly doesn't look so to the assembly language programmer,
and the efficiency of the resulting code is obviously much lower. Furthermore,
using TLS, the programmer looses in the process the automatic inter-thread
protection the use of segmentation would have provided: attempts to access data
outside a memory segment is something an Intel CPU automatically trap.
Finally, static TLS can only be used with implicitly loaded DLLs. No Win32 OS is
able to properly initialize TLS storage for explicitly loaded DLLs (loaded via
LoadLibrary). For details, see "Static Thread-Local Storage", [Richter 97.01].
The bottom line is that, at least for the Intel implementation, TLS is hardly
more than a dirty and very inefficient kludge.
Apart from TLS, the only other official way to achieve multi-instantiation in
theWin32 world is to create multiple processes (rather than multiple threads).
This belongs to the steam-hammer action category though, and doesn't compete
for efficiency with the lightweight multi-threaded way:
* Each process requires a separate .EXE file,
* Each process requires reloading / remapping the same memory image for each
instantiation of the process.
* Changing process context is more costly than changing thread context (all
process-local items are part of the process context and don't need to be
changed when switching thread context inside the same process).
* Using separate processes precludes the use of any of the lightweight
intra-process synchronization and data sharing mechanisms such as global memory
items, critical sections, etc,... And since the inter-process synchronization /
communication mechanisms have to cross process boundaries, they are much more
costly than the intra-process ones.
Whenever we had a chance to ask, we only got two explanations so far from
Microsoft personnel about this missing feature:
The first one is the "flat model" dogma: "Win32 uses the flat model, and this
model precludes the use of segmentation."
This is simply not true. Using the flat model never prevented the use of
segmentation, as the Intel CPU documentation clearly states. There is no such
thing in the Intel CPU as a "flat model bit", and using a flat model is a pure
programming convention / convenience. No CPU-inherent technical limitation
prevents a programmer from occasionally using a segment register for any reason.
As we mentioned above, the best evidence is that Microsoft themselves use
segmentation in the Win32 world: in any Win32 thread, the FS register always
contains a special descriptor, that doesn't follow the flat model rules, and is
used to access the TID (Thread Information Block, see [Pietrek 95.01]). The
lack of access to segmentation from Ring 3 code only comes from an OS design
decision.
The second explanation Microsoft commonly gives about the lack of access to
segmentation from Ring 3 code is the need for portability, and the lack of
hardware mechanisms to implement segmentation on non-Intel platforms. As we
already mentioned, this looks to us as a very moot point, as
* portability is non-existent in the Win9x world anyway, since there is no such
thing as a non-Intel-compatible Win9x platform, and conversely Win9x-specific
features are only supported on Wintel32 platforms,
* portability is of little use in the NT world, and it is even diminishing:
cumulated sales of NT on non-Intel platforms are a milli- or micro-market, even
shrinking now the MIPS and the PowerPC contenders threw the towel,
* there is only one non-Intel machine left in the arena (the Digital Alpha), it
is currently marginal, and this situation doesn't seem likely to improve:
AutoDesk, makers of AutoCAD, one of the major selling application for the Alpha,
recently decided to drop support for the NT/Alpha platform: there was not enough
demand...
* from an API point of view, it would only take two or three Intel-specific NT
system calls to implement this mechanism, and last but not least,
* it's ultimately the responsibility of the people designing application
software to decide whether portability is (or is not) a more desirable goal to
achieve than efficiency on any given platform they decide to choose.
We think there is clearly a case here for Microsoft implementing a few
Intel-specific syscalls and allowing proper thread instantiation through the use
of selectors and Ring 3 LDT manipulation.
Since we are not likely to see Microsoft change their position on this point
in the short term (!), here are some alternate solutions for those assembly
language programmers that the TLS kludges do not satisfy:
One way could be to restrict all data to local (stack based) storage, but there
are some problems with that approach:
* it severely limits the addressing modes available to the programmer. This is
particularly limiting when complex, nested structures / arrays need to be
accessed,
* it imposes severe architectural constraint to the programmer: there are many
programming situations where global data access is required, specially in
time-critical real-time applications, when the database is large enough. A
number of Win32 constructs actually require global data access,
Another variant could be:
1. Program the thread to instantiate, grouping its instance data together,
2. compute at runtime the size of the static RAM database for the thread,
3. allocate a chunk of memory of the same size,
4. initialize the new chunk as needed (possibly by a mere memory move from the
original to the new chunk),
5. compute the offset between the original memory block and the new chunk,
6. load a base register with the result and
7. address each and every instantiated variable through based addressing, using
a different chunk (with a different base) for each thread to instantiate.
This solution is slightly better: it provides a "global" database with no size
limitation, while still leaving stack based addressing to parameter passing and
true local storage.
But still far from ideal:
* it sacrifices a precious and scarce base register for the whole life of the
thread,
* it precludes any access to the direct addressing mode, the simplest an least
error prone of all,
* it prevents the use of base + index addressing inside the thread instantiated
data block (since base is already used to maintain access to the data block),
seriously complicating access to complex data structures, and
* if the programmer forgets to use base addressing on any single instruction,
the program will access the RAM instance of the original thread rather than
that of the thread it's running in (ouch), creating very hard to track bugs.
But everything being relative, keep in mind that the same kind of error is even
more likely to happen using the much more twisted TLS addressing ways.
The irony is that nearly the same logic as we described above would apply if LDT
selector allocation were allowed: step 5 above (and following ones) would be
replaced by something like
5. Allocate an LDT selector
6. Load a segment register with the result
7. Address each and every variable just as you would if this thread were alone:
It is actually alone to access this memory segment.
The main difference is that each memory access could be then be achieved safely
and efficiently using any and all of the addressing methods the CPU offers.
5 Win32ASM Toolkit
The toolkit contains an undefined number of files. Undefined, because
* the toolkit is a never ending work,
* the documentation always tend to lag behind,
* We have already postponed the release of this document too much, waiting for
planned enhancements that did not make it,
* it is probably better to provide a file with no documentation (or an obsolete
documentation) than no file at all,
* these files should be considered as "work in progress."
For these reasons, we can not guarantee that files associated with this
documentation match exactly what is described, nor can we insure that they are
fully stable. In other words, and as stated in the disclaimer ahead of this
document, you are using this code at your own risks...
5.1 The Example files
There are at least two example files with source code along with this document:
Win32Proto and Win32DLL.
Win32Proto is a skeleton Win32 program including a tool bar, some dialog boxes,
and a small bunch of gadgets. It can be used as a framework for real projects.
Win32DLL demonstrates how to build a 100% MASM DLL, and also how a DLL can be
called from a 100% MASM .EXE program.
See the corresponding subdirectories and the included README files for more
details.
5.2 The include files
5.2.1 General Include files
5.2.1.1 Win32Inc.equ
This include file is generally found ahead of each and every Win32ASM
application module.
It includes four other include files, that are needed in about all
circumstances:
Include UnicAnsi.EQU ;Unicode / ANSI stuff
Include Win32Types.EQU ;Various typedefs
Include Win32Defs.EQU ;Many equates
Include Win32Strs.EQU ;Various Windows structures
These four files and their contents are described below.
5.2.1.1.1 UnicAnsi.equ
This include file handles the character set issues. At the time I am writing
this, I am far from having covered, or even started studying the topic: All my
current works has to be Win95 compatible, and I thus I have to stick to ANSI
representation (Win95 does not support Unicode format). So the only part of
UnicAnsi.equ I am actually using today is the UnicAnsiExtern macro (see below).
Most of the other material in this file directly comes from Sven B. Schreiber's
Walk32 work, mentioned in this document, and has not yet been used or tested in
the Win32ASM environment. I could even have broken it while reshuffling it
around and not have realized it yet.
5.2.1.1.1.1 The UnicAnsiExtern macro:
Sven resolved the Unicode/ANSI issues at link time, in his own linker. Since I
had to use the Microsoft linker, I had to solve the problem another way.
I ended up doing it about the same way Microsoft does it in their "C" headers,
by using a compile time macro.
Win32ASM uses the UnicAnsiExtern macro to turn a function name into a TEXTEQU
containing its character set dependant name. The macro uses the Unicode switch
to postpend an "A" (for ANSI) or a "W" (for Wide) and compose the true external
name of the function.
The UnicAnsiExtern macro has to reference each character set dependent function
ahead of its PROTO definition. You will find groups of UnicAnsiExtern macros
ahead of any Win32 API include file that contains charset dependent functions.
At assembly time, any reference to the generic name of a function is
automatically changed by the UnicAnsiExtern macro into the relevant charset
dependent name.
If during your own development, you add a Win32 PROTO to some Win32 API equate
file and get a undefined reference at link time, double-check the function name
in the Win32SDK: if you spelled it properly (case included), you might have hit
a function that is character set dependent. In this case, add an UnicAnsiExtern
entry with the name of the new function ahead of the equate file, before your
PROTO definition.
5.2.1.1.1.2 The String macro
5.2.1.1.2 Win32Types.equ
This file contains various TYPEDEF definitions.
A number of these TYPEDEFs are used in various structures in the Win32Strs.equ
file.
5.2.1.1.3 Win32Defs.equ
This file contains miscellaneous Win32 EQUate and TEXTEQU definitions.
5.2.1.1.4 Win32Strs.equ
This file contains a number of Win32 structure definitions.
5.2.1.2 Win32Res.equ
This file contains numerous EQUates related to resource definitions.
5.2.2 API header include files
These include correspond to Win32 DLLs and their Import libraries.
Each .equ file contains
An INCLUDELIB referencing the import lib, instructing LINK to pull it in,
the function headers (PROTOs) corresponding to the .DLL of the same name,
related structures, equates and constants.
I only started to organize the API header include files this way recently. In
addition, I started by using existing include files for Win equates: First, the
file provided by Microsoft in the DDK and derived from the Winbase.h file, then
later those compiled by Sven B. Schreiber in Walk32. And finally, a number of
equates and structures are not specific to a specific DLL but are used instead
by several.
As a result, a number of structures (too many structures), equates and constants
are not located in the .equ file they should belong to, but can be found in one
of the "General Include files" (see 5.1.1) instead.
In addition, as I explained above, I only create and fill these files on a "as
needed" basis. So they can in no way be considered exhaustive. The WinMM.equ
file, for instance, contains a single PROTO definition at the time I am writing
this. Since my programming is more oriented toward console (service)
applications, not that much of the Windows API is covered at this point.
This situation should slowly improve with time, as I keep on adding new
functions, structures and reorganizing this file set as needed.
5.2.2.1 CommCtl32.equ
5.2.2.2 CommDlg32.equ
5.2.2.3 GDI32.equ
5.2.2.4 Kernel32.equ
5.2.2.5 TAPI32.equ
5.2.2.6 User32.equ
5.2.2.7 WinMM.equ
5.2.2.8 WinSpool.equ
5.3 The macro files
5.3.1 Instr.mac
INSTR.MAC contains various utility macros. Some of them more or less extend the
set of instruction / directives of MASM, thus the name of the macro file.
5.3.1.1 Structuring directive extensions
5.3.1.1.1 .BLOCK & ENDBLOCK
.BLOCK and .ENDBLOCK generate no code. They simply define a block of code with
a single exit point, located after the ENDBLOCK. One can jump out of a BLOCK
through the regular structuring directives, such as .BREAK, .BREAK .IF, etc...
.BLOCK is a synonym for ".REPEAT" and .ENDBLOCK a synonym for ".UNTIL 1," but
the .BLOCK and .ENDBLOCK name make their purpose more obvious and increase code
readability.
.BLOCK ;We just sent DLE ++ DLE 0.
CALL FBIIGetParms ;get other end's parms,
.BREAK .IF CARRY? ;Drop out if error.
CALL FBIISendParms ;queue our parameters packet,
.BREAK .IF CARRY?
CALL FBIxWait4Tx ;Wait until we're acked to go to data
.BREAK .IF CARRY?
CALL FBIxRelTxSem ;Release the Tx semaphore we just used.
CALL FBTxQuotePatch ;Update the quote table.
CALL FBTxInit1Init ;Initiator, end of init, Rxer and Txer.
CALL FBRxInit1Init
CLC
.ENDBLOCK
5.3.1.1.2 FOREVER
.FOREVER is a termination for a .REPEAT loop that unconditionally jumps back to
the head of the loop. It is synonym to ".UNTIL 0", but here again; readability
is the key:
.REPEAT / .FOREVER is simply more explicit than .REPEAT / UNTIL 0.
.REPEAT
INVOKE GetMessage,
OFFSET winMsg, ;Adress of Msg structure,
0, ;Window to get msg from,
0, ;Filter min,
0 ;filter max.
.BREAK .IF (EAX == 0) ;Can this ever happen here?...
INC StatTAPIMsgs ;Count messages we see.
; $Display 'Got a Win message',$EOL
INVOKE DispatchMessage, ;Dispatch msg to proper winproc, and
OFFSET winMsg ;loop again.
.FOREVER
5.3.1.1.3 Condition mnemonics in structuring directives
We have seen above in "Missing conditions in structuring directives", page 33,
that the mnemonics the structuring directives allow are very limited set of
condition mnemonics and that the structuring directives don't allow the direct
generation of all legal jumps the Intel CPUs can handle.
We can't fix the latter, but we can slightly improve the former:
The INSTR32.MAC file adds 3 mnemonics:
EQUAL? Synonym of ZERO?
BELOW? Synonym of CARRY?
ABOVEorEQUAL Synonym of !CARRY?
5.3.1.2 Saving and restoring registers
The SAVE and RESTORE macros generate multiple PUSH and POP instructions. They
are designed in such a way that the same list of registers (in the same order)
can be used for both SAVE and RESTORE, reducing a possible cause for errors.
In addition, the "F" register is handled specially and generates the right
instruction (PUSHFD or POPFD).
SAVE EAX,EBX,EDI
CALL FooBar
RESTORE EAX,EBX,EDI
5.3.1.3 UnusedParm
The UnusedParm macro is useful in PROCs where some entry parameters are defined
but not used. This happens very often in CALLBACK procedures, for instance.
In this case, if the warning level is set to the maximum value as it should,
MASM generates a warning.
UnusedParm allows you to disable the warning (and document the fact that
actually not using the parameter is not a bug).
MSGPROC WinProcCMD_ID_HELP_ABOUT
INVOKE DialogBoxParam,
hInst, ;Process instance,
IDD_ABOUTBOX, ;"About" box template resource,
hWnd, ;owner window,
OFFSET AboutDlgProc, ;dialog box procedure,
0 ;lparam for WM_DIALOGBOX message.
XOR EAX,EAX
RET
UnusedParm wMsg
UnusedParm wParam
UnusedParm lParam
WinProcCMD_ID_HELP_ABOUT ENDP
5.3.1.4 Internal consistency checking macros
These macros are more or less equivalent to the ASSERT macros present in some
HLL. The only reason their name is not ASSERT is that I created them before
ASSERT became popular in C (and yes, we're still talking about a post-neolithic
era).
All of these macros take a condition code as their first parameter. As the name
of the macro imply, the condition mentioned must be realized, or something
terrible will happen. For the MUSTBE family of macros, failure will yield a
call to a FatalError routine. As the name suggests, FatalError ends up killing
the calling process. But before doing so, it displays as much pertinent
information about the problem as it can, the minimum being the address where
the problem occured. The minimum FatalError routine is a breakpoint (INT 3),
that will either directly invoke a debugger (if one capable enough is present
in the machine) or invoke the OS "process abend" routine, that will display the
registers at the time of the problem (and optionally offer to invoke a debugger).
The top of stack will show the address where the INT 3 occured.
There is a more powerful, full fledged FatalError routine as part of this
package (see page 51).
The other variations of the MUSTBE macro take a second parameter, a message
that is displayed before the faulty process exits.
5.3.1.4.1 MUSTBE
This routine does not take any message. The only data the FatalError routine
has are the contents of the registers and condition codes, and the address
where the problem occurred (sitting at the top of the stack).
AND EAX,OnLine ;Already online?
MUSTBE Z ;Yes, can't be here. Just crash.
5.3.1.4.2 MUSTBEM
This routine takes an additional, optional parameter, an error message string.
The message is generated in the .CONST section, headed by a byte length, and an
INVOKE FatalError is generated, with a pointer to the aforementionned message.
CMP hCall,0 ;Check call handle:
MUSTBEM E,'FoolineMakeCall: Call handle already active ?!'
5.3.1.4.3 MUSTBEMGLE
Same as MUSTBEM, but the macro invokes a variation of the FatalError routine,
FatalErrorGLE. FatalErrorGLE invokes GetLastError, formats the corresponding
OS error message and present that information together with all the relevant
information available to the regular FatalError routine.
INVOKE SetConsoleCtrlHandler,
ADDR BruteForceExit,
TRUE
OR EAX,EAX
MUSTBEMGLE NZ,'FooMain: SetConsoleControlHandler failed'
5.3.1.4.4 SHOULDBE
This macro is in essence equivalent to the MUSTBE macro with a major difference:
it invokes a "Warning" routine instead of a "FatalError" routine, and "Warning"
is supposed to return to the point it was called without changing any register
or condition code. Although I have had the "SHOULDBE" macro around forever,
I never got around implementing the Warning routine. The FatalError ended up
being sufficient for my own use.
5.3.1.5 Enumeration macros
The enumeration macro allows the generation of a 0 based sequence of numeric
equates, with associated symbols. This macro is useful for defining symbolic
corresponding to offset of entries into tables, arrays, etc...
ENUM defines the beginning of an enumeration. It takes a type of data as its
required parameter. This can be a predefined type (such as WORD or DWORD), a
structure, a typdef, etc...
ENUMITEM takes a symbolic name as its required parameter. It will assign the
name to the generated offset.
ENUMEND defines the end of the enumeration table.
Example of use:
ENUM WORD
ENUMITEM FBTxStIdle ;TXer does plenty of nothing.
ENUMITEM FBTxStHead2 ;About to send 'B', second header byte.
ENUMITEM FBTxStData ;About to send a data byte.
ENUMITEM FBTxStCRC8 ;TxEr about to send CRC-8 (negociation).
ENUMITEM FBTxStCRC16L ;TxEr about to send CRC-16, LSB
ENUMITEM FBTxStCRC16M ;TxEr about to send CRC-16, MSB
ENUMITEM FBTxStDLEd ;TXer about to send DLE'd data.
ENUMITEM FBTxStRing ;Transmitting from ring buffer.
ENUMITEM FBTxStChain ;Txer about to check for chained
;transmission.
ENUMEND
The above example will equate FBTxStIdle to 0, FBTxStHead2 to 2, etc,...
5.3.1.6 Breakpoint macros
Debug generates an INT 1 (debugger call). Executing it normally results in the
debugger getting the focus.
Break is the regular INT 3 breakpoint.
Break2 generates a software NMI, INT 2.
If your machine contains a debugger configured to trap either breakpoint,
executing Break or Break2 will wake up the debugger.
Otherwise, the OS will raise the exception dialog box (which might give you a
chance to raise the debugger).
5.3.2 InitExit.mac: Runtime Initialization / Termination Macros
These macros are designed to resolve the boring problem of initialization /
termination of library routines.
The problem is the following:
Initialization routine(s) must be executed before the main logic of a program
can be started.
A typical use for an initialization routine is to create resources, such as
critical sections, that will be required later by routines called in a
multithreaded context. Because of the multithreaded context, the resources must
obviously be created and initialized before the threads that use them are
created. Otherwise, a race condition could occur when two threads are trying at
the same time to create a critical section, for instance. And it is a deadly
sin to initialize a critical section twice...
So the only solution is to execute all the initialization routines at a time
when the main thread of the process is the only one running. Of course, it is
possible to call these routines explicitly from the startup code of your
program. But for large programs, this is boring, error prone and often wasteful.
* Boring, because you have to look up each routine you use, check for an
initialization routine and invoke it.
* Error prone, because if you forget to insert the initialization call in the
startup code whenever you reference a new library routine, disaster will strike
at some point. And then, later, during testing, you might figure out that you
forgot to call the termination routine and some cleanup action never gets
properly performed.
* Wasteful, because if one day, you stop invoking a given routine in your
program, you will likely forget to remove the initialization and/or termination
routine, and even if this does no harm, this will result in the linker still
pulling the whole library member in your code, because the initialization code
is still invoked there.
The solution to this are the four Init/Exit macros.
You code the $InitRoutine macro in the same module as the library routine
itself, together with the initialization code for the routine. Ditto for the
$ExitRoutine macro. So the library routine and everything related to its
initialization and termination can be coded in the same source module, that of
the runtime library, and only there. From them on, you don't have to think
about initializing and terminating a library routine anymore.
If you call a library routine from within your code, this will pull its
initialization code at the same time. And the initialization code of all
library routines will be invoked all at once, in the startup code of your
program, but you will not even have to know which routines actually require
initialization and which do not.
Stop using some library routine, and it will not be pulled in the .EXE file
anymore. Nor will its initialization code.
So where is the trick?
The trick is in the linker (and its "Grouped Section" feature mentioned above)
and in four rather simple macros.
The macros create two sections. One is used to build a table containing the
addresses of all those initialization routines. The other one is used to build
a table containing the addresses of all the termination routines.
The $InitRoutine is used to declare an entry point to an initialization
routine. Each $InitRoutine macro will define the address of its initialization
routine and put it in the initialization section. Ditto for each $ExitRoutine,
that will put the address of its exit routine in the termination section. If
convenient, a single module might have as many $InitRoutine and $ExitRoutine
macro calls as needed.
When the linker pulls a library routine, it will find their initialization and
termination sections (if any) and concatenate them with those of the other
library routines used by the program.
The only thing that the application startup code will need to do is to declare
a $RunInitRoutines macro in its startup code, before any thread is created. The
$RunInitRoutine does not take any parameter and does not even "know" whether
there is any initialization routine to call in the whole project. If there is
no initialization to perform, the initialization section will be empty.
The $RunInitRoutine generate a very simple loop that calls in turn each address
in the initialization section.
Ditto for the $RunExitRoutines macro, that is called by the application in its
final code, presumably after all active threads have terminated.
An example is probably appropriate at this point:
The modules requiring initialization and / or termination look like this:
In a module defining memory pools handling routines:
; Declare all routines that require initialization.
$InitRoutine MPInitialize ;Initialize memory pools
$InitRoutine MPCritSectInit ;Initialize critical section
;for the mempool routines.
$ExitRoutine MPTerminate ;Cleanup / release memory pools
MPInitialize PROC
; Initialization code here...
RET
MPInitialize ENDP
MPCritSectInit PROC
; Initialization code here...
RET
MPCritSectInit ENDP
MPTerminate PROC
; Initialization code here...
RET
MPTerminate ENDP
The memory pool handling routines above need both initialization and termination.
In a module defining message queues handling routines:
; Declare the routine that requires initialization.
$InitRoutine MQInitialize ;Initialize message queue
MQInitialize PROC
; Initialization code here...
RET
MQInitialize ENDP
The message queue routines above require an initialization routine but no
termination routine.
The main code for a process using the initialization macros looks like this:
.CODE
MainProc PROC
; At this point, no other thread is running, so none of the
; initialization routines encurs the risk of a race condition.
$RunInitRoutines ;Run all initialization routines.
;Carry will be set if some init
;routine failed. In this case,
;execution of init routines will
;stop after the failing routine.
.IF !CARRY ;If no initialization error,
CALL MyMainCode ;the main code is here...
.ENDIF
SAVE EAX
$RunExitRoutines ;Now execute all the registered
RESTORE EAX ;Exit routines.
INVOKE ExitProcess, ;All done,
EAX ;pass exit retcode.
MainProc ENDP
Here is the detailed, "under the hood" view of the four macros. You do not need
to know exactly how this works to use it, anyway. If you don't care, just skip
to the next paragraph.
The $InitRoutine macro allows automatic, application-wide registering of the
initialization routines.
The $ExitRoutine macro accomplishes the same for termination routines.
The $InitRoutine is used in any module containing initialization routine(s)
that must be executed before the main logic of a program can be started.
$InitRoutine and $ExitRoutine are used to declare initialization and exit
routines.
$InitRoutine declares a section, naming it @Init$
$ExitRoutine declares a section, naming it @Exit$
So if invoked in .ASM module FOO, $InitRoutine and $ExitRoutine will create
segments named @Init$FOO and @Exit$FOO respectively. Both macros are called
with the name of a routine. The macro will generate a DWORD pointer to the
routine, and place this pointer in the @Init$ (or @Exit$) segment/section.
When a section name contains a '$' sign, a PE linker processes it specially, as
mentionned in "Grouped Sections", page 38
As a result, the contents of all "@Init$
concatenated in the "@Init" section and the contents of all "@Exit$
sections will be concatenated in the "@Exit" section. The linker sorts the
sections fragments by alphabetical order of
So the @$InitRoutine macros of all modules contribute to the construction of a
global table containing all the addresses of the initialization routines and
located in section "@Init".
Likewise, the $ExitRoutine macros contribute to the construction of a global
table containing all the addresses of the Exit routines and located in section
"@Exit".
The $RunInitRoutines and $RunExitRoutines in the startup / exit code of the
application put all of this together: they create the "@Init$" and "@Exit$"
segments (that will end up ahead of all other @Init and @Exit segments in
alphabetical collating sequence and contain the label at the top of the table),
and "@Init$zzzzzzzz" / "@Exit$zzzzzzzz" (that will hopefully end up
alphabetically after all other @Init and @Exit segments and contain a DWORD 0
as an end of table marker).
Finally, both the $RunInitRoutines and the $RunExitRoutines macro generate a
short code loop that goes down its associated list and calls each address in
the list.
In case the order of execution of some Init and/or Exit routines must be
ordered differently, it is possible to pass a second (optional) parameter to
the $InitRoutine ($ExitRoutine) declaration.
This second parameter is concatenated in the segment name ahead between the
@Init$ (@Exit$) and before the
linking order inside the group section and force the Init (Exit) routines to
execute in any suitable order (rather than by alphabetical order of modules).
For instance, a "Console Log" routine might need initialization before any
other routine so the other initialization routines might use the Console Log
PROCs to log what they did. Passing a second parameter of "0" (lowest in
collating sequence) might force the console log init routine to move ahead of
the list (providing no other less important module uses this same value and no
module is named "0.ASM").
5.4 The service routines
5.4.1 FatalError
See DEBUG.ASM.
6 Bibliography
6.1 [Booth, 96.01]
Rick Booth
Inner Loops
Addison Wesley Developer's Press
6.2 [Brain, 96.01]
Marshall Brain
Win32 System Services (Second edition)
Prentice Hall PTR
6.3 [Intel, 95.01]
AP-526 Application Note
Optimizations For Intel's 32-Bit Processors
(available as electronic documentation at www.intel.com)
6.4 [Petzold 96.01]
Charles Petzold
Programming Windows 95
Microsoft Press
6.5 [Pietrek 95.01]
Matt Pietrek
Windows 95 Systems Programming Secrets
IDG Books
6.6 [Rector & al, 96.01]
Rector & Newcomer
Win32 Programming,
Addison Wesley Developers Press
6.7 [Richter 96.01]
Jeffrey Richter
Win32 Q & A
in Microsoft Systems Journal, Sept 96, Vol. 9
6.8 [Richter 97.01]
Jeffrey Richter
Advanced Windows (Third edition)
Microsoft Press
6.9 [Schulman 94.01]
Andrew Schulman
Unauthorized Windows 95
IDG Books
Win32汇编语言教程
一、引言
Win32应用程序一般使用C语言编程,但是在某些需要进行深层编程的情况下,例如Win32应用程序执行机制分析、病毒清除、加密解密等深层编程,或者对于某些速度要求较高的程序,需要使用汇编语言(甚至机器语言)直接编写Win32应用程序。Win32应用程序虽然和其他32位应用程序(例如32位保护模式DOS程序)一样可以使用386汇编语言和保护模式编程,但是Win32应用程序的执行机制与其他32位应用程序有一定的差别,例如消息循环、动态链接等,Win32汇编语言也有其特殊的编程方式。目前国内极少看到有关Win32汇编语言的资料,市面上的汇编语言书籍一般只介绍DOS实模式汇编语言和386保护模式汇编语言,金山公司的《深入Windows编程》一书虽然介绍了使用汇编语言写Windows应用程序的方法,可惜该书只介绍了Win16汇编语言。为了使大家能对Win32汇编语言的基本编程方法有一定的了解,近日得闲,笔者编写了本教程,旨在抛砖引玉,如果本教程能够带领你走进神秘的Win32汇编语言世界,笔者心愿足矣。使用本教程,要求读者具有C语言编写Win32应用程序(Win32SDK编程)的基础。
二、进行Win32汇编语言编程的基本软件
进行Win32汇编语言编程,应该准备下列基本软件:
1、MASM 6.11以上版本的汇编器
MASM是Microsoft公司的汇编器,这是最基本的软件,必需MASM 6.11以上版本才能够汇编Win32汇编语言源程序。不过进行Win32汇编语言编程不必要全套的MASM 6.11,只要一个ML.EXE文件就可以了,Windows 95 DDK中带有MASM 6.11c的ML.EXE文
件,Windows 98 DDK中带有MASM 6.11d的ML.EXE文件,都可以使用。Turbo MASM 5.0(TASM)是Borland公司的汇编器,也可以用来汇编Win32汇编语言源程序,但是TASM的部分语法与MASM不同,用于MASM的Win32汇编语言源程序可能需要修改后才能用TASM汇编。本教程中的所有Win32汇编语言源程序都基于MASM。
2、Win32SDK
进行Win32汇编语言编程需要用到Win32SDK中的资源编译器(RC.EXE)和连接器(LINK.EXE),还需要用到Win32SDK中的引入库文件(KERNEL32.LIB、USER32.LIB、GDI32.LIB等)。如果没有Win32SDK,Platform SDK也可以,还可以安装Visual C++ 2.0以上版本的Visual C++,笔者使用的是Visual C++ 6.0。Borland C++ 4.0以上版本的Borland C++也可以使用,只是资源编译器和连接器的
文件名不同,分别是BRC.EXE(BRC32.EXE)和TLINK.EXE(TLINK32.EXE),选项也不尽相同,另外Borland C++不支持COFF格式的OBJ文件,汇编时不能使用/coff选项。
3、汇编语言编辑器
一个普通的文本编辑器,用于编辑Win32汇编语言源程序。EDIT、PWB等都可以,Visual C++等编程语言中的编辑器也可以,甚至WORD、WPS 97等可以编辑文本文件的字处理软件都可以,不过笔者推荐使用ASMEDIT,这是一个专用的汇编语言编辑
器,效果非常好。Win32汇编语言一般使用命令行方式汇编连接,经过一定的设置也可以在某些集成
环境(PWB、Visual C++、ASMEDIT等)下汇编连接,还可以使用NMAKE工具,不过本教程中只使用命令行方式汇编连接,也不使用NMAKE工具。
三、ANSI字符集API与UNICODE字符集API
Win32 API中凡是与字符有关的API都有两种不同的类型:ANSI字符集API和UNICODE字符集API,分别对应ANSI字符和UNICODE字符,Windows NT支持两种类型的API,Windows 95/98只支持ANSI字符集API。在WINDOWS.H头文件和其他Win32 API定义
头文件中,凡是与字符有关的API都有两种不同的定义,ANSI字符集API以API名称加字符“A”表示,UNICODE字符集API以API名称加字符“W”表示,并使用条件编译和宏定义实现自动根据当前字符集使用对应的API定义,例如GetModuleHandle函数的定义(包括在WINBASE.H头文件中):
WINBASEAPI
HMODULE
WINAPI
GetModuleHandleA(
LPCSTR lpModuleName
);
WINBASEAPI
HMODULE
WINAPI
GetModuleHandleW(
LPCWSTR lpModuleName
);
#ifdef UNICODE
#define GetModuleHandle GetModuleHandleW
#else
#define GetModuleHandle GetModuleHandleA
#endif // !UNICODE
与字符有关的数据结构也有类似的定义。本教程考虑到汇编语言使用条件汇编会导致不太直观,全部使用ANSI字符集API,这样也可以保证在Windows 95/98和Windows NT环境下的兼容性,所以本教程中许多API名称和数据结构的名称都加有“A”字符,读者可以方便地改用UNICODE字符集API。
四、一个简单的Win32汇编语言程序
读者可能一听到“汇编语言”四个字就觉得十分头疼!汇编语言给人的第一印象就是一大堆难以看懂又不直观的指令,而且不结构化,大量的标号、无条件跳转指令(JMP)和条件跳转指令让你难以看懂程序;过程(或者函数)的调用参数传递又不直观,要么直接使用寄存器传递参数,不符合结构化程序设计原则;要么使用堆栈传递参数,又不能有效地检验参数类型……想必Win32汇编语言更麻烦吧!还好,MASM 6.0以上版本的汇编器提供了很多结构化汇编语言伪指令,可以方便地实现汇编语言结构化程序设计,当你看完本教程以后,你可能会感觉到:Win32汇编语言并不比C语言麻烦多少。(如果读者看不懂本教程中的汇编语言源程序也不要紧,可以对照MASM 6.11的帮助看)和C语言Win32编程需要WINDOWS.H头文件和其他Win32 API定义头文件定义常量、数据结构和API一样,Win32汇编语言也需要包含文件(INC文件)定义常量、数据结构和API。不过笔者找了很长时间也没有找到一个完整的可用于Win32汇编语言的WINDOWS.INC文件或者WIN32.INC文件(倒是找到了用于Win16汇编语言的WINDOWS.INC文件),Turbo MASM 5.0中提供的WIN32.INC文件也不完整,只能用于自带的WAP32例子程序,而且与MASM 6.11不太兼容(听说NASM中有完整的WIN32.INC文件,可惜没有找到,也不知道与MASM 6.11是否兼容)。笔者只好自己定义常量、数据结构和API(根据WINDOWS.H头文件和其他Win32 API定义头文件定义),不过这倒
带来了不少好处——可以更好地了解Win32汇编语言的编程方法和原理。笔者编写了一个简单的Win32汇编语言程序,该程序的功能很简单:在屏幕上显示一个消息框。本程序只调用了两个API函数:MessageBox函数和ExitProcess函数,程序如下:
包含文件(MSGBOX.INC):
UINT TYPEDEF DWORD
LPSTR TYPEDEF PTR BYTE
LPCSTR TYPEDEF LPSTR
PVOID TYPEDEF PTR
HANDLE TYPEDEF PVOID
HWND TYPEDEF HANDLE
MB_ICONINFORMATION = 00000040h
MB_OK = 00000000h
MessageBoxA PROTO stdcall, :HWND,PCSTR,PCSTR,:UINT
ExitProcess PROTO stdcall, :UINT
源程序(MSGBOX.ASM):
.386p
.MODEL flat,stdcall
INCLUDE MSGBOX.INC
.STACK 4096
.DATA
WindowTitle BYTE 'MsgBox',0
Message1 BYTE 'This is a simple MessageBox
Win32 application.',0
.CODE
_startbr>
INVOKE MessageBoxA,0,ADDR Message1,ADDR WindowTitle,
MB_ICONINFORMATION or MB_OK
INVOKE ExitProcess,0
PUBLIC _start
END
汇编连接本程序的命令如下:
ml /c /coff /Cp msgbox.asm
link /subsystem:windows /entry:_start msgbox.obj kernel32.lib user32.lib
汇编命令中的/c选项表示只汇编,不自动连接;/coff选项表示生成COFF格式的OBJ文件(如果使用Borland的连接器不能使用/coff参数);/Cp选项表示标识符区分大小写。连接命令中/subsystem:windows选项表示连接器生成普通Windows可执行文件;/entry:_start选项表示程序入口点是_start标识符。连接时连接KERNEL32.LIB和USER32.LIB引入库。运行汇编连接后生成的MSGBOX.EXE文件,屏幕上将显示出一个消息框,消息框的标题是“MsgBox”,消息框中的字符串是“This is a simple
MessageBox Win32 application.”。Win32汇编语言源程序应该由.386p伪指令和.MODEL flat,stdcall伪指令开始,指示汇编器汇编386保护模式指令,并使用平坦内存模式(Win32内存模式)和stdcall函数调用方式(Win32标准函数调用方式)。PROTO伪指令定义函数原型(与C语言中函数原型的定义相似),可以定义函数名、调用方式和参数,INVOKE伪指令调用由PROTO伪指令定义的函数,可以方便地传递参数和检查参数类型。MSGBOX.INC文件中使用PROTO伪指令定义API函数,MSGBOX.ASM文件中使用INVOKE伪指令调用API函数,可见MASM 6.0以上版本的汇编器提供的结构化汇编语言伪指令大大简化了Win32汇编语言编程(本程序一条汇编语言指令也没有用到)。本程序调用了MessageBox函数显示消息框以后,调用了ExitProcess函数终止程序的执行,ExitProcess函数的作用是终止当前进程。
Win32应用程序一般使用C语言编程,但是在某些需要进行深层编程的情况下,例如Win32应用程序执行机制分析、病毒清除、加密解密等深层编程中,或者对于某些速度要求较高的程序,需要使用汇编语言(甚至机器语言)直接编写Win32应用程序。Win32应用程序虽然和其他32位应用程序(例如32位保护模式DOS程序)一样可以使用386汇编语言和保护模式编程,但是Win32应用程序的执行机制与其他32位应用程序存在一定的差别,例如消息循环、动态链接等,而且Win32汇编语言也有其特殊的编程方式。
1 进行Win32汇编语言编程的基本软件
进行Win32汇编语言编程,应该准备下列基本软件:
1.1 MASM 6.11以上版本的汇编器
MASM是Microsoft公司的汇编器,这是最基本的软件,只有MASM 6.11以上版本才能够汇编Win32汇编语言源程序。不过进行Win32汇编语言编程不需要全套的MASM6.11,只要一个ML.EXE文件就可以了,Windows 95 DDK中带有MASM 6.11c的ML.EXE文件,Windows 98 DDK中带有MASM6.11d的ML.EXE文件,都可以使用。Turbo MASM 5.0(TASM)是Borland公司的汇编器,也可以用来汇编Win32汇编语言源程序,但是TASM的部分语法与MASM不同。所以还是建议使用Microsoft公司的MASM6.11以上版本。
1.2 Win32SDK
进行Win32汇编语言编程需要用到Win32SDK中的资源编译器(RC.EXE)和连接器(LINK.EXE),还需要用到Win32SDK中的引入库文件(KERNEL32.LIB,USER32.LIB,GDI32.LIB等)。如果没有Win32SDK,用PlatformSDK也可以,还可以安装2.0以上版本的Visual C++。
1.3 汇编语言编辑器
任选一种文本编辑器,就可用于编辑Win32汇编语言源程序。如:EDIT、记事本等都可以,VisualC++等编程语言中的编辑器也可以,甚至WORD、WPS 97等可以编辑文本文件的字处理软件都可以。推荐使用ASMEDIT,这是一款专用的汇编语言编辑器。
2 一个简单的Win32汇编语言程序
汇编语言给人的印象往往就是一大堆难以看懂的指令,而且程序非结构化,大量的标号、无条件跳转指令(JMP)和条件跳转指令让你难以读懂程序;过程(或者函数)的调用参数传递又不直观,要么直接使用寄存器传递参数,不符合结构化程序设计原则;要么使用堆栈传递参数,又不能有效地检验参数类型……想必Win32汇编语言更麻烦吧!其实不然,MASM6.0以上版本的汇编器提供了很多结构化汇编语言伪指令,可以方便地实现汇编语言结构化程序设计,实际感觉是Win32汇编语言并不比C语言麻烦多少。与C语言编写Win32应用程序需要WINDOWS.H头文件和其他Win32API定义头文件定义常量、数据结构和API一样,Win32汇编语言也需要包含文件(INC文件)定义常量、数据结构和API。下面编写了一个简单的Win32汇编语言程序,该程序的功能很简单:在屏幕上显示一个消息框。本程序只调用了两个API函数:MessageBox函数和ExitProcess函数,程序如下:
包含文件(MSGBOX.INC):
UINT TYPEDEF DWORD
LPSTR TYPEDEF PTR BYTE
LPCSTR TYPEDEF LPSTR
PVOID TYPEDEF PTR
HANDLE TYPEDEF PVOID
HWND TYPEDEF HANDLE
MB_ICONINFORMATION = 00000040h
MB_OK = 00000000h
MessageBoxA PROTO stdcall, :HWND,:LPCSTR,:LPCSTR,:UINT
ExitProcess PROTO stdcall, :UINT
源程序(MSGBOX.ASM):
.386p
.MODEL flat,stdcall
INCLUDE MSGBOX.INC
.STACK 4096
.DATA
WindowTitle BYTE 'MsgBox’,0
Message1 BYTE’This is a simple MessageBox Win32 application.’,0
.CODE
_start:
INVOKE MessageBoxA,0,ADDR Message1,ADDR WindowTitle,
MB_ICONINFORMATION or MB_OK
INVOKE ExitProcess,0
PUBLIC _start
END
汇编连接本程序的命令如下:
ml /c /coff /Cp msgbox.asm
link /subsystem:windows /entry:_start msgbox.objkernel32.lib user32.lib
汇编命令中的/c选项表示只汇编,不自动连接;/coff选项表示生成COFF格式的OBJ文件(如果使用Borland的连接器不能使用/coff参数);/Cp选项表示标识符区分大小写。连接命令中/subsystem:windows选项表示连接器生成普通Windows可执行文件;/entry:_start选项表示程序入口点是_start标识符。连接时连接KERNEL32.LIB和USER32.LIB引入库。运行汇编连接后生成的MSGBOX.EXE文件,屏幕上将显示出一个消息框,消息框的标题是“MsgBox”,消息框中的字符串是“This is a simple MessageBox Win32application.”。Win32汇编语言源程序应该由.386p伪指令和.MODELflat,stdcall伪指令开始,指示汇编器汇编386保护模式指令,并使用平坦内存模式(Win32内存模式)和stdcall函数调用方式(Win32标准函数调用方式)。PROTO伪指令定义函数原型(与C语言中函数原型的定义相似),可以定义函数名、调用方式和参数,INVOKE伪指令调用由PROTO伪指令定义的函数,可以方便地传递参数和检查参数类型。MSGBOX.INC文件中使用PROTO伪指令定义API函数,MSGBOX.ASM文件中使用INVOKE伪指令调用API函数,可见MASM6.0以上版本的汇编器提供的结构化汇编语言伪指令大大简化了Win32汇编语言编程(本程序一条汇编语言指令也没有用到)。本程序调用了MessageBox函数显示消息框以后,调用了ExitProcess函数终止程序的执行,ExitProcess函数的作用是终止当前进程。
Win32应用程序的入口点是WinMain函数,实际上WinMain函数是被C语言的初始化和结束代码调用的,Win32应用程序的真正入口点和DOS应用程序没有什么区别,都是在文件头中指定的应用程序起始点。Win32汇编语言没有C语言的初始化和结束代码,必须自己编写初始化和结束代码调用WinMain函数(过程),WinMain函数的原型是:
int WINAPI WinMain(HINSTANCE hInstance,HINSTANCE hPrevInstance,LPSTR lpCmdLine,int nShowCmd);
WinMain函数有4个参数,分别是:
hInstance——应用程序当前实例的句柄,可以通过调用GetModuleHandle函数获取。
hPrevInstance——应用程序前一个实例的句柄,Win32中当前地址空间中不会有应用程序的其他实例在运行,该参数通常设置为NULL(提供该参数只是便于移植Win16应用程序源程序)。
lpCmdLine——命令行参数,可以通过调用GetCommandLine函数获取。
nShowCmd——主窗口的显示状态,可以设置成SW_ SHOWDEFAULT(缺省状态)。
3 资源在Win32汇编语言程序中的使用
资源在Win32应用程序中是很重要的。Win32汇编语言程序中使用资源的方法与C语言程序没有很大的差别,都可以用资源编辑工具生成资源源文件和资源头文件,然后使用资源编译器编译资源源文件,将生成的资源文件(RES文件)与汇编器生成的目标文件和引入库文件连接在一起就可以了(资源头文件需要移植到汇编语言上,建立一个资源包含文件)。
4 控制台Win32汇编语言程序
控制台Win32应用程序运行在控制台(MS-DOS窗口)下,与DOS下的C语言程序十分相似,程序入口点是main函数,使用标准C语言I/O函数进行I/O,也可以调用API。实际上控制台Win32应用程序与普通Win32应用程序没有本质上的区别,标准C语言I/O函数实际上还是调用了API,在控制台上进行I/O。控制台Win32汇编语言程序与C语言程序有一定的差别,需要获取控制台I/O句柄,然后使用控制台I/O句柄进行I/O(与文件句柄I/O相似),下面以MASM6.11中自带的控制台Win32汇编语言程序实例(HELLO.ASM)为例,程序如下:
.386
.MODEL flat, stdcall
STD_OUTPUT_HANDLE EQU -11
GetStdHandle PROTO NEAR32 stdcall, nStdHandle:DWORD
WriteFile PROTO NEAR32 stdcall,
hFile:DWORD,lpBuffer:NEAR32, nNumberOfBytesToWrite:DWORD,
lpNumberOfBytesWritten:NEAR32,lpOverlapped:NEAR32
ExitProcess PROTO NEAR32 stdcall,dwExitCode:DWORD
.STACK 4096
.DATA
msg DB "Hello, world.", 13, 10
written DD 0
hStdOut DD 0
.CODE
_start:
INVOKE GetStdHandle,STD_OUTPUT_HANDLE ;Standard output handle
mov hStdOut, eax
INVOKE WriteFile,hStdOut, ; File handle for screen
NEAR32PTR msg, ; Address of string
LENGTHOFmsg, ; Length of string
NEAR32PTR written, ; Bytes written
0 ; Overlapped mode
INVOKE ExitProcess,0 ; Result code for parent process
PUBLIC _start
END
汇编连接本程序的命令如下:
ml /c /coff /Cp hello.asm
link /subsystem:console /entry:_start hello.obj kernel32.lib
连接命令中/subsystem:console选项表示连接器生成控制台Win32应用程序。在MS-DOS窗口(控制台)下运行汇编连接后生成的HELLO.EXE文件,将会像MS-DOS程序一样显示出“Hello, world.”字符串。本程序调用了GetStdHandle函数获取标准控制台输出设备句柄,然后调用WriteFile函数向标准控制台输出设备句柄写字符串,完成控制台字符串输出,最后调用了ExitProcess函数终止程序的执行。
5 结语
MASM 6.0以上版本的汇编器提供了很多结构化汇编语言伪指令,大大简化了Win32汇编语言编程。因而,Win32汇编语言在一定程度上削弱了以往的汇编语言(如DOS实模式汇编语言和386保护模式汇编语言)复杂、难懂的特点。尽管与高级语言相比,Win32汇编语言仍然比较复杂,但是Win32汇编语言在某些特殊方面有高级语言不可比拟的优点,如果你正在想编程清除Win32病毒(例如CIH病毒),或者你正在编写对速度要求较高的程序(例如大量计算的程序),不妨试试Win32汇编语言,或许能够解决你的燃眉之急。
参考文献
[1] 严义,包健,周尉.Win32汇编语言程序设计教程[M].北京:机械工业出版社,2004.
[2] 罗云彬.Windows环境下32位汇编语言程序设计[M].北京:电子工业出版社,2002.
(责任编辑:刘翠玲)
───────────────
第一作者简介:孙彦生,男,1966年5月生,1988年毕业于鞍山科技大学计算机科学及应用专业,讲师,山西工程职业技术学院计算机工程系,山西省太原市新建路,030009.
Win32 Assembler Language and Win32 Application ProgramDesign
SUN Yansheng
ABSTRACT: This paper introduces the Win32 assemblerlanguage, and probes into the Win32 application program design by using Win32 assemblerlanguage.
KEY WORDS: Win32 assembler language; Win32 applicationprogram; program design