Implementing the DLL file format in x86 NASM

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
User avatar
~
Member
Member
Posts: 1228
Joined: Tue Mar 06, 2007 11:17 am
Libera.chat IRC: ArcheFire

Implementing the DLL file format in x86 NASM

Post by ~ »

I'm trying to learn how to implement DLLs in pure assembly, by hand, without a linker, just NASM.

My intention is to overcome technical limitations that wouldn't easily let me develop more recent code in Win9x or Linux. I want to stop using Visual Studio for natively porting simpler programs to Win9x and later to my own system.

So far I've only learned how to make simple PE EXEs like this one:
Image2017-10-22--ClockCount.zip

I know that mostly a DLL could have export and import tables, but I don't know if somebody knows low level references that let me structure a DLL directly in assembly, just like the sample PE EXE above.
alexfru
Member
Member
Posts: 1112
Joined: Tue Mar 04, 2014 5:27 am

Re: Implementing the DLL file format in x86 NASM

Post by alexfru »

Octocontrabass
Member
Member
Posts: 5586
Joined: Mon Mar 25, 2013 7:01 pm

Re: Implementing the DLL file format in x86 NASM

Post by Octocontrabass »

~ wrote:without a linker, just NASM.
Why not write your own linker that can be built with just NASM? That way, you can use NASM to create your linker and your object files, then use your linker to turn your object files into the final EXE or DLL file.
User avatar
~
Member
Member
Posts: 1228
Joined: Tue Mar 06, 2007 11:17 am
Libera.chat IRC: ArcheFire

Re: Implementing the DLL file format in x86 NASM

Post by ~ »

I have managed to implement a first attempt of a DLL, a DLL called WinAPI32.dll intended to import the entirety of the standard WinAPI and MSVC library from Windows 9x to Windows 10, and then export any custom functions and later data.

This will only load from a simple program, equally produced purely in assembly because of a bug/weakness I haven't been able to find yet (that's why I'm asking for help):

See the binary if you want a fast preview of the implementation result:
Image http://sourceforge.net/projects/x86wina ... p/download

Mirror (copy together the EXE and the DLL in the same directory):
Image http://sourceforge.net/projects/x86wina ... p/download


Here is the full source code for the DLL and the DLL-loading program. So far I can only load the produced DLL with this program, but not with other standard programs.


Can you find out any bug in the DLL or EXE code that prevents the DLL from being loaded and called without crashing from any program? I used mainly the Iczelion PE tutorials 7 and 17 to understand what was left, mostly the export table which I included at the very end of the import data section.

I will keep trying but I need to spot the error for which the DLL crashes if I try to load it with the Iczelion tools from PE tutorials 7 or 17 (getting export information and calling an exported DLL function).


But at least with this I can start creating standard functions exported in a DLL that I can ship later in an upgraded, more compatible DLL. It's a good start.
Last edited by ~ on Fri Nov 10, 2017 10:19 pm, edited 2 times in total.
User avatar
Solar
Member
Member
Posts: 7615
Joined: Thu Nov 16, 2006 12:01 pm
Location: Germany
Contact:

Re: Implementing the DLL file format in x86 NASM

Post by Solar »

~ wrote:I'm trying to learn how to implement DLLs in pure assembly, by hand, without a linker, just NASM.

My intention is to overcome technical limitations that wouldn't easily let me develop more recent code in Win9x or Linux. I want to stop using Visual Studio for natively porting simpler programs to Win9x and later to my own system.
Before we talk about implementation, we should talk about architecture. Before we talk about architecture, we should talk about intentions.

I, for one, have successfully removed the requirement for MSVC from my 9-to-5 job completely, and am building Windows libraries (both static .a and dynamic .dll), executables, and indeed complete setup.exe / .msi distribution packages on Linux (via MXE). But I don't know if that would be a (better?) solution for you because I don't really know what you are trying to do and why...

(Let's just say it's somewhat of a "smell", when somebody sets out to implement a new way to do things which can already be done, that this somebody might not be aware of the old way(s), or should be given a hand in using them instead of with what he thinks he has to do. While we are all here because we have a desire to "reinvent the wheel", we should focus on the wheels that needs reinventing, instead of reinventing every round thing in sight.)
Every good solution is obvious once you've found it.
User avatar
~
Member
Member
Posts: 1228
Joined: Tue Mar 06, 2007 11:17 am
Libera.chat IRC: ArcheFire

Re: Implementing the DLL file format in x86 NASM

Post by ~ »

I need to make compilers for each language variety, version, language, and CPU architecture. I need that to be able to use natively code from any language fast to my OS more easily than with GCC, although it would be good to at least have a GCC that only produces raw binaries in the specified order of the code without any additional padding.

I need C++ level to understand what's going on and to have enough advanced references to implement any language characteristic.

I have these ideas so far to start a C or C++ compiler for pure x86 raw binaries with embedded NASM assembly:

Mark the comment areas and skip from now on.

Parse the preprocessor code, starting by #if... and #include, in several hierarchies, until there are no preprocessor unresolved identifiers.

Generate new code with everything included and post-processed (could be just for debugging what the compiler sees and we might keep using the original code).

Resolve the declared data types.

Obtain the present functions and variables, first globally (with very stable functions that work like logical OPCODES capable of obtaining them in any scope). Mark the functions and other things that have header declarations, and the ones that don't as well, as such.

I want compilers that have the capability of freely nesting Assembly for any purpose, that don't require linkers, only produce the target assembly code, that can include active code in any place, like JavaScript, not just inside, that can recognize functions and any code element like in JavaScript, which first makes a full parsing to see all existing declarations and then solve the rest of the code in any order without assuming that the functions are unresolved, but above all I need compilers that understand the standard existing languages in all their versions individually in a cleanly packed toolchain collection for each language for being able to use all the existing code and libraries.


From there, continue programming an operating system and applications will become much more friendly, easily using existing code and easily shipping the code with clean and simple access to the existing languages without the need of too much, like Visual Studio and SDKs, being an open and standardized toolchain based in finished language and ABI standards.
Solar wrote:
~ wrote:I'm trying to learn how to implement DLLs in pure assembly, by hand, without a linker, just NASM.

My intention is to overcome technical limitations that wouldn't easily let me develop more recent code in Win9x or Linux. I want to stop using Visual Studio for natively porting simpler programs to Win9x and later to my own system.
Before we talk about implementation, we should talk about architecture. Before we talk about architecture, we should talk about intentions.

I, for one, have successfully removed the requirement for MSVC from my 9-to-5 job completely, and am building Windows libraries (both static .a and dynamic .dll), executables, and indeed complete setup.exe / .msi distribution packages on Linux (via MXE). But I don't know if that would be a (better?) solution for you because I don't really know what you are trying to do and why...

(Let's just say it's somewhat of a "smell", when somebody sets out to implement a new way to do things which can already be done, that this somebody might not be aware of the old way(s), or should be given a hand in using them instead of with what he thinks he has to do. While we are all here because we have a desire to "reinvent the wheel", we should focus on the wheels that needs reinventing, instead of reinventing every round thing in sight.)
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: Implementing the DLL file format in x86 NASM

Post by Brendan »

Hi,
~ wrote:...
About half of programming languages can't be compiled to native (e.g. they include features that would require run-time code generation). NASM only supports 80x86 and won't help much for all the other CPU architectures. C++ isn't a superset of all possible language features and lacks some features that other languages have.

You will die of old age before you have completed 1% of this. Continuing programming an operating system and applications while you are dead is extremely hard (and being dead won't make anything more friendly or allow you to use existing code).


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
User avatar
Solar
Member
Member
Posts: 7615
Joined: Thu Nov 16, 2006 12:01 pm
Location: Germany
Contact:

Re: Implementing the DLL file format in x86 NASM

Post by Solar »

~ wrote:I need to make compilers for each language variety, version, language, and CPU architecture.

[...]

I want compilers that have the capability of freely nesting Assembly for any purpose, that don't require linkers, only produce the target assembly code, that can include active code in any place, like JavaScript, not just inside, that can recognize functions and any code element like in JavaScript, which first makes a full parsing to see all existing declarations and then solve the rest of the code in any order without assuming that the functions are unresolved, but above all I need compilers that understand the standard existing languages in all their versions individually in a cleanly packed toolchain collection for each language for being able to use all the existing code and libraries.
It is as I feared. What you are describing -- at a very coarse and abstract level -- is, basically, reinventing all the wheels. So you want to reimplement compilers for all lanuages in all their versions to achieve a not-very-well-defined goal of being...
~ wrote:...able to use natively code from any language fast to my OS more easily than with GCC
When you e.g. say you want to implement a C/C++ compiler that "doesn't need a linker", I cringe. Because I have this nagging suspicion that you don't really know what current implementations actually do, why they are doing it the way they do, why shortcomings you might perceive in the state-of-the-art actually exist, or how to address them in a way that would actually move the state-of-the-art forward in a meaningful way. (Doing it for one language would be no mean feat, but you airily set out to do it for "all languages in all their versions"... that's preposterous.)

I mean, what (little you said about what) you want to achieve sounds nice at first glance, but... have you thought this through before jumping at implementation? Do you have some architecture jotted down for this, explaining your roadmap in some detail?

Because right now it sounds like a lot of "I don't understand all this, I'll just implement it the way I think it should work, starting with the easy part I did understand". If your goals were so beneficial and so easily achievable, don't you think somebody would already have done this? So there are apparently problems with "just doing this". Have you identified them?

Lots of work has flown into the existing toolchains. Changing their modus operandi first requires understanding what has been done before, and making others understand what you are really aiming for. Because, let's face it, even a simple C compiler is nothing done easily by a single person; you will need help at some point.

And a fully compliant C++ compiler? Bigger names than yours are falling short of that. C++ is orders of magnitude more difficult to "get right" than C is; it's not just "C with classes" anymore, not by a long shot.

Not even beginning to speak of runtime-supported languages like Java...
Every good solution is obvious once you've found it.
Post Reply