Schol-R-LEA wrote:While I do see your last point, and even agree with it, I am not certain how this is to be accomplished - either there would have to be a registrar of the bindings for all possible external references (which the compiler would need to be able to access even if the external element doesn't exist yet - or to put a different angle on it, the compiler could be put into in the position of creating a placeholder GUID for someone else's code, with no prior guarantee that the code will ever even be written, that the registrar will be accessible to a linker, or that the external element would match the signature assumed if it does), or a guaranteed way to regenerated exactly the same symbol identifier consistently without communication with the library, which is pretty much the opposite of what a GUID is meant to be (since the whole point of those is that there is only a vanishingly small chance of it ever being generated twice).
But perhaps I am missing what you have in mind. Please, feel free to elaborate. If you have a solution, I would be interested in hearing it, as it has bearing on some related problems I have for my own planned designs (which do indeed call for just such a universal distributed database of exported and imported code, just in a language context far removed from C or C++, and only bearing a distant similarity to even the more established branches of the Lisp family which are its closest relatives).
Actually, no. I wasn't thinking about a registrar. When I said database, I meant something else. But let me focus on the more down to earth usage first.
The standard workflow is for the library authors to create the header files along with the source files and build the library elfs/archives to be distributed. A client portion of the header files is supplied to the client code authors. Those, along with the rest of the project files, are used to build the client executable. This is what I consider to be the classic scenario. Of course, there are other possibilities. Such as the client code authors having no official header files, but being in possession of the library elfs/archives and knowing the exported functions' signatures. Or, the client code authors lacking even the library elfs/archives at the early development stages, but having information about the library's functions' signatures, provided on good faith by the library's vendors.
For the classic workflow, the changes are very small. In the final stages of development, before distribution, a GUID generator is used to make one unique id for each exported library routine/symbol. The client headers (which are also included by the library sources themselves, as is usual) are modified by appending an "__attribute__(GUID("guid-string"))" to the previously naked declarations. Now gcc knows enough to begin generating a .symbol.guids section in the output objects, which the linker cooks into the final elf/archive as exported (i.e. global, visible and defined) symbols metadata. Similarly, by including those headers, the client executable will be enriched with imported (i.e. global and undefined) symbols metadata. For the case of shared object libraries, the loader will now be able to bind the symbols using guids instead of strings. For static libraries, the link editor will be able to do the symbol resolution using guids.
For the case in which the signatures are known and the library itself is present, but there are no official headers, a tool from the toolchain can be used to list the GUID metadata. Think "objdump --guid" or something similar. Alternatively, a source code generator could be fed the library elf/archive directly and emit a "#pragma guid symbol = guid-string" header (analogous to "#pragma weak symbol1 = symbol2" currently available.)
For the case in which the signatures are known and the library and header files are both unavailable, the client source can be developed without guids, using plain identifiers. When the library is finalized, the guids or library files can be distributed to the clients, which can amend their projects. Or the authors, if they have indeed frozen the library interface at the preliminary stages, could generate the interface guids and provide them to the clients without the executable code.
Regarding my databases rant - I was obviously mostly drooling on futuristic concepts. Instead of headers, you can distribute dataset that provides the human readable name to guid mapping. In the more primitive case, a command line tool can then be used to import this dataset into any "namespace" of an identifier database. In the more advanced case, an IDE will enable you to fine-tune the import process. The database holds all mappings that the toolchain uses to resolve identifiers during the build. The attraction is in that the semantic and syntactical aspects become separated. The IDE can easily query and edit such a database, to track dependencies or to refactor the symbolic contents of a project without impacting the bindings beneath it. The internal references can be represented by a graph. However, the symbols at the boundary cannot be part of the graph itself, so they need to be keyed somehow. A guid key would solve that. There are a lot of details that need to be clarified even in theory. Such as a way to update import datasets without breaking the graph, etc. Also, this is more or less how IDEs work today, except that the code remains stored in inefficient human readable form and that the IDE's database is not reused by the toolchain. Newer programming languages are designed with the foresight of being machine parseable and this ameliorates the issue somewhat. Something of this kind has been tried before commercially (I cannot recall the product), but I don't think it caught on. Then again, it might have been just a timing issue. Or a financial one.