MACROS in Java

Neo · Post by **Neo** » Sun Mar 06, 2005 5:51 am

Yup, i tried that and got it working. Thanks a lot.
Isn't there a better way than this? Not specifiying args but using them doesn't seem like a proper way.

ark · Post by **ark** » Fri Mar 18, 2005 10:55 pm

On C++ templates...what is it about them, exactly, that prevents them from supporting parametric polymorphism? It seems to me that if you do as Java does and constrain your class hierarchy to have an ultimate, single base class, do not use your templates to operate on primitives, and then make all your template parameters pointers you can get exactly the same effect, assuming you have a compiler with half a brain (ultimately, this is exactly the Java implementation). And even if you don't do this, the template mechanism provides support for a bit more smarts than text-replacement macros that are not processed by the compiler proper and thus have no concept of scoping, type systems, and the very rules of the language itself.

Solar · Post by **Solar** » Mon Mar 21, 2005 1:19 am

Joel wrote: On C++ templates...what is it about them, exactly, that prevents them from supporting parametric polymorphism?

The fact that they are, unlike Java "Generics", a compile time feature? (Which is why they are much faster than Java "Generics", and give you compile-time diagnostics too...)

mystran · Post by **mystran** » Mon Mar 21, 2005 11:57 am

Except that you are not constrained to compile-time features really, because you have RTTI in there too if you want, and even if you didn't templates would be sufficiently powerful to implement it anyway.

So basicly, can do whatever you want with C++ templates (they are Turing complete modulo recursion limitations in compilers), but it might need a bit more work than it first looks like.

creichen · Post by **creichen** » Thu Mar 24, 2005 6:26 pm

Hi,

Solar wrote:
Joel wrote: On C++ templates...what is it about them, exactly, that prevents them from supporting parametric polymorphism?
The fact that they are, unlike Java "Generics", a compile time feature? (Which is why they are much faster than Java "Generics", and give you compile-time diagnostics too...)

No, that's not quite correct. Generics (parametric types) are part of the static Java type system, and thus a compile-time feature (recall static = compile time, dynamic = run-time).

The practical difference is two-fold:

(1) You can compile programs with generics/parametric types in separation from programs with template type parameters; in particular, you can build libraries of them. Compare this to the C++ STL where typically most or all of the implementation is part of the header files.

(2) You can compile a "generic" piece of code and get all error messages for it once and for all. With C++ templates, you cannot compile the template before usage, and during usage you may be confronted with error messages that pertain to implementation details of the templated library code you want to use (because its type parameters cannot be constrainted, see below).

A direct ramification of this is that it's much easier to get away with broken C++ library code (and, conversely, to stumble upon broken C++ library code), because the templated code itself is inherently not completely type-checkable up-front. There's another implementation issue, but let me get to that a little later.

Java allows its generics to be restricted to conform to a given interface. For example, you can write a piece of code like this (not syntax-checked, I'm afraid):

Code: Select all

public class BinaryTreeSet <T extends java.util.Comparable> {
...
   public void insert (T value) { ... }
   public java.util.List<T> getAllValues () { ... }
   public boolean contains (T value) { ... }
...
}

which, in its implementation, can then use the "compareTo()" method defined in [tt]java.util.Comparable[/tt] on elements of type [tt]T[/tt] whereever it needs to do so. The restriction [tt]T extends java.util.Comparable[/tt] guarantees that type [tt]T[/tt] has this method.

Thus, once it has been typechecked and compiled, it can be used anywhere with type parameter [tt]T[/tt] instantiated to any class that implements the aforementioned interface. Now let's compare how an equivalent C++ templated class would behave from two points of view:

Library user: In C++, if you use the [tt]BinaryTreeSet[/tt] class with an improper type, you won't get an error message for the instantiation point. You MAY get an error message later on, if one of the member functions you use happens to use the (nonexisting) "compareTo()" member function of your type parameter's class. If you don't get an error message, things will probably work, but if you do get it, you will have to figure out which of your type parameters the error message was caused by, what member function precisely you need to add to your class, and what this function is supposed to do.
In Java, the interface you're forced to implement precisely specifies all this information and all relevant requirements. Also, you will get the error message at the type instantiation point.

Library designer: Since you cannot constrain your type parameters in C++, it is very hard to design the generic library up-front interface-wise; while you may wind up with an informal write-up of the member functions you want to require, the compiler won't help you much if you accidently mis-type one of them, and if you later on change part of your implementation (not the interface) to use one of the member functions you informally restricted the type parameter to provide, you may wind up breaking code that was previously working. None of these things are an issue with Java's generics.

Simply put, Generics maintain information hiding across separate generic modules-- templates give you the neccessary power, but not the safety of generics.

My next posting will be on implementation differences of these features.

-- Christoph

creichen · Post by **creichen** » Thu Mar 24, 2005 6:50 pm

Hi again,

onto implementation. C++ template type parameters and Java generics are implemented quite differently:

In C++, using a template will instantiate the entire templated class for the specified type parameter(s). This means that separate code will be emitted for each use of the class; if you use linked lists for int, char, string and linked-list-of-int, four different code fragments will be generated.

In Java, generics are implemented through boxing (as in most SML implementations) and type erasure. Boxing is a little trick which ensures that all values and objects have a sufficiently similar structure at run-time that the code using them doesn't need to know much about them statically: For objects, boxing/unboxing does nothing, for values (int, char etc.), boxing places the value inside a "wrapper object", and unboxing retrieves the value again. Thus, the assembly code generated by the JIT system only needs to know that each object occupies 8 bytes (4 bytes on 32 bit machines), which represent a pointer to a standard Java (in this case) object, and call sites ensure that proper boxing/unboxing is performed implicitly.
Now, this would all be quite nice if the Java programmers hadn't been fairly lazy: The way in which they implemented parametric polymorphism beyond boxing was to simply insert implicit typecasts. While this gives you all the static checking of generics, it means that, at run-time, you still need to waste execution time checking that the type of an object returned from e.g. a container class is the expected one, effectively duplicating the work of the static typechecker dynamically.

Let's compare these two approaches:

Memory usage for C++ is linear in the number of instances (for different types) of a given templated class (instruction memory) and linear in the number of objects stored inside it (data memory). In Java, instruction memory usage is constant and comparable to C++ for one instance, data memor usage is linear in the number of objects stored inside it, though use on primitive values (int, char etc.) results in increasing the constant factor.

Runtime performance for the C++ case should be better, since templated classes can be optimised for whichever type they're instantiated on (e.g. allowing dynamic dispatch to be resolved statically; this kind of inlining or guarded inlining is one of the few useful optimisations in modern OO languages on modern computers, particularly in Java). However, Java wastes less i-cache memory; depending on a given concrete application, this might be the deciding factor (if you're continuously cross-referencing a couple of hashmaps for different types, for example). (Of course, Java also has the side effect of causing the i-cache to be flushed rather aggressively due to JITting, but this is a different matter).

Note that these two discussions are based on the theory behind those things. Actual benchmarks would be needed to present a complete argument, and I'm too lazy to generate those ATM-- I personally would expect Java to perform worse on "traditional" code with few uses of parametric types, particularly on older machines where memory isn't as big of an issue, and better in very fancy code, particularly on newer machines; but again, I have no numbers to back this up.

-- Christoph

mystran · Post by **mystran** » Thu Mar 24, 2005 7:09 pm

Ch. Reichenbach wrote: No, that's not quite correct. Generics (parametric types) are part of the static Java type system, and thus a compile-time feature (recall static = compile time, dynamic = run-time).

Let's be exact; there are two aspects of type system: checkability and implementation. So we really have several kinds of typesystems (in terms of static/dynamic).

Basicly, you can have a typesystem where compiler can check types at runtime, in which case it is called "static" but you might still need to dispatch using something like RTTI at runtime.

For example, one can define (in a generic ML typesystem functional language) a function like this:

Code: Select all

length :: [a] -> Num
length [] = 0
length (x:xs) = 1 + (length xs)

Now, that calculates a length of list. It's type is declared above, although any half-decent compiler would infer it automatically anyway. Because it can be infered, it obvious is statically typecheckable. And because we have a static type for it, we obviously can check all uses of it statically too.

But what is the type [a]?

[a] is usually syntax for something like [tt]list a[/tt] and : would be a infix list-contructor, the prefix version of which is known as "cons" in Lisp world at least. [a] is a parametric type in the sense that a is a semantic variable. The type could be defined as something like:

[a] = [] | a:[a]

That is, list of a's is either an empty list or an a consed together with a list of a's. So it's an union of two types really; a sum-type.

As such, while we can typecheck any program that calls our length-function for any kinds of lists, say [Num] or [[Num]] or [[Num]->Num] or whatever you happen to have, we still need to check types at runtime when we dispatch length; we have to check whether we have a cons-constructor (:) or an empty list ([]), so that we can get the right definition of length.

We could have much more interesting types too. In most ML-typesystem variations there aren't really that many limitations on what kinds of unions you can have (most of them have to do with what kind of recursion is allowed anyway, as infinite types can cause interesting problems). You can have for example (this is from 'A gentle introduction to Haskell'):

Code: Select all

data Color              = Red | Green | Blue | Indigo | Violet

For you C programmers out there, that's basicly your "enum", and for the mathematicians, each of the color names is a zero-argument type-constructor. Again, anything using "Color" is definitely type-checkable statically (at compile time) but the actual dispatch of any Color values (and hence of types) is actually a runtime operation (believe me here). Notice that any of the constructors COULD have any number of arguments, the above is just a simple case.

So, basicly, next time you say something about "static" or "dynamic", please be clear on what aspect of th typesystem you might be talking about.

creichen · Post by **creichen** » Thu Mar 24, 2005 8:13 pm

Hi,

you have some interesting points there, mystran, and I should have clarified that I was merely referring to checkability here-- however, this is what type systems are usually associated with (in my limited experience), whereas matters like dynamic dispatch are of relevance even to languages like Smalltalk or SCI which spend preciously little time pondering types (as opposed to objects).

(By the way, something like RTTI is not generally needed for dynamic dispatch; the implementations of dynamic dispatch I have seen simply used vtables or implicit parameters. I would be quite interested if you could point me to a language implementation that uses RTTI by default, because I'm curious why they would do that; I believe that some experimental Java JITs use it for optimisation purposes, but please don't quote me on that...)

Back to the main point: In a language such as the one you're using (Haskell or Miranda, judging from the syntax) there is quite definitely no type-checking needed at run-time, at least not in the sense that any of the current literature I've seen refers to that term. "Type checking" would mean that some representation of the actual type being used is needed for some test which may result in failure (type checking failed); all that we do here is figure out which of the possible values we're dealing with and act appropriately. Now, values may (of course) be responsible for program errors in one way or another, but these are quite different from type errors (at least on the level of most programming languages; theoretically, you can reduce one to the other if you have something like the Calculus of Constructions, as used in the Cayenne language).

(Another note: Dynamic dispatch for the "Color" values you refer to is not needed either. In fact, implementations of strict languages (at least SML/NJ and O'Caml) do not implement it as such (as far as I remember) but, instead, as a value comparison (dynamic dispatch = load function address from statically determined point in function address table whose structure is known (at least to that statically determined point) and jump). IIRC, only lazy languages (such as Haskell), which need to potentially "force" their "thunks" in this case, implement it as such (but, as indicated, for a good reason).

I'm not 100% certain about these points, however; this is what I remember from reading up on those things in the past. If this interests you, I can dig out the G-machine implementation docs and "Compiling with Continuations" again and try to find the relevant sections; that way, we could make sure that my memory isn't cheating on me here).

But I digressed rather sharply from your main point, which is still valid: The type system is responsible not only for checking type correctness, but also for giving some guideline to the compiler as to what code it is supposed to emit (i.e., where dynamic dispatch is needed, where dynamic type checks need to be inserted for type safety etc.). However, there are many other factors that affect the implementation (Can a value escape (can it be stack-inlined)? Is a given recursive call tail-recursive? Do we need to spill registers?), so I'm not sure whether the language implementation should be seen primarily as a function of the type system-- if someone put a gun to my head, I'd take the position that the primary purpose of a type system is to guarantee that an abstracted version of the program follows its abstract (explicit or implicit) specification (how useful this is depends on the expressive power of the type system and the ingenuity of the programmer), and anything else that falls out of the type system (such as hints for efficient implementation) are a nice bonus.
(If the person putting a gun to my head would insist on a different opinion, I might re-consider taking this position, but I digress again.)

-- Christoph

mystran · Post by **mystran** » Thu Mar 24, 2005 9:25 pm

Let me first make it clear that I was completely ignoring all implementation issues, and as such I didn't quite realize that if someone WOULD consider implementation, then some of my choices for terms might have meaning, and a mess would result. Sorry about that.

Ch. Reichenbach wrote: you have some interesting points there, mystran, and I should have clarified that I was merely referring to checkability here--

But you notice that C++ templates and Java generics are equivalent in terms of compile-time checkability. Only when we consider separate compilation (which is overrated anyway) are they different (in terms of compile-time chekability, that is).

(By the way, something like RTTI is not generally needed for dynamic dispatch; ...)

Sorry about using the terms "RTTI" and "dynamic dispatch" in quite a bit more generic sense than normal. Basicly, structured tags (whether arbitary or vtable pointers) are used a lot, and that is more or less what I had in mind. I agree that you don't need general RTTI (as in take any object and I'll tell it's type). You might want that for GC though.

Back to the main point: In a language such as the one you're using (Haskell or Miranda, judging from the syntax) there is quite definitely no type-checking needed at run-time, at least not in the sense that any of the current literature I've seen refers to that term.

That's my point. No type errors can occur at runtime. If you don't do downcasting (eg. from parent to child) in C++ or Java, then they are type-checkable in the same sense, and can hence be called "static with some loopholes".

(Another note: Dynamic dispatch for the "Color" values you refer to is not needed either. In fact, implementations of strict languages (at least SML/NJ and O'Caml) do not implement it as such (as far as I remember) but, instead, as a value comparison (dynamic dispatch = load function address from statically determined point in function address table whose structure is known (at least to that statically determined point) and jump).

Ouch, agreed. I was again imprecise. I didn't realize that dynamic dispatch has a (bit too) "well" defined meaning as an implementation technique. Ok, but the point remains: you have to identify the type constructor. The easiest way is to have a tagged structure, (in which case the Color example naturally degenerates to a single integer, as non of the constructors have any paramaters).

Indeed, you don't need to load new code at runtime. But what I was trying to say is that you still need to somehow figure out which constructor (or value if you like) you are dealing with. Since in the general case any of them can have parameters, you need to have tagged structures or something similar. Since one can have a recursively typed collection (eg, list, tree,...) of "arbitary" sum-types, tagged structures are usually the easy way. Whether your tag is an integer or a vtable-pointer is irrelevant.

IIRC, only lazy languages (such as Haskell), which need to potentially "force" their "thunks" in this case, implement it as such (but, as indicated, for a good reason).

I don't quite understand what you are trying to say here. Yes, in a lazy language you need to identify at runtime whether you have a value or a thunk, but then again how does this change anything (except evaluation which becomes a bit trickier)? Ok, using a function pointers might now be a better idea but we are drowning into implementation details aren't we. That was not my intention.

If this interests you, I can dig out the G-machine implementation docs and "Compiling with Continuations" again and try to find the relevant sections; [...]

That is probably not necessary. I am already aware of how the G-machine works.

However, there are many other factors that affect the implementation [...]

Urgh. My idea was more like: typesystem provides means for checking the type-correctness of a program (compiletime=static or runtime=dynamic) but can force some type-identification to be made at runtime too, which sometimes causes confusion when people mix these two ideas together. All too many people think that "static typing" means that no type identification is needed at runtime, while in fact the function of static typing is simply to ensure that no runtime type errors can occur.

My other idea was to cause some confusion. ;D

(If the person putting a gun to my head would insist on a different opinion, I might re-consider taking this position, but I digress again.)

I would find it unlike that a person holding your view would put a gun on your head to make you agree. Such brutal methods are more often found in certain other circles.

Solar · Post by **Solar** » Fri Mar 25, 2005 3:18 am

Ch. Reichenbach wrote: (1) You can compile programs with generics/parametric types in separation from programs with template type parameters; in particular, you can build libraries of them. Compare this to the C++ STL where typically most or all of the implementation is part of the header files.

"export" has been part of the C++ standard since 1998. It's only that compiler vendors so far haven't put much of an effort behind actually implementing it (except for the EDG compiler). With "export", template declaration and definition could be properly seperated, too.

Blame the vendors, not the language.

Other than that, I'm not that much into Java internals, so I can't really comment.

Joel (not logged in) · Fri Mar 25, 2005 9:54 am

http://www.research.att.com/~bs/bs_faq2 ... onstraints

Also...while it is true that C++ sometimes generates different code for the same template used on types, this is not always the case. There is certainly room for compiler optimization (vector of int vs. vector of long int don't need separate object code on a 32-bit x86 machine, for example). And in any case, as far as I can determine this is still parametric polymorphism -- I haven't seen any definition of the term that requires there to be a one-to-one correspondence between source code and object code.

Also, Java achieves this ability primarily through constraining all its objects to be referenced by pointer and to be a subclass of an ultimate base-class (Object) in a single-inheritance hierarchy. Under the same constraints, you can expect to see similar abilities in C++.

Candy · Post by **Candy** » Sun Mar 27, 2005 7:20 am

Joel (not logged in) wrote: Also, Java achieves this ability primarily through constraining...

Which is something C++ doesn't do and for which I very much prefer C++. Allow me to make the choices, I'm a software engineer for a good reason.

... all its objects to be referenced by pointer and to be a subclass of an ultimate base-class (Object) in a single-inheritance hierarchy. Under the same constraints, you can expect to see similar abilities in C++.

Which means that c++ can already do what Java could do with those generics. And it can do more. Which part of it being able to do more don't you like?

Colonel Kernel · Post by **Colonel Kernel** » Sun Mar 27, 2005 11:55 am

I've actually used some pretty crazy C++ template techniques in production code (the $$$ variety). Lots of stuff with Loki typelists -- automatically generating implementations of the Visitor, Abstract Factory, and Bridge patterns mostly. So, I can appreciate the power of templates over generics.

That said, I really wish "export" had been implemented, even though I know it would be ridiculously difficult to do so. Having to distribute large complex libraries with all the code in the headers is a real pain. Template diagnostics are also less than ideal.

consider separate compilation (which is overrated anyway)

This I very much disagree with, mostly for commercial reasons. I work on a product that is actually an SDK (Win32 and .NET-based). The SDK is distributed as a set of libraries (DLLs, static libs, and headers) plus a bunch of sample source that uses the libraries. If we were to add some of the powerful template stuff I've used on other projects, we would have to distribute it as source, and if customers were to use it incorrectly, we'd force them to wade through all the nasty compiler diagnostics. Not a good thing.

My practical experience with templates has shown me that when used for metaprogramming, they are still very much an experimental feature. (I've crashed VC++ 7.1 because the depth of its template instantiation recursion seems to be sorely limited.) I look forward to seeing what the next big thing is after generics... maybe someone will get everything right next time.

Joel (not logged in) · Sun Mar 27, 2005 2:00 pm

Candy wrote: Which means that c++ can already do what Java could do with those generics. And it can do more. Which part of it being able to do more don't you like?

That was my point.

Candy · Post by **Candy** » Mon Mar 28, 2005 12:09 am

Joel (not logged in) wrote: That was my point.

Oh... ok... think I like you then

Seriously, think we've kind of gone offtopic in here... if we're going to keep discussing C++ vs Java it might be a better idea to split the thread.

OSDev.org

MACROS in Java

Re:MACROS in Java

Re:MACROS in Java

Re:MACROS in Java

Re:MACROS in Java

Re:MACROS in Java

Re:MACROS in Java

Re:MACROS in Java

Re:MACROS in Java

Re:MACROS in Java

Re:MACROS in Java

Re:MACROS in Java

Re:MACROS in Java

Re:MACROS in Java

Re:MACROS in Java

Re:MACROS in Java