OSDev.org

Posted: **Tue May 06, 2008 1:31 am**

Shark8 wrote:
iammisc wrote:When it comes to pointer arithmetic, 4 + 1 is not 5 because a pointer is not a number(although it may be represented as one).
Right. Let's use a fictional $100 byte memory-ed computer for the example. (That's $00 to $FF.) The reason that having a pointer incremented by the size of the pointer can be seen as undesirable is that the intervining spaces ARE valid memory locations.
ie
const
I: Integer = $01;
var
P: pointer absolute I; {Make the pointer's value the contents of I.}

begin
p:= Ptr( $01 );
Inc(p);
Writeln( 'The value of P is:', i );
end;

What is the value of P at the end? Assuming the pointer size is a word (16-bit)? You see, the reason that some people don't like that Inc to go to the pointer-size offset {ie Inc(P, SizeOf(P));} is that the memory locations: $02 & $03 are valid, though unattainable under only Inc/Dec pointer arithmitic of that style. {At least that's my oppinion.}

Your argument is only true for certain values of 'valid'. What do I mean by that? let's take the (non fictional) MIPS architecture as an example. If, as in your example, the pointer is incremented by the word size, then the expression "*++ptr" will write to the next word in memory.

However, if (as you are suggesting), ++ only increments the pointer location by 1 byte, then to write to the next word in memory you'd have to do something like "ptr += sizeof(ptr); *ptr = ...", which IMHO is most ugly.

Not only that, but this expression: "*++ptr" will fail dramatically with an "Address alignment error (load or instruction fetch)". Nice going.

In all honesty though, I do wonder why you gave a pascal (is it pascal?) example in a war about C - do you mean to say that pascal has the same semantics as far as pointer incrementation goes? That's what I pick up from your text, and as such I can't see why you gripe at C for it when other languages do it too.

Cheers,

James

Posted: **Tue May 06, 2008 3:32 pm**

JamesM wrote:Your argument is only true for certain values of 'valid'. What do I mean by that? let's take the (non fictional) MIPS architecture as an example. If, as in your example, the pointer is incremented by the word size, then the expression "*++ptr" will write to the next word in memory.

However, if (as you are suggesting), ++ only increments the pointer location by 1 byte, then to write to the next word in memory you'd have to do something like "ptr += sizeof(ptr); *ptr = ...", which IMHO is most ugly.

Quite right, it is ugly. I've never used pointer arithmitic... Lists, assigns, sizings, deletes yes, working with arrays, yes. But I've never actually had to DO pointer arithmitic.

JamesM wrote:Not only that, but this expression: "*++ptr" will fail dramatically with an "Address alignment error (load or instruction fetch)". Nice going.

Yeah, as I recall all MIPS instructions are aligned. But I didn't know that the data was too.... interesting. But I also did mention that $02 & $03 were valid in the example.

JamesM wrote:In all honesty though, I do wonder why you gave a pascal (is it pascal?) example in a war about C - do you mean to say that pascal has the same semantics as far as pointer incrementation goes? That's what I pick up from your text, and as such I can't see why you gripe at C for it when other languages do it too.

Actually, I used Pascal-style syntax because I'm really rusty on my C programming... I'm not even sure it is VALID Pascal code. (The * syntax is counterintuitive in that it is context-sensitive.) Pointer arithmitic may work like C's in Pascal, it may not... that is even if pointer arithmitic is implemented in the language. (Like I've said, I've never actually had to use pointer arithmitic, so it may have a completely different syntax... I suppose I should go look it up.)

It is a point of style. But my point is that both ways (inc-by-next-valid-location, or the inc-by-size-of-data) make sense.

Posted: **Tue May 06, 2008 4:54 pm**

@Shark8: I say you're statements are false because in many ways they are not criticizing C but rather your own personal opinions as to why C is bad(and why it would be better if it was more like Pascal). I hate to break it to you but C is not Pascal. Therefore, obviously its semantics are different. A pointer is an integer and for a pointer in C, arithmetic is done in increments of the pointed to object's size. That is just the way C does it and it is well understood and defined. If you don't like it personally, it does not make it bad. You are right in saying that the addresses between two integers are also addressable but in the context of a pointer to an integer, the next integer is not 1 byte away. This makes sense to me but it might not to you but that does not make it bad. Also, it is not as if you cannot do integer-like arithmetic on a pointer: just cast it to an integer of sufficient size. This is nothing different from Pascal where you would have to explicitly add the size.

For example to get to the integer at an offset of one byte from already defined pointer unsigned int* p:

Code: Select all

p = (unsigned int*) (((unsigned int)p) + 4);

I probably don't need all those parentheses, but I think they prevent confusion.

Overall, my point is that you cannot say a language is inherently bad or badly designed just because you do not like its well-understood semantics or because you have never needed to use a particular feature.

Posted: **Fri May 09, 2008 12:23 pm**

iammisc wrote:@Shark8: I say you're statements are false because in many ways they are not criticizing C but rather your own personal opinions as to why C is bad(and why it would be better if it was more like Pascal).

I thought the question was "Who here doesn't use C? And why?"... I am explaining reasons why I personally don't like C. (Or are personal oppinions invalid?)

Also, where did I say that I wanted C to be more like Pascal? (Or that it was inherently bad?) The only place you can even begin to say that is when I said that Pascal's units (in the uses clause) get including done right, as opposed to having a textual-include, am I not right?

If the question is "Why don't you use C?" then can't I answer with:
1) "Because I find the syntax horrid.",
2) "Because I think that a language shouldn't go out of its way to have 'got`chas'." (Like assignments allowable in conditional tests, or given or is it invalid? {think comments}),
3) "Because I disagree with some important design decisions {and/or implementations}."
OR 4) "I find that, in general, it encourages bad programming practices."

As to the failings of C as a high-level language, that I could rant about. But like I keep trying to tell you, I don't use C/C++ unless I have to, as in acedemia.

Posted: **Fri May 09, 2008 3:14 pm**

Academia is not real life. Academics can make some assumptions that are just not possible in real life.

However, your arguments I cannot flaw as they are 100% subjective.

Posted: **Fri May 09, 2008 6:53 pm**

JamesM wrote:Academia is not real life. Academics can make some assumptions that are just not possible in real life.

However, your arguments I cannot flaw as they are 100% subjective.

My point exactally. It's a subjective question.
As for the array/address thing, I was merely trying to illistrate that there are other alternitives {to how one intreprets pointer arithmitic} which may or may not make more sense to any certain individual. (IE Adressing the earlier poster who mentioned pointers being incrimented by a numeric value of four.) Having the behavior of incrimenting to the next record-sized block seems to me to be kind of dangerous, inviting disaster... after all, what if some other variable lives at that location? (all you have is the space that points to a record/location, in all those examples I don't recall the space being allocated there, so... what's to say that rcd_ptr++ is a valid record?)

Posted: **Sat May 10, 2008 12:41 am**

And who says ES:ESI points to valid record. In both assembler C/C++, PASCAL, C#, or any other language you need to specify boundary conditions and validation testing otherwise your code will be buggy any ways.

I am a fervent user of C/C++ but i also know that you can make really bad decisions in that language on the other side i also know a lot of assembler as i am an embedded software specialist and the situation does not improve with assembler.

In my opinion it is not the language that count but the mind set and skill of the programmer. I have seen a lot of code in my life and seen object oriented assembly which was written by in artist (in my opnion he was) and ditto with C and javascript.

Posted: **Sat May 10, 2008 3:35 am**

Shark8 wrote:Like assignments allowable in conditional tests...

That's one of the problems that stems not so much from the language, but from people not realizing that compiler warnings are something helpful, not a nuisance to be > /dev/null'ed.

Posted: **Sat May 10, 2008 9:58 am**

Solar wrote:That's one of the problems that stems not so much from the language, but from people not realizing that compiler warnings are something helpful, not a nuisance to be > /dev/null'ed.

I read somewhere that K&R never intended for C to be used without lint. I wonder if there's any truth to that. It would explain a lot...

Posted: **Sat May 10, 2008 10:46 am**

Solar wrote: 1) Pointers behave the way they do because the semantics of "++ptr" is "point to the next element". If ptr points to an integer, that means "point to the next integer". Not the second half of the current one, the next one. There is nothing "wrong" or "right" about this behaviour compared to what you would prefer, it's just the way C works. It ain't "broken", it's just not what you liked, so it's a matter of dislike, not breakage.

While you didn't reply to me, I must say that it is indeed not breakage, but that does not make it less horrible.
Apart from that, the semantics of the increment and decrement operators of C++ are very clearly described in the C++ ANSI standard. Sorry to disappoint you, but here it is: "The operand of prefix ++ is modified by adding 1, or set to true if it is bool (this use is deprecated)."

It does not mention special cases for pointers. So what you said is actually wrong. There are no such semantics defined for that operator.

So 4+1 = 8 must come from the binary + operator (we assume they use the semantics of + when they said "adding", because else using ++ for pointer arithmetic would be a violation of the standard). The semantics of + is described as following: "The result of the binary + operator is the sum of the operands".

And after that, they contract themselves by making an exception for arrays, and defining non-array pointers to behave like they point to the first element of an array with 1 element.

Did you catch that? Arrays are defined in terms of pointers (elsewhere), and pointers are defined in terms of arrays.

So yes, it's correct (if we make some implicit assumptions (that does not hold true outside the C++world) about the standard). That does not make it better or worse.

----

iammisc wrote:@Shark8: I say you're statements are false because in many ways they are not criticizing C but rather your own personal opinions as to why C is bad

Which is what this thread was about. Not "why is C bad", but "why don't you use C". Read the first post. All the answers to the last question will obviously be opinions.

If you don't like it personally, it does not make it bad

Read the original question: "why don't you use C". "Because they way it works makes no sense to me" surely has to be a perfectly valid answer.

----

iammisc wrote:When it comes to pointer arithmetic, 4 + 1 is not 5 because a pointer is not a number(although it may be represented as one).

iammisc wrote:A pointer is an integer

It's an integer, but not a number?

----

in the context of a pointer to an integer, the next integer is not 1 byte away.

Since when did pointers get a concept of next?

----

Solar wrote:
Shark8 wrote:Like assignments allowable in conditional tests...
That's one of the problems that stems not so much from the language, but from people not realizing that compiler warnings are something helpful, not a nuisance to be > /dev/null'ed.

It is my opinion that when you have these choices:
A. Make a language where it's hard to make a common mistake
B. Make a language where it's easy to make a common mistake, and instead give a warning about it, so that the user has to go back and correct it

it's obvious that you'd want the first one. B gives a lot of extra work to both the compiler writer and the programmer. It's usually a trade-off. In rare cases you might find a win-win situation, good for both the compiler writer and the programmer. Only C has repeatedly chosen solutions that time-consuming for both.

----

Ok, everyone, have a look at this:

Code: Select all

int num = 0xFE-1;

Apparently a perfectly legal piece of C code. While it is "correct" that it does not compile (language authors decide what is correct), it is not intuitive that otherwise legal code fails due a brainless implementation of the preprocessor.

----

Colonel Kernel wrote: I read somewhere that K&R never intended for C to be used without lint. I wonder if there's any truth to that.

If it isn't true, we just delete the last two words, right?

Posted: **Sat May 10, 2008 11:22 am**

Hi,

I'm not going to reply to every point in your post - I'll leave that to someone else.

I must say that it is indeed not breakage, but that does not make it less horrible.

100% subjective. I wouldn't pick up on it except for the fact that you keep parading your opinions as fact. A more correct statement would be, for example, "but that does not make it less horrible for me" or "in my opinion". That way people will take your opinion as an opinion and not misinterpret it as a great big hairy fallacy and flame you back (unless flaming is what you want, of course).

"The operand of prefix ++ is modified by adding 1, or set to true if it is bool (this use is deprecated)."

Which means that "x++;" expands to "x = x + 1;" - that is, a call to
"x::operator+(1)". If x is a pointer, this will increment by the sizeof the pointer.

This may seem like a special case, however the postfix (and prefix) ++ operator is overloadable, like most operators in C++. So defining "adding one" to mean "the value must be incremented numerically by one" only has meaning for numeric types. What happens in the case of a custom class?

This is why the "add by one" description is so vague (as with a lot of that particular standard), and how it is implemented differently for different types without (much) special-casing.

Since when did pointers get a concept of next?

They were defined to from the conception of C.

Ok, everyone, have a look at this:
Code: Select all
int num = 0xFE-1;
Apparently a perfectly legal piece of C code. While it is "correct" that it does not compile (language authors decide what is correct), it is not intuitive that otherwise legal code fails due a brainless implementation of the preprocessor.

Ambiguity in that particular statement is due to the grammar, and it should be noted that other languages suffer this same problem. If you allow exponent extensions to constants, and you allow hexadecimal numbers, then your compiler will have a problem parsing that statement, in pretty much any language that uses infix operators.

Cheers,

James

Posted: **Sat May 10, 2008 11:23 am**

I am a fervent supporter of C and i like it , But the language did not bring anything novel . Actually Algol is infact more revolutinary than C , but it was too good for that period of time . Some machines even had an instruction set that supported Algol (Burrogs - something like that I think !)

Similary ,Although i find C# good , i find it unoriginal , Even Java is sligthly un orignial , The more orignal apporach was the p -code approach taken by Writh . But java was implemeted well . It's amazing to note people do not give much respect to true orginality .

Posted: **Sat May 10, 2008 11:54 am**

JamesM wrote:Ambiguity in that particular statement is due to the grammar, and it should be noted that other languages suffer this same problem. If you allow exponent extensions to constants, and you allow hexadecimal numbers, then your compiler will have a problem parsing that statement, in pretty much any language that uses infix operators.

Sorry, but that is simply not true. I have a production compiler right in front of me that allow exponent extensions to constants, and it doesn't have this problem.

What supernatural magic does it use to escape from this seemingly unsolvable problem? It simply does not look for exponent extensions after a hex constant (because it makes no sense).

Of course the problem is due to the grammar. The point is: Making a grammar without the problem would be trivial.

Edit: I just tested it with Pascal as well. It does not have this problem either. It's only C.
Edit: And I tested with Haskell. No such problem.

Code: Select all

Hugs> 0xfe-1
253

Edit: And specially written for this board, an EBNF grammar to correctly parse numbers correctly, including hex numbers, integer literals, float literals, integers with exponents and floats with exponents.

Code: Select all

letter      ::=   A..Z | a..z | _
digit       ::=   0..9
hexdigit    ::=   0..0 | A..Z
sign        ::=   + | -
infixop     ::=   + | - | * | / 
sint        ::=   <digit> {<digit>}
sfloat      ::=   <sint> [. <sint>]
decimal     ::=   <sfloat> [e [<sign>] <sfloat>]
hex         ::=   $<hexdigit> {<hexdigit>}

number      ::=   <hex> | <decimal>
name        ::=   <letter> [<name> | <digit> [<name>]]

term        ::=   <number> | <identifier>
expression  ::=   <term> {<infixop> <term>}

Valid example inputs, which should be sanely parsed:
5+2
$FE-2
5.4e-0.1-876-76
8

I didn't test it, so it could have some small errors, but the main concept should be obvious.

Posted: **Sat May 10, 2008 12:23 pm**

Sorry, but that is simply not true.

Because you have an example of the opposite does not invalidate my statement. I said "other languages" not "all other languages".

You'll also notice that to "escape from this seemingly unsolvable problem" your production compiler invalidates one of the two expressions in my "if" clause.

Cheers,

James

EDIT: in response to your edit, I didn't say that it wasn't possible to easily change the grammar. The point is that C allows hex numbers with exponents. C is not the only language that does this (I've only tested perl myself and that proved negative). I'm not defending this at all, it's a design decision which probably serves no use. However there are many other ambiguities in the C and C++ languages which are far more important to rectify - I'm thinking of the nested template/ operator>> ambiguity;

Code: Select all

A<B<C>> D;

Posted: **Sat May 10, 2008 12:46 pm**

JamesM wrote:
Sorry, but that is simply not true.
Because you have an example of the opposite does not invalidate my statement. I said "other languages" not "all other languages".

The fact is, that I have tested 3 other languages, and none of them has this problem. All of them allows hex constants and e notation. Which other languages are you talking about?

You'll also notice that to "escape from this seemingly unsolvable problem" your production compiler invalidates one of the two expressions in my "if" clause.

Let's read it again:

If you allow exponent extensions to constants, and you allow hexadecimal numbers, then your compiler will have a problem parsing that statement

You want two things:
1. E notation allowed applied to constants.
2. Hex numbers allowed.
The compiler allows these two things, just not at the same time, which was not a stated requirement.

Example of expression with e notation and hex numbers in Haskell:

Code: Select all

1e-3-0xfe-1e3

OSDev.org

Who here doesn't use C and why?

hi all ,