To be fair, indirection - which is what this confusion is all about - is damn hard to get without the hinting C and C++ use with the indirection ('
*' and '
->') and reference ('
&') operators. Take a look at P. J. Plauger's 'Programming on Purpose' article 'The Knob on the Back of the Set' for examples of far worse confusions with indirection in Algol-68 (I'll go into more detail on that travesty
in a separate thread on the General Ramblings board, as this post is way too long even without it).
But in this case, the problem is that ggodw000 keeps expecting direct operations where they should be expecting indirected ones, because that is all that Python has - and all most languages designed after 1990 have, for that matter.
Maybe it would be less confusing if Python didn't automatically create lists from list literals, or required a
new keyword in front of them as in Java or Ruby. I dunno. But it doesn't, and that seems to be what ggodw000 is getting tripped up over.
What is happening here is that the basic semantics of Python objects - regardless of type, and regardless of the
implementation - are that of variables as references or pointers to objects, and that objects themselves are independent and anonymous. This is, as I've said before, the
same semantics used in Lisp, in Ruby, Java for objects (but not primitive types, damn their eyes for that piece of confusion), and C# (which did the same thing as Java but then added POD structures).
While this set of semantics implies a corresponding implementation, a good compiler will snap the pointers silently for some things to improve efficiency. That's a separate issue, though, and I've already discussed that in this thread.
The point is, the value of the variable itself is hidden, and handled directly by the automatic memory management system. At no point does the variable hold a non-pointer value, at least as far as the visible semantics are concerned.
For a C programmer, you have to imagine all of the variables are pointers, and objects are something which was allocated with
malloc() somewhere else (even if it is a literal). Try this: every time you use a variable in a reference semantics language for anything other than assigning its reference to another variable, mentally append a
* in front of it. That might help, though I can't promise it will.
So, what that happened is this:
Variable
a now points to a number object with the literal value of one - the fact that the Python interpreter is actually putting that 1 in the memory space of the variable doesn't change the semantics of this, since tagging etc is done underneath the hood and the only way to know the details is with some sort of introspection library, or through an abstraction leak.
The
reference in variable
a is copied to variable
b.
In C, this would be (semantically, at least) equivalent to:
Code: Select all
int *a, *b;
a = malloc(sizeof(int));
*a = 1;
b = a;
So now both of them point to the same allocated memory.
Variable
a1 is assigned a reference to a list comprised of two object references.
The reference in variable
a is copied to variable
b.
Variable
a1 is assigned a reference to a
new list comprised of three object references. The original object (which
b1 still points to) is unchanged, and if
b1 haddn't been assigned to it, that list would now have zero references and would get garbage collected eventually.
For the C code to illustrate this, I've needed to have some handwaving, in the form of some typedefs and utility functions for handling the specifics of Python allocation and typing. Sample typedefs and implementations for some of the utility functions are at the bottom of the post if you care to look, no promises that they would work as they are just something I threw together for this and it really isn't relevant. Anyway, note that the earlier C code should have had the same, but I omitted it entirely because the details of the integer type handling would have just cluttered things.
Code: Select all
PyListNode* append(PyListNode* head, PyObject* data);
PyObjectType* lookupTypeSize(PyObjectType *registry, char typeName[]);
PyObject* makePyInt(int value); // allocates a Python big-int
PyObjectType *typeReg;
// a lookup table of type sizes, details don't matter
PyListNode *a1, *b1;
a1 = malloc(lookupTypeSize(typeReg, "list"));
append(a1, makePyInt(100));
append(a1, makePyInt(200));
// note that the append() function is operating
// on the list allocated earlier, not on 'a1' -
// in this case, a1 is just being used
// to pass the pointer to the head of the list
b1 = a1; // both now point to the same list
a1 = malloc(lookupTypeSize(typeReg, "list"));
// creates a *new* list, and assigns a1 a pointer to that list
// - b1 is still pointing to the existing list
append(a1, makePyInt(100));
append(a1, makePyInt(200));
append(a1, makePyInt(300));
// these operations are all on the second list,
// not the first one
The important thing to see here is that while it might
look as if
is appending a value to the existing list, it is in fact creating a completely different list.
So, for
A new list is created, but not appended to, and the variable
list1 is assigned a reference to it.
This appends to the list pointed to by
list1. Note that this is
not changing
list1 in any way, shape or form - the method dispatch uses the variable as a handle to access the object, which is the list which is
actually getting changed.
Here again, this is going variable to variable, not object to object. Both variables now point to the same list.
This is once again appending to the list, which both variables are still pointing to. Neither
list1 no
list2 have been modified, only the list object itself.
The matching C code is:
Code: Select all
PyListNode *list1, *list2;
list1 = malloc(lookupTypeSize(typeReg, "list"));
append(list1, makePyInt(100));
list2 = list1; // both now point to the same list
append(a1, makePyInt(101));
// these operations are all on the second list, not the first one
The point is, what seems to be confusing you is not the second set of operations, but the first set, but it looks to you like it is an append when it isn't.
Anyway, here's that example implementation I mentioned. I can pretty much guess that the actual CPython imlpementation is very different.
Code: Select all
typedef struct {
char typename[MaxIdentifierSize];
uint64_t minimum_size;
} PyObjectType;
typedef struct {
PyObjectType* type_tag;
uint64_t size;
void* data;
} PyObject;
// a tagged structure for an untyped pointer,
// as typing is handled at run time by
// the interpreter
typedef struct {
PyObject *data;
PyListNode *next;
} PyListNode;
PyListNode *a1, *b1;
// somewhere we have this function, which takes a pointer to a list and a piece of data to append to the end of the list:
PyListNode* append(PyListNode* head, PyObject* data)
{
PyListNode *curr;
for (curr = head; curr != null; curr = curr ->next)
{ } // iterate to the end of the list
curr = curr ->next = malloc(lookupTypeSize(typeReg, "list"));
curr ->data = data;
return head;
}