Ninja'ed by Brendan. My post contains a worked example, so I'll still post it.
Schol-R-LEA wrote:OK, that explains that, then. If it's an important idiom of the language, changing it would not be desirable, so my suggestion is not a workable solution. It may be possible to make it work - say, by having timestamp metadata on each word - but it probably wouldn't be worth the trouble.
Well, what you could do is keep the linked list but just have the hash table serve as a shortcut into it.
Something like this:
We start by defining a word GREET:
To enter it into the dictionary, we first hash it. For purposes of illustration, let's use a 26-place hash table where the hash of a word is the position of it's first letter in the alphabet (it wouldn't be a good choice of hash algorithm in an actual implementation, but I hope it will make it easier to understand what I'm doing). So GREET hashes to 7. A hash table entry consists of a pointer to a list of all words whose hash value points to that position in the table. So our table looks like this:
Code: Select all
...
6) NULL
7) WORDLIST7
8) NULL
...
At address "WORDLIST7", we have a structure of the following form:
Code: Select all
(Word name, Pointer to most recent word with this name, pointer to next entry in wordlist)
So memory looks something like this :
Code: Select all
DICTIONARY:
NULL (A pointer to the most recent word in the dictionary will go here)
HASHTABLE:
...
6) NULL
7) WORDLIST7
8) NULL
...
WORDLIST7:
("GREET", FIRSTGREET, NULL)
At the address "FIRSTGREET", we establish a dictionary entry structure of the following form.
Code: Select all
(Word name, pointer to previous word in dictionary, pointer to previous word with same name, other data needed by the forth implementation)
We then point DICTIONARY at FIRSTGREET, so memory now looks like this:
Code: Select all
DICTIONARY:
FIRSTGREET
HASHTABLE:
...
6) NULL
7) WORDLIST_G
8) NULL
...
WORDLIST_G:
("GREET", FIRSTGREET, NULL)
FIRSTGREET:
("GREET", NULL, NULL, Data)
We then define a word "FOO":
We add a hashtable entry and wordlist for "F" (hash table entry 6), add "FOO" too the new wordlist, and add foo to the dictionary. Memory now looks like this:
Code: Select all
DICTIONARY:
FIRSTFOO
HASHTABLE:
...
6) WORDLIST_F
7) WORDLIST_G
8) NULL
...
WORDLIST_G:
("GREET", FIRSTGREET, NULL)
FIRSTGREET:
("GREET", NULL, NULL, Data)
WORDLIST_F:
("FOO", FIRSTFOO, NULL)
FIRSTFOO:
("FOO", FIRSTGREET, NULL, Data)
We now redefine GREET:
Memory now looks like this:
Code: Select all
DICTIONARY:
SECONDGREET
HASHTABLE:
...
6) WORDLIST_F
7) WORDLIST_G
8) NULL
...
WORDLIST_G:
("GREET", SECONDGREET, NULL)
FIRSTGREET:
("GREET", NULL, NULL, Data)
WORDLIST_F:
("FOO", FIRSTFOO, NULL)
FIRSTFOO:
("FOO", FIRSTGREET, NULL, Data)
SECONDGREET:
("GREET", FIRSTFOO, FIRSTGREET, Data)
To see what happens with a hash collision, let's define a word "FLOB", memory then looks like this:
Code: Select all
DICTIONARY:
FIRSTFLOB
HASHTABLE:
...
6) WORDLIST_F
7) WORDLIST_G
8) NULL
...
WORDLIST_G:
("GREET", SECONDGREET, NULL)
FIRSTGREET:
("GREET", NULL, NULL, Data)
WORDLIST_F:
("FOO", FIRSTFOO, WORDLIST_F_NEXT)
FIRSTFOO:
("FOO", FIRSTGREET, NULL, Data)
SECONDGREET:
("GREET", FIRSTFOO, FIRSTGREET, Data)
WORDLIST_F_NEXT
("FLOB", FIRSTFLOB, NULL)
FIRSTFLOB:
("FLOB", SECONDGREET, NULL, Data)
When looking up a word, we hash it and use the hash as an index into HASHTABLE. We follow the pointer in the hashtable entry and see if the word name there matches the word we're looking for. If so, we follow the first pointer in the wordlist entry to find the most recent definition of that word. If not, we follow the second pointer in the wordlist entry to find the next word with that hash. If the second pointer is NULL, then the word we're looking for does not exist.
When forgetting a word, we follow DICTIONARY to the most recent word in the dictionary and forget it. If the words name matches that of the word we are trying to forget, we stop after we've forgotten this word. If not, we follow the first pointer in the dictionary entry to the previous word in the dictionary, and continue on, forgetting words as we go, until we get one whose name matches the word we're trying to forget. This is just like regular Forth.
The difference is that, as part of the procedure for forgetting a word, we need to maintain the wordlists referenced by the hashtable. We do this by using the second pointer in the dictionary entry. If it is NULL, there are no previous definitions of the word, we just hash the word to get to the wordlist it's in, find the word in the wordlist, and remove it from the wordlist. If it isn't NULL, then there's a previous definition of the word. We leave it in the wordlist and use the second pointer in the dictionary entry to find the previous definition. We then fill in the pointer to the word in the wordlist entry with the pointer to the previous definition.