Page 1 of 2

How do B-tree-based file systems work?

Posted: Tue Mar 01, 2016 5:10 pm
by physecfed
So I've been making an attempt to study file systems for later implementations, and of course this involves the obligate brewing of a FAT driver (in the works). However, I began also studying the HFS+ file system as it a) is less constrained than FAT, and b) has an open implementation as opposed to NTFS. However, the architecture of HFS and HFS+ (as well as many other file systems) are based upon B-trees as opposed to the linked list methodology used in FAT implementations.

From a top-level architectural description, how do B-tree-based file systems store their data? I've tried looking at the Wikipedia page for this but it has much too theoretical of a lean for my eyes at this point. What's the difference between a B-tree structure and a linked list structure?

Re: How do B-tree-based file systems work?

Posted: Wed Mar 02, 2016 1:00 am
by alexfru
physecfed wrote:What's the difference between a B-tree structure and a linked list structure?
The page says that every node of a B tree has up to M (> 2) children. And children may have further children.
Can you see a difference between a tree and a list and a tree with sort-of embedded lists in its nodes?
Perhaps, first take a look at 2-3 and 2-(3-)4 trees if too many child nodes are too many.

Re: How do B-tree-based file systems work?

Posted: Wed Mar 02, 2016 1:52 am
by iansjack
In simple terms, a b-tree is a more efficient data structure with respect to searching than a linked list (logarithmic search times rather than linear). The name is fairly descriptive - it stores its data as a balanced tree rather than a linear list.

You may find http://www.nobius.org/~dbg/practical-fi ... design.pdf of interest.

Re: How do B-tree-based file systems work?

Posted: Sun Mar 06, 2016 4:02 pm
by physecfed
iansjack wrote:In simple terms, a b-tree is a more efficient data structure with respect to searching than a linked list (logarithmic search times rather than linear). The name is fairly descriptive - it stores its data as a balanced tree rather than a linear list.

You may find http://www.nobius.org/~dbg/practical-fi ... design.pdf of interest.
That's a wonderfully-written resource. I've spent the past few days looking at it.

Would this be a correct way of implementing a B-tree node for M=2?

Code: Select all

typedef struct _btnode2 {
	uint32_t* data;
	struct _btnode2* child1;
	struct _btnode2* child2;
} btnode2_t;

Re: How do B-tree-based file systems work?

Posted: Sun Mar 06, 2016 4:45 pm
by iansjack
That's the basic data structure. You might find it useful to add another field which points to the parent of the node, so that you could traverse the tree from the bottom up as well as the top down.

In addition, you want to make the tree as balanced as possible, so that each node at a given level has roughly the same number of subnodes. Otherwise, in the worst scenario, you just end up with a linked list. This is a slightly more difficult exercise than a simple tree and you have to weigh up the efficiencies gained with the extra processing required to balance the tree.

Re: How do B-tree-based file systems work?

Posted: Sun Mar 06, 2016 11:56 pm
by physecfed
iansjack wrote:That's the basic data structure. You might find it useful to add another field which points to the parent of the node, so that you could traverse the tree from the bottom up as well as the top down.

In addition, you want to make the tree as balanced as possible, so that each node at a given level has roughly the same number of subnodes. Otherwise, in the worst scenario, you just end up with a linked list. This is a slightly more difficult exercise than a simple tree and you have to weigh up the efficiencies gained with the extra processing required to balance the tree.
Ah, so something like the following:

Code: Select all

typedef struct _btnode4 {
	uint32_t* data;
	struct _btnode2* parent;
	struct _btnode2* children[4];
} btnode4_t;
What I'm having trouble understanding is how the tree can be said to be self-balancing; does this mean that if the root has 3 subnodes, for example (3 pointers pointing to other nodes in the tree), that each of those nodes also has three child nodes associated with it?

The other thing that is tripping me up is the notion of keys - to what does this refer?

EDIT: Brief moment of enlightenment - do "keys" hold the same meaning here as they would, in say a dictionary or map? That is, a key maps to a value? I took another look at that document - for example, in a POSIX-like system, could a ".." key be used to refer to the address of the parent directory node?

Re: How do B-tree-based file systems work?

Posted: Mon Mar 07, 2016 2:41 am
by alexfru
physecfed wrote:The other thing that is tripping me up is the notion of keys - to what does this refer?

EDIT: Brief moment of enlightenment - do "keys" hold the same meaning here as they would, in say a dictionary or map? That is, a key maps to a value?
Guess how dictionaries and maps are implemented. There's either a hash table or a tree under the hood.

Re: How do B-tree-based file systems work?

Posted: Mon Mar 07, 2016 4:25 am
by physecfed
alexfru wrote:
physecfed wrote:The other thing that is tripping me up is the notion of keys - to what does this refer?

EDIT: Brief moment of enlightenment - do "keys" hold the same meaning here as they would, in say a dictionary or map? That is, a key maps to a value?
Guess how dictionaries and maps are implemented. There's either a hash table or a tree under the hood.
That makes sense; the logarithmic time helps prevent bogging-down of the entire structure for large sizes.

What does an individual node look like in an average implementation? (let's suppose we're dealing with Java's HashMap, say HashMap<String, String>) Do the individual nodes hold data themselves; that is, would it be safe to describe a node like the following?

Code: Select all

typedef struct _bnode3 {
	char				     key[255];
	char*				    value;
	struct _bnode3* 	  parent;
	struct _bnode3*		children[3];
} bnode3_t;

Re: How do B-tree-based file systems work?

Posted: Mon Mar 07, 2016 5:39 am
by alexfru
Think. In a binary sorted tree every node contains a key and at most 2 child nodes. Those child nodes' keys are less and greater than their parent's key respectively (let's not worry about duplicate keys for now as it's not needed to understand the general picture). So, every node(key) potentially subdivides the remaining range of keys in two, each in a subtree. Right?

Now your node has at most M keys (you chose 255). How many child nodes can it have? Well, every two adjacent keys in a node represent a range of keys (between those two). So, there're at most M-1 subranges between the minimum and the maximum keys in a node. Right? And then there are 2 side subranges for keys smaller than the minimum key and larger than the maximum key (just like in a binary tree, where every node has only side subranges). That's M-1+2=M+1 subranges and therefore M+1 child nodes. So, you should have

Code: Select all

struct _bnode*      children[256];
And, logically, for every key there's a value, so you should also have

Code: Select all

char*                value[255];
This would now make a bit more sense than 255 keys, 1 value and 3 children in a node. :)

Re: How do B-tree-based file systems work?

Posted: Mon Mar 07, 2016 1:15 pm
by physecfed
alexfru wrote:Think. In a binary sorted tree every node contains a key and at most 2 child nodes. Those child nodes' keys are less and greater than their parent's key respectively (let's not worry about duplicate keys for now as it's not needed to understand the general picture). So, every node(key) potentially subdivides the remaining range of keys in two, each in a subtree. Right?

Now your node has at most M keys (you chose 255). How many child nodes can it have? Well, every two adjacent keys in a node represent a range of keys (between those two). So, there're at most M-1 subranges between the minimum and the maximum keys in a node. Right? And then there are 2 side subranges for keys smaller than the minimum key and larger than the maximum key (just like in a binary tree, where every node has only side subranges). That's M-1+2=M+1 subranges and therefore M+1 child nodes. So, you should have

Code: Select all

struct _bnode*      children[256];
And, logically, for every key there's a value, so you should also have

Code: Select all

char*                value[255];
This would now make a bit more sense than 255 keys, 1 value and 3 children in a node. :)
I understand the basic topology of the tree, and I get that there are restrictions on both the minimum and maximum numbers of downstream links each node can support, but I'm not getting two things how the keys and values work; I don't understand how they aren't equal.

In a standard hashmap or dictionary, keys map to values in a 1:1 ratio; always 1:1. I'm looking at this site's tutorials on the subject and it appears to show, like you said, N-1 keys for N links to downstream nodes.

What I'm not understanding is how does one get the Nth key for the other node? I'm trying to think of this from an implementation perspective, where ideally one would be attempting to scale the tree with indirection using pointers. the notion of the key and the method used to logically link the tree (i.e. pointers) two different issues?

Put it this way. If we have three nodes arranged so that two children are linked to the parent (or root), then the root, by the logic we've talked about, should have one key for two children. But that root (in a structural implementation) would have two pointers to the children. So, even if you can verify that a key K exists in the tree T, how do you actually go about getting addresses from keys in order to scale the tree, when there are K+1 addresses for those K keys, and the memory addresses might not correspond in any way to the system of key ordering?

(My apologies, by the way - I'm not usually this completely dumbfounded or stupified by a subject)

Re: How do B-tree-based file systems work?

Posted: Mon Mar 07, 2016 1:30 pm
by SpyderTL
I've never really implemented a B-Tree structure, but after reading through this thread, is it accurate to say that a B-Tree is similar to a linked list, with the difference being that each entry has two links instead of one.

So, normally, to add a node to the middle of a linked list, and maintain sorting, you would have to start at the beginning, and check each entry until you found the right insert position, and then modify two entries to "insert" the new entry in the linked list.

With a B-Tree, you start with the root node, and compare the inserted value (or hash) with the current node, and follow the "less than" link if the value is less than the current node, or follow the "greater than" link if the value is greater than the current node, until you eventually find a "less than" or "greater than" link that is NULL, and then you just set that link to the inserted entry. The search should be much faster, and the "insert" is twice as fast.

Am I understanding this correctly, or completely off target?

EDIT: After some reading, this sounds more like a binary tree than a B-Tree...

Re: How do B-tree-based file systems work?

Posted: Mon Mar 07, 2016 1:41 pm
by iansjack
That's a fairly good description of the generalities involved. As the root node may have to change to balance the tree it's difficult (and not efficient) to try to ensure balance when you create a new node. I look on it a bit like garbage collection - have a background process that rebalances the tree every now and then. You have to make sure that the tree is not in use whilst it is being balanced or else you have all sorts of potential for corruption. Or I guess you could create a balanced copy and then switch to that as the main structure - but you would still have to ensure that any changes made to the original whilst the tree was being balanced were reflected in the copy.

The other alternative is just to rely on chance to keep the tree reasonably balanced.

Edit - reading around a bit I see that B-trees are self-balancing. The Wikipedia article is probably worth a read: https://en.wikipedia.org/wiki/B-tree

Re: How do B-tree-based file systems work?

Posted: Mon Mar 07, 2016 3:48 pm
by gerryg400
B-trees are more often used 'on-disk' because they require fewer disk reads than binary trees. The 'B' in B-tree does not stand for 'binary' and each node in a B-tree is usually big enough to fill a block on disk or some other size that helps performance.

Re: How do B-tree-based file systems work?

Posted: Tue Mar 08, 2016 7:13 am
by alexfru
physecfed wrote: In a standard hashmap or dictionary, keys map to values in a 1:1 ratio; always 1:1. I'm looking at this site's tutorials on the subject and it appears to show, like you said, N-1 keys for N links to downstream nodes.

What I'm not understanding is how does one get the Nth key for the other node?
What?
physecfed wrote: I'm trying to think of this from an implementation perspective, where ideally one would be attempting to scale the tree with indirection using pointers. the notion of the key and the method used to logically link the tree (i.e. pointers) two different issues?
What? The tree "scales" by growing extra levels. That happens when nodes start filling up and have to be split because there's no room left for another key-value pair. So you get extra nodes and eventually extra levels (not talking about the details of re-balancing).
physecfed wrote: Put it this way. If we have three nodes arranged so that two children are linked to the parent (or root), then the root, by the logic we've talked about, should have one key for two children. But that root (in a structural implementation) would have two pointers to the children.
Right. I see no reason for your "But", though. I'd say "And".
physecfed wrote: So, even if you can verify that a key K exists in the tree T, how do you actually go about getting addresses from keys
Addresses of what?
physecfed wrote: in order to scale the tree, when there are K+1 addresses for those K keys, and the memory addresses might not correspond in any way to the system of key ordering?
Within a node, keys and their respective values are logically tied/paired. You can have an array of k-v pairs in a node. Or you can have it split into two distinct arrays, an array of keys and an array of values, still directly corresponding to one another.
Obviously, if you want quick key search (which is the hole point of the data structure at hand), you keep the array of keys ordered/sorted. So, when you insert a new key between, let's say, key3 and key4 (because key3 < new key < key 4), you also insert its corresponding value between value3 and value4 (irrespective of how values relate to one another). The new key and its value both have the same index in separate arrays or in the same array, the index that used to belong to key4-value4 pair. key4-value4 moves one place.

The same ordering logic applies to links to children as those links are stuck between adjacent keys and there are two extra links on the sides.

I'm sorry, I'm having trouble understanding what it is that you don't understand.

Re: How do B-tree-based file systems work?

Posted: Tue Mar 08, 2016 8:09 am
by Combuster
Considering a tree is a way of sorting items, what you really see in all the textbook examples is that the tree contains the items in their entirety - usually a set of numbers: 1,2,6,8,12,13

Any entry in a dictionary consists of two parts a "key" and a "value". What really happens when you sort a dictionary using a tree is that the full dictionary entry becomes the element in a tree - so what in tree terminology is a key is actually a (key,value) pair:

Code: Select all

Map classify_living_objects<String, String> ==

                                                   +------------+--------------+
                                                   | crow-avian | horse-mammal |
                                                   O------------O--------------O
                                                   |            |              |
                        +--------------------------+            |              +------------------+
                        |                                       |                                 |
+-----------------+------------+---------------+   +------------+------------+   +------------------+---------------+
| blackbird-avian | cat-mammal | cobra-reptile |   | dodo-avian | dog-mammal |   | snapdragon-plant | spider-insect |
+-----------------+------------+---------------+   +------------+------------+   +------------------+---------------+
After that, you can decide if what you want is to store the dictionary entry in it's entirety, or that your dictionary values are actually fixed-size pointers instead.