did lot of experiment with tree algorithm

ggodw000 · Post by **ggodw000** » Mon Aug 28, 2017 9:39 pm

Needed to soup up with algorithm practice so started doing binary tree in C.
Sectioned my project into two configurable var-s: small data, huge data
small data - for testing the algorithm.
huge data - stressing over large data to see any anomaly appears.

either way i fed the 0-100 (small data) and varying number of data sets (1000, 2000....14000, 15000) into tree.

With large data with thousands of members latter, if inserted into incrementing numbers, it will be worst case, as tree is of course horribly, completely skewed to one side, it was just performing like list. and once approaching the 15000 it tooks about quite a few seconds to insert 15000 units.

Now to make most efficient, recursively call insert bottom by finding median and calling insert bottom. Doing so maximized the insertion efficiency (in another words, minimized the insert time).

But man, I was surprised by how much different it makes between these two insertion methods. Dedicated one variable: insertIterationCounter which increments each time the function recursively walks through tree nodes for single data insertion into a node.

When data set size for example 15000 is fed incrementtedly 1 by 1, then the last and highest value of 15000 took 15000 iteration to insert. Picking the median and working simultenously incrementing and decrementing will improve only by factor of 2. The tree will have only two huge branches of 7500 data.

But for median approach, it only took 13 iteration for highest iteration walk through when 15000 nodes were filled. WIth this large set, and only max iteration of 13, I thought there is something wrong with my algorithm with treePrint function and output to log file and indeed tree had 15000 nodes inserted.

So I counted the number of data that are inserted to binary tree through 13 iteration before being inserted, that was 2000! Which means base of the binary tree become 2000 wide!! Then 13 iterations made sense.
When performed insertion took less than a time it takes to blink an eye.

Efficiency compared:
Incrementing data sets: max iteration: through 15000 nodes.
Recursive insertion with median: max iteration: through 13 nodes.

When grouping the data-s inserted by number of iterations, here is the following statistics I got, looks like there might still be some problem with algorithm or skewage toward the end, but seems to be working fairly good.

Code: Select all

[root@localhost cs.dev]# cat bTree-big-oh-artificial-bisect-verbose.log | grep "00014000" -A 15000 | grep ": 13" | wc -l
2048
[root@localhost cs.dev]# cat bTree-big-oh-artificial-bisect-verbose.log | grep "00014000" -A 15000 | grep ": 12" | wc -l
4096
[root@localhost cs.dev]# cat bTree-big-oh-artificial-bisect-verbose.log | grep "00014000" -A 15000 | grep ": 11" | wc -l
4096
[root@localhost cs.dev]# cat bTree-big-oh-artificial-bisect-verbose.log | grep "00014000" -A 15000 | grep ": 10" | wc -l
1024
[root@localhost cs.dev]# cat bTree-big-oh-artificial-bisect-verbose.log | grep "00014000" -A 15000 | grep ": 9" | wc -l
512
[root@localhost cs.dev]# cat bTree-big-oh-artificial-bisect-verbose.log | grep "00014000" -A 15000 | grep ": 8" | wc -l
256
[root@localhost cs.dev]# cat bTree-big-oh-artificial-bisect-verbose.log | grep "00014000" -A 15000 | grep ": 7" | wc -l
128
[root@localhost cs.dev]# cat bTree-big-oh-artificial-bisect-verbose.log | grep "00014000" -A 15000 | grep ": 6" | wc -l
64
[root@localhost cs.dev]# cat bTree-big-oh-artificial-bisect-verbose.log | grep "00014000" -A 15000 | grep ": 5" | wc -l
32
[root@localhost cs.dev]# cat bTree-big-oh-artificial-bisect-verbose.log | grep "00014000" -A 15000 | grep ": 4" | wc -l
16
[root@localhost cs.dev]# cat bTree-big-oh-artificial-bisect-verbose.log | grep "00014000" -A 15000 | grep ": 3" | wc -l
8
[root@localhost cs.dev]# cat bTree-big-oh-artificial-bisect-verbose.log | grep "00014000" -A 15000 | grep ": 2" | wc -l
4
[root@localhost cs.dev]# cat bTree-big-oh-artificial-bisect-verbose.log | grep "00014000" -A 15000 | grep ": 1" | wc -l
11266
[root@localhost cs.dev]#

dchapiesky · Post by **dchapiesky** » Wed Aug 30, 2017 4:48 pm

So what kind of Binary Tree do you believe you have created?

There are so many....

Have you tried comparing to AVL? or Red/Black?

cheers

Icee · Post by **Icee** » Thu Aug 31, 2017 6:41 am

ggodw000 wrote:But for median approach, it only took 13 iteration for highest iteration walk through when 15000 nodes were filled.

This sounds wrong to me. If you have a binary tree of height 13, it can at most host 2^13 - 1 = 8191 nodes.

dchapiesky · Post by **dchapiesky** » Fri Sep 01, 2017 5:34 pm

would you be willing to post your code.....?

ggodw000 · Post by **ggodw000** » Fri Sep 01, 2017 7:50 pm

found defect in the code. the code worked perfectly with the 96 node inserted as a small test. But with 2000 node insertion or more, flaws observed: only ~1500 of the 2000 nodes were inserted and others were ignored. Will fix and post it.

ggodw000 · Post by **ggodw000** » Fri Sep 01, 2017 7:51 pm

dchapiesky wrote:So what kind of Binary Tree do you believe you have created?

There are so many....

Have you tried comparing to AVL? or Red/Black?

cheers

I am starting with the simpler ones. sorted such that left child is always smaller then parent and right is always bigger than parent, no duplicate values. Forgot the name of it.

OSDev.org

did lot of experiment with tree algorithm

did lot of experiment with tree algorithm

Re: did lot of experiment with tree algorithm

Re: did lot of experiment with tree algorithm

Re: did lot of experiment with tree algorithm

Re: did lot of experiment with tree algorithm

Re: did lot of experiment with tree algorithm