did lot of experiment with tree algorithm

Programming, for all ages and all languages.
Post Reply
ggodw000
Member
Member
Posts: 396
Joined: Wed Nov 18, 2015 3:04 pm
Location: San Jose San Francisco Bay Area
Contact:

did lot of experiment with tree algorithm

Post by ggodw000 »

Needed to soup up with algorithm practice so started doing binary tree in C.
Sectioned my project into two configurable var-s: small data, huge data
small data - for testing the algorithm.
huge data - stressing over large data to see any anomaly appears.

either way i fed the 0-100 (small data) and varying number of data sets (1000, 2000....14000, 15000) into tree.

With large data with thousands of members latter, if inserted into incrementing numbers, it will be worst case, as tree is of course horribly, completely skewed to one side, it was just performing like list. and once approaching the 15000 it tooks about quite a few seconds to insert 15000 units.

Now to make most efficient, recursively call insert bottom by finding median and calling insert bottom. Doing so maximized the insertion efficiency (in another words, minimized the insert time).

But man, I was surprised by how much different it makes between these two insertion methods. Dedicated one variable: insertIterationCounter which increments each time the function recursively walks through tree nodes for single data insertion into a node.

When data set size for example 15000 is fed incrementtedly 1 by 1, then the last and highest value of 15000 took 15000 iteration to insert. Picking the median and working simultenously incrementing and decrementing will improve only by factor of 2. The tree will have only two huge branches of 7500 data.

But for median approach, it only took 13 iteration for highest iteration walk through when 15000 nodes were filled. WIth this large set, and only max iteration of 13, I thought there is something wrong with my algorithm with treePrint function and output to log file and indeed tree had 15000 nodes inserted.

So I counted the number of data that are inserted to binary tree through 13 iteration before being inserted, that was 2000! Which means base of the binary tree become 2000 wide!! Then 13 iterations made sense.
When performed insertion took less than a time it takes to blink an eye.

Efficiency compared:
Incrementing data sets: max iteration: through 15000 nodes.
Recursive insertion with median: max iteration: through 13 nodes.


When grouping the data-s inserted by number of iterations, here is the following statistics I got, looks like there might still be some problem with algorithm or skewage toward the end, but seems to be working fairly good.

Code: Select all

[root@localhost cs.dev]# cat bTree-big-oh-artificial-bisect-verbose.log | grep "00014000" -A 15000 | grep ": 13" | wc -l
2048
[root@localhost cs.dev]# cat bTree-big-oh-artificial-bisect-verbose.log | grep "00014000" -A 15000 | grep ": 12" | wc -l
4096
[root@localhost cs.dev]# cat bTree-big-oh-artificial-bisect-verbose.log | grep "00014000" -A 15000 | grep ": 11" | wc -l
4096
[root@localhost cs.dev]# cat bTree-big-oh-artificial-bisect-verbose.log | grep "00014000" -A 15000 | grep ": 10" | wc -l
1024
[root@localhost cs.dev]# cat bTree-big-oh-artificial-bisect-verbose.log | grep "00014000" -A 15000 | grep ": 9" | wc -l
512
[root@localhost cs.dev]# cat bTree-big-oh-artificial-bisect-verbose.log | grep "00014000" -A 15000 | grep ": 8" | wc -l
256
[root@localhost cs.dev]# cat bTree-big-oh-artificial-bisect-verbose.log | grep "00014000" -A 15000 | grep ": 7" | wc -l
128
[root@localhost cs.dev]# cat bTree-big-oh-artificial-bisect-verbose.log | grep "00014000" -A 15000 | grep ": 6" | wc -l
64
[root@localhost cs.dev]# cat bTree-big-oh-artificial-bisect-verbose.log | grep "00014000" -A 15000 | grep ": 5" | wc -l
32
[root@localhost cs.dev]# cat bTree-big-oh-artificial-bisect-verbose.log | grep "00014000" -A 15000 | grep ": 4" | wc -l
16
[root@localhost cs.dev]# cat bTree-big-oh-artificial-bisect-verbose.log | grep "00014000" -A 15000 | grep ": 3" | wc -l
8
[root@localhost cs.dev]# cat bTree-big-oh-artificial-bisect-verbose.log | grep "00014000" -A 15000 | grep ": 2" | wc -l
4
[root@localhost cs.dev]# cat bTree-big-oh-artificial-bisect-verbose.log | grep "00014000" -A 15000 | grep ": 1" | wc -l
11266
[root@localhost cs.dev]#
key takeaway after spending yrs on sw industry: big issue small because everyone jumps on it and fixes it. small issue is big since everyone ignores and it causes catastrophy later. #devilisinthedetails
User avatar
dchapiesky
Member
Member
Posts: 204
Joined: Sun Dec 25, 2016 1:54 am
Libera.chat IRC: dchapiesky

Re: did lot of experiment with tree algorithm

Post by dchapiesky »

So what kind of Binary Tree do you believe you have created?

There are so many....

Have you tried comparing to AVL? or Red/Black?

cheers
Plagiarize. Plagiarize. Let not one line escape thine eyes...
Icee
Member
Member
Posts: 100
Joined: Wed Jan 08, 2014 8:41 am
Location: Moscow, Russia

Re: did lot of experiment with tree algorithm

Post by Icee »

ggodw000 wrote:But for median approach, it only took 13 iteration for highest iteration walk through when 15000 nodes were filled.
This sounds wrong to me. If you have a binary tree of height 13, it can at most host 2^13 - 1 = 8191 nodes.
User avatar
dchapiesky
Member
Member
Posts: 204
Joined: Sun Dec 25, 2016 1:54 am
Libera.chat IRC: dchapiesky

Re: did lot of experiment with tree algorithm

Post by dchapiesky »

would you be willing to post your code.....?
Plagiarize. Plagiarize. Let not one line escape thine eyes...
ggodw000
Member
Member
Posts: 396
Joined: Wed Nov 18, 2015 3:04 pm
Location: San Jose San Francisco Bay Area
Contact:

Re: did lot of experiment with tree algorithm

Post by ggodw000 »

found defect in the code. the code worked perfectly with the 96 node inserted as a small test. But with 2000 node insertion or more, flaws observed: only ~1500 of the 2000 nodes were inserted and others were ignored. Will fix and post it.
key takeaway after spending yrs on sw industry: big issue small because everyone jumps on it and fixes it. small issue is big since everyone ignores and it causes catastrophy later. #devilisinthedetails
ggodw000
Member
Member
Posts: 396
Joined: Wed Nov 18, 2015 3:04 pm
Location: San Jose San Francisco Bay Area
Contact:

Re: did lot of experiment with tree algorithm

Post by ggodw000 »

dchapiesky wrote:So what kind of Binary Tree do you believe you have created?

There are so many....

Have you tried comparing to AVL? or Red/Black?

cheers
I am starting with the simpler ones. sorted such that left child is always smaller then parent and right is always bigger than parent, no duplicate values. Forgot the name of it.
key takeaway after spending yrs on sw industry: big issue small because everyone jumps on it and fixes it. small issue is big since everyone ignores and it causes catastrophy later. #devilisinthedetails
Post Reply