Page 1 of 2

Revisiting a school homework assignment.

Posted: Thu Nov 23, 2017 8:04 am
by Darkpony
This semester in my algorithms and data structures class we had a maze finder backtracking recursion assignment. I found all the paths from a random start point and end point with an obstacle on a simulated 2d array board. The prof recorded the times it took to compute the boards and I placed in the middle. I was wondering if I could improve my time by programming bare metal. How would i get started? I am in assembly now, and I have completed bare bones but thats pretty much all I have done. My guess would be to use only the libraries that gcc gives me. Any decent guides out there? I have found this and was wondering if this is a decent guide. https://arobenko.gitbooks.io/bare_metal ... n_mem.html

Re: Revisiting a school homework assignment.

Posted: Thu Nov 23, 2017 11:17 am
by Korona
This is one of the cases where assembly barely helps. If you don't know assembly it will take you weeks to beat the compiler.

Did you run a worst-case exponential algorithm that tries all paths. Use a better algorithm like BFS (which equals Dijkstra's algorithm for undirected graphs) or even A*.

Re: Revisiting a school homework assignment.

Posted: Thu Nov 23, 2017 11:39 am
by Schol-R-LEA
Darkpony wrote:This semester in my algorithms and data structures class we had the old backtracking recursion assignment.


Which one? There are several widely used examples of backtracking by recursion used in such courses; from that sentence alone, I would have probably guessed that you meant Eight-Queens, but from the description given after that:
Darkpony wrote:I found all the paths from a random start point and end point with an obstacle on a simulated 2d array board. The prof recorded the times it took to compute the boards and I placed in the middle.
I am guessing you mean some variant of a Maze Path problem (where on a simply-connected maze or not), or some general pathfinding problem.
Darkpony wrote:I was wondering if I could improve my time by programming bare metal.
Probably not, though without a lot more information about the previous implementation and you're assembly language skills, I couldn't hazard a guess.

The first thing to realize is that the speed of most of the problems which fit the description given are dominated by algorithmic time complexity, not implementation efficiency - no matter how tight your code is, an O(n log n) algorithm (I am assuming here that the course covered Big-O notation) is going to be faster than an O(n^2) algorithm for any n above the difference in the marginal implementation cost in time of the two algorithms (by Marginal Implementation Cost in Time, I mean the time usage for a single pass of a given implementation of an algorithm - that is to say, its unit operational time. Time performance of an implementations is roughly proportional to the product of the MICT and the algorithmic complexity applied to the number of elements operated on. This means that, as a rule, for n below some point, the simpler implementation would be faster, but above that point, the more efficient algorithm would perform better even if it had a significantly higher MICT).

For a given implementation, the MICT in turn depends on how good the running code is - both how good a programmer you are, and how good the source code translator's generated code is. With assembly, the latter depends almost entirely on the former, while for a compiled program, it depends on how good the compiler's optimization is. Now, while it is true that a capable assembly programmer can - in principle - always outperform a compiler, in practice, few assembly programmers are skilled enough to beat even a modestly-optimizing compiler (such as GCC or Visual C++), and the larger the program, the harder it is to sustain that level of excellence. A really highly optimizing compiler can work with a lot more data points than a programmer can keep in their head, and grind over it a lot faster than the programmer could; but again, that level of optimization is a black art, so most compilers don't go to the extremes of ones like, say the Intel C compiler, or the Stalingrad Scheme compiler. As a rule, the results mirror that described above - for a really short program, hand-coding often wins, but the longer and more complex the program gets, the more things favor the compiler, even with no real optimizations applied.

Re: Revisiting a school homework assignment.

Posted: Thu Nov 23, 2017 1:16 pm
by Darkpony
Damn you guys are right. I don't know why I thought I would see a significant performance increase, and I don't have enough skill to beat the compiler anyway. My best bet would be to use the Dijkstra's algorithm.

Re: Revisiting a school homework assignment.

Posted: Thu Nov 23, 2017 4:13 pm
by Brendan
Hi,
Darkpony wrote:This semester in my algorithms and data structures class we had a maze finder backtracking recursion assignment. I found all the paths from a random start point and end point with an obstacle on a simulated 2d array board. The prof recorded the times it took to compute the boards and I placed in the middle. I was wondering if I could improve my time by programming bare metal.
By removing features (having no device drivers and no IRQs, having no security and no CPL=3 -> CPL=0 -> CPL=3 switches for kernel API, having no multitasking and no task switches, having none of the virtual memory tricks and no TLB misses) you might be able to improve performance by about 1%. This is what you'd end up with if you have "bare metal" code booted directly by GRUB with no OS at all.

Of course (on modern computers) you can probably find a way to use multiple CPUs/threads to improve performance by about 200%. For example; maybe you have 2 threads working from entry to exit where one takes left-hand turns first and the other takes right hand turns first, and 2 more threads working from exit to entry doing similar; where you've found a solution when an "entry to exit thread" and an "exit to entry thread" collide in the middle.


Cheers,

Brendan

Re: Revisiting a school homework assignment.

Posted: Thu Nov 23, 2017 7:05 pm
by Schol-R-LEA
Well, at the very least you should look at how it performs, and maybe try a few others like A* to see how each performs for the specific task. All depends one how far you want to take this.

Maybe if you could post or link to the exact assignment, and the code you wrote, we could say more.

On a semi-unrelated note, I did some testing of a number of things, and in the process I came up with a little Scheme snippet that takes a function representing a function's time complexity, and a unit running time (in some unspecified units - it could be days or nanoseconds, what matters is that all the compared implementations are rated in the same way) and returns a function which computes a total time to expected for a given application of the implementation (modulo any start-up and tear down times) for a size b.

Code: Select all

(define order-of-n
  (lambda (fn marginal)
      (lambda (n)
        (inexact->exact (ceiling (* marginal (fn n))))))
I don't know if it would help, but I figured I'd post it for... well, ego satisfaction, I suppose. I'm not sure how easy it would be to implement in C++ even with the addition of things like lambdas in C++11, but it ought to be easy to translate it into any language which has first-class higher-order functions (e.g., Haskell, ML, Erlang, CaML, most of the other Lisps).

As a quick example of using it:

Code: Select all

;; define a simple identity function that just returns its argument
(define identity (lambda (n) n))

;; most Scheme implementations only include a natural log function
(define logB 
  (lambda (x B)
    (lambda (x)
      (/ (log x) (log B)))))

(define log2 (logB 2))
(define log10 (logB 10))

(define n-1000 (order-of-n identity 1000))

(define nlgn-500 (order-of-n (lambda (n) (* n (log2 n))) 500))
(define nlogn-500 (order-of-n (lambda (n) (* n (log10 n))) 500))

(define lgn-10k (order-of-n log2 10000))
(define logn-10k (order-of-n log10 10000))
Which can be applied at the REPL as so, to show how the differences in unit time and algorithmic performance change things (lower is better):

Code: Select all

>>> (n-1000 1000000)
1000000000
>>> (logn-10k 1000000)
60000
>>> (lgn-10k 1000000)
199316
>>> (nlogn-500 1000000)
3000000000
>>> (nlgn-500 1000000)
9965784285
I may have more to say on this later.

Re: Revisiting a school homework assignment.

Posted: Fri Nov 24, 2017 1:04 am
by Darkpony
Erlang actually would be a really good idea, however at this point I have no experience with it. In programming languages we go over it, and this might be a good project for it. I can also implement threads, but at the time I wasn't very comfortable with them. Here was my assignment description.

Consider a rectangle divided into squares, like a checkerboard or chessboard. One square is marked as the hole. Another square is marked start, and a third square is marked finish. We will call the result a board.

Coordinates of squares on the board are zero-based. So x
-coordinates of squares are in the range 0 to dim_x−1, where dim_x is the x dimension of the board. And similarly for y

-coordinates.

Place a spider on the start square. The spider can step to an adjacent square on the board: north, south, east, west, or any of the four diagonal directions. It does this repeatedly until it reaches the finish square. A way of doing this that does not step on the hole, and steps on each other square exactly once, is a holey spider run.

Here is how I implemented my code
Test base case
(squaresleft - 2 == 0) && on finish positions
I increment my count. Incrementing a count means i found a path.

1.Check to see if my vector is in range
2.Check to see if I visited the Cell
3. Decrement -- squares
3.Change my pos_x and pos_y values
4.Change the value in that cell to 1
5.Begin recursion
6.Now begin backtracking. Change the value to 0
7.Increment squares
8.Revert my position.

I can post the source code if you guys really really want. I could have improved my time by checking to see if the board was impossible before starting my algorithm. A simple move right looks something like this..

Code: Select all

if (isCell(pos_x + 1, pos_y, dim_x, dim_y)) // checks to see if the vector is in range.
	{
		if (ary[(pos_x + 1)*(dim_y) + pos_y] == 0) // means we havent been there
		{
			--_squaresLeft; // decrement the squares left
			(pos_x) += 1; // move our position
			ary[(pos_x)*(dim_y) + (pos_y)] = 1; // set that position to visited
			countHSR_recurse(ary, dim_x, dim_y, finish_x, finish_y, pos_x, pos_y, _squaresLeft); // recursion
			ary[(pos_x)*(dim_y) + (pos_y)] = 0; // backtracking set our position to 0
			++_squaresLeft; // increment our squares left
			pos_x -= 1; // move back
		}
	}
I choose to use a 1d vector instead of a 2d array because I was having trouble dynamically allocating and passing it. However, you can simulate a 2d array with posx*dimy + posy. The order of this algorithm to find all the paths when run through the master theorem comes out as quadratic. This is probably why i came in somewhere in the middle of my class. So yea my algorithm isn't very good.

Re: Revisiting a school homework assignment.

Posted: Fri Nov 24, 2017 2:34 am
by Brendan
Hi,
Darkpony wrote:Consider a rectangle divided into squares, like a checkerboard or chessboard. One square is marked as the hole. Another square is marked start, and a third square is marked finish. We will call the result a board.

Coordinates of squares on the board are zero-based. So x
-coordinates of squares are in the range 0 to dim_x−1, where dim_x is the x dimension of the board. And similarly for y

-coordinates.

Place a spider on the start square. The spider can step to an adjacent square on the board: north, south, east, west, or any of the four diagonal directions. It does this repeatedly until it reaches the finish square. A way of doing this that does not step on the hole, and steps on each other square exactly once, is a holey spider run.
This isn't a maze at all. For something like this, you don't need backtracking or recursion, and it should be so fast that more threads can't help.

Normally there would be 2 "shortest paths" (diagonal then lateral, or lateral and then diagonal), and at least one of these paths will miss the hole. For example:

Code: Select all

    ...........
    .S___O_....
    ..\....\...
    ...\____\E.
    ...........
In this case, to calculate how much to go diagonally, it'd be "diagonal_size = min( abs(x1 - x2), abs(y1 - y2) );" and the rest should be fairly obvious (some "if(x1 < x2)" and "if(y1 < y2)" tests to figure out the direction, some "is hole on this line" checks).

The abnormal case is where "diagonal_size == 0" (where there's only one path from start to end, and that path is either a vertical or horizontal line) and the hole is on the line from start to end. In this case you just need an initial diagonal move before (re)doing the same as above (using the temporary location after the initial diagonal move as a new "start").


Cheers,

Brendan

Re: Revisiting a school homework assignment.

Posted: Fri Nov 24, 2017 4:19 am
by Darkpony
I probably should have mentioned that I was required to use recursion. Also you have to hit all squares on the board while not passing through a visited square ending on the finish. This counts towards one completed path. I must find all the paths from start to end. So finding the shortest path doesn't help for this problem.

Re: Revisiting a school homework assignment.

Posted: Fri Nov 24, 2017 3:12 pm
by Schol-R-LEA
Darkpony wrote:I probably should have mentioned that I was required to use recursion. Also you have to hit all squares on the board while not passing through a visited square ending on the finish. This counts towards one completed path. I must find all the paths from start to end. So finding the shortest path doesn't help for this problem.
Ah, OK, I think that was the missing detail we needed. So this is what is called a 'tour' problem, similar to the Knight's Tour. This type of graph theory problem is related to the Traveling Salesman Problem, except that instead of the shortest-path though all the vertices of a fixed set of edges and vertices, the intent is to find a path touring a finite Moore neighborhood upright square lattice; and to the Seven Bridges of Königsberg, where the restriction is on passage through the edges, rather than through the vertices.

Which brings up the question, is this meant to find a single suitable path (e.g., prove that one exists for the given layout), or the set of all possible paths (possibly by an exhaustive search of the problem space)? The former would probably take... a very long time for some layouts, especially if done through brute-force. It sounds like that is what is intended, and that is the real heart of the assignment.

Also, I am assuming that it is a bounded (planar) checkerboard graph rather an unbounded (e.g., spherical, toroidal) or partially bounded (e.g., mobius) one. In other words, that there are 'edges' to the 'board' that cannot be crossed, rather than having them wrap around to some other side, meaning that the squares along those edges don't have as many neighbors as the ones in the mid-board.

I am not what you would call an expert at any of this, but the topic comes up when implementing Game of Life (which is usually given as toroidal) and its variants, so I had some idea of where to look all of these terms up. You don't need to understand of them - I certainly don't - but the important thing is that you have some reference points for looking further into this.

I recommend looking at the pages regarding data structures for representing graphs as an abstract data type, specifically adjacency lists and adjacency matrices, especially the latter given that this amounts to a dense graph; however, you need to give some thought to how you are representing the found paths, both to make it easier to represent how you found them, and to allow you to apply some kinds of heuristics for pruning the path trees. A tree representing the successful paths might help with the former (reflecting the backtracking itself - the recursion is in effect creating an implicit tree of this type, you just need to make it explicit and persistent), while a priority queue might be of use in the latter.

Re: Revisiting a school homework assignment.

Posted: Sat Nov 25, 2017 1:25 pm
by DavidCooper
It's an interesting challenge, but I wonder how much people learn from it if they aren't shown afterwards how they could have improved their code. There are a number of things you may not be doing that other people may be doing. Here are some ideas I had last night while thinking about it.

(1) If you're working with an 8x8 grid of squares, you could use a 10x10 grid and mark all the ones by the edge as "visited" so that you never need to make additional checks to see if you've left the board, and they'll always remain marked as "visited" because when you backtrack a path you will never visit them to unmark them. (The square with the hole should be marked as "visited" in the same way because that is all you need to stop the path running through it.) Note that you can simply make moves by adding or subtracting the values 1, 9, 10 or 11 (or rather, a multiple of those, the size of which depends on how big the set of variables is for a single square) without ever having to worry about where you are on the grid or where you're moving to. On each move, you would use one of the new square's variables to store the address of the variable set for the previous square, so backtracking is simply a matter of retrieving that address to point at the previous square.

(2) Unvisited squares could be marked using the value 3, while the finish square would have the value 2 instead. This allows a single subtraction of 2 to set the zero flag if you've reached the final square (which is perhaps the first thing you should check when you land on a square, and if the zero flag is set you would then check the count to see if all squares have been visited - if they haven't, then you should abandon that path immediately and backtrack). Without needing to do a second calculation, you can then check the carry flag to see if the square has already been visited. (Note that the count of how many squares you've visited need only ever be checked when you find you're on the finish square, so we can eliminate a lot of unnecessary processing.) [Maybe I've got that the wrong way round and the carry flag should be checked first, but I'll leave you to think about that for yourself.]

(3) When backtracking, it isn't necessary to check whether the count is zero if it's a backtrack from a failed advance, so make sure you aren't wasting processor time on that. You only need to check the count when backtracking from a square which has had all eight paths starting from it checked (meaning that there are no further potential paths from it left to explore).


If each square has a set of 8 variables labelling which paths have been taken from it (with the current path being explored), there may be ways of avoiding some moves by marking them as checked already. For example, a square horizontally or vertically adjacent to a corner square could mark the path to the other square that's horizontally or vertically adjacent to the same corner square as explored (unless the start or finish square is that corner square), but as soon as you start trying to use such a trick, you increase the amount of processing required when working out the next move from that square and when backtracking (where you wouldn't want to erase that marking), and that extra processing would slow down every move made, probably costing you much more than it saves. It's hard to think of any trick of this kind that won't increase the amount of processing for each step and cost more than the gains, but it could be worth experimenting with some ideas to see if they pay off.

For speed reasons, don't use calls and rets; avoid using the stack if everything can be done with registers; disable paging and interrupts; and it may be worth checking to see if 16-bit mode is faster than 32-bit mode - it may be quicker at storing and retrieving addresses, but that would depend on the 16-bit instructions not running slower than the 32-bit equivalents.

If you're doing multiprocessing, you could set each processor going from the start square with a different direction to explore, but you will then have to check on every backtrack that isn't from a failed advance to see if you've returned to the start square so that no processor goes on to duplicate work that's already been done by another (as that would provide a wrong answer to the total number of viable paths). If one processor completes its task early, it's hard to see how it could be allocated extra work by another processor without increasing the amount of work done at each step, so there are a lot of challenges to explore there.

If any of these ideas are new to you, discuss them with other members of your class - by sharing ideas with them you'll get ideas back from them too and you'll all advance faster.

Re: Revisiting a school homework assignment.

Posted: Sat Nov 25, 2017 10:01 pm
by Brendan
Hi,
Darkpony wrote:I probably should have mentioned that I was required to use recursion. Also you have to hit all squares on the board while not passing through a visited square ending on the finish. This counts towards one completed path. I must find all the paths from start to end. So finding the shortest path doesn't help for this problem.
That makes it harder, but there's still a solution to the specific problem that is faster/cheaper than a generic solution (like A* or similar). The difficult part is designing an algorithm for the specific problem.

I'd begin by splitting the board up into rectangular "sub-boards" by adding (at most) 3 vertical and 3 horizontal "cutting lines" that pass through the starting point, hole and the finishing point. This gives you 6 to 16 sub-boards; where the start, hole and exit are in the corner of a sub-board and most sub-boards (3 to 13 of them) don't contain the start, hole or exit. For each of these rectangular sub-boards there's only 3 "fill patterns" that matter (that result in a exit point that is on an edge); "lots of parallel lines" (exit in a corner), "lots of diagonal lines" (exit in the opposite corner to the entry) and "U shape" (exit in the middle of an edge). There's different orientations of these fill patterns (rotations, mirrors); and there's trivial restrictions - e.g. if the sub-board's entry point is in the middle of an edge then you must use "U shape" with a specific rotation, if the width is odd you can't use 2 of the orientations of "U shape" (because you wouldn't end on an edge), etc. More importantly; it's easy to use the entry point and width and height of the sub-board to determine valid orientations of fill patterns and determine the exit point for each valid orientation of each fill pattern (without actually filling any individual tiles).

The basic idea here would be to use that information to find a path through entire sub-boards that goes from "sub-board containing starting point" to "sub-board containing finishing point". This could/should be extremely quick, because there's a maximum of only 16 sub-boards (the actual size of the whole board is irrelevant - it could/should take the same amount of time to find a solution for a 500*500 board as it does for a 5*5 board because there's a maximum of 16 sub-boards in all cases). Because there's only 16 sub-boards a simple brute force search would be fine.

The other hard part would be trying to figure out how to make everything worse (slower, more complex) to fulfil the pointless "must have recursion" requirement.


Cheers,

Brendan

Re: Revisiting a school homework assignment.

Posted: Sat Nov 25, 2017 11:39 pm
by Schol-R-LEA
Brendan, while your advice is helpful, it isn't quite on the mark. I think you missed two salient points.

First, that that it isn't A-B pathfinding, it's a tour - that is, the path has to pass through every square, exactly once.

Second, that (by my interpretation, anyway) it isn't finding a tour, it is finding every tour - it's an exhaustive mapping of the tour space for the given board, starting point, and ending point.

My reading of this is that the goal is to create a tree that contains all of the possible tours of the board, with the branch nodes representing the differences between two paths - hence the backtracking.

Now, the divide-and-conquer approach you describe is still relevant and usable, but the way it would be used (if I am correct in my reading of the assignment) would be a bit different. What the OP would do is, in effect, build the first entries in the tree bottom up rather than top down, and memoize the layouts as they are constructed, so that when it finds a sub-board which matches a previous one (or a rotation of a previous one) it can simply copy the existing branch.

I'm interested enough to work on this some more myself, and will try to come up with a pseudo-code explanation of what I have in mind tomorrow.

Re: Revisiting a school homework assignment.

Posted: Sun Nov 26, 2017 11:00 am
by DavidCooper
(4) You can get rid of the additions too for finding the next square. Each square's set of variables should start with the eight addresses of the squares around it. A variable will count down how many of them have been tried already from the current square, counting down in multiples of the number of bytes used to hold an address. If using 32-bit addresses, you'd subtract 4 each time and check the zero flag to see if they've all been done, then you use the count value as an offset into the list of eight addresses to collect the address of the next square to go to.

The variables for a square start then with eight addresses, then a byte to label the square as visited/unvisited/finish, then three unused bytes, then a place to store the address of the previous square, then the direction variable that counts down in 4s the number of taken paths from the current square.

[Edit: move the visited/unvisited/finish variable and the three unused bytes ahead of the list of 8 addresses - when the direction variable reaches zero, you don't use an address from that offset, so I had the addresses out of position by four bytes.]

When moving from one square to the next, the program loads the address of the next square's set of variables into SI (with the current square's variables pointed to by DI), and it collects the visited/unvisited/finish variable for the next square, subtracts 2 from it (in a register), then checks the carry flag to see if it has already been visited; if not, it checks the zero flag to see if it's the finish square; if not, it saves DI in the new square's variable set, moves SI to DI (unless two versions of the routine are used on alternate steps to avoid the need to switch register), then the direction variable is collected, has 4 taken away (result must be posted back) and used as an offset into the list of addresses to find the address of the next square's set of variables. It's that quick - minimal processing for each square visited.

The other thing to pay attention to if you want maximum speed is the alignment of parts of the code which are jumped to - the addresses of these should be multiples of 4, or maybe 8 or 16 - try different spacings and time the difference in performance to see what happens on different hardware.

The really interesting question though is whether any kind of shortcuts can be taken to avoid crunching every step of every line (up to the point where it hits the finish square) without adding so much more time to the processing of a single step that it more than cancels out the gains. I've been wondering if a second processor could watch the action and tell the first to abandon lines that can't work (e.g. in cases where there are unvisited squares with no open path left between them), but this might be more costly than having that processor work independently using the same brute-force approach as the first instead of helping the first processor abandon a minority of lines early. To force the first processor to abandon a line early, the first processor would need to keep checking to see if it's been told to abandon a line, and that would slow it down, so I see a lot of losses and little gain in return.

Re: Revisiting a school homework assignment.

Posted: Sun Nov 26, 2017 12:46 pm
by Schol-R-LEA
Gaah, I just noticed a real problem in my previous post. I've fixed it now, and hopefully, that will be the last one (though it probably isn't).

Also, as for the recursion being 'pointless', I am pretty sure that using recursion is the point of the assignment - or rather, the point was for the students to demonstrate that they understood recursive data structures (specifically acyclic ones such as trees) - it is a data structures course, after all - and how recursive functions could be used to build and traverse them. The part Brendan is dismissing is actually the whole purpose of the exercise.

Regardless of whether you see this as something worth teaching to the students or not, it is what the professor chose to teach, and the OP was stuck with it, at least in the assignment itself. If Darkpony chooses to adhere to that restriction now that the assignment is over, for whatever reason, that's their choice.

We should probably wait until Darkpony shows up in the forum again so we can ask about it.

That having been said, comparing the recursive solutions to one or more iterative ones would probably be instructive, once Darkpony is satisfied with what could be learned from the recursive methods. I would recommend applying profilers such as gprof and memory checkers such as valgrind (or whichever ones are appropriate for the compiler and toolchain in use) to the programs as well.