Looking for volunteer to develop parallel Bochs architecture

This forums is for OS project announcements including project openings, new releases, update notices, test requests, and job openings (both paying and volunteer).
User avatar
bewing
Member
Member
Posts: 1401
Joined: Wed Feb 07, 2007 1:45 pm
Location: Eugene, OR, US

Re: Looking for volunteer to develop parallel Bochs architecture

Post by bewing »

Hi Brendan,

Your proposal is quite interesting ... but there are some practical drawbacks that I will have to think about.
Things like: spinlocks aren't so good, if you happen to be on a singleprocessor system with 100ms timeslices (and you can't tell whether you are or not with normal programming techniques).

I agree that it's not possible to fully sync -- and anyway real hardware doesn't fully sync either, so there is no good reason to do it. You can sleep(0), of course.

Bruce
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: Looking for volunteer to develop parallel Bochs architecture

Post by Brendan »

Hi,
bewing wrote:Your proposal is quite interesting ... but there are some practical drawbacks that I will have to think about.
Things like: spinlocks aren't so good, if you happen to be on a singleprocessor system with 100ms timeslices (and you can't tell whether you are or not with normal programming techniques).
If you can't get the CPU affinity for the process, then you might have to assume it's single-CPU (but that's fine because you're probably running on DOS anyway). :)
bewing wrote:I agree that it's not possible to fully sync -- and anyway real hardware doesn't fully sync either, so there is no good reason to do it. You can sleep(0), of course.
Hehe. If you use "sleep(0)" then there won't be any other threads (in the same process) to run and the OS's scheduler will probably end up doing a full process switch (flushing your TLB entries and cache). In this case it's probably better to spin to keep caches warm.

For my idea you should be able to provide a "maximum number of instructions executed by a thread before switching to another CPU" setting (very similar to what Bochs already does) and remain synchronized to within N emulated instructions.


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
stlw
Member
Member
Posts: 357
Joined: Fri Apr 04, 2008 6:43 am
Contact:

Re: Looking for volunteer to develop parallel Bochs architecture

Post by stlw »

(And if my code doesn't run 5 times faster than yours, I'll eat my hat. High-efficiency code is my specialty, and I am even published once on the topic.)
Keep your hat for now, high efficiency code is my speciality as well :)
And do not forget about Amdahl law!
The Siminterface has been deleted. Most of textconfig has been deleted. Access to variables through the param tree is done for user display only, and the param tree has been completely redefined and simplified.
Looks fine but will not change things that much.
The testconfig and siminterface is running less than 0.5% of the time. Simplest profiling shows what 99.5% of the time you running CPU emulation code.
I agree with you, siminterface is ugly and good to be be rewritten. But even throwing it away completely won't change if you want to reach "5 times faster" goal.
All the CPU memory structures have been completely redefined for efficiency (as arrays).
What is the CPU memory structure (I just don't understand what do you mean) ?
Could you send me an example of your redefined CPU code ?
The selected debugger now is completely in charge of driving the simulation. The interface between the init code and the debuggers has been standardized.All the multithreaded mods above have been implemented.
Cool! This is what I wanted to get on Bochs 2.3.8 time frame but not succeded because of ... nobody responded when I asked about this interface and help to define it.
The instruction decoder has been made more efficient than the instruction cache, and therefore the instruction cache has been deleted.
Now it is my time: I will eat my hat if you could show x86 instruction decoder which is faster than just blind copy of the decoded instruction from the cache.
The instruction cache/trace cache in Bochs could be improved 2-3 times at least but there is no way to write x86 decode which is faster than caching.
I remember you described how do you plan to do the decoder. I could guarantee you pretty nice slowdown of 2-5x against cache.
BX_EVENTS have been deleted.
What do you mean ? Did it exists ?
Breakpoints have been simplified and are managed for efficiency.
I once asked you this new break point code for current Bochs, could you send it - it is pretty good idea and I want to integrate it into today's Bochs.
(Who knows when and if the rewrite will be finished)
All the code is written in pure C for portability and simplicity.
Portability or simplicity even not near here.
Current Bochs code written with "very basic" C++, i.e. we stay away of teemplates, STL, even hardly try to stay away of virtual functions unless required.
Personally me highly agains virtual functions of any complex classes hierarhy in Bochs.
Such subset of C++ can't more portable than C :)

About simplicity we can argue a lot. I still will not belive that pure C code is simpler to understand.
Look on your GUI debugger code - Once you doing smth complicated you end up with dozens of global variables (which you never understand) or catch yourslef emulating C++ in C with structs which you must transfer to all the function by pointer (not adds to simplicity as well).
All devices are always compiled into the code, and "enabled" at runtime. Plugins have been eliminated.
You missed the point here. The "plugins" were invented to allow user to write a device module separately from Bochs and attach it to the already comiled Bochs model. You can argue this is not working yet, even now, but this is certainly an intention. Plugins - the only way I see to enable such capability.
Most conditional code has been eliminated. Almost all compile-time options have been eliminated, and turned into runtime options (bochsrc or commandline). One GUI and one textmode debugger are selected at compile time (only).
You want to say that checking for every instruction if it supported on current CPU (which is configured on runtime) is faster than just conditionally compile all such features ? Add yourself another 1.5x slowdown vs current Bochs.
Next, I am going to redefine and simplify the interface between the sim and the devices.
Bruce, of course I am a bit cynic above, and I am sure some of your ideas are very good.
I am will be very glad to replace param tree in current Bochs already or take some of your ideas into the code.
You should understand, once in a year or two comes a person with ideas of complete rewrite of whole Bochs code to make it 10 times faster and 10 times simpler. You know, it is never succeed yet. On paper everything looks perfect but becames different when converted to the algorithms and code.
But every such person always has brilliant ideas which give a break-through to Bochs. And I am already schedule a next break-through, even if I can't tell your rewrite will succeed or even happen.

Stanislav
User avatar
bewing
Member
Member
Posts: 1401
Joined: Wed Feb 07, 2007 1:45 pm
Location: Eugene, OR, US

Re: Looking for volunteer to develop parallel Bochs architecture

Post by bewing »

high efficiency code is my speciality as well
Then we will just have to see who is better at it, won't we? :wink:
The "plugins" were invented to allow user to write a device module separately from Bochs and attach it to the already comiled Bochs model.
Which turns out not to be useful. Which is why it has been deleted.
Looks fine but will not change things that much.
Not for speed, but it simplifies the code tremendously.
You want to say that checking for every instruction if it supported on current CPU (which is configured on runtime) is faster than just conditionally compile all such features ?
No, you simply choose the proper decoder function in advance to match the CPU. Then you don't need to check anything at all.
You should understand, once in a year or two comes a person with ideas of complete rewrite of whole Bochs code to make it 10 times faster and 10 times simpler. You know, it is never succeed yet.
I'm not surprised, but that's because they don't have the guts to use their delete key enough on the current code. My delete key is currently smoking, and partially melted.
Once you doing smth complicated you end up with dozens of global variables ...
Yup. Our ideas of aesthetic code are different. So what?
Brendan wrote: If you can't get the CPU affinity for the process, then you might have to assume it's single-CPU (but that's fine because you're probably running on DOS anyway).
I suspect that you are assuming (even more than I am) that the simulation host system is x86-based. :wink:
In this case it's probably better to spin to keep caches warm.
I'd rather let the other processes on the host system have the extra time. Bochs itself already runs at "below normal priority" (I think this is a mistake to do) to allow all the rest of the machine more time.
mkaushik
Posts: 2
Joined: Tue Apr 28, 2009 4:51 pm

Re: Looking for volunteer to develop parallel Bochs architec

Post by mkaushik »

Is this work still in progress? I can help out.
User avatar
bewing
Member
Member
Posts: 1401
Joined: Wed Feb 07, 2007 1:45 pm
Location: Eugene, OR, US

Re: Looking for volunteer to develop parallel Bochs architec

Post by bewing »

Yes, I assume that there are two concurrent projects doing this. I think Stanislav is still considering adding these features to Bochs. I can tell you from experience that adding multithreading to Bochs is not easy in the least. So you can contribute code to Bochs if you like.

Also, a year ago when we built this thread, I decided to create a complete rewrite of Bochs with these features. I have been working on it since. The parts that exist are running beautifully, the coding is readable and clear and fast. There are only 8 small items left on the todo list, before I release "BeBochs" for alpha testing and a request for code contributions. I am expecting to do exactly that within 2 weeks. If you want to contribute code now or when I release for alpha testing, I would be delighted with the help.
stlw
Member
Member
Posts: 357
Joined: Fri Apr 04, 2008 6:43 am
Contact:

Re: Looking for volunteer to develop parallel Bochs architec

Post by stlw »

bewing wrote:Yes, I assume that there are two concurrent projects doing this. I think Stanislav is still considering adding these features to Bochs. I can tell you from experience that adding multithreading to Bochs is not easy in the least. So you can contribute code to Bochs if you like.

Also, a year ago when we built this thread, I decided to create a complete rewrite of Bochs with these features. I have been working on it since. The parts that exist are running beautifully, the coding is readable and clear and fast. There are only 8 small items left on the todo list, before I release "BeBochs" for alpha testing and a request for code contributions. I am expecting to do exactly that within 2 weeks. If you want to contribute code now or when I release for alpha testing, I would be delighted with the help.

I have three vectors in mind while developing Bochs (sorted by importancy to me):

1. Accuracy and reach content (new instructions, modes or functionality)
2. Speed
3. Simplicity

Because accuracy is most important, last time I was dealing with mostly accuracy issues (more than 50 accuracy bug fixes in CVS since last release, large part is VMX emulation). So I had no time to start with multithreading story yet. Actually I think this will require significant change in how timing emulated in Bochs, and I setill have no idea what is BKM here :(

But I am also open for changes/patches or even modules replacements what could be done for simplicity.
I still hate how param_tree is implemented in Bochs and wonder if I could find better replacement, I am still wonder if device models structure could be simpler. Not sure about CPU changes Bryce suggests, but just because I never saw them. It sounds to me that code is going to be absolutely unmaintenable (like GUI-debugger code today) but I might be wrong :)

if you have some ideas from the thread above and you are like to discuss them and probably even want to start some implementation - I will help as much as I can.

Stanislav

P.S> 2bewing: could you post the list of what you already done + probably even non-compiling pre-alpha of your BeBochs ?
Even in private if you don't want to publish yet.
jal
Member
Member
Posts: 1385
Joined: Wed Oct 31, 2007 9:09 am

Re: Looking for volunteer to develop parallel Bochs architec

Post by jal »

bewing wrote:Also, a year ago when we built this thread, I decided to create a complete rewrite of Bochs with these features. I have been working on it since.
Cool you're still working on that, as I've been wondering lately what the status would be.
before I release "BeBochs"
That name will get you killed :).


JAL
User avatar
bewing
Member
Member
Posts: 1401
Joined: Wed Feb 07, 2007 1:45 pm
Location: Eugene, OR, US

Re: Looking for volunteer to develop parallel Bochs architec

Post by bewing »

You'd have to catch me first, and it's all quok's fault anyway -- send all death threats to him.
Post Reply