Hi Brendan,
Your proposal is quite interesting ... but there are some practical drawbacks that I will have to think about.
Things like: spinlocks aren't so good, if you happen to be on a singleprocessor system with 100ms timeslices (and you can't tell whether you are or not with normal programming techniques).
I agree that it's not possible to fully sync -- and anyway real hardware doesn't fully sync either, so there is no good reason to do it. You can sleep(0), of course.
Bruce
Looking for volunteer to develop parallel Bochs architecture
Re: Looking for volunteer to develop parallel Bochs architecture
Hi,
For my idea you should be able to provide a "maximum number of instructions executed by a thread before switching to another CPU" setting (very similar to what Bochs already does) and remain synchronized to within N emulated instructions.
Cheers,
Brendan
If you can't get the CPU affinity for the process, then you might have to assume it's single-CPU (but that's fine because you're probably running on DOS anyway).bewing wrote:Your proposal is quite interesting ... but there are some practical drawbacks that I will have to think about.
Things like: spinlocks aren't so good, if you happen to be on a singleprocessor system with 100ms timeslices (and you can't tell whether you are or not with normal programming techniques).
Hehe. If you use "sleep(0)" then there won't be any other threads (in the same process) to run and the OS's scheduler will probably end up doing a full process switch (flushing your TLB entries and cache). In this case it's probably better to spin to keep caches warm.bewing wrote:I agree that it's not possible to fully sync -- and anyway real hardware doesn't fully sync either, so there is no good reason to do it. You can sleep(0), of course.
For my idea you should be able to provide a "maximum number of instructions executed by a thread before switching to another CPU" setting (very similar to what Bochs already does) and remain synchronized to within N emulated instructions.
Cheers,
Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Re: Looking for volunteer to develop parallel Bochs architecture
Keep your hat for now, high efficiency code is my speciality as well(And if my code doesn't run 5 times faster than yours, I'll eat my hat. High-efficiency code is my specialty, and I am even published once on the topic.)
And do not forget about Amdahl law!
Looks fine but will not change things that much.The Siminterface has been deleted. Most of textconfig has been deleted. Access to variables through the param tree is done for user display only, and the param tree has been completely redefined and simplified.
The testconfig and siminterface is running less than 0.5% of the time. Simplest profiling shows what 99.5% of the time you running CPU emulation code.
I agree with you, siminterface is ugly and good to be be rewritten. But even throwing it away completely won't change if you want to reach "5 times faster" goal.
What is the CPU memory structure (I just don't understand what do you mean) ?All the CPU memory structures have been completely redefined for efficiency (as arrays).
Could you send me an example of your redefined CPU code ?
Cool! This is what I wanted to get on Bochs 2.3.8 time frame but not succeded because of ... nobody responded when I asked about this interface and help to define it.The selected debugger now is completely in charge of driving the simulation. The interface between the init code and the debuggers has been standardized.All the multithreaded mods above have been implemented.
Now it is my time: I will eat my hat if you could show x86 instruction decoder which is faster than just blind copy of the decoded instruction from the cache.The instruction decoder has been made more efficient than the instruction cache, and therefore the instruction cache has been deleted.
The instruction cache/trace cache in Bochs could be improved 2-3 times at least but there is no way to write x86 decode which is faster than caching.
I remember you described how do you plan to do the decoder. I could guarantee you pretty nice slowdown of 2-5x against cache.
What do you mean ? Did it exists ?BX_EVENTS have been deleted.
I once asked you this new break point code for current Bochs, could you send it - it is pretty good idea and I want to integrate it into today's Bochs.Breakpoints have been simplified and are managed for efficiency.
(Who knows when and if the rewrite will be finished)
Portability or simplicity even not near here.All the code is written in pure C for portability and simplicity.
Current Bochs code written with "very basic" C++, i.e. we stay away of teemplates, STL, even hardly try to stay away of virtual functions unless required.
Personally me highly agains virtual functions of any complex classes hierarhy in Bochs.
Such subset of C++ can't more portable than C
About simplicity we can argue a lot. I still will not belive that pure C code is simpler to understand.
Look on your GUI debugger code - Once you doing smth complicated you end up with dozens of global variables (which you never understand) or catch yourslef emulating C++ in C with structs which you must transfer to all the function by pointer (not adds to simplicity as well).
You missed the point here. The "plugins" were invented to allow user to write a device module separately from Bochs and attach it to the already comiled Bochs model. You can argue this is not working yet, even now, but this is certainly an intention. Plugins - the only way I see to enable such capability.All devices are always compiled into the code, and "enabled" at runtime. Plugins have been eliminated.
You want to say that checking for every instruction if it supported on current CPU (which is configured on runtime) is faster than just conditionally compile all such features ? Add yourself another 1.5x slowdown vs current Bochs.Most conditional code has been eliminated. Almost all compile-time options have been eliminated, and turned into runtime options (bochsrc or commandline). One GUI and one textmode debugger are selected at compile time (only).
Bruce, of course I am a bit cynic above, and I am sure some of your ideas are very good.Next, I am going to redefine and simplify the interface between the sim and the devices.
I am will be very glad to replace param tree in current Bochs already or take some of your ideas into the code.
You should understand, once in a year or two comes a person with ideas of complete rewrite of whole Bochs code to make it 10 times faster and 10 times simpler. You know, it is never succeed yet. On paper everything looks perfect but becames different when converted to the algorithms and code.
But every such person always has brilliant ideas which give a break-through to Bochs. And I am already schedule a next break-through, even if I can't tell your rewrite will succeed or even happen.
Stanislav
Re: Looking for volunteer to develop parallel Bochs architecture
Then we will just have to see who is better at it, won't we?high efficiency code is my speciality as well
Which turns out not to be useful. Which is why it has been deleted.The "plugins" were invented to allow user to write a device module separately from Bochs and attach it to the already comiled Bochs model.
Not for speed, but it simplifies the code tremendously.Looks fine but will not change things that much.
No, you simply choose the proper decoder function in advance to match the CPU. Then you don't need to check anything at all.You want to say that checking for every instruction if it supported on current CPU (which is configured on runtime) is faster than just conditionally compile all such features ?
I'm not surprised, but that's because they don't have the guts to use their delete key enough on the current code. My delete key is currently smoking, and partially melted.You should understand, once in a year or two comes a person with ideas of complete rewrite of whole Bochs code to make it 10 times faster and 10 times simpler. You know, it is never succeed yet.
Yup. Our ideas of aesthetic code are different. So what?Once you doing smth complicated you end up with dozens of global variables ...
I suspect that you are assuming (even more than I am) that the simulation host system is x86-based.Brendan wrote: If you can't get the CPU affinity for the process, then you might have to assume it's single-CPU (but that's fine because you're probably running on DOS anyway).
I'd rather let the other processes on the host system have the extra time. Bochs itself already runs at "below normal priority" (I think this is a mistake to do) to allow all the rest of the machine more time.In this case it's probably better to spin to keep caches warm.
Re: Looking for volunteer to develop parallel Bochs architec
Is this work still in progress? I can help out.
Re: Looking for volunteer to develop parallel Bochs architec
Yes, I assume that there are two concurrent projects doing this. I think Stanislav is still considering adding these features to Bochs. I can tell you from experience that adding multithreading to Bochs is not easy in the least. So you can contribute code to Bochs if you like.
Also, a year ago when we built this thread, I decided to create a complete rewrite of Bochs with these features. I have been working on it since. The parts that exist are running beautifully, the coding is readable and clear and fast. There are only 8 small items left on the todo list, before I release "BeBochs" for alpha testing and a request for code contributions. I am expecting to do exactly that within 2 weeks. If you want to contribute code now or when I release for alpha testing, I would be delighted with the help.
Also, a year ago when we built this thread, I decided to create a complete rewrite of Bochs with these features. I have been working on it since. The parts that exist are running beautifully, the coding is readable and clear and fast. There are only 8 small items left on the todo list, before I release "BeBochs" for alpha testing and a request for code contributions. I am expecting to do exactly that within 2 weeks. If you want to contribute code now or when I release for alpha testing, I would be delighted with the help.
Re: Looking for volunteer to develop parallel Bochs architec
bewing wrote:Yes, I assume that there are two concurrent projects doing this. I think Stanislav is still considering adding these features to Bochs. I can tell you from experience that adding multithreading to Bochs is not easy in the least. So you can contribute code to Bochs if you like.
Also, a year ago when we built this thread, I decided to create a complete rewrite of Bochs with these features. I have been working on it since. The parts that exist are running beautifully, the coding is readable and clear and fast. There are only 8 small items left on the todo list, before I release "BeBochs" for alpha testing and a request for code contributions. I am expecting to do exactly that within 2 weeks. If you want to contribute code now or when I release for alpha testing, I would be delighted with the help.
I have three vectors in mind while developing Bochs (sorted by importancy to me):
1. Accuracy and reach content (new instructions, modes or functionality)
2. Speed
3. Simplicity
Because accuracy is most important, last time I was dealing with mostly accuracy issues (more than 50 accuracy bug fixes in CVS since last release, large part is VMX emulation). So I had no time to start with multithreading story yet. Actually I think this will require significant change in how timing emulated in Bochs, and I setill have no idea what is BKM here
But I am also open for changes/patches or even modules replacements what could be done for simplicity.
I still hate how param_tree is implemented in Bochs and wonder if I could find better replacement, I am still wonder if device models structure could be simpler. Not sure about CPU changes Bryce suggests, but just because I never saw them. It sounds to me that code is going to be absolutely unmaintenable (like GUI-debugger code today) but I might be wrong
if you have some ideas from the thread above and you are like to discuss them and probably even want to start some implementation - I will help as much as I can.
Stanislav
P.S> 2bewing: could you post the list of what you already done + probably even non-compiling pre-alpha of your BeBochs ?
Even in private if you don't want to publish yet.
Re: Looking for volunteer to develop parallel Bochs architec
Cool you're still working on that, as I've been wondering lately what the status would be.bewing wrote:Also, a year ago when we built this thread, I decided to create a complete rewrite of Bochs with these features. I have been working on it since.
That name will get you killed :).before I release "BeBochs"
JAL
Re: Looking for volunteer to develop parallel Bochs architec
You'd have to catch me first, and it's all quok's fault anyway -- send all death threats to him.