Owen wrote:The ARM7TDMI is a simple in order single issue chip with no cache.
What does the compiler know that can make it better than me at that? In fact, its often worse, because it invokes its multiplication helper routines when you could do it in a couple of instructions instead...
I hadn't looked that deeply into the GBA specifically, so I wasn't aware of those characteristics, and they do make my main argument moot. Thank you for informing me.
Owen wrote:Of course, really need to avoid running ARM mode code from main memory. Massive performance killer.
(Now, if this were the NDS, that does indeed have caches. But also a bat **** insane memory architecture...)
This is what I was getting at with my "pay attention to the bus specs" remark. In some passing Nerd ADD research, I knew that the main memory bus in the GBA is 16-bit, and therefore well-suited to thumb code, and that there is also a 32-bit RAM area where you can load some 32-bit code.
I do recognize the benefits of assembly in a limited environment, and I recognize it always as a valid choice for personal reasons in any personal project. What I was trying to get at is that, if you're careful, a language as low-level as C can easily increase productivity and still fit rather well to the needs of the environment. Especially if you specify to the compiler that you don't want any builtin code included and augment with assembly where needed.
Thank you for your rebuttal, it was both engaging and informative.