Comment 19 for bug 488354

Revision history for this message
In , Mh+mozilla (mh+mozilla) wrote :

(In reply to comment #13)
> That's a neat solution. The only worry I have is the fiddling of the stack
> pointer within C code; that is likely to break even in the same compiler with
> unrelated code changes causing it to prod the stack immediately before calling
> "func".

I think it should be safe, actually, because the compiler knows the stack pointer is being fiddled with by the assembly, and avoids using it. Without forcing the stack pointer with the assembly, the compiled code modifies sp to read the first 3 arguments. (iirc, once with add and once with ldmia)
While this would actually do the same thing, it wouldn't have the same guarantees.

> Actually, I'd be interested in seeing the assembly for that, as I'd like to see
> what it's doing differently.

Will attach.

> There are two issues here. I think you know these anyway, but for the sake of
> documentation & clarity in the bug:
> * Correctness on a system with mixed ARM & Thumb code: "blx" _must_ be used
> here.
> * Performance:
> - ARMv4T interworking branches aren't return-stack-predicted on newer cores,
> and this costs a huge amount of performance. (I've seen figures ranging from 5%
> to 50%, but I don't know what would happen in this particular case.)
> - ARMv4 long-range branches involve doing "mov pc, <dst_reg>". That's what
> the old code used to do. ("bl" can't branch to an address in a register and is
> restricted to PC±32MB.)

The undeniable advantage of the C code is that the compiler will just use the right construct depending on the target architecture. No need to #ifdef __thumb__ or whatever.

> I was planning to work on this (and 530087) further over the next few days
> because it's been in my "to do" list for ages, but you're welcome to take it if
> you want to.

Since I've gone this far already, I think I'll just take it ;)