So Oliver points out we have a slightly different patch than the referenced one. If the patch is only about fixing alignment issues then it's not a big deal since I confirmed that the bug reported here doesn't trigger any alignment traps (echo 3 >/proc/cpu/alignment + watching dmesg and cat /proc/cpu/alignment). I did various research over the day and confirmed that it's not USE_DOUBLE_MMAP (a strace already indicated that it was unlikely the issue). It might be related to recent libstdc++ changes in gcc-4.4 but it's not clear. I'm copying the log of today's conversations with caolan. It looks like we need to rebuild an old ooo with our latest toolchain or rebuild latest ooo with older toolchains. 11:56 < lool> I'm trying to debug an early startup issue of soffice.bin on Ubuntu armel 11:57 < lool> (gdb) run -norestore -writer 11:57 < lool> Starting program: /usr/lib/openoffice/program/soffice.bin -norestore -writer 11:57 < lool> terminate called after throwing an instance of 'com::sun::star::ucb::InteractiveAugmentedIOException' 11:57 < lool> During startup program exited with code 80. 11:57 < lool> I tried breaking on various functions but I dont seem to reach them 11:57 < lool> Oddly strace shows a SIGABRT which I dont see in gdb 11:59 < lool> I read through http://wiki.services.openoffice.org/wiki/Debugging and http://www.skynet.ie/~caolan/TechTexts/OpenOfficeHacking.html but it seems it's breaking earlier 12:07 <@caolan> lool: are you able to... gdb) break main, (gdb) run -norestore -writer, gdb) cont, gdb) catch throw 12:09 < lool> caolan: If I try to break main it just does the same output 12:10 <@caolan> lool: i.e. the terminate called... ? 12:10 < lool> Yes 12:10 < lool> caolan: Oh you're the author of the EABI patch? 12:10 <@caolan> does gdb offer any sort of a bt ? 12:10 < lool> No, I just drop back to gdb 12:11 <@caolan> wonder if its dying before main, or if gdb is just uselessly broken 12:11 < lool> Perhaps it's more fragile on armel and needs debug symbols 12:12 < lool> I tried breaking on things like __libc_start_main too and that didn't work either 12:12 < lool> So it might be gdb being broken 12:15 <@caolan> lool: btw, is this under a real arm, qemu-system-arm, or qemu-arm ? 12:17 < lool> caolan: Real arm 12:17 < lool> caolan: imx51 soc 12:17 < lool> v7 but v5/v6 userspace 12:18 < lool> We just changed our toolchain from v5 + soft vfp to v6 + softfp vfp 12:18 <@caolan> lool: version is 3.1.1 right ? 12:18 < lool> caolan: BTW( I'd be happy to offer you access if you like) 12:19 < lool> caolan: yes 12:19 <@caolan> lool: and did any earlier versions of OOo work previously ? 12:19 < lool> The jaunty one worked 12:19 < lool> that was 1:3.0.1-9ubuntu3 12:20 <@caolan> hmmm 12:20 < lool> we're not sure of whether a new v5 toolchain broke it or a new oo.o upstream release 12:20 < lool> Oliver tells me a new oo.o upstream release broke it 12:21 < lool> It could really be either toolchain or oo.o 12:21 <@caolan> does e.g. commenting out USE_DOUBLE_MMAP in bridges/inc/bridges/cpp_uno/shared/vtablefactory.hxx make any difference ? 12:22 < lool> It will take me some to build but I'll try that out, thanks 12:24 <@caolan> lool: a strace -f of startup might help indicate if the mmap thing is relevant, e.g. grepping the log for /.exec 12:25 < lool> http://people.canonical.com/~lool/soffice.strace 12:25 < lool> I do see mmaps 12:28 < lool> (I'm not experienced in building oo.o so I'm using the packaging and that takes a long while to build everything) 12:30 <@caolan> mmaps look fine anyway, doing what I'd expect them to do. possibly still work commenting it out, but I wouldn't have an massive expectation that it'll make a difference 12:30 <@caolan> s/work/worth/ 12:31 <@caolan> I sort of feel its the first uno exception getting thrown through the bridge-code, which would be something of a tricky area 13:16 < lool> Hmm a full build took 1 day and 15 hours 13:17 < lool> caolan: Would it be possible for me to build just a subset of oo.o to test various cases? 13:22 <@caolan> lool: sure, see http://wiki.services.openoffice.org/wiki/Documentation/Building_Guide/Building_on_Linux#setting_the_environment for some help. Might be complicated by an ooo-build build wrapperhttp://wiki.services.openoffice.org/wiki/Documentation/Building_Guide/Building_on_Linux#setting_the_environment . Might be complicated by an ooo-build wrapper, but in essence e.g., source LinuxArmEnvSet.sh, cd bridges, make changes, build, cp unxlngr.pro/ 13:22 <@caolan> lib/libgcc3_uno.so /path/to/ure/lib/libgcc3_uno.so 13:22 <@caolan> ack 13:23 <@caolan> munged text, but you get the gist I guess 13:42 < lool> caolan: Thanks a lot 14:50 < lool> caolan: After commenting out the double mmap thing, an ubuntu build fails with: 14:50 < lool> ../../../inc/bridges/cpp_uno/shared/vtablefactory.hxx:188: error: candidate is: static unsigned char* bridges::cpp_uno::shared::VtableFactory::addLocalFunctions(bridges::cpp_uno::shared::VtableFactory::Slot**, unsigned char*, const typelib_InterfaceTypeDescription*, sal_Int32, sal_Int32, sal_Int32) 14:51 < lool> caolan: I have a full build log if you like 14:59 <@caolan> would have to hack it a bit I see, i.e. remove sal_PtrDiff writetoexecdiff, from VtableFactory::addLocalFunctions and replace uses of writetoexecdiff with 0 15:01 < lool> caolan: odd, that's already protected in #ifdef USE_DOUBLE_MMAP 15:02 < lool> Oh nm you meant in cpp2uno.cxx 15:07 < lool> Ok that built 15:28 < lool> caolan: I have two libgcc3_uno.so in the build tree now (build still in progress) ./ooo-build/build/OOO310_m19/bridges/unxlngr.pro/lib/libgcc3_uno.so and ./ooo-build/build/OOO310_m19/solver/310/unxlngr.pro/lib/libgcc3_uno.so 15:28 < lool> caolan: I tried LD_LIBRARY_PATH=./ooo-build/build/OOO310_m19/solver/310/unxlngr.pro/lib/ /usr/lib/openoffice/program/soffice.bin -norestore -writer 15:28 < lool> and got the same error 15:28 < lool> but am not 100% sure it picks up the right flie 15:28 < lool> I guess I could strace 15:29 < lool> 1431 open("/usr/lib/ure/lib/libgcc3_uno.so", O_RDONLY) = 5 15:29 < lool> does not 15:30 <@caolan> lool: you could just take the brute force q-n-d approach and overwrite it 15:31 < lool> caolan: Eh I just did :) 15:31 < lool> caolan: It resulted in the same exception 15:31 < lool> lool@babbage25:~/ooo-startup/openoffice.org-3.1.1$ /usr/lib/openoffice/program/soffice.bin -norestore -writer 15:31 < lool> terminate called after throwing an instance of 'com::sun::star::ucb::InteractiveAugmentedIOException' 15:32 < lool> caolan: I copied ./ooo-build/build/OOO310_m19/solver/310/unxlngr.pro/lib/libgcc3_uno.so; was that the right one? 15:32 < lool> identical anyway 15:33 < lool> caolan: so by the look of it, if libgcc3_uno.so was the only affected binary for the double_mmap changes, it's another issue; you were suggesting it might be in the uno bridge code; how would I create a minimal test case? 15:34 < lool> Like throw + catch an exception over the uno/cpp bridges 15:34 < lool> (IIUC the other track you mentionned) 15:35 <@caolan> lool: yeah, not a massive surprise. I didn't really think it was the double mmap once seen the strace. Minimal test cases for the bridge is a bit of a disaster at the moment. There was/is a test case in cppu/test but that's been broken for a while and I've never had a chance to fix it up. 15:37 < lool> Hmm 800 lines for ./ooo-build/build/OOO310_m19/cppu/test/test_cuno.c 15:38 <@caolan> would ideally be able to get gdb in there to catch throws, otherwise its hard to know where to start. If it was e.g. first uno exception then adding a fprintf(stderr to gcc3_linux_arm/except.cxx raiseException might help find that out 15:42 < lool> caolan: I'll start with the printfs (I see some are there already) and will resort to debugging gdb as preliminary to debugging oo.o afterwards; thanks 15:43 <@caolan> Those damn bridges are nearly at the limit of my abilities so they sort of fall under "Debugging is twice as hard as writing code in the first place. Therefore if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it" 15:46 < lool> caolan: So I was expecting to see: fprintf( stderr, "> uno exception occured: %s\n", cstr.getStr() ); 15:46 < lool> caolan: but didnt see anything 15:47 < lool> caolan: what I did was #define OSL_DEBUG_LEVEL 2 in ./ooo-build/build/OOO310_m19/bridges/source/cpp_uno/gcc3_linux_arm/except.cxx + rebuild + system wide install 15:47 < lool> then soffice.bin -norestore -writer 15:47 < lool> Oh sorry I do see it 15:47 < lool> > uno exception occured: com.sun.star.ucb.InteractiveAugmentedIOException 15:47 < lool> terminate called after throwing an instance of 'com::sun::star::ucb::InteractiveAugmentedIOException' 15:47 < lool> caolan: ^ so it seems it is indeed the first exception 15:47 < lool> since I only see it once 15:48 <@caolan> lool: FWIW "build debug=true" would automatically give a OSL_DEBUG_LEVEL. 15:48 <@caolan> lool: indeed, that would seem to be the case. Which rather sucks 15:48 < lool> _rene_: Is this something I can easily set ("build debug=true") 15:49 < lool> caolan: Would you be tempted to look into it with remote access? I could chase the gdb issue(s) in the mean time 15:49 <@caolan> getAdjustedPtr might be wrong, not sure 15:51 < lool> _rene_: from an unpacked tree I did find . -iname \*EnvSet.sh and didn't any env setting file; am I supposed to create one from scratch? 16:03 < lool> caolan: I wonder if there's a relation between gdb being broken here and the fact that the bridge seems to use gcc's unwinding functions 16:07 <@caolan> lool: yeah, it is suspicious isn't it. A thing to check is to compare the __cxa_exception in share.hxx in gcc3_linux_arm with the same definition in gcc's libstdc++'s internal header. Would probably need to get the matching src.deb of your gcc/libstdc++ in order to find it 16:09 < lool> caolan: I was actually looking at the one from gcc /usr/lib/gcc/arm-linux-gnueabi/4.3/include/unwind.h 16:09 < lool> found no interesting difference between the gcc 4.3 and 4.4 versions so it seems the API didnt change recently 16:12 < lool> ah /usr/include/c++/4.4/exception_ptr.h is a better hit 16:12 < lool> that's actually in libstdc++ 16:16 <@caolan> lool: well, the canonical header is the libstdc++-v3/libsupc++/unwind-cxx.h header in your gcc source. Looking at my 4.4.1 one it seems unchanged. Not sure what else might be at play 16:28 <@caolan> hmm 16:29 < lool> ok got it now 16:31 <@caolan> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38732 would be the *sort* of thing that could be behind this. Not saying that it is, just that its like that 16:35 < lool> caolan: interesting 16:36 < lool> caolan: I can see we have this change indeed 16:39 < lool> caolan: Do you think #ifdef __ARM_EABI_UNWINDER__ versus #ifdef __ARM_EABI__ matters? 16:40 <@caolan> lool: no, shouldn't matter 16:40 < lool> It uses _Unwind_Ptr versus void* too 16:40 < lool> caolan: ./libstdc++-v3/testsuite/18_support/exception/38732.cc has an entirely similar definition as oo.o's 16:47 < lool> apparently the new test passes fine in our gcc-4.4 build 17:17 <@caolan> and what is the obsession with registering anyway 17:18 <@caolan> bah, I personally seriously doubt the utility of the feedback survey. Especially as the last time I checked it I was presented with a survey in German :-) 17:20 <@caolan> I seriously doubt the utility of those too :-), but what a window-esque horror it would be for every app to launch such a dialog on first-start 17:20 <@caolan> but anyway, that's beside the point I guess 17:24 < lool> caolan: Continuing with printf debugging, I can say that __cxa_throw is properly called 17:26 <@caolan> lool: depending on how long it would take to do, it might be worth finding out if say the last known OOo to work would still work if rebuilt with the current toolchain. 17:28 < lool> caolan: Ok 17:28 < lool> caolan: Would take days I guess