Filosoft (Filodej's sandpit) home

HP-UX crash analysis

Overview

32-bit HP-UX (PA-RISC) aCC compiler bug

The +hpxstd98 compiler option causes that the compiler generates incorrect assembly code for a perfectly correct source code.

Compiler option description

  +hpxstd98      Turns on  a new, standard compliant compilation mode.
                 This option provides a rich set of diagnostics, better
                 support for templates (including support for template-
                 template parameters) and is independent of other
                 compilation options.  Objects created using this option
                 are completely binary compatible with those created
                 without using this option and also with the objects
                 created using earlier versions of the compiler.

This option is necessary for building the Boost C++ libraries (see HP aC++ FAQ).

Compiler version

If we check the compiler version:

$ aCC --version         
aCC: HP ANSI C++ B3910B A.03.80

... we can see that the version of the compiler is A.03.80, but currently there are two more recent versions available: A.03.85 and A.03.90 (see HP aC++ release history for PA-RISC servers

Steps to reproduce the problem

Prepare the program

The correct version

The incorrect version

Analysis

Debug the program

Analyse the disassembly

... was generated correctly and identically in both cases.

Conclusion

It seems that the two registers above contain:

There are two r1 register dereferences with the offset 0 bytes, those probably correspond to the code:

this->m_state.m_noop

... then there is one more dereference with offset 16 bytes, it seemingly correspond to a retrieval of the Noop vtable and then there is one final dereference corresponding to the vtable lookup.

The virtual call itself is realized as so called long call with the BLE followed by the COPY instruction (see section 2.5.5 of the The 32-bit PA-RISC Run-time Architecture Document).

For some reason when the +hpxstd98 compiler option is used the generated code differs considerably, some of the dereferences are completely missing.

This causes Bus Error in some cases, Segmentation fault in other. That seem to depend on the address of the Noop object (in some cases the LDW instruction complains about its alignment which causes SIGBUS signal just before the SEGFAULT).

Fork me on GitHub