memory layout C++ objects

I am basically wondering how C++ lays out the object in memory. So, I hear that dynamic casts simply adjust the object's pointer in memory with an offset; and reinterpret kind of allows us to do anything with this pointer. I don't really understand this. Details would be appreciated!

Answers


Each class lays out its data members in the order of declaration. The compiler is allowed to place padding between members to make access efficient (but it is not allowed to re-order).

How dynamic_cast<> works is a compiler implementation detail and not defined by the standard. It will all depend on the ABI used by the compiler.

reinterpret_cast<> works by just changing the type of the object. The only thing that you can guarantee that works is that casting a pointer to a void* and back to the same the pointer to class will give you the same pointer.


Memory layout is mostly left to the implementation. The key exception is that member variables for a given access specifier will be in order of their declaration.

§ 9.2.14

Nonstatic data members of a (non-union) class with the same access control (Clause 11) are allocated so that later members have higher addresses within a class object. The order of allocation of non-static data members with different access control is unspecified (11). Implementation alignment requirements might cause two adjacent members not to be allocated immediately after each other; so might requirements for space for managing virtual functions (10.3) and virtual base classes (10.1).

Other than member variables, a class or struct needs to provide space for member variables, subobjects of base classes, virtual function management (e.g. a virtual table), and padding and alignment of these data. This is up to the implementation but the Itanium ABI specification is a popular choice. gcc and clang adhere to it (at least to a degree).

http://mentorembedded.github.io/cxx-abi/abi.html#layout

The Itanium ABI is of course not part of the C++ standard and is not binding. To get more detailed you need to turn to your implementor's documentation and tools. clang provides a tool to view the memory layout of classes. As an example, the following:

class VBase {
    virtual void corge();
    int j;
};

class SBase1 {
    virtual void grault();
    int k;
};

class SBase2 {
    virtual void grault();
    int k;
};

class SBase3 {
    void grault();
    int k;
};

class Class : public SBase1, SBase2, SBase3, virtual VBase {
public:
    void bar();
    virtual void baz();
    // virtual member function templates not allowed, thinking about memory
    // layout and vtables will tell you why
    // template<typename T>
    // virtual void quux();
private:
    int i;
    char c;
public:
    float f;
private:
    double d;
public:
    short s;
};

class Derived : public Class {
    virtual void qux();
};

int main() {
    return sizeof(Derived);
}

After creating a source file that uses the memory layout of the class, clang will reveal the memory layout.

$ clang -cc1 -fdump-record-layouts layout.cpp

The layout for Class:

*** Dumping AST Record Layout
   0 | class Class
   0 |   class SBase1 (primary base)
   0 |     (SBase1 vtable pointer)
   8 |     int k
  16 |   class SBase2 (base)
  16 |     (SBase2 vtable pointer)
  24 |     int k
  28 |   class SBase3 (base)
  28 |     int k
  32 |   int i
  36 |   char c
  40 |   float f
  48 |   double d
  56 |   short s
  64 |   class VBase (virtual base)
  64 |     (VBase vtable pointer)
  72 |     int j
     | [sizeof=80, dsize=76, align=8
     |  nvsize=58, nvalign=8]

More on this clang feature can be found on Eli Bendersky's blog:

http://eli.thegreenplace.net/2012/12/17/dumping-a-c-objects-memory-layout-with-clang/

gcc provides a similar tool, `-fdump-class-hierarchy'. For the class given above, it prints (among other things):

Class Class
   size=80 align=8
   base size=58 base align=8
Class (0x0x141f81280) 0
    vptridx=0u vptr=((& Class::_ZTV5Class) + 24u)
  SBase1 (0x0x141f78840) 0
      primary-for Class (0x0x141f81280)
  SBase2 (0x0x141f788a0) 16
      vptr=((& Class::_ZTV5Class) + 56u)
  SBase3 (0x0x141f78900) 28
  VBase (0x0x141f78960) 64 virtual
      vptridx=8u vbaseoffset=-24 vptr=((& Class::_ZTV5Class) + 88u)

It doesn't itemize the member variables (or at least I don't know how to get it to) but you can tell they would have to be between offset 28 and 64, just as in the clang layout.

You can see that one base class is singled out as primary. This removes the need for adjustment of the this pointer when Class is accessed as an SBase1.

The equivalent for gcc is:

$ g++ -fdump-class-hierarchy -c layout.cpp

The equivalent for Visual C++ is:

cl main.cpp /c /d1reportSingleClassLayoutTest_A

see: https://blogs.msdn.microsoft.com/vcblog/2007/05/17/diagnosing-hidden-odr-violations-in-visual-c-and-fixing-lnk2022/


The answer is, "it's complicated". Dynamic cast does not simply adjust pointers with an offset; it may actually retrieve internal pointers inside the object in order to do its work. GCC follows an ABI designed for Itanium but implemented more broadly. You can find the gory details here: Itanium C++ ABI.


As stated previously, the full details are complicated, painful to read, and really only useful to compiler developers, and varies between compilers. Basically, each object contains the following (usually laid out in this order):

  1. Runtime type information
  2. Non-Virtual base objects and their data (probably in order of declaration).
  3. Member variables
  4. Virtual base objects and their data (Probably in some DFS tree search order).

These pieces of data may or may not be padded to make memory alignment easier etc. Hidden in the runtime type information is stuff about the type, v-tables for virtual parent classes etc, all of which is compiler specific.

When it comes to casts, reinterpret_cast simply changes the C++ data type of the pointer and does nothing else, so you had better be sure you know what you're doing when you use it, otherwise you're liable to mess things up badly. dynamic_cast does very much the same thing as static_cast (in altering the pointer) except it uses the runtime type information to figure out if it can cast to the given type, and how to do so. Again, all that is compiler specific. Note that you can't dynamic_cast a void* because it needs to know where to find the runtime type information so it can do all its wonderful runtime checks.


this question is already answered at http://dieharddeveloper.blogspot.in/2013/07/c-memory-layout-and-process-image.html here is a excerpt from there: In the middle of the process's address space, there is a region is reserved for shared objects. When a new process is created, the process manager first maps the two segments from the executable into memory. It then decodes the program's ELF header. If the program header indicates that the executable was linked against a shared library, the process manager (PM) will extract the name of the dynamic interpreter from the program header. The dynamic interpreter points to a shared library that contains the runtime linker code.


Need Your Help

How to use ng-bootstrap Typeahead component on a Reactive Form

angular twitter-bootstrap-4 ng-bootstrap

I'm trying to use the ng-bootstrap Typeahead component on a Reactive Form with Angular 2 and, after looking the example code on Typeahead's documentation I'm unable to get it to work.

Can Haskell be used to write shell scripts?

linux shell unix haskell

Is it possible to write shell scripts in Haskell and if so, how do you do it? Just changing the interpreter like so?