上QQ阅读APP看书，第一时间看更新

Virtual functions under the hood

Although polymorphism is not limited to virtual functions, we will discuss them in more detail because dynamic polymorphism is the most popular form of polymorphism in C++. And again, the best way to better understand a concept or technology is by implementing it on your own. Whether we declare a virtual member function in a class or it has a base class with virtual functions, the compiler augments the class with an additional pointer. The pointer points to a table that's usually referred to as a virtual functions table, or simply a virtual table. We also refer to the pointer as the virtual table pointer.

Let's suppose we are implementing a class subsystem for a bank customer account management. Let's say that the bank asks us to implement cashing out based on the account type. For example, a savings account allows cashing out money once a year, while the checking account allows cashing out money whenever the customer wants. Without diving into any unnecessary details about the Account class, let's declare the bare minimum that will help us understand virtual member functions. Let's look at the Account class definition:

class Account
{
public:
  virtual void cash_out() {
    // the default implementation for cashing out 
  }

  virtual ~Account() {}
private:
  double balance_;
};

The compiler transforms the Account class into a structure that has a pointer to the virtual functions table. The following code represents pseudocode, explaining what happens when we declare virtual functions in the class. As always, note that we provide a general explanation rather than a compiler-specific implementation (the name mangling is also in a generic form; for example, we rename cash_out Account_cash_out):

struct Account
{
  VTable* __vptr;
  double balance_;
};

void Account_constructor(Account* this) {
  this->__vptr = &Account_VTable;
}

void Account_cash_out(Account* this) {
  // the default implementation for cashing out
}

void Account_destructor(Account* this) {}

Take a good look at the preceding pseudocode. The Account struct has __vptr as its first member. Since the previously declared Account class has two virtual functions, we can imagine the virtual table as an array with two pointers to virtual member functions. See the following representation:

VTable Account_VTable[] = {
  &Account_cash_out,
  &Account_destructor
};

With our previous presumptions at hand, let's find out what code the compiler will generate when we call a virtual function on an object:

// consider the get_account() function as already implemented and returning an Account*
Account* ptr = get_account();
ptr->cash_out();

Here's what we can imagine the compiler's generated code to be like for the preceding code:

Account* ptr = get_account();
ptr->__vptr[0]();

Virtual functions show their power when they're used in hierarchies. SavingsAccount inherits from the Account class like so:

class SavingsAccount : public Account
{
public:
  void cash_out() override {
    // an implementation specific to SavingsAccount
  }
  virtual ~SavingsAccount() {}
};

When we call cash_out() via a pointer (or a reference), the virtual function is invoked based on the target object that the pointer points to. For example, suppose get_savings_account() returns a SavingsAccount as Account*. The following code will call the SavingsAccount implementation of cash_out():

Account* p = get_savings_account();
p->cash_out(); // calls SavingsAccount version of the cash_out

Here's what the compiler generates for SavingsClass:

struct SavingsAccount
{
  Account _parent_subobject_;
  VTable* __vptr;
};

VTable* SavingsAccount_VTable[] = {
  &SavingsAccount_cash_out,
  &SavingsAccount_destructor,
};

void SavingsAccount_constructor(SavingsAccount* this) {
  this->__vptr = &SavingsAccount_VTable;
}

void SavingsAccount_cash_out(SavingsAccount* this) {
  // an implementation specific to SavingsAccount
}

void SavingsAccount_destructor(SavingsAccount* this) {}

So, we have two different tables of virtual functions. When we create an object of the Account type, its __vptr points to Account_VTable, while the object of the SavingsAccount type has its __vptr pointing to SavingsAccount_VTable. Let's take a look at the following code:

p->cash_out();

The preceding code translates into this:

p->__vptr[0]();

Now, it's obvious that __vptr[0] resolves to the correct function because it is read via the p pointer.

What if SavingsAccount doesn't override the cash_out() function? In that case, the compiler just places the address of the base class implementation in the same slot as SavingsAccount_VTable, as shown here:

VTable* SavingsAccount_VTable[] = {
  // the slot contains the base class version 
  // if the derived class doesn't have an implementation
  &Account_cash_out,
  &SavingsAccount_destructor
};

Compilers implement the virtual functions' representation and management differently. Some implementations use even different models, rather than the one we introduced earlier. We brought a popular approach and represented it in a generic way for the sake of simplicity. Now, we will take a look at what is going on under the hood of the code that incorporates dynamic polymorphism.