Programmiersprache C++: Rule of Zero, or Six

Die Rule of Zero, or Six ist eine wichtige Regel in modernem C++.

23.01.2023, 11:55 Uhr

Lesezeit: 7 Min.

Von

Rainer Grimm

Die Rule of Zero, or Six ist eine sehr wichtige Regel in modernem C++. In meinem aktuellen Buch "C++ Core Guidelines: Best Practices for Modern C++" stelle ich sie genauer vor. In diesem Artikel möchte ich die relevanten Stellen meines englischen Buches zitieren.

Rainer Grimm ist seit vielen Jahren als Softwarearchitekt, Team- und Schulungsleiter tätig. Er schreibt gerne Artikel zu den Programmiersprachen C++, Python und Haskell, spricht aber auch gerne und häufig auf Fachkonferenzen. Auf seinem Blog Modernes C++ beschäftigt er sich intensiv mit seiner Leidenschaft C++.

By default, the compiler can generate the big six if needed. You can define the six special member functions, but can also ask explicitly the compiler to provide them with = default or delete them with = delete.

C.20: If you can avoid defining any default operations, do

This rule is also known as "the rule of zero". That means, that you can avoid writing any custom constructor, copy/move constructors, assignment operators, or destructors by using types that support the appropriate copy/move semantics. This applies to the regular types such as the built-in types bool or double, but also the containers of the Standard Template Library (STL) such as std::vector or std::string.

class Named_map {
 public:
    // ... no default operations declared ...
 private:
    std::string name;
    std::map<int, int> rep;
};

Named_map nm;       // default construct
Named_map nm2{nm};  // copy construct

The default construction and the copy construction work because they are already defined for std::string and std::map. When the compiler auto-generates the copy constructor for a class, it invokes the copy constructor for all members and all bases of the class.

C.21: If you define or =delete any default operation, define or =delete them all

The big six are closely related. Due to this relation, you have to define or =delete all six. Consequently, this rule is called "the rule of six". Sometimes, you hear "the rule of five", because the default constructor is special, and, therefore, sometimes excluded.

Dependencies between the special member functions

Howard Hinnant developed in his talk at the ACCU 2014 conference an overview to the automatically generated special member functions.

Howard's table demands a deep explanation.

First of all, user-declared means for one of these six special member functions that you define it explicitly or auto request it from the compiler with =default. Deletion of the special member function with =delete is also regarded as user declared. Essentially, when you just use the name, such the name of the default constructor, it counts as user declared.
When you define any constructor, you get no default constructor. A default constructor is a constructor which can be invoked without an argument.
When you define or delete a default constructor with =default or =delete, no other of the six special member functions is affected.
When you define or delete a destructor, a copy constructor, or a copy-assignment operator with =default or =delete, you get no compiler-generated move-constructor and move-assignment constructor. This means move operations such as move construction or move assignment fall back to copy operations such as copy construction or copy assignment. This fallback automatism is marked in red in the table.
When you define or delete with =default or =delete a move constructor or a move assignment operator, you get only the defined =default or =delete move constructor or move assignment operator. Consequently, the copy constructor and the copy assignment operator are set to =delete. Invoking a copy operation such as copy construction or copy assignment causes, therefore, a compilation error.

When you don't follow this rule, you get very unintuitive objects. Here is an unintuitive example from the guidelines.

// doubleFree.cpp

#include <cstddef>

class BigArray {

 public:
    BigArray(std::size_t len): len_(len), data_(new int[len]) {}

    ~BigArray(){
        delete[] data_;
    }

 private:
  size_t len_;
  int* data_;
};

int main(){
  
    BigArray bigArray1(1000);
    
    BigArray bigArray2(1000);
    
    bigArray2 = bigArray1;    // (1)

}                             // (2)

Why does this program have undefined behavior? The default copy assignment operation bigArray2 = bigArray1 (1) of the example copies all members of bigArray2. Copying means, in particular, that pointer data is copied but not the data. Hence, the destructor for bigArray1 and bigArray2 is called (2), and we get undefined behavior because of double free.

The unintuitive behavior of the example is, that the compiler-generated copy assignment operator of BigArray makes a shallow copy of BigArray, but the explicit implemented destructor of BigArray assumes ownership of data.

AddressSanitizer makes the undefined behavior visible.

C.22 Make default operations consistent

This rule is related to the previous rule. If you implement the default operations with different semantics, the users of the class may become very confused. This strange behavior may also appear if you partially implement the member functions and partially request them via =default. You cannot assume that the compiler-generated special member functions have the same semantics as yours.

As an example of the odd behavior, here is the class Strange. Strange includes a pointer to int.

// strange.cpp

#include <iostream>

struct Strange { 
  
    Strange(): p(new int(2011)) {}
    
    // deep copy 
    Strange(const Strange& a) : p(new int(*a.p)) {}      // (1)
  
    // shallow copy
    // equivalent to Strange& operator 
    // = (const Strange&) = default;
    Strange& operator = (const Strange& a) {             // (2)
        p = a.p;
        return *this;
    }  
   
    int* p;
    
};

int main() {
  
    std::cout << '\n';
  
    std::cout << "Deep copy" << '\n';
  
    Strange s1;
    Strange s2(s1);                                      // (3)
  
    std::cout << "s1.p: " << s1.p << "; *s1.p: " << *s1.p << '\n';
    std::cout << "s2.p: " << s2.p << "; *s2.p: " << *s2.p << '\n';
  
    std::cout <<  "*s2.p = 2017" << '\n';
    *s2.p = 2017;                                        // (4) 
  
    std::cout << "s1.p: " << s1.p << "; *s1.p: " << *s1.p << '\n';
    std::cout << "s2.p: " << s2.p << "; *s2.p: " << *s2.p << '\n';
  
    std::cout << '\n';
  
    std::cout << "Shallow copy" << '\n';

    Strange s3;
    s3 = s1;                                             // (5)
  
    std::cout << "s1.p: " << s1.p << "; *s1.p: " << *s1.p << '\n';
    std::cout << "s3.p: " << s3.p << "; *s3.p: " << *s3.p << '\n';
  
  
    std::cout <<  "*s3.p = 2017" << '\n';
    *s3.p = 2017;                                        // (6)
  
    std::cout << "s1.p: " << s1.p << "; *s1.p: " << *s1.p << '\n';
    std::cout << "s3.p: " << s3.p << "; *s3.p: " << *s3.p << '\n';
  
    std::cout << '\n';
  
    std::cout << "delete s1.p" << '\n';                  // (7)
    delete s1.p;                                        
  
    std::cout << "s2.p: " << s2.p << "; *s2.p: " 
      << *s2.p << '\n';                                  // (8) 
    std::cout << "s3.p: " << s3.p << "; *s3.p: " << *s3.p << '\n';
  
    std::cout << '\n';
  
}

The class Strange has a copy constructor (1) and a copy-assignment operator (2). The copy constructor applies deep copy, and the assignment operator applies shallow copy. By the way, the compiler-generated copy constructor or copy assignment operator also applies shallow copy. Most of the time, you want deep copy semantics (value semantics) for your types, but you probably never want to have different semantics for these two related operations. The difference is that deep copy semantics creates two new separate storage p(new int(*a.p)) while shallow copy semantics just copies the pointer p = a.p. Let’s play with the Strange types. The following figure shows the output of the program.

(3) uses the copy constructor to create s2. Displaying the addresses of the pointer and changing the value of the pointer s2.p (4) shows that s1 and s2 are two distinct objects. This is not the case for s1 and s3. The copy-assignment operation in (5) performs a shallow copy. The result is that changing the pointer s3.p (6) also affects the pointer s1.p because both pointers refer to the same value.

The fun starts if I delete the pointer s1.p (7). Thanks to the deep copy, nothing bad happens to s2.p, but the value of s3.p becomes an invalid pointer. To be more precise: Dereferencing an invalid pointer such as in *s3.p (8) is undefined behavior.

Wie geht's weiter?

Das Konzept eines Regular Type ist in der Standard Template Library (STL) fest verankert. Die Idee geht auf Alexander Stephanov zurück, den Schöpfer der STL. In meinem nächsten Beitrag werde ich mehr über reguläre Typen schreiben. (rme)