C++ Copy Semantics

One of the most common operation we have been doing while programming is assigning a value to a variable using the = operator and passing values as arguments to functions. Whenever we do this the compiler creates the copy or pass the references.

And this is sometimes confusing to understand when it’s copy by value (which means creating a clone at a new memory location and using it) or it’s a copy by reference (clone is not created, just a pointer to the same memory location is created).

Understanding the concept via code example will be easy. Suppose we have a Circle class with a float pointer to store it’s radius.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
// Online C++ compiler to run C++ program online
#include <iostream>

class Circle {
public:
Circle(float val): radius(new float(val))
{
std::cout << "LOG: circle `created` with radius: " << val << std::endl;
}

~Circle()
{
std::cout << "LOG: circle `deleted` with radius: " << *radius << std::endl;
delete radius;
radius = nullptr;
}

void set(float val) { *radius = val; }
float get() { return *radius; }

private:
float *radius;
};

int main() {
// create a circle
Circle c1 = Circle(10.5);

// create a copy
Circle c2 = c1;

// change the radius of C2
c2.set(15.5);

// print the radius of both circles
std::cout << "C1 radius = " << c1.get() << std::endl;
std::cout << "C2 radius = " << c2.get() << std::endl;

return 0;
}

Intentionally I have chosen to create float value using new so that we can see the Shallow Copy in action.

The program is very simple. It creates a Circle C1 and then it creates a duplicate C2 by assigning the value of C1 to it. As per assumptions the program should do the following:

  • Create a circle C1 with radius 10.5
  • Create a duplicate circle C2 with radius 10.5
  • Change the radius of C2 from 10.5 to 15.5

But if we execute this we will see some weird behaviors. Here’s the output of the above program:

1
2
3
4
5
6
7
LOG: circle `created` with radius: 10.5
C1 radius = 15.5
C2 radius = 15.5
LOG: circle `deleted` with radius: 15.5
LOG: circle `deleted` with radius: 1.92719e-40
free(): double free detected in tcache 2
Aborted

Give it another try, and here’s the output:

1
2
3
4
5
6
7
LOG: circle `created` with radius: 10.5
C1 radius = 15.5
C2 radius = 15.5
LOG: circle `deleted` with radius: 15.5
LOG: circle `deleted` with radius: 3.28753e-40
free(): double free detected in tcache 2
Aborted

Shallow Copy

Oh hold on! There are 2 lines from the LOG which needs our attention.

1
2
# why C1 has radius of 15.5? 
C1 radius = 15.5

The reason for this behavior is due to shallow copy. Since we have a pointer member in the class definition and we have not explicitly defined the Copy Constructor the compiler has injected a default implementations where it shallow copies the member variables to the target variable.

Double Deletion

1
2
3
4
5
# first log 
LOG: circle `deleted` with radius: 1.92719e-40

# second log
LOG: circle `deleted` with radius: 3.28753e-40

We see garbage values for the radius his is because on first run the destructor is deleting the pointer but on second run it is trying to delete the already deleted pointer. This caused double delete.

Double delete corrupts the heap and leads to undefined behavior.

1
2
3
4
5
6
~Circle() 
{
std::cout << "LOG: circle `deleted` with radius: " << *radius << std::endl;
delete radius;
radius = nullptr;
}

If we change the order of operations in the destructor, i.e. set the radius pointer to nullptr and then free it using delete the double error message will go away. Does that mean we have solved the double delete issue?

1
2
3
4
5
6
~Circle() 
{
std::cout << "LOG: circle `deleted` with radius: " << *radius << std::endl;
radius = nullptr;
delete radius;
}

No! We just created memory leak.

The above code makes the destructor useless and since we assigned the nullptr to radius we lost he memory location we borrowed from the OS and there’s not way to return it. Hence this is a memory leak.

Deep Copy

In the scenarios like above, we need to create deep copy where we have to explicitly tell the compiler about the copy mechanism of our class. Deep copy creates a completely independent copy of all resources owned by the object. When copying a class with dynamic memory, a deep copy allocates new memory and copies the content from the original object to the new memory. Each object then manages its own copy, so changes to one object do not affect the other, and destructors can safely free memory without risk of double deletion.

Copy Constructor

Let’s add the copy constructor.

1
2
3
4
5
Circle(const Circle& copy)
: radius(new float(*copy.radius))
{
std::cout << "LOG: `copy` of circle created with radius: " << *radius << std::endl;
}

We are now explicitly telling the compiler about how to handle the case of copy for our custom class implementation. Now if we execute the code, the logs will be like:

1
2
3
4
5
6
LOG: circle `created` with radius: 10.5
LOG: `copy` of circle created with radius: 10.5
C1 radius = 10.5
C2 radius = 15.5
LOG: circle `deleted` with radius: 15.5
LOG: circle `deleted` with radius: 10.5

Double Free Error Again!

We created a copy for an object which was non-existing. I mean we created Circle c2 from the value of Circle c1. What we try assigning value of circles to existing circle objects?

Let’s modify the main function to assign new value to existing Circle objects like:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
int main() {
// create a circle
Circle c1 = Circle(10.5);
// create a big circle
Circle c2 = Circle(15.5);
// create a small circle
Circle c3 = Circle(5.5);

// print the radius of all the circles
std::cout << "C1 radius = " << c1.get() << std::endl;
std::cout << "C2 radius = " << c2.get() << std::endl;
std::cout << "C3 radius = " << c3.get() << std::endl;

// let's assign the value to c3 to c1
c1 = c3;
std::cout << "updated C1 radius = " << c1.get() << std::endl;

return 0;
}

The logs on execution of the updated code:

1
2
3
4
5
6
7
8
9
10
11
12
LOG: circle `created` with radius: 10.5
LOG: circle `created` with radius: 15.5
LOG: circle `created` with radius: 5.5
C1 radius = 10.5
C2 radius = 15.5
C3 radius = 5.5
updated C1 radius = 5.5
LOG: circle `deleted` with radius: 5.5
LOG: circle `deleted` with radius: 15.5
LOG: circle `deleted` with radius: 1.77397e-40
free(): double free detected in tcache 2
Aborted

Oh wait, something is not right. The updated radius updated C1 radius = 5.5 is working as expected but why the error of double deletion popped up again?

Overloading = Operator

We are seeing the double free error again because c1 = c3 calls the copy assignment operator, and since we have not provided that, the compiler auto generated a shallow assignment which copies the pointer value of radius from c3 into c1.

After that both c1 and c3 point to the same heap float, so when destructors run we get a double deletion error. So we need need to overload the = operator for our class.

1
2
3
4
5
6
Circle& operator=(Circle other)
{
// copy and swap idiom [C++98]
std::swap(radius, other.radius);
return *this;
}

Now after adding the code for = overloading the logs are:

1
2
3
4
5
6
7
8
9
10
11
12
LOG: circle `created` with radius: 10.5
LOG: circle `created` with radius: 15.5
LOG: circle `created` with radius: 5.5
C1 radius = 10.5
C2 radius = 15.5
C3 radius = 5.5
LOG: `copy` of circle created with radius: 5.5
LOG: circle `deleted` with radius: 10.5
updated C1 radius = 5.5
LOG: circle `deleted` with radius: 5.5
LOG: circle `deleted` with radius: 15.5
LOG: circle `deleted` with radius: 5.5

We need to pay special attention to these lines from the above log:

1
2
3
LOG: `copy` of circle created with radius: 5.5
LOG: circle `deleted` with radius: 10.5
updated C1 radius = 5.5

This is the magic of the std::swap. Here’s the sequence of operations performed (after operator overloading definition) when we executed: c1 = c3:

  • The std::swap first created a copy of c3 -> LOG: copy of circle created with radius: 5.5.
  • Then it destroys the older c1 and the associated pointer (radius) of it -> LOG: circle deleted with radius: 10.5.
  • Assigns the newly created object to c1.

C++ Constructor Rules

These are the core principles professionals rely on when writing correct, safe, resource-owning C++ classes:

Rule of Three [C++98]

If a class manages a resource (raw pointer, file handle, socket, etc.) and you define any one of these: Destructor, Copy Constructor or the Copy Assignment Operator, then you must define all three. If not defined, the default generated versions will perform shallow copies, causing double-free or leaks.

Rule of Five [C++11]

With the new C++ 11 that brought move semantics, resource managing classes typically need five special member functions which are Destructor, Copy Constructor, Copy Assignment, Move Constructor, Move Assignment. Because moving improves performance (stealing instead of copying) and prevents unnecessary allocations.

Rule of Zero [Modern C++]

If we design our class to avoid manual resource management using RAII (Resource Acquisition Is Initialization) wrappers like std::string, std::vector, std::unique_ptr, then we should define none of the special member functions.

Complete Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
#include <iostream>

class Circle {
public:
Circle(float val): radius(new float(val))
{
std::cout << "LOG: circle `created` with radius: " << val << std::endl;
}

~Circle() noexcept
{
std::cout << "LOG: circle `deleted` with radius: " << *radius << std::endl;
delete radius;
}

Circle(const Circle& copy)
: radius(new float(*copy.radius))
{
std::cout << "LOG: `copy` of circle created with radius: " << *radius << std::endl;
}

Circle& operator=(Circle copy) noexcept
{
// copy and swap idiom [C++98]
std::swap(radius, copy.radius);
return *this;
}

void set(float val) { *radius = val; }
float get() const { return *radius; }

private:
float *radius;
};

int main() {
// create a circle
Circle c1 = Circle(10.5);
// create a big circle
Circle c2 = Circle(15.5);
// create a small circle
Circle c3 = Circle(5.5);


// print the radius of both circles
std::cout << "C1 radius = " << c1.get() << std::endl;
std::cout << "C2 radius = " << c2.get() << std::endl;
std::cout << "C3 radius = " << c3.get() << std::endl;

// let's assign the value to c3 to c1
c1 = c3;
std::cout << "updated C1 radius = " << c1.get() << std::endl;

return 0;
}

The final code has noexcept which is a C++11 spec that specifies whether a function might throw exceptions.

Closing Notes

This is long text and I tried my best to jot down things as clearly as possible. I hope you will enjoy reading it and you will find this helpful. We just touched upon RAII which is one of the most important and defining idioms in C++.

It is the foundation of safe memory management, exception safety, and deterministic cleanup. Let’s see if I can fully absorb it and then write about the same. There are numerous C++11 changes which I am excited about but at the same time a few are hard to swallow and really frustrating. Hopefully this will go away with more practice.

Happy Coding. Bye!