Using smart pointers in building trees in which child nodes need an access to the parent

Consider building a tree. Often it is enough if only a parent node is able to reference child nodes, but this is not always the case. If parent node creates a child node that will later need an access back to the parent, what is the best way for a parent node to pass a reference to itself to the child? The only proper solution I could come up with was to inherit from std::enable_shared_from_this class and pass on the shared_ptr of pointer this. Luckily, the C++11 creators were smart enough to include that class to the standard. Other alternatives I considered have more serious shortcomings than this one.

Wrong solutions

You could always avoid using smart pointers and simply pass raw pointer this, but then you get all the usual downsides of not using smart pointers. If you are prepared to live with them, then this is probably the simplest way. The following will of course segfault, but for simplicity I will omit mechanism for stopping the tree growing in all examples here.


class node {
private:
  node* m_parent;
  node* m_child1;
  node* m_child2;
public:
  node(node* parent) {
    m_parent = parent
    m_child1 = new node(this);
    m_child2 = new node(this);
    // Remember to delete nodes
    // Don't ever delete parent
  }
}

To help enforce at least some safety into such system, you might try to at least store pointer to parent node as a const, to prevent accidental deletion of that pointer as well.


class node {
private:
  const node* const m_parent;
  node* m_child1;
  node* m_child2;
public:
  node(const node* const parent) 
 : m_parent(parent) {
    m_child1 = new node(this);
    m_child2 = new node(this);
    // Remember to delete nodes
    // Safer: can't delete parent anyway
  }
}

But in this case you will only be able to call const functions of a parent node and after that, any child nodes function a parent might call, unless you const_cast it somewhere on the way. But by then you have already created a mess and you might as well do the whole thing dangerously. It is within a contained section of a code anyway, in this one class only.

If you are more of a conformist and don't like living on the edge, you might want to consider using smart pointers as a way to enforce safety, instead of merely documenting the code with capital letters.


class node {
private:
  shared_ptr<node> m_child1;
  shared_ptr<node> m_child2;
}

In that case, the parent class will keep references to the child nodes in form of smart pointers. That means the child nodes will be owned by code external to them and when they will find themselves in the role of a parent node, how to arrange passing a reference to themselves to their child nodes?

Again, one option is to pass raw pointer to this.


class node {
private:
  const node* const m_parent;
  shared_ptr<node> m_child1;
  shared_ptr<node> m_child2;
public:
  node(const node* const parent)
 : m_parent(parent) {
    m_child1.reset(new node(this));
    m_child2.reset(new node(this));
  }
}

Now, if pointer to the parent is not collected and stored as const, there is no enforced ownership. Someone could easily delete that pointer by mistake, for instance. If it is stored as const, well, see above. To summarise, this solution is not whole lot better than simply not using smart pointers. I would argue it is worse, because to a casual reader of your code this would signal that the code is safer than it really is. Imagine fixing a bug in the area and thinking it can't be deleted at this point, because it is a smart pointer...

And you certainly can't simply take this pointer and pass it to the child node to be stored as shared_ptr.


class node {
private:
  shared_ptr<node> m_parent;
  shared_ptr<node> m_child1;
  shared_ptr<node> m_child2;
public:
  node(node* parent) {
    m_parent.reset(parent);
    m_child1.reset(new node(this));
    m_child2.reset(new node(this));
  }
}

As shown before, if a node is created as a shared_ptr, it is owned by a code creating it. Creating new shared_ptr object out of the same pointer will result in two smart pointers sharing ownership of the same pointer. This is the most basic of errors you could make with smart pointers.

There are two ways out of this, both broken. You could have a smart pointer version of this stored along with other class members, just to be able to pass it on and not break the reference count. This smart pointer object could be given to node by the code that created it, shortly after the creation.


class node {
private:
  shared_ptr<node> m_parent;
  shared_ptr<node> m_child1;
  shared_ptr<node> m_child2;
  shared_ptr<node> m_this;
public:
  node(shared_ptr<node> parent) {
    m_parent = parent;
    m_child1.reset(new node(m_this));
    m_child2.reset(new node(m_this));
    m_child1.set_this(m_child1);
    m_child2.set_this(m_child2);
  }
  void set_this(shared_ptr<node> that) {
    m_this = that;
  }
}

This might seem like a not completely unreasonable idea at first, but it defeats the whole purpose of a smart pointer. It creates a circular reference. Smart pointer used this way will always have at least one reference, because the object references itself! It will never get destroyed because it will never go out of scope because it will never go out of scope.

The second, less wrong way out of the problem of storing smart pointer to this in the node itself, but by no means less useful, is to use unique_ptr instead. A code creating a node would pass a unique_ptr to the node after creation and lose ownership. The node would in turn pass it to the child node it is creating and lose ownership too. This will avoid circular reference problem, because node does not own itself.


class node {
private:
  unique_ptr<node> m_parent;
  unique_ptr<node> m_this;
  unique_ptr<node> m_child1;
  unique_ptr<node> m_child2;
public:
  node(unique_ptr<node> parent) {
    m_parent = parent;
    m_child1.reset(new node(m_this));
    // Lost m_this to m_child1
    m_child1.set_this(m_child1);
    // Hmm, lost m_child1 as well ...
  }
  void set_this(unique_ptr<node> that) {
    m_this = that;
  }
}

Yes, but the price is the node is now unable to create any additional child nodes the same way. And as we all know, trees where node only have one child are merely useless linked lists.

You could try to come up with some out-of-the-box idea and have for instance external accounting for all the objects in a tree. In a vector for instance. But how would you traverse? If you stored some kind of references of child and parent nodes in the nodes, then you have just solved the problem without external storage of nodes. You could also have a list of relations between nodes stored separately, but that sounds like a clumsy solution. Another solution would be to invent an ahnentafel-like arrangement, where you could find an element based on some formula. All I could think of were too much trouble at best.

What about getting rid of pointer types altogether? You could pass dereferenced this pointer of a parent node to a child class and store it there as a reference. You can only do that in constructor, which means construction of a root element will create the whole tree. Could this work as well?


class node {
public:
  node& m_parent;
  // Must use pointer type,
  // it is incomplete type at this point
  std::shared_ptr<node> m_child1;
  std::shared_ptr<node> m_child2;
  node(node& parent)
    : m_parent(parent) {
    m_child1.reset(new node(*this));
    m_child2.reset(new node(*this));
  }
};

Looks solid enough. But what to pass to that first root node when starting to create a tree? Reference is a stronger condition. It requires a fully constructed object. It can't be null just for the root, yet root by definition doesn't have nodes. The fact that default constructor is missing is not a problem, it is merely a correct representation of reality.

This calls for another class to handle root nodes. This class should not require a parent node. Now you already need 2 classes to solve this problem.


class node_root {
public:
  std::shared_ptr<node> m_child1;
  std::shared_ptr<node> m_child2;
  node_root() {
    // m_child1.reset(new node(*this)); 
    // But now this doesn't even compile; node expects 
    // a constructed object of type node to be provided.
    m_c1.reset(new node(*reinterpret_cast<node*>(this))); 
    m_c2.reset(new node(*reinterpret_cast<node*>(this))); 
  } 
};

You need to use the ugliest of casts just to make this compile. Which, by the way, will not work, because in node class a reference to the parent is in the first position, followed by two shared pointers to child nodes, which is incompatible to root_node memory alignment. You could provide correct ordering of members in common, but lets just stop here.

The correct way out of this mess is by having a common base class. Now you already need 3 classes to solve this problem! Not to mention one of them needs to be forward declared first, then fully defined exactly between one and the other.


class node;
class node_base
{
public:
  std::shared_ptr<node> m_child1;
  std::shared_ptr<node> m_child2;
  void some_common_function() {
    int i=0; // or whatever the node is for
  }
};
// class node must defined in the correct position, 
// between node_base and node_root
class node : public node_base
{
public:
  node_base& m_parent;
  node(node_base& parent)
    : m_parent(parent) {
    m_child1.reset(new node(*this));
    m_child2.reset(new node(*this));
  }
};
class node_root : public node_base
{
public:
  node_root() {
    // Can now get away with the second ugliest cast
    m_child1.reset(new node(*reinterpret_cast<node_base*>(this)));
    m_child2.reset(new node(*reinterpret_cast<node_base*>(this)));
  }
};

Only then can finally use it.


node_root root(3);
root.m_child1->m_child1->m_parent.some_common_function();
root.m_child1->m_child2->some_common_function();

It is not that it couldn't work, but let's rather admit this was only a perverted exerciseand implement proper solution instead of this travesty.

Proper solution

On the other hand, storing smart pointer to itself in the object is similar to what you get if you just inherit from std::enable_shared_from_this. Base class will communicate with a shared_ptr instance that owns this pointer and allow you to use shared_ptr version of itself by calling shared_from_this() function without breaking reference counting.


class node : std::enable_shared_from_this {
private:
  shared_ptr<node> m_parent;
  shared_ptr<node> m_child1;
  shared_ptr<node> m_child2;
public:
  node() {
    // Can't do much here ...
  }
  void create(shared_ptr<node> parent){
    m_parent = parent;
    m_child1.reset(new node());
    m_child2.reset(new node());
    m_child1.create(shared_from_this());
    m_child2.create(shared_from_this());
  }
}

Downside to this is that an object must be owned by shared_ptr prior to calling shared_from_this(), otherwise you risk an angry run-time error. The consequence of this is that child nodes cannot be created in a constructor of a parent node, because the construction of parent must be completed before the pointer returned by operator new can be assigned to shared_ptr. Unless you are willing to write your own new. But considering alternatives, this is just a minor inconvenience.


Previous: Error: unable to find numeric literal operator
Next: A way to compare floating point values