123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414 |
- ======================================================
- How to set up LLVM-style RTTI for your class hierarchy
- ======================================================
- .. contents::
- Background
- ==========
- LLVM avoids using C++'s built in RTTI. Instead, it pervasively uses its
- own hand-rolled form of RTTI which is much more efficient and flexible,
- although it requires a bit more work from you as a class author.
- A description of how to use LLVM-style RTTI from a client's perspective is
- given in the `Programmer's Manual <ProgrammersManual.html#isa>`_. This
- document, in contrast, discusses the steps you need to take as a class
- hierarchy author to make LLVM-style RTTI available to your clients.
- Before diving in, make sure that you are familiar with the Object Oriented
- Programming concept of "`is-a`_".
- .. _is-a: http://en.wikipedia.org/wiki/Is-a
- Basic Setup
- ===========
- This section describes how to set up the most basic form of LLVM-style RTTI
- (which is sufficient for 99.9% of the cases). We will set up LLVM-style
- RTTI for this class hierarchy:
- .. code-block:: c++
- class Shape {
- public:
- Shape() {}
- virtual double computeArea() = 0;
- };
- class Square : public Shape {
- double SideLength;
- public:
- Square(double S) : SideLength(S) {}
- double computeArea() override;
- };
- class Circle : public Shape {
- double Radius;
- public:
- Circle(double R) : Radius(R) {}
- double computeArea() override;
- };
- The most basic working setup for LLVM-style RTTI requires the following
- steps:
- #. In the header where you declare ``Shape``, you will want to ``#include
- "llvm/Support/Casting.h"``, which declares LLVM's RTTI templates. That
- way your clients don't even have to think about it.
- .. code-block:: c++
- #include "llvm/Support/Casting.h"
- #. In the base class, introduce an enum which discriminates all of the
- different concrete classes in the hierarchy, and stash the enum value
- somewhere in the base class.
- Here is the code after introducing this change:
- .. code-block:: c++
- class Shape {
- public:
- + /// Discriminator for LLVM-style RTTI (dyn_cast<> et al.)
- + enum ShapeKind {
- + SK_Square,
- + SK_Circle
- + };
- +private:
- + const ShapeKind Kind;
- +public:
- + ShapeKind getKind() const { return Kind; }
- +
- Shape() {}
- virtual double computeArea() = 0;
- };
- You will usually want to keep the ``Kind`` member encapsulated and
- private, but let the enum ``ShapeKind`` be public along with providing a
- ``getKind()`` method. This is convenient for clients so that they can do
- a ``switch`` over the enum.
- A common naming convention is that these enums are "kind"s, to avoid
- ambiguity with the words "type" or "class" which have overloaded meanings
- in many contexts within LLVM. Sometimes there will be a natural name for
- it, like "opcode". Don't bikeshed over this; when in doubt use ``Kind``.
- You might wonder why the ``Kind`` enum doesn't have an entry for
- ``Shape``. The reason for this is that since ``Shape`` is abstract
- (``computeArea() = 0;``), you will never actually have non-derived
- instances of exactly that class (only subclasses). See `Concrete Bases
- and Deeper Hierarchies`_ for information on how to deal with
- non-abstract bases. It's worth mentioning here that unlike
- ``dynamic_cast<>``, LLVM-style RTTI can be used (and is often used) for
- classes that don't have v-tables.
- #. Next, you need to make sure that the ``Kind`` gets initialized to the
- value corresponding to the dynamic type of the class. Typically, you will
- want to have it be an argument to the constructor of the base class, and
- then pass in the respective ``XXXKind`` from subclass constructors.
- Here is the code after that change:
- .. code-block:: c++
- class Shape {
- public:
- /// Discriminator for LLVM-style RTTI (dyn_cast<> et al.)
- enum ShapeKind {
- SK_Square,
- SK_Circle
- };
- private:
- const ShapeKind Kind;
- public:
- ShapeKind getKind() const { return Kind; }
- - Shape() {}
- + Shape(ShapeKind K) : Kind(K) {}
- virtual double computeArea() = 0;
- };
- class Square : public Shape {
- double SideLength;
- public:
- - Square(double S) : SideLength(S) {}
- + Square(double S) : Shape(SK_Square), SideLength(S) {}
- double computeArea() override;
- };
- class Circle : public Shape {
- double Radius;
- public:
- - Circle(double R) : Radius(R) {}
- + Circle(double R) : Shape(SK_Circle), Radius(R) {}
- double computeArea() override;
- };
- #. Finally, you need to inform LLVM's RTTI templates how to dynamically
- determine the type of a class (i.e. whether the ``isa<>``/``dyn_cast<>``
- should succeed). The default "99.9% of use cases" way to accomplish this
- is through a small static member function ``classof``. In order to have
- proper context for an explanation, we will display this code first, and
- then below describe each part:
- .. code-block:: c++
- class Shape {
- public:
- /// Discriminator for LLVM-style RTTI (dyn_cast<> et al.)
- enum ShapeKind {
- SK_Square,
- SK_Circle
- };
- private:
- const ShapeKind Kind;
- public:
- ShapeKind getKind() const { return Kind; }
- Shape(ShapeKind K) : Kind(K) {}
- virtual double computeArea() = 0;
- };
- class Square : public Shape {
- double SideLength;
- public:
- Square(double S) : Shape(SK_Square), SideLength(S) {}
- double computeArea() override;
- +
- + static bool classof(const Shape *S) {
- + return S->getKind() == SK_Square;
- + }
- };
- class Circle : public Shape {
- double Radius;
- public:
- Circle(double R) : Shape(SK_Circle), Radius(R) {}
- double computeArea() override;
- +
- + static bool classof(const Shape *S) {
- + return S->getKind() == SK_Circle;
- + }
- };
- The job of ``classof`` is to dynamically determine whether an object of
- a base class is in fact of a particular derived class. In order to
- downcast a type ``Base`` to a type ``Derived``, there needs to be a
- ``classof`` in ``Derived`` which will accept an object of type ``Base``.
- To be concrete, consider the following code:
- .. code-block:: c++
- Shape *S = ...;
- if (isa<Circle>(S)) {
- /* do something ... */
- }
- The code of the ``isa<>`` test in this code will eventually boil
- down---after template instantiation and some other machinery---to a
- check roughly like ``Circle::classof(S)``. For more information, see
- :ref:`classof-contract`.
- The argument to ``classof`` should always be an *ancestor* class because
- the implementation has logic to allow and optimize away
- upcasts/up-``isa<>``'s automatically. It is as though every class
- ``Foo`` automatically has a ``classof`` like:
- .. code-block:: c++
- class Foo {
- [...]
- template <class T>
- static bool classof(const T *,
- ::std::enable_if<
- ::std::is_base_of<Foo, T>::value
- >::type* = 0) { return true; }
- [...]
- };
- Note that this is the reason that we did not need to introduce a
- ``classof`` into ``Shape``: all relevant classes derive from ``Shape``,
- and ``Shape`` itself is abstract (has no entry in the ``Kind`` enum),
- so this notional inferred ``classof`` is all we need. See `Concrete
- Bases and Deeper Hierarchies`_ for more information about how to extend
- this example to more general hierarchies.
- Although for this small example setting up LLVM-style RTTI seems like a lot
- of "boilerplate", if your classes are doing anything interesting then this
- will end up being a tiny fraction of the code.
- Concrete Bases and Deeper Hierarchies
- =====================================
- For concrete bases (i.e. non-abstract interior nodes of the inheritance
- tree), the ``Kind`` check inside ``classof`` needs to be a bit more
- complicated. The situation differs from the example above in that
- * Since the class is concrete, it must itself have an entry in the ``Kind``
- enum because it is possible to have objects with this class as a dynamic
- type.
- * Since the class has children, the check inside ``classof`` must take them
- into account.
- Say that ``SpecialSquare`` and ``OtherSpecialSquare`` derive
- from ``Square``, and so ``ShapeKind`` becomes:
- .. code-block:: c++
- enum ShapeKind {
- SK_Square,
- + SK_SpecialSquare,
- + SK_OtherSpecialSquare,
- SK_Circle
- }
- Then in ``Square``, we would need to modify the ``classof`` like so:
- .. code-block:: c++
- - static bool classof(const Shape *S) {
- - return S->getKind() == SK_Square;
- - }
- + static bool classof(const Shape *S) {
- + return S->getKind() >= SK_Square &&
- + S->getKind() <= SK_OtherSpecialSquare;
- + }
- The reason that we need to test a range like this instead of just equality
- is that both ``SpecialSquare`` and ``OtherSpecialSquare`` "is-a"
- ``Square``, and so ``classof`` needs to return ``true`` for them.
- This approach can be made to scale to arbitrarily deep hierarchies. The
- trick is that you arrange the enum values so that they correspond to a
- preorder traversal of the class hierarchy tree. With that arrangement, all
- subclass tests can be done with two comparisons as shown above. If you just
- list the class hierarchy like a list of bullet points, you'll get the
- ordering right::
- | Shape
- | Square
- | SpecialSquare
- | OtherSpecialSquare
- | Circle
- A Bug to be Aware Of
- --------------------
- The example just given opens the door to bugs where the ``classof``\s are
- not updated to match the ``Kind`` enum when adding (or removing) classes to
- (from) the hierarchy.
- Continuing the example above, suppose we add a ``SomewhatSpecialSquare`` as
- a subclass of ``Square``, and update the ``ShapeKind`` enum like so:
- .. code-block:: c++
- enum ShapeKind {
- SK_Square,
- SK_SpecialSquare,
- SK_OtherSpecialSquare,
- + SK_SomewhatSpecialSquare,
- SK_Circle
- }
- Now, suppose that we forget to update ``Square::classof()``, so it still
- looks like:
- .. code-block:: c++
- static bool classof(const Shape *S) {
- // BUG: Returns false when S->getKind() == SK_SomewhatSpecialSquare,
- // even though SomewhatSpecialSquare "is a" Square.
- return S->getKind() >= SK_Square &&
- S->getKind() <= SK_OtherSpecialSquare;
- }
- As the comment indicates, this code contains a bug. A straightforward and
- non-clever way to avoid this is to introduce an explicit ``SK_LastSquare``
- entry in the enum when adding the first subclass(es). For example, we could
- rewrite the example at the beginning of `Concrete Bases and Deeper
- Hierarchies`_ as:
- .. code-block:: c++
- enum ShapeKind {
- SK_Square,
- + SK_SpecialSquare,
- + SK_OtherSpecialSquare,
- + SK_LastSquare,
- SK_Circle
- }
- ...
- // Square::classof()
- - static bool classof(const Shape *S) {
- - return S->getKind() == SK_Square;
- - }
- + static bool classof(const Shape *S) {
- + return S->getKind() >= SK_Square &&
- + S->getKind() <= SK_LastSquare;
- + }
- Then, adding new subclasses is easy:
- .. code-block:: c++
- enum ShapeKind {
- SK_Square,
- SK_SpecialSquare,
- SK_OtherSpecialSquare,
- + SK_SomewhatSpecialSquare,
- SK_LastSquare,
- SK_Circle
- }
- Notice that ``Square::classof`` does not need to be changed.
- .. _classof-contract:
- The Contract of ``classof``
- ---------------------------
- To be more precise, let ``classof`` be inside a class ``C``. Then the
- contract for ``classof`` is "return ``true`` if the dynamic type of the
- argument is-a ``C``". As long as your implementation fulfills this
- contract, you can tweak and optimize it as much as you want.
- For example, LLVM-style RTTI can work fine in the presence of
- multiple-inheritance by defining an appropriate ``classof``.
- An example of this in practice is
- `Decl <http://clang.llvm.org/doxygen/classclang_1_1Decl.html>`_ vs.
- `DeclContext <http://clang.llvm.org/doxygen/classclang_1_1DeclContext.html>`_
- inside Clang.
- The ``Decl`` hierarchy is done very similarly to the example setup
- demonstrated in this tutorial.
- The key part is how to then incorporate ``DeclContext``: all that is needed
- is in ``bool DeclContext::classof(const Decl *)``, which asks the question
- "Given a ``Decl``, how can I determine if it is-a ``DeclContext``?".
- It answers this with a simple switch over the set of ``Decl`` "kinds", and
- returning true for ones that are known to be ``DeclContext``'s.
- .. TODO::
- Touch on some of the more advanced features, like ``isa_impl`` and
- ``simplify_type``. However, those two need reference documentation in
- the form of doxygen comments as well. We need the doxygen so that we can
- say "for full details, see http://llvm.org/doxygen/..."
- Rules of Thumb
- ==============
- #. The ``Kind`` enum should have one entry per concrete class, ordered
- according to a preorder traversal of the inheritance tree.
- #. The argument to ``classof`` should be a ``const Base *``, where ``Base``
- is some ancestor in the inheritance hierarchy. The argument should
- *never* be a derived class or the class itself: the template machinery
- for ``isa<>`` already handles this case and optimizes it.
- #. For each class in the hierarchy that has no children, implement a
- ``classof`` that checks only against its ``Kind``.
- #. For each class in the hierarchy that has children, implement a
- ``classof`` that checks a range of the first child's ``Kind`` and the
- last child's ``Kind``.
|