EVL: A framework for multi-methods in C++

EVL: A framework for multi-methods in C++

Science of Computer Programming 98 (2015) 531–550 Contents lists available at ScienceDirect Science of Computer Programming www.elsevier.com/locate/...

519KB Sizes 1 Downloads 26 Views

Science of Computer Programming 98 (2015) 531–550

Contents lists available at ScienceDirect

Science of Computer Programming www.elsevier.com/locate/scico

EVL: A framework for multi-methods in C++ Yannick Le Goc a , Alexandre Donzé b,∗ a b

Institut Laue-Langevin, 6 rue Jules Horowitz, BP 156, 38042 Grenoble Cedex 9, France University of California, Berkeley, EECS Department, Cory Hall 545S, 94720 Berkeley, CA, USA

a r t i c l e

i n f o

Article history: Received 3 April 2012 Received in revised form 9 August 2014 Accepted 11 August 2014 Available online 4 September 2014 Keywords: Multi-methods Multiple dispatch Runtime type information Object-oriented programming C++

a b s t r a c t Multi-methods are functions whose calls at runtime are resolved depending on the dynamic types of more than one argument. They are useful for common programming problems. However, while many languages provide different mechanisms to implement them in one way or another, there is still, to the best of our knowledge, no library or language feature that handles them in a general and flexible way. In this paper, we present the EVL (Extended Virtual function Library) framework which provides a set of classes in C++ aiming at solving this problem. The EVL framework provides a generalization of virtual function dispatch through the number of dimensions and the selection of the function to invoke using a so-called Function Comparison Operator. Our library provides both symmetric and asymmetric dispatch algorithms that can be refined by the programmer to include criteria other than class inheritance. For instance, the EVL framework provides multi-methods with predicate dispatch by defining a dedicated FCO based not only on the dynamic types of the arguments but also on their values. This flexibility greatly helps to resolve ambiguities without having to define new functions. Our multi-methods also unify dispatch tables and caching by introducing cache strategies for which the implementation is a balance between memory and speed. To define multi-methods in C++, we implement a non-intrusive reflection library providing fast dynamic casting and supporting dynamic class loading. Our multi-methods are policybased class templates that support virtual but not repeated inheritance. They check the type compatibility of functions at compile-time, preserve type-safety and resolve function calls at runtime by invoking the cache or updating it by computing the selected function for the requested tuple of types. By default, our multi-methods handle dispatch errors at runtime by throwing exceptions but an error-code strategy can be set up by defining a dedicated policy class. Performance of our multi-methods is comparable with that of standard virtual functions when configured with fast cache. © 2014 Elsevier B.V. All rights reserved.

1. Introduction In object-oriented programming (OOP), it is common to define some member functions as virtual to allow them to be overridden by differing concrete implementations in derived classes; a call to a virtual function is then resolved to one of its concrete implementations at runtime based on the dynamic type of the object on which the call is made. Multiple dispatch can be seen as an extension of this selection mechanism, where the concrete implementation to use at runtime depends on

*

Corresponding author. E-mail addresses: [email protected] (Y. Le Goc), [email protected], [email protected] (A. Donzé).

http://dx.doi.org/10.1016/j.scico.2014.08.003 0167-6423/© 2014 Elsevier B.V. All rights reserved.

532

Y. Le Goc, A. Donzé / Science of Computer Programming 98 (2015) 531–550

the dynamic type of more than one argument. It has been implemented in different languages with various terminologies. In CLOS [1] and Dylan [2], generic functions implement multiple dispatch. In Cecil [3] and MultiJava [4], such method families are called multi-methods. The most popular object-oriented languages, including C++ [5] and Java [6], propose a built-in single dispatch mechanism implemented by class member functions (virtual functions in C++, non-static methods in Java), but do not support the runtime resolution of functions that depend on the dynamic type of more than one argument. Potential applications of multiple dispatch are (see [7] for more examples):

• The traversal of a graph data structure where nodes are instances of different polymorphic classes without defining a process function in each node class;

• The definition of an event handler for a widget toolkit by defining functions depending on two dynamic arguments (event, widget);

• The definition of binary comparison operators. Each example has its own specifics. The traversal example can be found in 3D image synthesis, where scenes are modeled with an OOP data structure called a scene graph and for which a number of different rendering algorithms (defined into the process functions) can be used. The well-known visitor pattern [8], which makes use of common features of object-oriented languages, is functional and relatively convenient, but it introduces additional dependencies between the node class to process and the visitor class. The event handler and the binary operator examples are typical problems requiring the use of double dispatch. As an example of a triple dispatcher, one can consider the event handler problem with an additional polymorphic argument (its parameter type is polymorphic) representing a context state. Multiple dispatch is an advanced topic in OOP. As a consequence, training programmers to use such a tool is a problem by itself; this was addressed, e.g., by Chambers in [9] but is still an ongoing concern. Despite the fact that the concept has existed for years, there are still many remaining issues with multiple dispatch and many ways of defining it; it is also not clear how popular it is among practitioners. Muschevici et al. [10] present a cross-language comparison on how multiple dispatch is used in programs. By defining code metrics, they are able to provide statistics over applications coded in different languages. The results tend to show that when multi-methods are available in a language, they are used. We can thus infer that once multi-methods are implemented in C++, they will be used. Moreover Muschevici et al. studied the potential use of multi-methods in the Java language. For that they made statistics on the use of explicit multiple dispatch by implementations of the visitor pattern and implicit multiple dispatch by cascaded use of the instanceof operator. The results showed that cascaded instanceof operators are more often used than the visitor pattern. The first issue for multiple dispatch is the resolution of ambiguities when looking for the best matching function, i.e., when more than one function matches but it is not possible to decide which one to call. In C++, it is usual to encounter this situation. For instance, if we define two non-member overloaded functions foo with a single argument of type float for the first function and double for the second function and we try to apply foo to an int, we get an ambiguous call error at compile-time. When multiple dispatch is defined as an extension of single dispatch, the natural idea is to have a compiler that provides compilation errors when it detects ambiguities. However, detecting ambiguities with multi-methods is not as easy as for single dispatch functions, which are defined in the scope of a class hierarchy. It may not even be possible for the compiler to check for ambiguities at compile-time, since it has to check all possible tuples of types, but their exhaustive list is only available at link-time. In the presence of dynamic class loading, this list can even vary at runtime. Furthermore, it can be very large, increasing the probability of ambiguities. Also, it is legitimate and plausible to define multi-methods for which ambiguous tuples of types exist but are intentionally never used together. An a priori systematic detection of possible ambiguities would forbid this use case. A common approach to ambiguity resolution is the definition of additional functions that forward their calls to the already defined functions. The minimal set of additional functions to define can be computed [11] but the number of functions to define can remain large. The resolution can be either symmetric [9,7] or asymmetric [1]. Compared to asymmetric resolution (e.g., arguments are taken in lexicographic order), symmetric resolution (all arguments have the same “weight”) seems more natural in C++ because it conforms to its function overload resolution rules. However the symmetric resolution generates more ambiguities and the programmer will find easier to consider the resolution argument by argument. Various approaches have been used to incorporate multiple dispatch in OOP languages. Some languages natively support it, like CLOS [1], Dylan [2] and Cecil [3]. For others, it comes as an extension, such as MultiJava [4] for Java. In C++, Open Multi-Methods [7] were proposed as an extension of virtual functions to more general multi-methods, but have not yet been submitted for inclusion in the standard. An alternative approach is to implement a library based on existing language features. In Java, JMMF [12] is such a library, based on Java’s reflection mechanism. In the Python language [13], several modules can be imported, and in C++, the Cmm library [14] uses pre-processor definitions and a pre-compiler to integrate multiple dispatch. By contrast, the library that we propose is based on the C++ standard only. Several other C++ libraries have been developed prior to ours, such as Loki [15] or DoubleCpp [16]. In Section 5.2, we provide a more detailed comparison of our work with existing approaches and implementations, in terms of features and performance. One point to be discussed is where to define functions of multi-methods. In [7], they are non-member functions automatically handled by a special linker which will construct the family of compatible callable functions, hence their qualification as open. This provides an easy extensibility and less verbosity. The problem with such global functions is that they can be found by the linker in different translation units even if those have nothing in common, leading to unexpected dispatches.

Y. Le Goc, A. Donzé / Science of Computer Programming 98 (2015) 531–550

533

Another issue concerns the definition of multi-methods for a contextual processing of objects. If a multi-method is defined outside of any class, the context must be passed as a parameter, which can be cumbersome. On the contrary, if this multimethod is encapsulated in a class, i.e., the set of functions participating to the resolution belong to a class [17], it makes it possible to encapsulate the context in this same class and have it accessible for the function definitions as a private member. Moreover it provides more control over the visibility of the functions, although it is more constrained and does not allow easy extensibility. Finally, Chambers [9] asks the questions whether multi-methods defined outside the scope of a class are object-oriented. In this paper, we propose a C++ multi-methods framework called EVL for “Extended Virtual functions Library”. We take advantage of the library approach as opposed to a language evolution approach to provide highly configurable multi-method classes. We first present a generalization of the function selection algorithm by using a redefinable function comparison operator (FCO). The operator is based on the distance (as measured by a configurable metric) between tuple of types in the subclass tree and optional additional information. The goal of this generalization is to propose a solution that can implement easily either the asymmetric or the symmetric resolution of the polymorphic arguments, or even a mix of both for multiple dispatch of more than two dimensions, or arbitrary refined and specific resolution strategies depending on the programmer’s needs and choices. Then we present our cache strategy that involves no more than an associative container filled at runtime. The advantage is that the cache can be implemented as a standard dispatch table by a vector of functions as well as a bounded hash map to reduce the memory footprint. More complex compressed dispatch tables are also possible to implement in principle [18] as well as caching algorithms such as page replacement algorithms [19]. We provide a complete and functional implementation of multiple dispatch for C++ based on the C++ 2003 standard. Our multi-methods are runtime objects that record functions and dynamically resolve their application. They are defined by class templates that preserve type safety through a strong typing. Only functions compatible with the defined prototype of the multi-method can be recorded without compilation errors. They also provide policies for FCO, cache and error handling. To realize the library implementation of the function selection algorithms based on tuple of types, we need inheritance relations at runtime. As the C++ standard does not specify runtime reflection and no existing library met our criterions, we defined a practical reflection mechanism using rules that make it as simple as possible for the programmer to declare types. Although the reflection part of the EVL framework is not our primary focus, we believe that it constitutes a contribution by itself, as it leads to the definition and implementation of interesting features such as fast dynamic casting [20] and simple rich pointers [21] (a variant of smart pointers that do not provide memory management but type information facilities). To summarize, the contributions of this work include: 1. A highly flexible multi-method framework including • the conceptualization of customizable function comparison operators to define any dispatch function selection algorithm, or refine existing ones (symmetric, asymmetric). FCO can be used to resolve ambiguities and implement predicate dispatch, • a customizable cache strategy to improve the performance of a multi-method (speed, memory), • a customizable error strategy to support environments where exceptions cannot be thrown. 2. A fully-functional C++ library implementation which • is C++03 standard compliant, macro-free and performant, • provides dynamic class loading support, • implements and uses a partial runtime reflection library that is non-intrusive and provides fast dynamic casting. The rest of the paper is organized as follows. In Section 2, we formalize the concepts of multi-methods and function comparison operators. In Section 3, we present our implementation of reflection in C++ to construct inheritance graphs, a prerequisite to multiple dispatch. In Section 4, we describe the implementation of multi-methods and how the notions introduced in the earlier sections are integrated into the EVL framework. Finally we discuss performance issues and related work, and present experimental results in Section 5 before concluding in Section 6. 2. Multi-methods formalization In this section, we formalize the multi-method notion and function comparison operators (FCOs), for which we will use the mathematical notation
• bp is a triplet (r , t , p ), called the base prototype, where r is the return type, t a tuple of virtual parameter types and p a tuple of any types. The tuple t has size N, called the number of dimensions of the multi-method. The tuple p has size M.

534

Y. Le Goc, A. Donzé / Science of Computer Programming 98 (2015) 531–550

For a tuple of arguments a = (a1 , · · · , a N , a N +1 , · · · , a N + M ), we let a¯ denote the tuple of dynamic types of (a1 , · · · , a N ), corresponding to the virtual parameter types; • D is a set of functions { f } such that every f ∈ D has a prototype compatible with bp: – The first N parameter types of f , called virtual parameter types and denoted ¯f = ( ¯f 1 , ¯f 2 , . . . , ¯f N ), are polymorphic and covariant to the corresponding N types in t, i.e., ¯f 1 < t 1 , ¯f 2 < t 2 , etc.1 ; – The remaining parameter types of f , called nonvirtual parameter types, are equal to p; – The return type of f is covariant to r. • d is a map which associates an additional attribute of type data2 to each function of D; fco


m(a) = f ∗ (ca), fco where f ∗ is the minimal element of D a¯ for the order
• Functions of D can have different names and be non-members or members of a class. In the latter case, they must be bound to a calling object making them equivalent to a non-member function. We call them overriders of m [7].

• Type-safety is preserved. The set D a¯ contains only functions f for which cast ¯f (a) succeeds. • The invocation of m fails if fco – there is no minimal element for

In the above definition of our multi-methods, the
fco




f


g=

true

if ¯f < g¯ ,

false

otherwise.

Recall that
1 2

The special type all can be used to relax the covariance constraint. E.g., t = ( A , B ) requires f¯1 ≤ A and f¯2 ≤ B; t = ( A , all) only requires f¯1 ≤ A. There is no attribute when data is void.

Y. Le Goc, A. Donzé / Science of Computer Programming 98 (2015) 531–550

535

2.3. Distance-based function comparison operators We first introduce a distance between types. Let u and t be polymorphic types such that u < t, then the distance d = σ (u , t ) between u and t is defined as the length of the longest class derivation path

u < t 1 < · · · < t d −1 < t . Using

σ , we can compare f ∈ D a¯ and g ∈ D a¯ by comparing   σa¯ ( f )  σ (¯a1 , ¯f 1 ), · · · , σ (¯a N , ¯f N ) ∈ NN

with





σa¯ ( g )  σ (¯a1 , g¯ 1 ), · · · , σ (¯a N , g¯ N ) ∈ NN . Note that σa¯ ( f ) and σa¯ ( g ) are well defined by construction of D a¯ . We propose three FCOs, based on lexicographic order, product order and 1-norm of tuples. The lexicographic FCO is defined as:

f


σa¯ ( f )


σ (¯a1 , ¯f 1 ) < σ (¯a1 , g¯ 1 )

∃k > 1, ∀i < k,

σ (¯ai , ¯f i ) = σ (¯ai , g¯ i ) and σ (¯ak , ¯f k ) < σ (¯ak , g¯ k ).

The product FCO is defined as: p

f


σa¯ ( f ) < p σa¯ ( g )



∀i , σ (¯ai , f i ) ≤ σ (¯ai , g¯ i ) and ∃ j , σ (¯a j , ¯f j ) < σ (¯a j , g¯ j ).

The lexicographic FCO does not produce ambiguities (except in case of multiple inheritance) but it gives more importance to the first elements than to the last elements of the tuple. On the other hand, the product FCO is more “democratic” since it will only consider a tuple to be “inferior” to another tuple if all its elements are “inferior”. To define the lexicographic and the product FCOs, we do not need the actual distance between types – the inheritance relation would be enough. However, this additional information can be used to define new relations between tuples of types. For example, we can define a 1-norm FCO as:

f
⇔ ⇔

σa¯ ( f ) <1 σa¯ ( g ) N 

σ (¯ai , ¯f i ) <

i =1

N 

σ (¯ai , g¯ i ).

i =1

Since we have the implications

σa¯ ( f ) < p σa¯ ( g )



σa¯ ( f )
σa¯ ( f ) < p σa¯ ( g )



σa¯ ( f ) <1 σa¯ ( g ),

the 1-norm and the lexicographic FCOs can be seen as a product FCO for which we provide an additional disambiguation rule (sum of distances for 1-norm and order of arguments for lexicographic). The 1-norm FCO reduces ambiguities (without eliminating them) but can be less intuitive in practice. Finally, note that the lexicographic FCO is an example of asymmetric resolution and the product FCO is an example of symmetric resolution. 2.4. Refining function comparison operators A distance relation might not be enough to define a proper FCO, meaning that using only distance information can still yield ambiguities. They can be resolved either by adding the necessary overriders or by refining the FCO with additional rules. The refinement of an FCO
refined

f
⎧ if f
otherwise.

The library makes it possible to attach data to each function or use extra type information that the implementation of the FCO can use (through the d component of the multi-method). Thus the natural way to define a proper FCO that resolves ambiguities is as follows:

536

Y. Le Goc, A. Donzé / Science of Computer Programming 98 (2015) 531–550

Fig. 1. Illustration of the overriders multiplication problem. To resolve the ambiguity with ( B , A 1 ), one would have to define k overriders ( A 1 , B ), · · · , ( A k , B ).

1. Define


f
f


when defined (no ambiguity with < p ),

priority( f ) > priority( g )

otherwise.

This theoretical example shows how flexibility in the definition of the FCO can help in resolving ambiguities rather than defining overriders. 2.5. From dispatch tables to cache strategies It is natural with multi-methods whose function resolution only depends on the dynamic types, such as virtual functions, to speed up their invocation by generating a dispatch table containing the entries of all the possible combinations of tuples of types. The resolution for a tuple of types is then pre-calculated only at compile-time. However for dimensions greater than one, the use of a simple dispatch table may not be the most appropriate nor the most efficient. The size of the dispatch table of a multi-method having all its virtual parameter types belonging to the same hierarchy is C N , where C is the number of classes of the hierarchy. For large class hierarchies, it is obvious that the size can become a memory issue. Now suppose C = 10, N = 3 and that the multi-method will only be called on a subset of the class hierarchy of size 5. Then at most 53 /103 ≈ 12% of the dispatch table will be used. That means that almost 90% of the dispatch table is calculated for nothing. Moreover, for multi-method implementations raising a compilation error when an ambiguity is found for a tuple of types during the dispatch table generation, this becomes an issue because it is more likely that the programmer will have to resolve an ambiguity for a tuple of types he will never use. Our implementation proposes a flexible cache mechanism. In EVL, the cache c of a multi-method is a container that associates tuples of types to functions. It is filled at runtime and thus contains only the subset of all the possible tuples of types that are actually used in the program. Moreover, it is possible to fill the container with all the possible tuples of types by “resolving” the multi-method at the beginning of the execution of the program. In that case, the cache becomes a dispatch table. The choice of the cache implementation is left to the programmer and should be made in consideration of the application memory and performance requirements. By default, multi-methods have a standard map-based cache.

Y. Le Goc, A. Donzé / Science of Computer Programming 98 (2015) 531–550

537

Thus our cache mechanism lets the developer choose between: 1. A caching technique using a map-based implementation filled at runtime; 2. A dispatch table using a vector-based implementation filled at the beginning of the execution of the program by resolving the multi-method. Other caching techniques can be used, e.g., the caching technique of [23] can be implemented by using an unbounded map such as a simple hash map, although its size can become an issue. It can also be implemented using a bounded map, in which cache replacement algorithms [19] can be used, e.g., FIFO, LRU, etc. Dispatch tables are expected to be faster, but also use more memory. However, to deal with the size and sparsity of the dispatch table, various compression algorithms have been proposed in the literature (see e.g. [18]). In EVL, types are identified with small contiguous integer IDs (see Section 3.1). For each a¯ i , ID(¯ai ) is the contiguous integer ID attributed to a¯ i when it was first declared in EVL. The covariance constraint implies that the set of possible types for a¯ i in the context where a¯ is used is a subset of all the declared types. Let min i (resp. maxi ) be the minimum (resp. maximum) ID in this subset. The interval [mini , maxi ] defines an interval of possible IDs for a¯ i of length si = maxi − mini + 1 which can be quite smaller than M. Let posi be the position of a¯ in this interval, i.e., posi = ID(¯ai ) − mini . Then the ID of the tuple a¯ i is defined as:

ID(¯a) =

N 

posi ×

i =1

N



sj .

(1)

j = i +1

This is the natural address computation in a multi-dimensional array of size s1 × · · · × s N . This ensures that each tuple

N of types a¯ has a unique ID such that: 0 ≤ ID(¯a) < j =1 s j = S. The cache maps have integer keys for the tuples of types. We can define a lookup table by an array of size S. The IDs are then the indices of the array, providing de-facto the most performant cache strategy. As a final remark, let us note that caches can only be used for standard multi-methods, because there are potentially too many entries for general multi-methods with predicate dispatch. As a consequence, these multi-methods are much less efficient and should not be used in a context where function calls take significant times. 3. Implementing reflection in C++ To implement the FCOs of our multi-methods, we need runtime reflection to compare tuples of types. Since there is no current standard implementation of reflection in C++ (apart from the limited RTTI operators) and we found no satisfying existing approach, we designed and implemented our own reflection library. 3.1. A new reflection library We designed a runtime partial reflection library fulfilling the following features and requirements: 1. 2. 3. 4. 5. 6. 7. 8. 9.

It It It It It It It It It

provides automatic inheritance information; supports class templates; provides fast dynamic cast; provides contiguous integer IDs for reflected types; is non-intrusive; has an API as clean and simple as possible, in particular avoiding the use of macros or external parsing tools; offers the possibility to add user-definable information; is portable; supports dynamic class loading.

Different approaches to implement C++ reflection have been proposed: by letting the programmer provide information [24,25], by parsing sources with a pre-processor, generating type descriptors [26,27] or by parsing debugging information provided by a compiler. Knizhnik compared and implemented the first and last approaches [24]. The first approach is usually intrusive and requires the duplication of class information (inheritance, members, etc.) which is error-prone. The second approach does not require duplicating information but the compilation process must be done in two stages, making it tedious. Moreover, to create a C++ parser is a difficult task, especially because of class templates. Finally, the last approach is compiler-dependent. Another approach is to automatically provide reflection information by using available information at runtime. This was done, e.g., for the implementation of a type switch library in C++ [28]. A type switch library shares some of our requirements above: objects must be compared using their dynamic types and fast dynamic cast must be provided to get an efficient implementation. The mechanism should be transparent to the programmer, because reflection, in this case as in ours, is a mean, not the goal of the library. The technique is based on the introspection of objects with polymorphic types, in

538

Y. Le Goc, A. Donzé / Science of Computer Programming 98 (2015) 531–550

the scope of compilers implementing the “common vendor C++ ABI”.3 By comparing the addresses of the virtual table pointers of two objects, inheritance information and subobject offsets used for casts can be deduced. However, this solution is compiler-dependent, and it is not clear whether it is implementable for every compilers, thus violating our requirement for portability. Our approach is based on the same idea, i.e., to automatically provide reflection information for inheritance relations and subobject offsets, but does not rely on compiler-specific features. Also, additional reflection information can be provided by the programmer. Our library is portable, but has the following limitations:

• Fast dynamic cast supports virtual inheritance but not repeated inheritance. A cast is only possible when the source type and the target type exist only once in the base classes of the dynamic type of the source object;

• Reflected types are declared when they appear in a template argument of an EVL class or function. Some base classes of a class may be missing;

• When a type is declared, a unique instance of the type is created and lives until the termination of the program. Note that it is possible to cast objects containing repeated base classes to their dynamic type. But it will not be possible to cast to or from these repeated classes. Indeed, we do not take into account the derivation path of the source subobject. If base classes of a class are missing, it obviously does not change their inheritance relations, but reduces the type distances (see Section 2.3). A unique instance of a declared type is used to compare types in order to build an inheritance graph. When all the types have been declared, it could be possible to delete these objects. However, in case of dynamic class loading, they must be kept alive, as a new type can be declared at any time. 3.2. Class definitions, instantiations and fast dynamic cast The requirements described in the previous section lead us to the definition of the main classes of the framework, object_class and object, which are similar to the Java classes Class and Object [6]. For each declared type in EVL, a unique instance of object_class is created, containing the base and derived class list represented by an object_class set. The programmer can manually add information relating to static const members, setters, getters and functions. A class A can be managed by the EVL framework if

• it is polymorphic; • it is constructible or copyable if it is not abstract; • a direct subclass that is default constructible is provided if it is abstract. Our reflection library is not intrusive, in the sense that it is not necessary to make A inherit from the base class object. However, this is usually advantageous. To perform fast dynamic casts, subobject offset information given by the object_class instances are pre-calculated so that only lookup table reads are needed. In [20], Gibbs and Stroustrup propose a fast dynamic cast scheme also based on pre-calculated offsets. For classes inheriting from object, the object_class instance is immediately accessed, as the object class carries a pointer to the object_class. For other classes, a rich pointer represented by the object_ptr<> class template that itself carries a pointer to the object_class, has to be used. Fast dynamic casts are performed by calling the dynamic_pointer_cast<> function template. This is illustrated in the following examples. Example 2 (Fast dynamic cast without/with rich pointers). Suppose we have the default constructible classes A and B such that B inherits A and A does inherit object. The following code shows how to instantiate and cast such objects: // create a pointer B converted into A A * a = object::make(); // fast dynamic cast from A to B B * b = dynamic_pointer_cast(a);

Suppose now that we have the default constructible classes X and Y such that Y inherits X and X does not inherit object. To benefit from fast dynamic cast, we therefore need to instantiate objects using rich pointers of type object_ptr<>: // create a rich pointer Y converted into X object_ptr x = object::make_ptr(); // fast dynamic cast from X to Y object_ptr y = dynamic_pointer_cast(x);

3

http://mentorembedded.github.io/cxx-abi/.

Y. Le Goc, A. Donzé / Science of Computer Programming 98 (2015) 531–550

539

Unlike the dynamic_cast<> operator, the dynamic_pointer_cast<> function template performs a dynamic cast in constant time, regardless of what the distance between the source and the target type in the class hierarchy is. The object_ptr<> class template is inspired by the Rich Pointers [21] proposal for standardization. With the object_class instance of the object dynamic type, we can immediately get a pre-computed offset to move the source pointer to the pointer of target type. See Section 3.4 for details on how offsets are computed. By instantiating the EVL class template object_ptr<> and function templates object::make<>, object::make_ptr<> and dynamic_pointer_cast<> with parameters A, B, X and Y , the object_class instances representing A, B, X and Y are automatically instantiated in the static initialization phase of the program. In Section 4.1 we will see that the insertion of a function into a multi-method also causes the creation of object_class. For dynamically loaded classes, the instantiation of the object_class is made in the static initialization phase of the library, i.e., during the execution of the main function of the program. Note, in our example, that if the class B also inherits another class C that is not declared here, then the list of base classes of B provided by its object_class only contains A, but for all the declared classes, we are able to reconstruct the partial inheritance graph at runtime. 3.3. Inheritance graph reconstruction In this section, we describe how the library performs the reconstruction of the object_class inheritance graph using the RTTI operators. It is mostly based on the definition of a member function is_base_of in object_class such that for two object_class instances a and b representing respectively the polymorphic types A and B we have, ideally:

a.is_base_of (b)



A is a base of B .

(2)

To show how the function is_base_of is implemented in object_class, we present the following simplified code to highlight the technique4 : struct object_class { virtual bool is_base_of(object_class *) = 0; object * instance; }; template struct object_class_typed : object_class { object_class_typed() { instance = make_object(); } object * make_object(); virtual bool is_base_of(object_class * ocU) { return this != ocU && dynamic_cast(ocU->instance) != 0; } };

The instance member of object_class is the unique instance of the represented type. Let us first consider the case of two declared types A and B that have object as base class and are default constructible. The make_object function implementation is then trivial: template object * object_class_typed::make_object() { return new T(); };

We can compare A and B at runtime: object_class * ocA = new object_class_typed
(); // calls ocA->instance = new A() object_class * ocB = new object_class_typed(); // calls ocB->instance = new B() ocA->is_base_of(ocB); // tests dynamic_cast(ocB->instance) != 0 ocB->is_base_of(ocA); // tests dynamic_cast(ocA->instance) != 0

The implementation of is_base_of is using the type represented by the object_class caller and an instance of the type represented by the object_class argument to call dynamic_cast<>, the only C++ operator that provides subclass information at runtime. However, the dynamic_cast
cast will succeed only if the B instance contains a unique base of type A. Indeed, in case of multiple repeated inheritance, the test can fail. Thus our implementation of is_base_of does not implement exactly (2), but the following:

a.is_base_of (b)



A is a unique base of B .

However this is not a problem for the reconstruction of our inheritance graph, as explained below.

4

This is actually a type erasure based technique.

540

Y. Le Goc, A. Donzé / Science of Computer Programming 98 (2015) 531–550

Fig. 2. Example of a class inheritance graph (a) and the resulting reduced object class graph (b).

In the case of a class T which does not inherit from object (recall that the EVL library is not intrusive and does not impose the requirement that reflected types inherit from object), we introduce a class template external<> defined as: template struct external : object, T { };

The make_object function implementation is then: template object * object_class_typed::make_object() { return new external(); };

Note that the separation between the two implementations of object_class_typed<> is done by class template specializations. The comparison of the object_class representing A and B still works. The external<> class is necessary because if we want to pass an instance of T to another object_class, we need the base class object. With external, we “add” an object to a T . Until now we assumed that we had default constructible classes. The EVL library provides ways to declare classes that do not have default constructors, are abstract or are only copyable but it is a bit more verbose (we refer the interested readers to the tutorials of the library available online and in the distribution). Finally, we define the inheritance graph as the transitive reduction of the DAG representing the object_class set with the is_base_of function defining the edges of the graph. Note that the fact that our implementation of is_base_of returns true only when we have a unique base class is not a problem because the transitive reduction only keeps direct class inheritance relations and it is not possible in C++ to have repeated direct base classes. The reconstruction of the inheritance graph is done concretely at runtime when the object_class instances are created, i.e., in the static initialization phase or during the loading of a dynamic library. From the inheritance relations, we can also deduce distance information. For each object_class, we can request the list of all the base classes or derived classes with their relative distances. The distance between two object_class objects is defined as the length of the longest path between them. This distance information is usually the main information used for the dispatch strategy. Example 3. Consider the class hierarchy depicted in Fig. 2(a), where multiple inheritance for classes E and G is repeated. The constructed inheritance graph is depicted in Fig. 2(b). For the class F , the edges F → A and F → B are eliminated after transitive reduction. For the class E, the edge E → A does not exist even before the transitive reduction because A is a base class but not a unique one for E. It would have been eliminated in any case. In Fig. 2(b), the distance between G and A is 4, the distance between G and C is 2. 3.4. Fast dynamic cast and subobject offset computation In C++, the dynamic_cast<> operator converts a source pointer s with static type S and dynamic type R to a target pointer t of type T . T * t = dynamic_cast(s);

However the operator performs the conversion slowly by iterating through the list of base classes of R. The result of the dynamic cast is the computation of a pointer offset that we denote offset R ( S , T ), as the conversion is made in the context of an R object. In the case where R inherits from object, we have

Y. Le Goc, A. Donzé / Science of Computer Programming 98 (2015) 531–550

541

Algorithm 1 add. Require: t ∈ Type Let T be the current object_class set if t ∈ / types(T ) then oc ← object_class(t ) T ← T ∪ {oc} update_base_classes(oc) update_derived_classes(oc) update_offset(oc) reduce_graph() end if

offset R ( S , T ) = offset R ( S , object) + offset R (object, T )

(3)

= offset R ( S , object) − offset R ( T , object)

(4)

If R does not inherit from object, we just have:

offset R ( S , T ) = offsetexternal R  ( S , T )

(5)

= offsetexternal R  ( S , object) + offsetexternal R  (object, T )

(6)

= offsetexternal R  ( S , object) − offsetexternal R  ( T , object)

(7)

This is true as external< R > has nonvirtual base classes object and R, thus the memory layout of the subobject R in an object of type external< R > is the same as an object of type R for (5). Then simple pointer arithmetic operations give us (3), (4), (6) and (7) by introducing the pointer to the subobject object in R or external< R >. Then we have to compute the two-dimensional offset_table lookup table such that:

 offset_table(U , V ) =

offsetU ( V , object)

if U inherits from object,

offsetexternalU  ( V , object)

otherwise.

And we get:

offset R ( S , T ) = offset_table( R , S ) − offset_table( R , T ). When classes R and T are unrelated, the dynamic_cast<> operator returns the null pointer and we set std::numeric_limits::max(). We use the contiguous integer object_class ID to implement an array-based lookup table that is updated when an object_class is created. This provides us with a fast, constant-time dynamic cast implemented in the dynamic_pointer_cast<> function template. Note that these offsets are pre-computed in the static initialization phase. 3.5. Reflection information creation summary For clarification, we describe the simple algorithms used to create the reflection information:

• Algorithm 1 add implements the insertion of a new object_class with the update of the current inheritance graph and other information. The algorithm add is called sequentially for all the declared types but in random order;

• Algorithm 2 update_base_classes adds a new object_class object if it does not exist yet. Then it searches for the most derived classes that are base classes using the is_base_of comparison function;

• Algorithm update_derived_classes is similar to update_base_classes; • Algorithm 3 simply iterates through every existing object_class and updates the offset table in accordance with Section 3.4;

• Algorithm reduce_graph effects the transitive reduction of the graph. 4. Multi-methods implementation Our library is based on multi-method classes with template parameter policies defining the prototype of the dispatched functions and other configuration options. Functions are inserted into the multi-method objects and, taking advantage of the object_class framework presented in the previous sections, we can compare them through the runtime type information of their polymorphic arguments.

542

Y. Le Goc, A. Donzé / Science of Computer Programming 98 (2015) 531–550

Algorithm 2 update_base_classes. Require: oc ∈ object_class Let T be the current set of object_class

S ←∅

for all c ∈ T do if oc.is_base_of (c ) then S ← S ∪ {c } end if end for for all s ∈ S do found ← false for all ss ∈ S do if ss.is_base_of (s) then found ← true break end if end for if !found then oc.base ← s s.derived ← oc end if end for

Algorithm 3 update_offset. Require: oc ∈ object_class Let T be the current set of object_class for all c ∈ T do update_offset_table(oc, c ) update_offset_table(c , oc) end for

4.1. The multi-method class template The main class template is multi_function<>. We favored the term function over method in the name of the class because the latter was never part of the C++ terminology. However, the notion of a multi-method exists independently of any language, which is why in our formal presentation, we did not use a different term. Moreover, it makes a clear connection with the Open Multi-Methods of [7]. The class template multi_function<> has the following parameters, corresponding to their formal definition in Section 2:

• FunctionType is the base prototype and an instantiation of the type<> class template. For example, we can define a base prototype FooType as: typedef type, parameters > FooType;

The objects<> class template represents the virtual parameter types and the parameters class template represents the nonvirtual parameter types. • FCO is a policy class that defines the function comparison operator const & f, fco::item const & g) { ... // compares f and g based on their distance to a and additional data } };

A more detailed example is given in Section 4.3. The default value is the lexicographic FCO. Note that here, FooFCO defines an FCO for a standard multi-method where dispatch depends only on the dynamic types. To implement a general FCO, it must inherit a class named context which provides access to the values of all the arguments. An example is provided in the set of tutorials for the library.

Y. Le Goc, A. Donzé / Science of Computer Programming 98 (2015) 531–550

543

• CacheType is a policy class that defines the cache strategy c. It is a class template that must define a const_iterator nested type and implement container functions clear(), find(), begin(), etc. The default value is a wrapper to the std::map<> class template. • ErrorHandlerType is a policy class that defines the error handling strategy e. It is a class that must implement the static functions reset(), handle_no_function(), handle_no_minimum_function(), handle_ambiguous_function() and handle_function_not_defined(). The default value is a class whose functions throws exceptions. The following functions are part of the class template multi_function<>. They provide the management of the set of functions (insert and erase), the invocation of the multi-method with possible update of the cache or not (invoke and invoke_cache), the request for a tuple of objects without multi-method invocation (request), the calculation of the full dispatch table (resolve) and the information whether the multi-method is exception-safe or not (is_safe).

• insert: The dispatched functions are explicitly registered by calling the insert() function and passing a pointer to the function along with its associated object if it is a member function, and the associated data_type value (defined as a nested type in the FCO class) if data_type is not void. Multi-methods are strongly typed and only functions with a prototype compatible with the base prototype FunctionType can be inserted without a compilation error. Class compatibility tests are performed using type traits functions. Calling the insert() function resets the cache. Note that the object_class instances representing the virtual parameters of the inserted function are automatically instantiated in the static initialization phase of the program in addition to the instantiation locations presented in Section 3.2. Example 4 (Defining a multi-method). Suppose we have the default constructible classes A, B and C such that B inherits A. The following code shows how to define a simple multi-method of dimension two: // base prototype typedef type, parameters > Type; // function foo int fooA(objects
::tuple t, double p, bool r) {return 1;} int fooBC(objects::tuple t, double p, bool r) {return 2;} // multi-method m with base prototype specified multi_function m; // insert the functions fooA and fooBC compatible with the base prototype of m m.insert(&fooA); m.insert(&fooBC);

• erase: erases an already inserted function (and also resets the cache). This function is typically used when unloading a dynamically loaded class to remove the functions that become unavailable from the multi-method (see Section 4.4).

• invoke: The function takes a tuple of objects (constructed with the factory function on()) and possibly additional parameters as arguments. Only arguments compatible with the base prototype can be passed without a compilation error, i.e.: – Virtual arguments must be covariant to the virtual parameter types of the base prototype; – Nonvirtual arguments must be compatible with nonvirtual parameter types following usual C++ rules; – All arguments must be const-compatible with the base prototype parameters; If the tuple of virtual parameter types of the arguments is not in the cache then EVL looks for the appropriate function using Algorithm 4 find_min_function implementing the formalism described in Section 2. Otherwise, the function returned by the cache is called. The algorithm first computes the set of functions Da compatible with a and searches for minimum functions using the FCO described in Section 4.3. The function find_compatible_functions implements a simple algorithm that iterates over the dimensions of the multi-method and adds to Da the functions in D inserted in the multi-method that can be applied to a (see Section 2). Next, the algorithm uses the FCO to find the set Dm of minimum functions. This has to be done carefully, in case an FCO defined by the user which does not define a strict weak ordering is used (hence the verification phase at the end of the algorithm). Finally, if Dm is empty, or of a size greater than 1, the algorithm returns an error, reporting the absence of appropriate functions in the first case and an ambiguity in the second case. In those cases the error handler is called. In Section 2, we describe how the FCO uses the distance between tuples of types. The distance information is not present in the object_class objects, so it must be computed here. This is done in the find_compatible_functions algorithm. Moreover, for predicate dispatch that depends on the value of the arguments, the arguments are copied in the context part of the FCO (they are not copied when the FCO class does not inherit from an instantiation of context<>) before calling the find_min_function algorithm. Once the function to call is found in the cache or by computation (in which case it is inserted in the cache), the tuple of objects and arguments passed to invoke() are applied to it by performing a cast to the virtual types of the function. We use internally the fast dynamic cast presented in Example 2. Next is a minimal example illustrating a call to a multi-method.

544

Y. Le Goc, A. Donzé / Science of Computer Programming 98 (2015) 531–550

Algorithm 4 find_min_function(D , a). Require: D, the set of inserted functions, a a tuple of objects Let Let Let Let

Da be a set of functions Dm be the set of minimum functions fm be a function FCO be the function comparison operator

// Look for compatible functions Da = find_compatible_functions( D , a) if |Da| == 0 then error “no function” end if // Iterate on Da to determine Dm using fm as current minimum function Pick fm ∈ Da, Dm ← {fm} for all f ∈ Da − {fm} do if FCO(a, f , fm) then fm ← f , Dm ← { f } else if ¬FCO(a, fm, f ) then // in case f and fm are both minimal for a Dm ← Dm ∪ { f } end if end for // Verify minimum functions found (in case FCO not a strict weak ordering for a) for all f ∈ Da, f  ∈ Dm do if FCO(a, f , f  ) then error “no minimum function” end if end for if |Dm| > 1 then error “ambiguous function” end if return fm

Example 5 (Calling a multi-method). We reuse the multi-method defined in Example 4. The following code shows how to call a simple multi-method of dimension two: // create a pointer B converted into A and a pointer C A * b = object::make(); C * c = object::make(); // call invoke on the tuple (b, c) created by the function on() int result = m.invoke(on(b, c), 3.1, true);

The call will be resolved to calling fooBC, hence result=2. The invoke() function is not const, as the cache can be updated when it does not contain an entry for the requested tuple of types.

• invoke_cache: This function is used to lookup the function to apply in the cache. If the requested tuple of types is not in the cache, an error will be reported by the error handler (the cache implementation must do it). This is the function to call when performance is required. For classes that do not inherit the base class object, it is necessary to invoke the multi-method on a rich pointer object_ptr<> to have the fastest call (Section 3.2). The invoke_cache() function is const, as the cache is unchanged during the call. • request: The function takes a tuple of objects and possibly additional parameters as arguments. The function request() behaves like invoke() (possible update of the cache) except that the function found is not applied. Moreover, in case of error, it is reported in the result and the error handler is not called. • resolve: When calling this function, all the possible tuples of types are requested using the request() function, inserting the resulting functions in the cache. After the call, the multi-method is said to be resolved. Then, provided the cache implementation allows us to store all the possible entries, invoke_cache() can be safely called for the multi-method. • is_safe: This function returns true if the error handler is never called for any tuple of types present in the cache. If the error handler throws exceptions and the multi-method is resolved, then it ensures that the multi-method will never throw an exception.

Y. Le Goc, A. Donzé / Science of Computer Programming 98 (2015) 531–550

545

4.2. Multi-methods framework Using the notions presented so far, the EVL library provides a clear and flexible methodology distinguishing a set of processing functions (which may or may not be associated with an object) from the dispatch mechanism. Indeed, by introducing the notion of customizable FCOs, a variety of dispatch algorithms can be defined. Thus, separating process from dispatch allows us to define the same processing functions for many dispatchers. We illustrate this simply by the following example. Example 6 (Define multi-methods with different FCOs). Consider the multi-method defined in Example 4 and insert a member function. class Agent { public: int bar(objects
::tuple t, double a, bool b) {return 2;} };

// create an Agent object Agent ag; // insert the function bar bound to the object ag m.insert(&bar, &ag); // define a multi-method with symmetric dispatch multi_function ms; // insert the functions fooA and bar ms.insert(&fooA); ms.insert(&bar, &ag);

By default, m uses fco::asymmetric (an alias for fco::lexicographic). Two other FCOs are implemented in the library by the classes fco::symmetric (an alias for fco::product) and fco::norm1. So far we have presented only the multi_function<> class template but EVL also provides the static_multi_function<> class template that provides faster invoke() and invoke_cache() calls, as only non-member functions can be inserted. EVL also provides the self_method<> class template that operates on this objects like standard virtual functions. Note that we implemented our own delegate classes to unify member and non-member functions. We did not use the Boost.Function and Boost.Bind libraries5 because they do not provide the performance of simple function pointers in the case of non-member functions for static_multi_function<>. Although we have shown the advantage of separating the process from the dispatch, our multi-method classes can be used as base classes to simplify their extensibility. In that case it is recommended to insert the dispatched functions into the constructor of the new class and insert a new function by extending the class. As our multi-methods are instantiations of normal C++ classes, they can be class templates themselves providing a “semi-virtual” function mechanism. This can be realized by inserting the functions in the constructor of a class template that has for instance a parameter T for the first nonvirtual parameter type. However, the list of dispatched functions has to be known and cannot be extended later without instantiating a new multi-method object of a new class template. An example is provided in the tutorials for the library. In conclusion, as our multi-methods are objects, the programmer can decide on their scope. They can be private members of a class as well as global objects. The latter case is equivalent to the Open Multi-Methods of [7]. 4.3. Resolving ambiguities Recall that the library deals with ambiguity resolution by using a function comparison operator, a so-called FCO (see Section 4.3). It is defined as a functor which contains a nested type data_type. It compares two objects of type fco::item representing functions that are candidates for being applied to a tuple of types a which is a class_tuple (alias for std::vector). The default FCO uses a simple lexicographic order. The library offers the possibility to customize it. For this, note that the objects of type fco::item compared have the following member functions:

• tuple: returns a class_tuple, the tuple of types of the associated function; • distance: returns a tuple of integers representing the tuple of distances from the requested type tuple a; • data: returns the associated data of type data_type; 5

http://www.boost.org.

546

Y. Le Goc, A. Donzé / Science of Computer Programming 98 (2015) 531–550

Suppose that we want to assign a priority integer to functions to resolve ambiguities. To do this, we specify data_type as fco::data, an instantiation of the class template fco::data<>. The functor is defined as follows: struct FooFCO { typedef fco::data data_type; bool operator()(class_tuple const & a, fco::item > const & f, fco::item > const & g) { // compare items using product functor bool less = fco::product()(a, f, g); bool greater = fco::product()(a, g, f);

// desambiguate when equality if (!less && !greater) { // compare the value associated with the functions return f.data().getValue() > g.data().getValue(); } return less; } };

As mentioned in Section 4.3, another way to resolve ambiguities is to use additional reflection information to compare two tuples of types (e.g., with the static constant class members). Note that we did not implement the ambiguity resolution of compatible functions with different covariant return types by selecting the most specialized return type as Open Multi-Methods of [7] do. 4.4. Dynamic class loading We saw in Section 3.1 that our reflection library was able to manage dynamically loaded classes. When loading a dynamic library, its static initialization phase is executed and new types are added to the object_class set with the update of the inheritance graph and other extra information. When a new type is added at runtime, it is assigned a new ID that may change the min or max values of the possible IDs for some dimensions of a multi-method (see Section 2.5). For performance reasons, we decided to simply invalidate the cache of all existing multi-methods even if it does not concern all the multi-methods. Thus every multi-method resets its cache at the next invoke() call, but not invoke_cache(), which does not modify the cache. A dynamic library may also add new processing functions to an existing multi-method. The call to insert() resets the cache. When closing a dynamically loaded library, for safety reasons we are required to:

• Erase from the multi-methods all the functions defined in the library; • Erase from the object_class set all the types declared in the library. 4.5. Concurrent calls to multi-methods The use in multi-methods of a cache that can be updated at runtime raises the issue of thread safety. Basic EVL multimethods are not thread-safe. Indeed, although invoke_cache() is always read-only (the cache is never updated), insert() and erase() are not, since they clear the cache and update the list of functions. To ensure that invoke() and request() can be safely called concurrently, along with invoke_cache(), a first solution is to call resolve() which fills the cache with all possible entries. However this is not always possible (e.g., because of unpredictable dynamic insertions of functions) or desirable (because the cache would be too big). In this case, we implemented a lock policy in EVL which requires the definition of mutex and lock types in order to obtain a thread-safe cache in the context of EVL multi-methods. The implementation of the lock policy allows to use standard containers like std::vector<>. For an example with the Boost.Thread library, we refer the reader to the tutorials. Notice that at the time of writing, our implementation complies with the C++03 standard, however, later versions might take advantage of the concurrent programming model defined by the C++11 standard. 5. Experimental results 5.1. Performance test As mentioned in Section 3.1, there is a computational overhead at the static initialization phase of the program due to the reconstruction of the inheritance graph, but this is likely negligible, as the number of classes managed by the framework should to be limited. To evaluate the performance of the library, we compared the cost of calling a multi-method with the cost of calling a standard virtual function. To do this, we defined a base class Base with a pure virtual function foo().

Y. Le Goc, A. Donzé / Science of Computer Programming 98 (2015) 531–550

547

Table 1 Computational ratio overhead with respect to virtual functions, with different compilers. The results with gcc 4.7 were obtained on Core i7 Laptop running Linux Ubuntu 13.04 64 bit. The results with Visual Studio 10 were obtained on a Core 2 Duo desktop PC running Windows 7. The results with Darwin were obtained on Macbook Pro with a Core i7 processor running Mac OSX 10.8.4. Multi-method

Static Non-static Static Static dim. 2 Static dim. 4 Static dim. 8 Static Static

Cache type

vector vector map vector vector map vector sync. no cache

gcc 4.7

VS10

Darwin (gcc 4.4)

Invoke

Cache

Invoke

Cache

Invoke

Cache

1.55 1.70 2.50 2.00 2.25 4.85 2.25 165

1.35 1.45 2.10 1.50 2.05 4.45 1.80 –

2.00 2.10 3.60 2.10 3.60 7.10 5.25 450

1.35 1.40 2.55 1.80 2.90 5.30 4.55 –

1.55 1.65 2.65 1.70 2.20 4.35 4.80 365

1.30 1.40 2.40 1.40 1.85 4.15 4.40 –

Other libraries

gcc 4.7

VS10

Darwin (gcc 4.4)

doublecpp dim. 2 Loki dim. 2 Yomm11 dim. 2 Yomm11 dim. 4 Yomm11 dim. 8

1.05 9.70 1.40 1.60 1.70

1.30 20.0 – – –

1.15 4.60 – – –

The class is derived into eight subclasses A 1 , A 2 , . . . , A 8 all implementing foo(), which simply returns the index of the class. We defined the equivalent functions in the Foo multi-method. The test program creates an instance of each of the A 1 , A 2 , . . . , A 8 classes and randomly calls the foo() function through the class Base and through the multi-method Foo. We compared different types of multi-methods (static, i.e., dispatch to only non-member functions, non-static, i.e., dispatch to member functions bound to an object) with different cache strategies (map, vector, vector synchronized, no cache) to the equivalent virtual function. We also tested static multi-methods of dimension two, four, and eight. In all cases, we compared the time taken by the invoke() and invoke_cache() (cache column) functions and report the ratio with the time taken by virtual functions in Table 1. For most multi-methods with vector cache and functions used (invoke_cache() or invoke()), the call duration is not more than twice the virtual function call duration. In the best case, i.e., by calling invoke_cache() with a static multimethod, the performance is close to that of virtual functions. For non-static multi-methods, the overhead is due to our delegate implementation that uses a virtual function. For multi-methods with map strategy, the cache implementation is based on the std::map<> class template. Note that better performance could be obtained with the boost::hash_map<> class template. Using a multi-method of dimension two is slower than using a multi-method of dimension one, but the overhead is limited. When increasing the dimensions to four and eight, the call duration follows a relatively linear progression. Note that in dimension eight, the dispatch table was too big to be used with a vector cache. Our vector synchronized cache implementation is defined with the boost::mutex class and is slower than the vector cache implementation, although the overhead is system-dependent and was found smaller with the most recent compiler we used. When no cache is used, the overhead is more significant. Notice that in this case, however, it is highly dependent on the program context, in particular on the number of compatible functions. Overall, the results show that the library provides really good performance with optimized multi-methods. 5.2. Comparison with other implementations We ran our performance test on DoubleCpp [16] and the Loki library and reported the results in Table 2. The small overhead obtained with DoubleCpp (between 5% and 30%) is expected as its implementation is done in such a way that a function call has the cost of a virtual function call plus a small overhead. However, DoubleCpp implements a hidden visitor pattern and does not provide a dispatch based on the selection of the best matching function. The Loki library [15] implements different kinds of dispatcher that are not based on the selection of the best matching function, except for the static dispatcher to which the ordering of the class hierarchy must be passed, as Loki does not reconstruct the inheritance graph. We ran our performance test on the FunctorDispatcher<> template class, which obtained a ratio of between 4.59 and 20 depending on the system. The result is mainly explained by the fact that Loki is calling the dynamic_cast<> operator to cast the virtual arguments. We made direct performance comparisons (not reported here) between our fast dynamic cast operator (see Section 3.2) and the standard dynamic_cast<> operator and obtained similar variability in the results. In [29], a simplified multi-method library is described that has similar implementation ideas to EVL: a class hierarchy is walked to find the best matching function. However, the class hierarchy, defined with nodes referencing std::type_info objects, is explicitly built and no lookup table is used to speed up the calls. Also, the library did not appear to be available at the time of our writing and thus we could not make direct performance comparisons.

548

Y. Le Goc, A. Donzé / Science of Computer Programming 98 (2015) 531–550

Table 2 Computational ratio overhead with respect to virtual functions for external classes in EVL and Yomm11, with gcc 4.7 obtained on Core i7 Laptop running Linux Ubuntu 13.04 64 bit. Multi-method

EVL

Yomm11

static dim. 2 static dim. 4 static dim. 8

2.05 3.15 6.45

7.65 13.15 23.60

The Yomm11 project6 is a library implementation of Open Multi-Methods. It implements the major features and presents very good performance as shown in Table 2. In both EVL and Yomm11 tests, the class Base inherited a class of the framework (object for EVL and selector for Yomm11), which yields the best performance. However we did additional tests with external classes (i.e., for which Base has now no base class from the multi-methods framework), and in that case, Yomm11 performance were significantly worse due to the use of the dynamic_cast<> operator to cast the virtual arguments, whereas EVL still performed well. Moreover, Yomm11 provides a compact API but at the cost of requiring the use of macros to define class inheritance information as well as in the multi-method definitions. We will now focus on the comparison of the Open Multi-Methods (OMM) with EVL as it shows the contrast between a language extension implementation and a library implementation. Direct performance tests were not possible here either, although the authors report overheads of the order of 16%. At the conceptual level, we can compare different aspects.

• Verbosity

• •

• • •







6

OMM has a very light syntax to define multi-methods. One needs only to add the virtual keyword to a parameter of a non-member function to make it a multi-method. Moreover, function registration is implicit as it is realized using its signature. In EVL the definition of a multi-method is more verbose and functions have to be explicitly registered. In addition, an EVL multi-method must be explicitly resolved for its dispatch table to be filled; C++ conformance OMM multi-methods are fully conformant with standard C++ and support virtual and repeated inheritance. EVL supports a class hierarchy defined with virtual and repeated inheritance, but not dispatch onto repeated base classes; Scope An OMM multi-method is a family of functions implicitly linked together by the fact that they have compatible signatures. A multi-method defined to operate in a given portion of code may be disturbed by the definition of a different supposed multi-method in another portion of code, because they have compatible signatures. It is possible in EVL to define global multi-methods and encounter such “conflicts”, however, multi-methods are objects and it is recommended not to declare them global, following OOP principles; Dispatch onto object methods OMM multi-methods can only be defined with non-member functions. In EVL, dispatched functions can be members of a class. In that case they are registered with a bound object; Dispatch strategy OMM fixes the dispatch strategy to the symmetric product relation. EVL multi-methods can be defined with a highly customizable dispatch strategy as discussed in the previous sections; Ambiguities When more than one function is the best match for a given tuple of types, OMM generates an error at compile-time or link-time. The only way to resolve it is to define a new function override. Moreover, all the possible combinations of types are tested to fill the entire dispatch table, generating ambiguities even for tuples of types that are never called. In EVL, it is possible to refine the dispatch strategy to resolve the ambiguity; Scalability OMM computes the full dispatch table that contains all the possible combinations of types and implements a compression algorithm. EVL implements a cache strategy that can be defined by the programmer. The implementation can be either a dispatch table or a cache algorithm. In that case, an empty cache is filled at runtime so that it only contains the tuples of types that are actually used; Errors OMM detects errors at compile-time and link-time. EVL can only detect errors at runtime (except function compatibility errors at compile-time), but the error strategy can be redefined. By default, exceptions are thrown, but it is possible to define error codes; Debugging OMM function overriders are implicitly linked, making it difficult to obtain the list of overriders of a multi-method, or examine the contents of the dispatch table. EVL classes have functions that can be used to print the contents of a multi-method;

http://www.yorel.be/mm/.

Y. Le Goc, A. Donzé / Science of Computer Programming 98 (2015) 531–550

549

• Dynamic class loading The OMM current implementation does not support dynamic class loading. EVL is mainly based on runtime execution and dynamic class loading is supported; • Templates As virtual functions cannot be templates in the actual C++ standard, OMM multi-methods cannot be templates. In EVL, multi-methods are instances of classes that can be templates. To define a multi-method template that will dispatch function templates, the list of inserted function templates must be fixed and registered in the constructor (function templates are instantiated with the template parameters of the multi-method). In dimension one, the multi-method has the behavior of a “semi-virtual” function template. It is called “semi-virtual” because it is not possible to register functions after the construction of the object; • Covariant return types OMM supports covariant return types and uses them to improve the ambiguity resolution at runtime and compile-time. EVL makes it possible to use covariant return types but the multi-methods always return a result of the base return type as their resolution is only dynamic. To summarize, OMM offers better syntax, C++ conformance and performance than EVL, but EVL supports more objectoriented features, provides more general dispatch strategies and better ambiguity resolution possibilities. Moreover, EVL supports dynamic class loading. 6. Conclusions and future works We presented the EVL library as a technical solution to the problem of implementing multi-methods in C++. Our approach, based on a library, offers flexibility in the parameterization of the dispatch that a language extension cannot provide. It is mainly based upon a strong object-oriented paradigm by a clear separation between the processing classes and the dispatch classes, and a dispatch strategy which is not fixed and can be redefined. We believe that it provides a tool for the programmer to build natural object-oriented designs and control the dispatch strategy (symmetric, asymmetric, resolution of ambiguities) with good performance. At the time of writing, our library is available at http://libevl.bitbucket.org. It comes with complete documentation and comprehensive set of examples. Multiple dispatch is a non-trivial notion requiring advanced OOP skills. Although tuning the dispatch strategy should be of interest only to the expert, the library is intended to be used by the maximum number of programmers. Thus, our multi-methods come with simple default dispatch configurations. We think that providing a tool that is easy to use is the first step in the acceptance of multi-methods principle. We are looking forward to feedback from the whole community of programmers to further improve our proposed multi-method interface. For our needs, we designed a reflection library that can be used independently from the multi-method framework. In the future, we hope to use a dedicated reflection library or a language feature when such options are available. Our solution is based on the C++03 standard and we plan to adapt its implementation when the new C++11 standard is implemented by the majority of compilers. For example, such an adaptation could be that multi-methods insert std::function instances rather than raw function pointers for more flexibility and better integration (although with lower performance), because std::function implements the abstraction of non-member and member functions that the library requires. References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17]

G.L. Steele Jr., Common Lisp the Language, second ed., Digital Press, 1990. A. Shalit, D. Moon, O. Starbuck, Common Lisp the Language, first ed., Addison Wesley Publishing Company, 1996. C. Chambers, The Cecil language, specification and rationale – version 2.0, Tech. rep., 1995. C. Clifton, G.T. Leavens, MultiJava: modular open classes and symmetric multiple dispatch for Java, in: OOPSLA 2000 Conference on Object-Oriented Programming, Systems, Languages, and Applications, 2000, pp. 130–145. B. Stroustrup, The C++ Programming Language, fourth ed., Addison-Wesley Professional, 2013. K. Arnold, J. Gosling, D. Holmes, The Java Programming Language, third ed., Addison-Wesley Professional, 2000. P. Pirkelbauer, Y. Solodkyy, B. Stroustrup, Design and evaluation of C++ open multi-methods, Sci. Comput. Program. 75 (2010) 638–667. E. Gamma, R. Helm, R. Johnson, J.M. Vlissides, Design Patterns: Elements of Reusable Object-Oriented Software, Addison-Wesley Professional, 1995. C. Chambers, Object-oriented multi-methods in Cecil, in: ECOOP’92 Conference Proceedings, Springer-Verlag, New York, NY, USA, 1992, pp. 33–56. R. Muschevici, A. Potanin, E.D. Tempero, J. Noble, Multiple dispatch in practice, in: G.E. Harris (Ed.), OOPSLA, ACM, ISBN 978-1-60558-215-3, 2008, pp. 563–582. E. Amiel, E. Dujardin, Supporting explicit disambiguation of multi-methods, in: P. Cointe (Ed.), ECOOP’96 — Object-Oriented Programming, in: Lecture Notes in Computer Science, vol. 1098, Springer, Berlin, Heidelberg, ISBN 978-3-540-61439-5, 1996, pp. 167–188. R. Forax, E. Duris, G. Roussel, A reflective implementation of Java multi-methods, IEEE Trans. Softw. Eng. (2004) 1055–1071. G. Van Rossum, The Python Language Reference Manual, Network Theory Ltd., 2003. J. Smith, Draft proposal for adding multimethods to C++, 2003. A. Alexandrescu, Modern C++ Design: Generic Programming and Design Patterns Applied, Addison-Wesley Professional, 2001. L. Bettini, S. Capecchi, B. Venneri, Translating double-dispatch into single-dispatch, in: Proceedings of the Second Workshop on Object Oriented Developments (WOOD 2004), in: ENTCS, vol. 138, Elsevier, 2005, pp. 59–78, http://rap.dsi.unifi.it/phpbibliography/files/doubledisp.pdf. K. Bruce, G.T. Leavens, G. Castagna, L. Cardelli, B. Pierce, On binary methods, in: Symposium on Object-Oriented Programming: Systems, Languages, and Applications, ACM, Harvard University Press, 1995, pp. 227–256.

550

Y. Le Goc, A. Donzé / Science of Computer Programming 98 (2015) 531–550

[18] E. Dujardin, E. Amiel, E. Simon, Fast algorithms for compressed multimethod dispatch table generation, ACM Trans. Program. Lang. Syst. (ISSN 0164-0925) 20 (1) (1998) 116–165. [19] R. Karedla, J.S. Love, B.G. Wherry, Caching strategies to improve disk system performance, Computer (ISSN 0018-9162) 27 (3) (1994) 38–46, http://dx.doi.org/10.1109/2.268884. [20] M. Gibbs, B. Stroustrup, Fast dynamic casting, Softw. Pract. Exp. (ISSN 0038-0644) 36 (2) (2006) 139–156, http://dx.doi.org/10.1002/spe.v36:2. [21] D.M. Berris, M. Austern, L. Crowl, Rich pointers, Document number: ISO/IEC JTC1 SC22 WG21 N3340=120030, 2012. [22] M.D. Ernst, C.S. Kaplan, C. Chambers, Predicate dispatching: a unified theory of dispatch, in: E. Jul (Ed.), ECOOP, in: Lecture Notes in Computer Science, vol. 1445, Springer, ISBN 3-540-64737-6, 1998, pp. 186–211. [23] H.W. Schmidt, S.M. Omohundrof, CLOS Eiffel and Sather: a comparison, Tech. rep. in: A. Paepcke (Ed.), Object-Oriented Programming: The CLOS Perspective, 1991. [24] K. Knizhnik, Reflection for C++, http://www.garret.ru/cppreflection/docs/reflect.html, 2001. [25] A. Margaritis, AGM::LibReflection: a reflection library for C++, http://www.codeproject.com/Articles/8712/AGM-LibReflection-A-reflection-library-for-C, 2004. [26] S. Roiser, Boost library proposal: reflection for C++ based on Reflex library, http://lists.boost.org/Archives/boost/2006/01/99686.php, 2006. [27] T. Devadithya, K. Chiu, W. Lu, C++ reflection for high performance problem solving environments, in: Proceedings of the 2007 Spring Simulation Multiconference – Volume 2, SpringSim’07, Society for Computer Simulation International, San Diego, CA, USA, ISBN 1-56555-313-6, 2007, pp. 435–440, http://dl.acm.org/citation.cfm?id=1404680.1404749. [28] Y. Solodkyy, G.D. Reis, B. Stroustrup, Open and efficient type switch for C++, in: G.T. Leavens, M.B. Dwyer (Eds.), OOPSLA, ACM, ISBN 978-1-4503-1561-6, 2012, pp. 963–982. [29] P. Pirkelbauer, S. Parent, M. Marcus, B. Stroustrup, Dynamic algorithm selection for runtime concepts, Sci. Comput. Program. (ISSN 0167-6423) 75 (9) (2010) 773–786, special issue on Object-Oriented Programming Languages and Systems (OOPS 2008), a special track at the 23rd {ACM} Symposium on Applied Computing.