www.ravenbrook.com -

Ravenbrook / Projects / Memory Pool System / Master Product Sources / Design Documents
               THE DESIGN FOR PROTOCOL INHERITANCE IN MPS
                          design.mps.protocol
                             incomplete doc
                            tony 1998-10-12

INTRODUCTION

.intro: This document explains the design of the support for class inheritance 
in MPS.  It is not yet complete.  It describes support for single inheritance 
of classes.  Future extensions will describe multiple inheritance and the 
relationship between instances and classes.

.readership: This document is intended for any MM developer.

.hist.0: Written by Tony 1998-10-12


PURPOSE

.purpose.code-maintain: The purpose of the protocol inheritance design is to 
ensure that the MPS code base can make use of the benefits of OO class 
inheritance to maximize code reuse, minimize code maintainance and minimize the 
use of "boiler plate" code. 

.purpose.related: For related discussion, see mail.tony.1998-08-28.16-26(0), 
mail.tony.1998-09-01.11-38(0), mail.tony.1998-10-06.11-03(0) & other messages 
in the same threads.


REQUIREMENTS

.req.implicit: The object system should provide a means for classes to inherit 
the methods of their direct superclasses implicitly for all functions in the 
protocol without having to write any explicit code for each inherited function. 

.req.override: There must additionally be a way for classes to override the 
methods of their superclasses. 

.req.next-method: As a result of .req.implicit, classes cannot make static 
assumptions about methods used by direct superclasses.  The object system must 
provide a means for classes to extend (not just replace) the behaviour of 
protocol functions, such as a mechanism for invoking the "next-method".

.req.ideal.extend: The object system must provide a standard way for classes to 
implement the protocol supported by they superclass and additionally add new 
methods of their own which can be specialized by subclasses.

.req.ideal.multiple-inheritance: The object system should support multiple 
inheritance such that sub-protocols can be "mixed in" with several classes 
which do not themselves support identical protocols.


OVERVIEW

.overview.root: We start with the root of all conformant class hierarchies, 
which is called "ProtocolClass".  ProtocolClass is an "abstract" class (i.e. it 
has no direct instances, but it is intended to have subclasses).  To use Dylan 
terminology, instances of its subclasses are "general" instances of 
ProtocolClass.  They look as follows:-

     Instance Object                    Class Object

     --------------------              --------------------
     |     sig          |    |-------->|    sig           |
     --------------------    |         --------------------
     |     class        |----|         |    superclass    |
     --------------------              --------------------
     |     ...          |              |    coerceInst    |
     --------------------              --------------------
     |     ...          |              |    coerceClass   |
     --------------------              --------------------
     |                  |              |     ...          |

.overview.inherit: Classes inherit the protocols supported by their 
superclasses.  By default they have the same methods as the class(es) from 
which they inherit.  .overview.inherit.specialize: Classes may specialize the 
behaviour of their superclass.  They do this by by overriding methods or other 
fields in the class object. 

.overview.extend: Classes may extend the protocols supported by their 
superclasses by adding new fields for methods or other data.

.overview.sig.inherit: Classes will contain (possibly several) signatures.  
Classes must not specialize (i.e. override) the signature(s) they inherit from 
their superclass(es).

.overview.sig.extend: If a class definition extends a protocol, it is normal 
policy for the class definition to include a new signature as the last field in 
the class object.

.overview.coerce-class: Each class contains a coerceClass field.  This contains 
a method which can find the part of the class object which implements the 
protocols of a supplied superclass argument (if, indeed, the argument IS a 
superclass).  This function may be used for testing subclass/superclass 
relationships, and it also provides support for multiple inheritance.

.overview.coerce-inst: Each class contains a coerceInst field.  This contains a 
method which can find the part of an instance object which contains the 
instance slots of a supplied superclass argument (if, indeed, the argument IS a 
superclass). This function may be used for testing whether an object is an 
instance of a given class, and it also provides support for multiple 
inheritance.

.overview.superclass: Each class contains a superclass field.  This enables 
classes to call "next-method", as well as enabling the coercion functions.

.overview.next-method: A specialized method in a class can make use of an 
overridden method from a superclass by accessing the method from the 
appropriate field in the superclass object and calling it.  The superclass may 
be accessed indirectly from the class's "Ensure" function when it is statically 
known (see .overview.access).  This permits "next-method" calls, and is fully 
scalable in that it allows arbitrary length method chains.  The SUPERCLASS 
macro helps with this (see .int.static-superclass). 

.overview.next-method.naive: In some cases it is necessary to write a method 
which is designed to specialize an inherited method, needs to call the 
next-method, and yet the implementation doesn't have static knowledge of the 
superclass.  This might happen because the specialized method is designed to be 
reusable by many class definitions.  The specialized method can usually locate 
the class object from one of the parameters passed to the method.  It can then 
access the superclass through the "superclass" field of the class, and hence 
call the next method.  This technique has some limitations and doesn't support 
longer method chains.  It is also dependent on none of the class definitions 
which use the method having any subclasses.

.overview.access: Classes must be initialized by calls to functions, since it 
is these function calls which copy properties from superclasses.  Each class 
must provide an "Ensure" function, which returns the canonical copy of the 
class.  The canonical copy may reside in static storage, but no MPS code may 
refer to that static storage by name.

.overview.naming: There are some strict naming conventions which must be 
followed when defining and using classes.  The use is obligatory because it is 
assumed by the macros which support the definition and inheritance mechanism.  
For every class SomeClass, we insist upon the following naming conventions:-

  SomeClassStruct  - names the type of the structure for the protocol class.  
                     This might be a typedef which aliases the type to the 
                     type of the superclass, but if the class has extended 
                     the protocols of the superclass the it will be a type which
                     contains the new class fields.

  SomeClass        - names the type *SomeClassStruct.
                     This might be a typedef which aliases the type to the 
                     type of the superclass, but if the class has extended 
                     the protocols of the superclass then it will be a type 
which
                     contains the new class fields.

  EnsureSomeClass  - names the function that returns the initialized 
                     class object.



INTERFACE

Class Definition

.int.define-class: Class definition is performed by the macro 
DEFINE_CLASS(className, var).  A call to the macro must be followed by a body 
of initialization code in braces {}.  The parameter className is used to name 
the class being defined.  The parameter var is used to name a local variable of 
type className, which is defined by the macro; it refers to the canonical 
storage for the class being defined.  This variable may be used in the 
initialization code.  (The macro doesn't just pick a name implicitly because of 
the danger of a name clash with other names used by the programmer).  A call to 
DEFINE_CLASS(SomeClass, var) does the following:
Defines the EnsureSomeClass function.
Defines some static storage for the canonical class object
Defines some other things to ensure the class gets initialized exactly once

.int.define-alias-class: A convenience macro DEFINE_ALIAS_CLASS is provided 
which both performs the class definition and defines the types SomeClass and 
SomeClass struct as aliases for some other class types.  This is particularly 
useful for classes which simply inherit, and don't extend protocols.  The macro 
call DEFINE_ALIAS_CLASS(className, superName, var) is exactly equivalent to the 
following:
  typedef superName className;
  typedef superNameStruct classNameStruct;
  DEFINE_CLASS(className, var)

.int.define-special: If classes are particularly likely to be subclassed 
without extension, the class implementor may choose to provide a convenience 
macro which expands into DEFINE_ALIAS_CLASS with an appropriate name for the 
superclass.  For example, there might be a macro for defining pool classes such 
that the macro call DEFINE_POOL_CLASS(className, var) is exactly equivalent to 
the macro call DEFINE_ALIAS_CLASS(className, PoolClass, var).  It may also be 
convenient to define a static superclass accessor macro at the same time (see 
.int.static-superclass.special).


Single Inheritance

.int.inheritance: Class inheritance details must be provided in the class 
initialization code (see .int.define-class).  Inheritance is performed by the 
macro INHERIT_CLASS(thisClassCoerced, parentClassName).  A call to this macro 
will make the class being defined a direct subclass of ParentClassName by 
ensuring that all the fields of the parent class are copied into thisClass, and 
setting the superclass field of thisClass to be the parent class object.  The 
parameter thisClassCoerced must be of type parentClassName.  If the class 
definition defines an alias class (see .int.define-alias-class), then the 
variable named as the second parameter to DEFINE_CLASS will be appropriate to 
pass to INHERIT_CLASS.


Specialization

.int.specialize: Class specialization details must be given explicitly in the 
class initialization code (see .int.define-class).  This must happen AFTER the 
inheritance details are given (see .int.inheritance).


Extension

.int.extend: To extend the protocol when defining a new class, a new type must 
be defined for the class structure.  This must embed the structure for the 
primarily inherited class as the first field of the structure.  Class extension 
details must be given explicitly in the class initialization code (see 
.int.define-class).  This must happen AFTER the inheritance details are given 
(see .int.inheritance).


Introspection

.introspect.c-lang: The design includes a number of introspection functions for 
dynamically examining class relationships.  These functions are polymorphic and 
accept arbitrary subclasses of ProtocolClass.  C doesn't support such 
polymorphism.  So although these have the semantics of functions (and could be 
implemented as functions in another language with compatible calling 
conventions) they are actually implemented as macros.  The macros are named as 
method-style macros despite the fact that this arguably contravenes 
guide.impl.c.macro.method.  The justification for this is that this design is 
intended to promote the use of polymorphism, and it breaks the abstraction for 
the users to need to be aware of what can and can't be expressed directly in C 
function syntax.  These functions all end in "Poly" to identify them as 
polymorphic functions.

.int.superclass: ProtocolClassSuperclassPoly(class) is an introspection 
function which returns the direct superclass of class object class.

.int.static-superclass: SUPERCLASS(className) is an introspection macro which 
returns the direct superclass given a class name, which must (obviously) be 
statically known.  The macro expands into a call to the ensure function for the 
class name, so this must be in scope (which may require a forward 
declaration).  The macro is useful for next-method calls (see 
.overview.next-method).  The superclass is returned with type ProtocolClass so 
it may be necessary to cast it to the type for the appropriate subclass. 

.int.static-superclass.special: Implementors of classes which are designed to 
be subclassed without extension may choose to provide a convenience macro which 
expands into SUPERCLASS along with a type cast.  For example, there might be a 
macro for finding pool superclasses such that the macro call 
POOL_SUPERCLASS(className) is exactly equivalent to 
(PoolClass)SUPERCLASS(className).  It's convenient to define these macros 
alongside the convenience class definition macro (see .int.define-special).

.int.class: ClassOfPoly(inst) is an introspection function which returns the 
class of which inst is a direct instance.

.int.subclass: IsSubclassPoly(sub, super) is an introspection function which 
returns a boolean indicating whether sub is a subclass of super.  I.e., it is a 
predicate for testing subclass relationships.


Multiple inheritance

.int.mult-inherit: Multiple inheritance involves an extension of the protocol 
(see .int.extend) and also multiple uses of the single inheritance mechanism 
(see .ini.inheritance).  It also requires specialized methods for coerceClass 
and coerceInst to be written (see .overview.coerce-class & 
.overview.coerce-inst).  Documentation on support for multiple inheritance is 
under construction.  This facility is not currently used.  The basic idea is 
described in mail.tony.1998-10-06.11-03(0).


Protocol guidelines

.guide.fail: When designing an extensible function which might fail, the design 
must permit the correct implementation of the failure-case code.  Typically, a 
failure might occur in any method in the chain.  Each method is responsible for 
correctly propagating failure information supplied by superclass methods and 
for managing it's own failures. 

.guide.fail.before-next: Dealing with a failure which is detected before any 
next-method call is made is similar to a fail case in any non-extensible 
function.  See .example.fail below. 

.guide.fail.during-next: Dealing with a failure returned from a next-method 
call is also similar to a fail case in any non-extensible function.  See 
.example.fail below. 

.guide.fail.after-next: Dealing with a failure which is detected after the next 
methods have been successfully invoked is more complex.  If this scenario is 
possible, the design must include an "anti-function", and each class must 
ensure that it provides a method for the anti-method which will clean up any 
resources which are claimed after a successful invocation of the main method 
for that class.  Typically the anti-function would exist anyway for clients of 
the protocol (e.g. "finish" is an anti-function for "init").  The effect of the 
next-method call can then be cleaned up by calling the anti-method for the 
superclass.  See .example.fail below. 


Example

.example.inheritance: The following example class definition shows both 
inheritance and specialization.  It shows the definition of the class 
EPDRPoolClass, which inherits from EPDLPoolClass and has specialized values of 
the name, init & alloc fields.  The type EPDLPoolClass is an alias (typedef) 
for PoolClass.

typedef EPDLPoolClass EPDRPoolClass;
typedef EPDLPoolClassStruct EPDRPoolClassStruct;

DEFINE_CLASS(EPDRPoolClass, this)
{
  INHERIT_CLASS(this, EPDLPoolClass);
  this->name = "EPDR";
  this->init = EPDRInit;
  this->alloc = EPDRAlloc;
}


.example.extension: The following (hypothetical) example class definition shows 
inheritance, specialization and also extension.  It shows the definition of the 
class EPDLDebugPoolClass, which inherits from EPDLPoolClass, but also 
implements a method for checking properties of the pool.

typedef struct EPDLDebugPoolClassStruct {
  EPDLPoolClassStruct epdl;
  DebugPoolCheckMethod check;
  Sig sig;
} EPDLDebugPoolClassStruct;

typedef EPDLDebugPoolClassStruct *EPDLDebugPoolClass;

DEFINE_CLASS(EPDLDebugPoolClass, this)
{
  EPDLPoolClass epdl = &this->epdl;
  INHERIT_CLASS(epdl, EPDLPoolClass);
  epdl->name = "EPDLDBG";
  this->check = EPDLDebugCheck;
  this->sig = EPDLDebugSig;
}

.example.fail: The following example shows the implemenation of failure-case 
code for an "init" method, making use of the "finish" anti-method:-

static Res mySegInit(Seg seg, Pool pool, Addr base, Size size, 
                     Bool reservoirPermit, va_list args)
{
  SegClass super;
  MYSeg myseg;
  OBJ1 obj1;
  Res res;
  Arena arena;

  AVERT(Seg, seg);
  myseg = SegMYSeg(seg);
  AVERT(Pool, pool);
  arena = PoolArena(pool);

  /* Ensure the pool is ready for the segment */
  res = myNoteSeg(pool, seg);
  if(res != ResOK)
    goto failNoteSeg;

  /* Initialize the superclass fields first via next-method call */
  super = (SegClass)SUPERCLASS(MYSegClass);
  res = super->init(seg, pool, base, size, reservoirPermit, args);
  if(res != ResOK)
    goto failNextMethods;

  /* Create an object after the next-method call */
  res = ControlAlloc(&obj1, arena, sizeof(OBJ1Struct), reservoirPermit);
  if(res != ResOK)
    goto failObj1;

  myseg->obj1 = obj1
  return ResOK;

failObj1:
  /* call the anti-method for the superclass */
  super->finish(seg);
failNextMethods:
  /* reverse the effect of myNoteSeg */
  myUnnoteSeg(pool, seg);
failNoteSeg:
  return res;
}


IMPLEMENTATION

.impl.derived-names: The DEFINE_CLASS macro derives some additional names from 
the class name as part of it's implementation.  These should not appear in the 
source code - but it may be useful to know about this for debugging purposes.  
For each class definition for class SomeClass, the macro defines the following:

extern SomeClass EnsureSomeClass(void);  /* The class accessor function. 
See.overview.naming */
static Bool protocolSomeClassGuardian;  /* A boolean which indicates whether 
the class has been initialzed yet */
static void protocolEnsureSomeClass(SomeClass); /* A function called by 
EnsureSomeClass. All the class initialization code is actually in this function 
*/
static SomeClassStruct protocolSomeClassStruct; /* Static storage for the 
canonical class object */
  

.impl.init-once: Class objects only behave according to their definition after 
they have been initialized, and class protocols may not be used before 
initialization has happened.  The only code which is allowed to see a class 
object in a partially initialized state is the initialization code itself -- 
and this must take care not to pass the object to any other code which might 
assume it is initialized. Once a class has been initialized, the class might 
have a client.  The class must not be initialized again when this has happened, 
because the state is not necessarily consistent in the middle of an 
initialization function.  The initialization state for each class is stored in 
a boolean "guardian" variable whose name is derived from the class name (see 
.impl.derived-names).  This ensures the initialization hapens only once.  The 
path through the EnsureSomeClass function should be very fast for the common 
case when this variable is TRUE, and the class has already been initialized, as 
the canonical static storage can simply be returned in that case.  However, 
when the value of the guardian is FALSE, the class is not initialized.  In this 
case, a call to EnsureSomeClass must first execute the initialization code and 
then set the guardian to TRUE.  However, this must happen atomically (see 
.impl.init-lock).

.impl.init-lock: There would be the possibility of a race condition if 
EnsureSomeClass were called concurrently on separate threads before SomeClass 
has been initialized.  The class must not be initialized more than once, so the 
sequence test-guard, init-class, set-guard must be run as a critical region.  
It's not sufficient to use the arena lock to protect the critical region, 
because the class object might be shared between multiple arenas.  The 
DEFINE_CLASS macro uses a global recursive lock instead.  The lock is only 
claimed after an initial unlocked access of the guard variable shows that the 
class in not initialized.  This avoids any locking overhead for the common case 
where the class is already initialized.  This lock is provided by the lock 
module -- see design.mps.lock(0).
B. Document History

2002-06-07
Converted from MMInfo database design document.
C. Copyright and License

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
Redistributions in any form must be accompanied by information on how to obtain complete source code for the this software and any accompanying software that uses this software. The source code must either be included in the distribution or be available for no more than the cost of distribution plus a nominal fee, and must be freely redistributable under reasonable conditions. For an executable file, complete source code means the source code for all modules it contains. It does not include source code for modules or files that typically accompany the major components of the operating system on which the executable file runs.
This software is provided by the copyright holders and contributors "as is" and any express or implied warranties, including, but not limited to, the implied warranties of merchantability, fitness for a particular purpose, or non-infringement, are disclaimed. In no event shall the copyright holders and contributors be liable for any direct, indirect, incidental, special, exemplary, or consequential damages (including, but not limited to, procurement of substitute goods or services; loss of use, data, or profits; or business interruption) however caused and on any theory of liability, whether in contract, strict liability, or tort (including negligence or otherwise) arising in any way out of the use of this software, even if advised of the possibility of such damage.
A. References

B. Document History

C. Copyright and License