Archive for April, 2007

KBattleship: almost ready

April 13, 2007

This week I’ve been hacking quite a lot on KBattleship, adding almost all required features: a decent AI, network play and sounds. While Riccardo is working on KWelcomeScreen, a library class inspired by this blog post by Johann, I’m going to make some minor adjustments, add small missing features like scoring, and if everything goes well I guess we can move the application to kdereview within next week.
This is a screenshot showing a game between the old KBattleship and my rewrite:
KBS4 screenshot

Advertisements

Dynamic types and virtual inner types

April 5, 2007

The concept based polymorphism I tried to explain in my previous post can be more conveniently expressed with another syntax which avoids concepts at all.

The key observation is that runtime polymorphism and F-bounded polymorphism are very similar in nature, and, to some extent, the latter can be implemented using the former. If I have understood correctly, that is exactly how the Java virtual machine implements generics.

For example, suppose you have an abstract class Shape with a virtual function draw, and several concrete classes like Square, Circle, Triangle, which override draw.
Client code using runtime polymorphism would look like:

void render(Shape* shape)
{
  shape->draw();
}

while using generic programming one could write:

template <typename S>
where Inherits<S, Shape>
void render(S* s)
{
  s->draw();
}

assuming that the concept Inherits<T, U> is verified when T inherits from U (possibly indirectly). An approximation for Inherits is

auto concept Inherits<typename T, typename U>
{
  // T can be implicitly converted to U
  T::operator U();
}

Using the generic code in such situations should probably be considered a mistake, because if one wants to use template based polymorphism, defining the abstract class Shape is pointless. It would be better to directly define a ShapeConcept having a member function draw.

However, the example suggests that a syntax resembling templates could be used to express runtime polymorphism:

template <Shape! S>
void render(S* s)
{
  s->draw();
}

the idea being that S is the runtime type of s. The compiler could instantiate the template immediately and implement the body just like a Shape pointer were used instead of S. Here S is not really a type, since for example expressions like

new S

should be rejected; let’s call things like S dynamic types. They behave much like incomplete types (cannot be instantiated directy, sizeof cannot be called on them), but are a different beast: for the purpose of typechecking they are just subtypes of their parent concept (Shape, in this example), but for code generation they are considered equivalent to it.

Why could all of this syntax be possibly useful? Well, when using ordinary abstract classes, it does not really add anything to the language, but it shows its power when another (more than syntactical) extension is introduced: virtual inner types.

Just like the name suggests, virtual inner types are inner types which behave polymorphically. The typical (and motivating) example here is still that of the abstract factory. So suppose you have the following abstract factory:

class AbstractAnimal;
class AbstractFood;

class AbstractFactory
{
public:
  virtual AbstractAnimal* createAnimal() = 0;
  virtual AbstractFood* createFood() = 0;
};

and the following abstract products:

class AbstractAnimal
{
public:
  virtual void eat(Food*) = 0;
};

class AbstractFood
{
public:
  virtual int quality() const = 0;
};

As explained in my previous posts [1] [2], there’s no way to explain to the compiler that the Food that an animal is able to eat is only that created by the same factory which created that very animal.
Virtual inner types could be used to solve the problem:

class AbstractAnimal;
class AbstractFood;

class AbstractFactory
{
public:
  virtual typedef AbstractAnimal Animal;
  virtual typedef AbstractFood Food;
  virtual Animal* createAnimal() = 0;
  virtual Food* createFood() = 0;
};

class AbstractAnimal
{
public:
  virtual typedef AbstractFactory Factory;
  virtual void eat(Factory::Food* food) = 0;
};

class AbstractFood
{
public:
  virtual typedef AbstractFactory Factory; // unused, only here for completeness
  virtual int quality() const = 0;
};

The idea is that virtual typedef Base X forces inherited class to define an inner type X inheriting from Base. Furthermore, this type behaves polymorphically (the type itself, not only instances!). That means that if A is a dynamic type with parent AbstractAnimal, then A::Factory is the correct dynamic type with parent AbstractFactory.

To explain this fact in more detail, let’s continue the example:

class Cow;
class Grass;

class CowFactory : public AbstractFactory
{
public:
  typedef Cow Animal;
  typedef Grass Food;
  Cow* createAnimal();
  Grass* createFood();
};

class Cow : public Animal
{
public:
  typedef CowFactory Factory;
  void eat(Grass* grass);
};

class Grass : public Food
{
public:
  typedef CowFactory Factory;
  int quality() const;
};

notice that we have defined inner types corresponding to the virtual typedefs of parent classes. Besides, the argument to eat is Grass and not AbstractFood.
Client code, in fact, cannot simply pass an AbstractFood object to the eat member function. The following function should not compile:

void feed(AbstractAnimal* animal, AbstractFood* food)
{
  animal->eat(food);
}

but can be fixed with the use of dynamic types

template <AbstractFactory! Factory>
void feed(Factory::Animal* animal, Factory::Food* food)
{
  animal->eat(food);
}

expressing the fact that both animal and food come from the same factory of dynamic type Factory. The food parameter (actually stored in an AbstractFood variable), is statically downcasted to Grass before being passed to the eat member function. The static typechecking via dynamic types ensures that such a downcast is always safe, so there is no need to perform a dynamic cast.

These ideas for a language extension are essentially equivalent to the ones expressed in the previous post, but seem to require less drastic changes in compiler implementations. Furthermore, they don’t use concepts, so it could be even implemented on top of the existing gcc.

Covariance versus substitutivity

April 3, 2007

The covariant argument problem which I tried to explain in a previous post has its roots in the incompatibility between covariance and substitutivity in a static typed world.

Substitutivity is considered the most important feature in the design of an object oriented language, though it is often underrated or misunderstood by many users of the language. Basically, it states that subtyping is a “is-a” relation, i.e: any object of a derived class can be used everywhere an object of the base class in required, expressed with C++ terminology. In languages that follow this rule, derived classes may only widen the interface of their parent, since narrowing it could cause conflicts which are unpredictable at compile time. Narrowing the interface could be done in many ways, so they are all prohibited in languages like C++, C# or Java:

  • removing a virtual function of the base class;
  • contravariantly changing the return type of a virtual function (i.e. moving it up in the inheritance hierarchy);
  • covariantly changing the type of any argument of a virtual function (i.e. moving it down in the inhritance hierarchy).

The problem is that any of these possibilities where allowed, a piece of code that is able to operate on an object of the base class could “use a feature” that the derived class has removed (or incompatibly modified) and cause runtime failures.

For example, suppose that class Cow inherits Animal, where the latter has a virtual function eat which takes a Food object as an argument. It seems natural to override eat in Cow so that it takes a Grass (cows don’t eat any food), but if we do so, the following function

void f(Animal* a)
{
  Food* f = new Food;
  a->eat(f);
}

which is perfectly valid when a is of class Animal, fails when called with an instance of Cow, and this violates substitutivity.

The wikipedia article about overriding is a perfect example of how much confusion the substitutivity rule can cause. The example in the article uses runtime exceptions to narrow the interface of the base class (note that the language used there is dynamically typed, so there would be no reason to do so). Using exception works (even in static typed languages, of course) but has several drawbacks:

  • moves a part of typechecking to runtime, and that goes against the idea of having static typing in the first place;
  • clutters the code with typechecking statements;
  • feels a bit like a hack: it is essentially working around a limitation of the programming language.

Using exceptions for narrowing the interface of derived classes, the previous example could be worked out as follows:

class Cow : public Animal
{
  virtual void eat(Food* food)
  {
    if (Grass* grass = dynamic_cast<Grass*>(food)) {
      // method body
    }
    else {
      throw typechecking_exception(/* ... */);
    }
  }
};

I guess at this point some of you may still feel a bit unsure about why would anyone need to narrow the interface of derived classes.
Indeed, many times, the urge of doing so turns out to be a direct consequence of a flawed object oriented design. Look at the Person / Baby example from the wikipedia article, for instance: the error there is that Baby should not be modelled as a subclass of Person. It is probably better to have a (likely abstract) class HumanBeing, with Person and Baby being both derived classes. If Baby needs to reuse part of Person’s implementation, it’s probably better to consider using incapsulation (after all inheritance is not for code reuse).

However, there are situations (like the one I tried to describe here) where the need to narrow interfaces while inheriting is justifiable. These situations can be distinguished by the fact that one is willing to sacrifice substitutivity, because sometimes it even gets in the way, and it is not a valuable feature anymore, but an obstacle for a clean design.
In fact, it was clear from the previous examples that interface narrowing (called CAT in Eiffel terminology) is not an evil thing per se; it’s just that it can cause runtime failures when the substitutivity principle is applied.
For an example of a situation where a CAT is necessary, take the Abstract Factory pattern. For those who don’t have familiarity with this technique, it basically consists of an abstract factory class, which is able to create abstract products. Each abstract product is represented by a different abstract class that is the return type of a specialized (abstract) factory method of the abstract factory class. Concrete products must then be defined by inheritance; they are returned by the overridden factory methods of a concrete factory class. Several factory classes associated with different concrete products can be defined. The client only uses the abstract products and the abstract factory, and this allows for great generality in that the client is able to deal with a number of different product families, just by changing the particular concrete factory it uses (and this can be even done at runtime).

The fact that the client only accesses products through the particular concrete factory it happens to be using implies that it won’t be mixing concrete products belonging to different families (i.e. created with different concrete factories). There are some facts worth noticing here:

  • Substitutivity does not apply. Or better, it should not be applied. Applying it would mean that incompatible products could be mixed, probably resulting in undefined behaviour.
  • It is client code’s responsibility to use the correct factory. If more than one factory is used, respective products should not be intermixed.
  • If the abstract products (hence the concrete ones) are interdependent, an interface narrowing could be needed. To stick with the previous example, suppose that both Animal and Food are abstract products, Cow and Grass being corresponding concrete implementations. If an Animal takes a Food object as an argument to one of its methods, how can this method be overridden in the concrete class? The dilemma here is that (assuming a well behaved client) we know that the method is going to be called with a Grass argument, but we are forced to declare it as a general Food because we need to preserve the parent signature.

Mainstream static typed languages do not offer a clean solution to the problems outlined here. The template wrapper workaround used in KBoard is nice, but has several drawbacks as well (I’ll probably talk about it in more detail in another post). Obviously, such a solution could be simplified by direct language support, but the fact that typechecking has been partially moved to runtime makes it suboptimal anyway, so that it is not probably worth including it in the language, if it claims to be statically typed.
The main question here is: “it is possible to give up substitutivity in a hierarchy to achieve greater power on the customization of derived classes, while having all type checking done at compile time?”.

To see why the answer has to be yes, let us find a solution to the problem using generic polymorphism instead of virtual functions.
It turns out that, using generic programming facilities, the Abstract Factory pattern can be implemented without using an abstract factory class at all! Here is a sample realization of the previous scenario using C++ templates:

class Cow;
class Grass;

class CowFactory
{
public:
  typedef Cow Animal;
  typedef Grass Food;

  // factory methods are not needed! one can simply use "new CowFactory::Food"
}

class Cow
{
public:
  virtual void eat(Grass* g) { /* ... */ }
};

class Food
{
  // ...
};

As you can see, there are no abstract classes at all. We directly implement concrete products following an implicit interface. Client code could be structured like this:

template <typename Factory>
class Client
{
  void f()
  {
    typename Factory::Animal animal;
    typename Factory::Food food;

    animal.eat(food);
  };
};

This may look like a perfect solution: elegant, clean, efficient (notice that there are no virtual functions, and even dynamic allocations are avoided) and just as general and powerful as that based on runtime polymorphism, but unfortunately it has a fundamental drawback which makes it unfeasible in all practical circumstances: it requires the client code to have a Factory template argument.
This has a series of unfortunate consequences:

  • Client code has to be completely recompiled whenever you change the factory or the products.
  • There is a copy of the client code for each product family. If there are a lot of families, this leads to code bloat.
  • There’s no way to dynamically load a product family at runtime. A family cannot reside in a dynamic library. This is because genericity here depends on template instantiation, which can only happen at compile time.

Nevertheless, it is possible to elaborate on that idea to define a language extension which almost perfectly solves the problem. My idea is based on concepts, a new C++ feature which will be introduced with the new language standard C++0x.
To begin with, I’ll show you how the ConceptC++ language can be extended so that abstract classes can be modelled by concepts. Let’s stick to the Java terminology of calling interfaces those abstract classes with no data members and all pure virtual functions.
There is an obvious natural map from interfaces to concept, mapping for example a class like

class A
{
  virtual void f(int x) = 0;
  virtual Foo* g() const = 0;
};

to the concept

concept A<typename T>
{
  void T::f(int x);
  Foo* T::g() const;
};

Code using the interface A could be rewritten using a template parameter T constrained by the concept A, and the semantics would be preserved, as well as the typechecking, with all pros and cons of the generic abstract factory implementation outlined above. Indeed, that was a particular case of this transformation, where we have simply omitted the concepts involved.

Let’s now consider an implementation of the A interface

class X : public A
{
public:
  void f(int x) { cout << x << endl; }
  Foo* g() const { return new Foo; }
};

Is it possible to extend the language so that the above class compiles even when A is a concept? If the concept A has a single argument, no constructor or free function constraints, and no associated types, the answer is yes. The compiler needs to create an abstract class corresponding to the concept and have X inherit from it.

In this case we did not gain anything, but let’s broaden a little bit the set of admissible concepts. Define a concept dynamic if it has a single argument, no constructor or free function constraints, and all its associated types are constrained by dynamic concepts and never used by value (probably other restrictions are needed, but you get an idea). For example:

concept AbstractFoo<typename T>
{
  int T::foo() const;
};

concept A<typename T>
{
  AbstractFoo Foo = T::Foo;
  void T::test(Foo* x) const;
  Foo* T::create() const;
};

class MyFoo : public AbstractFoo
{
public:
  int value;
  MyFoo(int value) : value(value) { }
  int foo() const { return value; }
};

class X : public A
{
public:
  typedef MyFoo Foo;
  void test(MyFoo* f) const { f->value = 37; }
  MyFoo* create() const { return new MyFoo(42); }
};

Since AbstractFoo and A are dynamic concepts by the above definition, the compiler lets you treat them as abstract classes, and inherit from them.
Under the hood, the compiler creates the corresponding abstract classes and virtual tables, and behaves just like the concepts where abstract classes at all. If you are wondering what all this complication is for, just look at the argument of the test member function: it is an associated type, and as such, covariantly modified in the derived class. This could not be possible with normal inheritance.

To see how all of this can possibly work (and be statically checked), let’s write down some usage examples:

void evil()
{
  A* a = new X;
  AbstractFoo* foo = new MyFoo(15);
  a->test(foo);
}

void safe()
{
  A* a = new X;
  a->test(a->createFoo());
}

The compiler has no way to understand that the evil function doesn’t actually do anything evil, so it has to give a type checking error on the test call there.
However, the safe function does something interesting: it calls test with a parameter which is of the Foo inner type of a’s actual type. Therefore, the compiler can safely assume that no type mismatch is taking place, and accept the call.

But what if one wants to store the result of createFoo in some member of temporary variable? What is the type of this variable? Here we need to invent a bit of new syntax. I like the idea of being able to write something like the following:

A* a = new X;
conceptof(a)::Foo* foo = a->createFoo();
a->test(foo);

Everything can be checked at compile time, since we know from the concept specification that createFoo is going to return the Foo inner type of the actual type of a, and we can typecheck the second and third line without knowing this type exactly. The conceptof keywords extracts the dynamic concept associated to an object and returns a dynamic type.
Dynamic types can be used just like ordinary types: they can be typedef’d like this:

A* a = new X;
typedef A! conceptof(a) T;
T::Foo* foo = a->createFoo();
a->test(foo);

(note the exclamation mark), and used as template parameters:

template <A! T>
void f(T* a)
{
  T::Foo* foo = a->createFoo();

}

in which case the template is immediately instantiated.

For completeness’ sake, I’ll translate the previous example about cows and grass to dynamic concepts:

concept GenericFactory<typename Factory>
{
  GenericFood Animal = Factory::Animal;
  GenericAnimal Food = Factory::Food;

  Animal* Factory::createAnimal();
  Food* Factory::createFood();
};

concept GenericFood<typename Food>
{
  GenericFactory Factory = Food::Factory;
  int Food::quality();
};

concept GenericAnimal<typename Animal>
{
  GenericFactory Factory = Animal::Factory;
  void Animal::eat(GenericFactory::Food* f);
};

class CowFactory;

class Grass : public GenericFood!
{
public:
  typedef CowFactory Factory;
  int quality() { return 0; }
};

class Cow : public GenericAnimal!
{
public:
  typedef CowFactory Factory;
  void eat(Grass* grass) { }
};

class CowFactory : public GenericFactory!
{
public:
  typedef Cow Animal;
  typedef Grass Food;

  Cow* createAnimal() { return new Cow; }
  Grass* createFood() { return new Grass; }
};

class Client
{
public:
  void f(GenericFactory* factory) {
    typedef GenericFactory! conceptof(factory) Factory;
    Factory::Animal* animal = factory->createAnimal();
    Factory::Food* food = factory->createFood();
    animal->eat(food);
  }

  template <GenericFactory! Factory>
  void feed(Factory::Animal* animal,
            Factory::Food* food)
  {
    animal->eat(food);
  }
};

Of course, these are only a bunch of ideas, and I cannot be sure that this approach is feasible. I would love to hear the opinion of some C++ expert about this stuff.