Wednesday, August 18, 2010

Why Inheritance Sucks (in Ruby, at least)

[Update: Some readers went up in arms over this post, probably because the original title didn't specify which language I'm talking about. To clarify, I'm talking about inheritance in Ruby, compared to using modules, also in Ruby. My point here is that inheritance is an essential feature in C++/Java/C#, but not as much in Ruby. No, I'm not saying that Java should drop inheritance anytime soon.]

I came to Ruby from a static language background (C++, Java), and I had a hard time leaving my hold habits behind. In particular, as a Ruby beginner I tended to overuse inheritance. These days, I rarely use inheritance at all. Instead, I use modules. Let's look at the difference.

When you use inheritance, the superclass becomes an ancestor of the subclass. When you call a method, Ruby walks up the chain of ancestors until it finds the method. So, objects of the subclass also get the methods defined in the superclass.


When you use modules, the module also becomes an ancestor of the class, just like the superclass does:


When you call a method, Ruby still walks up the ancestors chain until it finds the method. The net effect is exactly the same as the picture above, except that Bird is now a module instead of a class. So, having a method in a superclass or having the same method in a module doesn't make much difference in practice.

However, modules are generally more flexible than superclasses. Modules can be managed at runtime, because include is just a regular method call, while superclasses are set in stone as you write your class definitions. Modules are much easier to use and test in isolation than tightly coupled hierarchy of classes. You can include as many modules as you like, while you can only have one superclass per class. And finally, when you get into advanced Ruby, modules give you much more flexibility than classes, so you can use modules to cast magic metaprogramming spells like Singleton Methods and Class Extensions.

If inheritance is so much worse than modules in Ruby, then why do languages like Java and C# rely on inheritance so much? There are two reasons why you use inheritance in these languages. The first reason is that you want to manage your methods - for example, re-use the same method in different subclasses. The second reason is because you want to upcast the type of a reference from a subclass to a superclass - that's the only way to get polimorphism in Java. The first reason is not as valid in Ruby, because you can just as well use modules to manage your methods. However, upcasting is more interesting.

Java is both compiled and statically typed, so the compiler can analyze your code and spot type-related mistakes. In particular, it can spot upcasting mistakes: if you have a method that takes Minerals, and you pass a Dog to the method, then the compiler will complain that a Dog is an Animal, not a Mineral, so you cannot upcast a Dog reference to a Mineral reference. In Ruby you don't declare your types, so you don't have upcasting at all. Even if you did have upcasting, you wouldn't have a compiler double-checking it. So you don't get the same advantages out of inheritance in Ruby compared to Java.

"Wait a minute," I hear you say. "Some of the limitations of inheritance are actually a good thing! Including multiple modules in Ruby is just like having multiple inheritance in C++, and multiple inheritance is a big mess. That's why Java and C# force you to inherit from a single class". This is the "diamond" problem that my original C++ mentor used to warn me about: if your class has two superclasses, and they both inherit from yet another superclass, then you get a diamond-shaped inheritance chain that can potentially be confusing. Wouldn't modules be a throwback to this kind of headaches?

In practice, however, Ruby modules tend to be more manageable than multiple superclasses. In Ruby, the chain of ancestors always follows a single path, where each module or class can only appear once - so you can't have diamond-shaped inheritance. If you understand how Ruby builds the chain of ancestors, you're never going to find yourself in an ambiguous situation where you don't know which method is called: simply enough, Ruby always calls the version of the method that's lower on the ancestors chain. (You can still get a clash if two separate modules reference instance variables with the same name, but that rarely happens in practice.) More crucially, the way you write code in Ruby is different from the way you write code in a static language. If you get used to crazy stuff like replacing methods with Monkeypatches, then there is no reason why you shouldn't get used to managing methods with modules.

I took literally years to get rid of my tendency to think in inheritance. Now I finally understand why large Ruby projects such as Rails barely use inheritance at all, and rely almost exclusively on modules.

Use inheritance sparingly.

21 comments:

  1. Hello Paolo,
    I agree with you, I find modules the most flexible way to encapsulate behaviour. Nowadays I think that many so called OO languages such as Java and especially C++ should be called Class Oriented languages, their flawed object model based on the questionable axiom that inheritance is the preferred way to share behaviour has corrupted many programmers and produced too much complexity.
    That is in contrast to clean OO design such as Smalltalk and ..ruby.

    Thank you for your article.

    ReplyDelete
  2. From the sound of it, "modules" do the same thing than "inheritance", and then some. Clearly, you're still using a form of inheritance. Clearly, you don't actually think that inheritance sucks.

    If you did, you would have switched to functional programming by now.

    I, on the other hand, do think inheritance (including your usage of modules) sucks, for a simple reason: derived classes depend too much on their ancestors. It creates big and implicit interfaces. Big interfaces are bad because they mean high coupling and are difficult to keep in your head at once.

    Most of the time, bare closures are simpler.

    ReplyDelete
  3. Loup: one difference between using a module and using inheritance is that modules are bound dynamically, so they're not as strictly coupled as classes/subclasses. You can use or test both the including class and the included module in isolation, happily ignoring their relationship.
    On the other hand, if you want to go out with a subclass, then you'll have take the entire family of superclasses aboard.

    One outstanding example of this is ActiveRecord. The entire library basically consists of a single class that is enriched by including a *lot* of modules. Both the base class and the modules can be tested, used and documented in isolation, so even if the resulting interface is huge, it doesn't get in your way nearly as much as a chain of superclasses does. The resulting code is surprisingly decoupled and maintainable.

    In a sense, this approach is more akin to composition than inheritance. Basically, ActiveRecord easily does with modules what large Java projects struggle to do with dependency injection frameworks.

    >Clearly, you don't actually think that
    >inheritance sucks. If you did, you would
    >have switched to functional programming
    >by now.

    In a statically typed OOP language, inheritance is necessary. In a dynamically typed language like Ruby, you have generally better options. I'll readily admit that functional programming cleans the slate of any such problem, but I still like OOP - expecially when done right, as miky above suggests.

    ReplyDelete
  4. Paolo: Would you say that you could live happily with a programming language that has no inheritance at all—only mixins? I thought about that question a lot for myself lately and I think I wouldn't need inheritance.

    ReplyDelete
  5. Excellent post. I think you can use some of the same techniques in static languages- while you can't dynamically change C++ base classes, you can use multiple inheritance mostly the same way you use mixins in Ruby. Scala traits also use the same idea.

    ReplyDelete
  6. You're definitely over stating it when you say "This is the dreaded "diamond" problem that C++ programmers learn to fear."

    In C++ this is solved by making your inheritance virtual. This isn't something to dread, it is merely a single word of syntax!

    Also your statement "if your class has two superclasses that both define an instance variable named x, then you can have a hard time specifying which x you're using."

    Is essentially wrong. The issue with diamond inheritance is the supermost base class gets inherited twice, giving your class two copies of this class (that we'll call) A. A virtual inheritance tells the C++ compiler to check for this and ensure only a single A class is inserted into your inheriting D class.

    A simple note: "Simply enough, Ruby always calls the version of the method that's lower on the ancestors chain."

    So does C++.

    The real reason languages don't do multiple inheritance anymore is nothing to help the programmer. In fact all the uses you're describing are handled perfectly in C++.

    The real reason is how hard they are to create. Writing the language specs and especially writing the compilers for multiple inheritance is really hard and really complicated. In the end the system is much better for the programmer but at a large cost to the language maintainers (both spec and compiler writers).

    The more people capable and interested in writing a compiler for your language, the more it will succeed. This creates a downward pressure on languages to be easier to create and maintain. C++ ignored this overhead and saw the cost of doing so. For an example of where C++ went too far, check out "extern templates." The allusive C++ feature that no compiler has ever really implemented.

    ReplyDelete
  7. aias: In a statically-typed OOP language, inheritance is a key feature because of upcasting. In a dynamic language, however, I think that inheritance isn't necessarily the best option. In Ruby, it's nice to have, but it makes the language a tad more complex: instead of having a uniform chain of ancestors, you have two special cases, superclasses and included modules. Is Ruby inheritance worth the added complexity? I don't have a very strong opinion on that, but I do think that it's easy to overuse inheritance and incur the opportunity cost of not using modules enough. So I'll stay on the safe side and challenge the need for inheritance in my code when I'm tempted to use it.

    ReplyDelete
  8. Dustin: You're right that I misrepresented the diamond problem. I edited the post to fix that.

    You're also right that C++ has ways around the diamond problem, but they're not very intuitive until you really wrap your head around the language. Until you do, you can easily get tripped.

    I tend to disagree with you that multiple inheritance fell from fasion just because it's difficult to implement in compilers. While that's certainly true, there are other reasons. I think that the official stance of Java was that multiple inheritance made life more complex for programmers. However, I suspect that the core reason to drop multiple inheritance is that multiple inheritance doesn't fit very well with a singly-rooted class hierarchy, which is a useful feature of Java and C#.

    ReplyDelete
  9. I love the religious debates of what is better, inheritance or modules, Ruby or C++, functional or OOP... or whether the iPhone is better than Android or working from home is better than working from the office. Ridiculous.

    Fact: Every language has strengths and weaknesses and they all have pros and cons. I challenge anyone that knows their language in any kind of depth to look me in the eye and tell me with a straight face that their language is the best. You can't.

    Any developer worth their salt will tell you that you don't tap a keg with a jackhammer, you don't dig up a road with a chisel, you don't kill a fly with a demolition ball.

    These programming languages and methodologies you all harp on about being the best may certainly be the best in specific situations, but they quite probably suck in others. They're just tools in your toolbox, you should learn them all and learn to apply them in the correct situation.

    Most of the time, you will actually find that you can borrow ideas from on language and use it in another with a little ingenuity and it may work really well in some situations.

    No language is the best language in all situations. You don't program queries on a SQL database or write a high level web interfaces in assembler (unless you're psychotic). Ruby, like Prolog, sucks ass for flashing my BIOS. You don't write low level hardware drivers in SQL or Java. You don't write anything in ASP.NET web forms...period (okay, I'm being facetious, even ASP.NET web forms have their place).

    My point is that every language had a purpose and an objective in mind when it was designed. It doesn't mean it can't do other things, but it's highly likely that it will perform as well at those things as another language designed specifically for that purpose.

    You should know enough about a language to know what its purpose is and when it's best to use or avoid it (or completely ignore it) for any given situation.

    So for the arguments "inheritance sucks", "modular programming is better", "functional programming is best" - the fact is, they all have places where they excel and places where they suck.

    End of rant...

    Thanks for listening.

    ReplyDelete
  10. Ben: I didn't write that inheritance in Java sucks compared to modules in Ruby. That'd be comparing apples and oranges. What I wrote is that inheritance in Ruby sucks compared to modules in Ruby. I suspect that you're commenting on the post title rather than the post text here, but it's too late to change the title now (and besides, I dig pulp titles ;) ).

    As much as I like Ruby, I still write Java code, and I wouldn't do without inheritance and upcasting in Java for sure. Ruby picked a different set of trade-offs. As you rightly say, different problems require different tools.

    ReplyDelete
  11. Hi Paolo, you didn't mention the *other* way to get polimorphism in Java, that is to use interfaces... that's a nice feature that makes Java suck less :-)

    ReplyDelete
  12. Matteo: I was focusing on Ruby, where interfaces wouldn't make sense. In a statically typed language, interfaces are a good way to get polimorphism without the clumsiness of class inheritance.

    ReplyDelete
  13. Salve, mi chiamo Andrea Garbellini, ho avuto i riferimenti da Claudio Bellentani, sono interessato con urgenza a coinvolgerti in un progetto di agile, con metodo SCRUM.
    Ti lascio un riferimento diretto
    3334386543
    info(at)omnianova.com
    a presto

    ReplyDelete
  14. Hmm...so modules in ruby are like interfaces in Java ?

    ReplyDelete
  15. Another thing: The creator of Java, James Gosling once said that if he would start creating java from scratch again, he would leave out inheritance feature, because it is evil. Always programm to an interface, and don't inherit ;)

    ReplyDelete
  16. Hey, DreamerForever.

    Actually Ruby modules are emphatically *not* similar to Java's interfaces. Modules contain executable code - interfaces just describe the methods that you can potentially call on an implementer (and maybe global constants). In that respect, modules are much closer to traditional inheritance, but they don't have the side effect of strongly coupling classes (I guess that's the "evil" that Gosling refers to) - so they have some of the advantages of inheritance, together with some of the advantages of composition.

    One good way to see the difference between modules and interfaces: look at the example in the post above, and imagine implementing that with Java interfaces. You'll get stuck, because you don't have a place to put the implementation of the fly() method.

    Interfaces are a very useful feature for a static language such as Java, but they wouldn't be nearly as useful in a dynamic language such as Ruby, where you don't have upcasting.

    ReplyDelete
  17. It doesn't make sense to add a Bird to a Duck for me. This is a conceptual problem. Also, please don't come up with "programming models should not adhere to conceptual models", as this is ultra counter productive. So, while a Duck is a Bird, inheritance will work.

    ReplyDelete
  18. d1Bug,
    you seem to imply that there is a platonic perfect "conceptual model in the sky", and it's inherently hierarchical - so you should use inheritance in your programming model. I disagree on that. I believe that the Java inheritance-centered way of thinking is an acquired skill. Try to teach Java OOP to programmers with a procedural background, and see how hard it is for them to match that programming model to their own conceptual models.

    You seem to speak from the position of someone who natively thinks in hierarchies, but hierarchies come with plenty of conceptual mismatches of their own. In Java, for example, I often have to recur to complex patterns to work around the constraints of single inheritance (which doesn't fit my conceptual model of most problems), or the similarly constraining idea that an object's class never changes during the object's lifetime. There are worse, and more subtle, mismatches, but you catch my drift. I think you just learned to automatically think around those constraints when you ponder your models, so they became invisible to you.

    You can conceptualize a problem in many different ways, and the way you pick depends mostly on your mental toolbox. Try telling a LISP programmer that the OOP programming model is the ideal fit for conceptual models in general, and watch the sparks fly. The LISPer is as biased as you and me, but he has a point: there are different ways to think about problems. Empirically, after I wrote a lot of code in both Java and Ruby, I find that the Ruby programming model tends to fit many problems better than the Java programming model.

    ReplyDelete
  19. Following up on the previous comment, and to be concrete: when a Ruby programmers sees that Duck "includes" Bird, she doesn't think "I'm adding a Bird to a Duck", like you did. She thinks: "a Duck has the qualities of a Bird". That is the conceptual meaning of module inclusion in this language. Once you get used to it, you'll find it as natural as "a Duck is a Bird". I took a while to get there - but then I remember taking a while to fully interiorize the "is_a" relationship as well.

    ReplyDelete
  20. Hey Paolo, thanks for your reply. The first thing that I should mention is how modules are an easy way to describe anything (like a class) but without defining a domain and purpose. It's just some piece floating in the sky, as irrelevant as it could be. What you (and many others) are proposing is to create a "conglomerate" of components without order. Nature doesn't work like that, so the mental process, hence systems programming. This is module hell.

    The problem with the current programmers generation is that they tend to simplify everything that they think it's formal, and of course, old and slow. Modern languages seems to me a funny and terrible mantra for this gen.

    I like to think on a model that it's evolutionary and based on archetypes and not only in hierarchies. For example, all humans share the same methods and properties, so a child's arm behave the same way as an adult's arm, maybe with some evolutionary aspect. The arm is a module? Yes, it is. But it's part of a child. And it's also part of an adult. So this module is a pattern on the archetype, and this archetype should be common to human beings.

    Do you get my point?

    Regarding explaining OO concepts to an procedural programmer or functional programmer, it doesn't matter. OO is a simple concept, but it needs a sharp mind to be understood like anything worthy. You will also fail to explain superstrings theory to a lawyer, but that doesn't mean it's false or unnatural.

    ReplyDelete
  21. d1Bug: I think I get your point. I agree with you that singly-rooted hierarchies are a very useful abstraction (a form of induction). The point where we probably disagree is that I don't think that inheritance is intrinsically a better mapping of "the real world" than function composition or mixins, just like the English sentence structure is not a better match for the real world than, say, Japanese. There is no one kosher way of thinking about the world, although "my way" feels more natural to myself. Plato was wrong - there is no such thing as an "ideal horse".

    Sorry if I sound fluffy - what I mean is something very concrete. My current toy project is a simulation (https://github.com/nusco/narjillos/), and simulations are one of the sweet spots for inheritance-based OOP. Indeed, they are the use case that OOP was originally developed for. I picked Java to write this program. Compared to dynamic languages, I'm constantly reminded how clumsy Java abstractions are, and I constantly have to revise my mental model to accomodate things such as singly-rooted inheritance, static fields, and lock-based threading.

    When you look at other domains, things get even clumsier. I think that Java as a language is a terrible match for enterprise middleware programming, which happens to be its main use case in the market. Also look at the absurd contortions we have to go through to deal with the object-relational mismatch. That's one case where two legitimate abstractions of "the real world" happen to work fine in isolation, but just don't work well together.

    Thanks for commenting on this post. In case you care, I talked about related ideas here: https://www.youtube.com/watch?v=v9Gkq9-dnlU.

    ReplyDelete