Search This Blog

Saturday 17 November 2012

The Hibernate Inheritance Mechanism - Flashback

OK, this is a little late. I needed to use inheritance and was going across the earlier posts - as I barely remembered how I had done inheritance in the first place. As I looked at the four techniques and the mutated enhancement provided by Hibernate I saw certain things that I had failed to note down the last time.
So here in  flashback, is some points to keep in mind when doing Hibernate Inheritance.
The first technique as we all saw is not really a Hibernate inheritance technique. It is  actually a 'non-inheritance' technique. By far the simplest approach. It simply says - "Forget inheritance". If you want any generic data, identify the related tables on your own, fire the queries and then aggregate the data.
In this case any inheritance information you have is limited to the java world. Go ahead make classes abstract, have other classes extend them , all fine  - but  Hibernate and the database is not aware of these hierarchies. The hbm files we use here look like any other non-inheritance aware hbm files. Hence the name "Table per Concrete Class + Implicit Polymorphism".

The second technique is actually the first Hibernate Inheritance technique. In this case the class hierarchy is something that Hibernate is aware of. The hbm files start differing from here. Hibernate is made aware of the base class in the heirarchy
<class name="SportsPerson" abstract="true">
The sub-classes are also announced to hibernate:
<union-subclass name="Cricketer" table="CRICKETER">
</union-subclass>
<union-subclass name="Footballer" table="FOOTBALLER">
</union-subclass>
At the database level, however the base class has no representation.In fact neither of the sub-classes have any kind of relation at the database level. The tables created for the sub-classes are exactly like tables created in technique 1. Or for that matter, any simple example. Or is it ?
There is a very teeny weeny difference. The type of the identifier. We cannot use native identifier for the cricketer or footballer tables. We need to use an identifier class that will guarantee us that an id value which appears in Cricket table will not appear in any of the other subclass tables. Why so ?
Well Hibernate is aware of SportsPerson. It is perfectly capable of associating the records of all the classes with the BaseClass. If we were to execute the below query what answer would you expect ?
Query query = session.createQuery("from SportsPerson where id = 1");
How would Hibernate know which record to fetch if all the subclass tables had a record with id 1. For this reason it is imperative that the generator strategy used ensures a particular id value of a subclass table is unique across all subclass tables. This is why native is out. Also the reason I used hilo is because it is the easiest option here. You can go ahead and use other generators - as long as the above rule is satisfied. The query generated for the above HQL expression is :
select
        sportspers0_.ID as ID0_,
        sportspers0_.NAME as NAME0_,
        sportspers0_.RUNS as RUNS1_,
        sportspers0_.WICKETS as WICKETS1_,
        sportspers0_.T20_PLAYER as T3_1_,
        sportspers0_.GOALS as GOALS2_,
        sportspers0_.APPEARANCES as APPEARAN2_2_,
        sportspers0_.SEND_OFFS as SEND3_2_,
        sportspers0_.clazz_ as clazz_ 
    from
        ( select
            T20_PLAYER,
            ID,
            null as GOALS,
            NAME,
            null as APPEARANCES,
            RUNS,
            null as SEND_OFFS,
            WICKETS,
            1 as clazz_ 
        from
            CRICKETER 
        union
        select
            null as T20_PLAYER,
            ID,
            GOALS,
            NAME,
            APPEARANCES,
            null as RUNS,
            SEND_OFFS,
            null as WICKETS,
            2 as clazz_ 
        from
            FOOTBALLER 
    ) sportspers0_ 
where
    sportspers0_.ID=1
An interesting to note is the inner select query. Whenever we execute polymorphic queries Hibernate will have to execute a select * on all the sub class tables. It will then join the results of these tables using union. Finally it will apply the where clause criteria on the aggregated result. (This could be the explanation behind the naming of the sub class elements as <union-subclass>. The technique name is similarly suggestive - "Table Per Concrete Class + Unions"
The above process is the exact steps you would have to manually perform if you wanted this result using technique 1. Hibernate saves you that trouble with its awareness of the inheritance hierarchy. The database is however still not aware of the inheritance hierarchy.

 Next up for analysis is Technique 3. From here onwards details of the inheritance hierarchy reaches beyond java and hibernate and touches upon the database. Or more simply "If your inheritance hierarchy is affected the database tables are affected." 
Technique 3 tries to get rid of all the different tables for subclasses and comes up with a "one table fits all solution." Or more specifically "Table per Class Hierarchy". Thus you have everything in one table. Polymorphic queries will access just the single table.Your identifier class can be native again. All is well, or is it......
Errr with everything inside one table, How does Hibernate handle subclass query? For example
Query query = session.createQuery("from Footballer f");
In a single table how does Hibernate which record is a Footballer and which is a Cricketer ? This is where discriminators come in.
<class name="SportsPerson" table="SPORTS_PERSON">
     <discriminator column="SPORT_TYPE" type="string" />
</class>
Interesting thing to note in above hbm fragment is the absence of abstract property- it is not needed in this technique, or any technique other than technique 2.
The discriminator property includes a column name and type. This is actually going to map to a column in the table.
create table SPORTS_PERSON (
        ID bigint not null auto_increment,
        SPORT_TYPE varchar(255) not null,
This column is used by Hibernate to distinguish between the different subclass records.
<subclass name="Cricketer" discriminator-value="cricket">
</subclass>
<subclass name="Footballer" discriminator-value="footballer">
</subclass>
As can be seen, for every record of Cricketer the value in the discriminator column (SPORTS_TYPE) is "cricket" and for Footballer it is "footballer". While the queries may reduce in this technique, its disadvantages are documented in the post.

The last technique is something that not just affects the database but is even results in a highly normalized schema. The schema generated by this technique is something that would make the database purists sentimental (with joy that is). But it comes at a cost.
Here every class in the hierarchy gets its own table. The base class and the subclasses. So all common properties (such as the identifier) will be in the base class table. Properties specific to the sub classes land in their own tables.As the base class table manages the identifiers, unique ids for every sports person is assured. Everything looks so well encapsulated.
Quick question: If the identifier is in the base class, how will the subclass be aware of its id ? This is where the key property comes in.
<class name="SportsPerson" table="SPORTS_PERSON">
        <joined-subclass name="Cricketer" table="CRICKETER">
            <key column = "CRICKETER_ID" />
        </joined-subclass>

        <joined-subclass name="Footballer" table="FOOTBALLER">
            <key column = "FOOTBALLER_ID" />
        </joined-subclass>
</class>
As can be seen, each of the subclass elements includes a <key> element. The column specified here is the primary key for the subclass table and it gets its values from the id column of the base class table. Hibernate also applies foreign keys to it.
create table SPORTS_PERSON (
        ID bigint not null auto_increment,
        primary key (ID),
...    
create table CRICKETER (
        CRICKETER_ID bigint not null,
        primary key (FOOTBALLER_ID),
    
alter table CRICKETER 
        add index FKA046883E7573CC6 (CRICKETER_ID), 
        add constraint FKA046883E7573CC6 
        foreign key (CRICKETER_ID) 
        references SPORTS_PERSON (ID)
In short the primary key of the sub class is a foreign key coming from the base class table.
Now the disadvantage. While everything is nicely placed in normalized tables, most queries on the classes here will result in joins. A generic query like the one below could in fact be a real performance headache:
Query query = session.createQuery("from SportsPerson");
The resultant SQL is :
select
        sportspers0_.ID as ID0_,
        sportspers0_.NAME as NAME0_,
        sportspers0_1_.RUNS as RUNS1_,
        sportspers0_1_.WICKETS as WICKETS1_,
        sportspers0_1_.T20_PLAYER as T4_1_,
        sportspers0_2_.GOALS as GOALS2_,
        sportspers0_2_.APPEARANCES as APPEARAN3_2_,
        sportspers0_2_.SEND_OFFS as SEND4_2_,
        case 
            when sportspers0_1_.CRICKETER_ID is not null then 1 
            when sportspers0_2_.FOOTBALLER_ID is not null then 2 
            when sportspers0_.ID is not null then 0 
        end as clazz_ 
    from
        SPORTS_PERSON sportspers0_ 
    left outer join
        CRICKETER sportspers0_1_ 
            on sportspers0_.ID=sportspers0_1_.CRICKETER_ID 
    left outer join
        FOOTBALLER sportspers0_2_ 
            on sportspers0_.ID=sportspers0_2_.FOOTBALLER_ID
As the number of subclasses grow, so do the joins (Maybe that's why the name <joined-subclass>).

This finishes the four inheritance techniques provided by Hibernate. Now to look at the mutation.
This one is a combination of techniques 3 and 4. While it tries to take advantage of the  normalization of technique 4, it also tries to bring in some of the query efficiency of technique 3.
<class name="SportsPerson" table="SPORTS_PERSON">
    <discriminator column="SPORT_TYPE" type="string" />
    <subclass name="Cricketer" discriminator-value="cricket">
    </subclass>
    <subclass name="Footballer" discriminator-value="footballer">
         <join table="FOOTBALLER">
              <key column="FOOTBALLER_ID" />
          </join>
    </subclass>
</class>
The main idea here is to take advantage of the base class table. The mutation smartly (?) combines one of the sub class tables with the base class, thus reducing one of the tables from the last design. So if the application has a hierarchy where a particular table has most of the records then this one could be combined with the base class table gaining some performance benefits. The other subclasses are represented by the <join>element indicating them as separate tables with the foreign key from the base class. The earlier select query will now result in one less join.
Query query = session.createQuery("from SportsPerson");
SQL:
select
        sportspers0_.ID as ID0_,
        sportspers0_.NAME as NAME0_,
        sportspers0_.RUNS as RUNS0_,
        sportspers0_.WICKETS as WICKETS0_,
        sportspers0_.T20_PLAYER as T6_0_,
        sportspers0_1_.GOALS as GOALS1_,
        sportspers0_1_.APPEARANCES as APPEARAN3_1_,
        sportspers0_1_.SEND_OFFS as SEND4_1_,
        sportspers0_.SPORT_TYPE as SPORT2_0_ 
    from
        SPORTS_PERSON sportspers0_ 
    left outer join
        FOOTBALLER sportspers0_1_ 
            on sportspers0_.ID=sportspers0_1_.FOOTBALLER_ID
While we have managed to reduce the join, we have also reintroduced the cons of technique 3.

That concludes my flashback. Whew.
In hindsight I feel inheritance in Hibernate is a functionality that should be used very carefully. It brings its own set of pros and some very dangerous cons to the table. If your use case really does not demand it, don't use it. (I said use-case, not your ego :P )
Stick to simple models or Technique one and life will be easy (or as easy as possible). Leave Hibernate (and therefore the database) inheritance unaware. But if you must absolutely have it, then choose wisely my friend.

2 comments:

  1. Dude the day you wrote this post was the last day I was with my gf..hehehe..
    Sorry...the posting date reminded me of that day..your tutorials are amazing..These are very easy to learn and very focused...at the end of each tutorial i know that i have learnt something..thanks for these nice and simple tutorials..
    PS: 17 Nov is my B'day..

    ReplyDelete
  2. Lol, nice flashback mate !
    Will try and post one article on Nov 17, Cheers

    ReplyDelete