1 August 2011

Axon Framework - DDD and EDA meet together

CQRS (Command Query Responsibility Segregation) is a new approach towards building scalable and distributed systems that is based on simple pattern know as Command Query Separation (CQS). In short, you should design your system in a way that it either processes a command or serves response to a query. CQRS in its core is quite simple. Just split your service interface into two parts: Query Service and Command Service and you are done. But what is important is that by making this simple separation, you make your system open to many opportunities for architecture that may otherwise not exist. In this article I will not show you what is possible when fully applying CQRS (check this CQRS Starting Page for details). Instead I will present you how some design patterns can effectively be applied to standard ORM-based application when you decide to split your interface into Commands and Queries and use Axon - CQRS framework for Java. But first we must talk about DDD as it plays crucial role in building scalable and maintainable systems and fits very well to CQRS-based architecture.

DDD shortly

Generally DDD is about domain model that, when expressed in Ubiquitous Language, can be shared by developers and domain experts allowing them to communicate effectively. From technical point of view DDD is about modeling your core domain using aggregate roots (AR), entities, value objects and some other artifacts like domain services and repositories. What is important, when applying DDD to medium or large domains (generally complex domains are good candidates for DDD), you should not be expecting one model to arise representing all areas covered by the system. Each area will likely be modeled separately (inside its own Bounded Context). This aspect is very important, especially if your application starts to grow covering more and more business activities. If you stick with one model aka EDM (Enterprise Domain Model) (that typically reassembles model of relations inside your sql database), you will end up with monolithic system not capable of adjusting to business requirements.
Therefore building application DDD-way should be seen as completely different approach comparing to standard approach (one model to rule them all approach). You will need additional means to express communication between your different models (Bounded Contexts).
Thats where EDA - event based communication comes in.

Let's see how to apply DDD and EDA to standard ORM-based application with use of Axon.

Axon introduction

Axon framework provides building blocks for CQRS applications. Some blocks, i.e. (event sourcing, asynchronous processing of commands (no response from Command Handler)) are optional and can be omitted. If you are not ready for (or just don't need) full CQRS, you can still benefit from other goodies such as synchronous command handling, events, sagas and others. Additionally Axon integrates deeply with Spring making configuration a trivial task (just use built in namespaces (xml part) and annotations). Unique feature of Axon is that it allows to integrate existing JPA-based application. To make your entities become Aggregate Roots extend AbstractAggregateRoot class, to load your ARs from database use GenericJpaRepository (or even better HybridJpaRepository if you want to use EventStore). For detailed instructions see User's Guide. Lets start working with code.

Get rid of Transaction Script

When applying DDD, we tend to build rich entities that encapsulate behavior.
In contrast to standard approach with anemic domain model and procedural code inside Application Services (see: Transaction Script), most of business logic should be handled inside Aggregate Roots. When a command comes in, it is dispatched to Command Handler whose only job should be to get AR from Repository and invoke single method on it.

Lets imagine a system that manages user accounts (Account AR). One of Account's properties is state. To activate an account, AccountActivateCommand must be sent by the client. In order to handle this command, we must register and implement specialized Command Handler:

That's it. Nothing special. Clean code so far.

Implementing business logic - uniformed approach

First, we need to extend our model in order to make it more interesting :) Our system must support handling user's payments related to Payment Period. The following requirements are added:
1) when Account is activated, Payment Period must be created that will be used to keep track of payments related to the Account
2) after Payment Period expires, subsequent Payment Period must be created, but expired Payment Period should still be accessible

We could model our Account like this:

In this model, Account is responsible for managing Payment Periods. It sounds good, we can implement account activation inside Account class:

Please notice three blocks of code in this method and theirs order. This separation is dictated by CQRS-based design. You should have all these blocks in every method that is invoked by Command Handler. The blocks are:

1) validation - checking if operation is allowed and will not break consistency of the Aggregate
2) raising event - creating a Domain Event containing information about Aggregate's change
3) handling event - updating state of the Aggregate (no exceptions, no logic here!)

Every change of Aggregate's state must be signaled with an Event (Domain Event). All Domain Events will be stored in Event Store (if configured) (just serialized as Blobs or Clobs to single table).

If our Aggregate Root is event-sourced (is reconstructed from Event Store instead or being created by EntityManager), we should move event handling code to separate method:

We will not discuss Event Sourcing in this article, as we don't want to get rid of our powerful ORM, or do we? (actually in CQRS world ORM is
persona non grata - you were warned ;)

Keep your ARs lously coupled

Going back to our model, after thinking a little bit longer, we realize that it doesn't fit well our needs... One of the requirements is to process commands related directly to Payment Periods. These commands should be dispatched directly to Payment Period entity rather than go through Account AR. Payment Period should be an AR on its own. When we think more about this (and talk with our Domain Expert (if we have one;)), the separation of Account and Payment Period will become even more obvious. We could easily imagine two services Account Service (responsible for accounts management) and Payment Service (responsible for payment registration) working independently.

So we can simplify Account AR and upgrade Payment Period to AR:

Now account activation is implemented partially (Payment Period is not being created). We could implement Application Service that would first call Account#activate and than create new Payment Period, but implementing business logic within Application Service layer leads to Transaction Script that we are trying to avoid (we want to avoid both ARs (Account and Payment Period) forcibly be invoked in the same transaction - it hurts system scalability).
Lets think of our ARs on higher level. They belong to different contexts/services (virtual Account Service and Payment Service). The way to communicate between different contexts (services) is to use Domain Events!

Events to the rescue

We already have implemented raising of the AccountActivatedEvent in activate() method of Account AR. Now we must create Event Listener that will listen for this event and will send PaymentPeriodCreate command.
With Axon it is as simple as creating new class:

Of course, we need to implement a command handler that will handle PaymentPeriodCreateCommand by creating Payment Period AR and adding it to Repository.
Thats all. Now we have independent ARs that communicate with events.
This design leads us in direction of autonomous components (services) communicating asynchronously in publish-subscribe model (aka: push integration model), possibly via some kind of EventBus or Broker (see: Avoid a Failed SOA: Business & Autonomous Components to the Rescue by Udi Dahan)
But we will not go so far.

There are many other benefits from applying EDA. One of them is ability to keep detailed history of Events:

History of Events (Audit)

By storing Events in database (Event Store) we keep log of changes. We can easily create additional table containing following data:
- aggregate root class
- aggregate root id
- event class
- user

The table like this can serve basic reporting purposes. If we want to create more sophisticated reports in the future, we can reply events stored in Event Store and populate any report table we need. All history of changes is kept in Event Store.

Now lets see how we can model long running process with SAGA! In case you forgot the requirements, short reminder: new Payment Period must be created after the current one expires.

Enter SAGA

Saga is a stateful component (its state is persisted across invocations) that is capable of receiving events (including timeout events) (similar to Event Listener). Saga represents business process instance, in other words business process associated with particular AR(s).

First method will be invoked on PaymentPeriodCreatedEvent and will result in creation of new Saga associated with Payment Period being created and related Account. Inside method body we schedule the PaymentPeriodExpiredEvent that will be triggered when validity interval of Payment Period ends (Axon provides Quartz-based implementation of Event Scheduler) .
The second method is called when payment expiration happens (when PaymentPeriodExpiredEvent is triggered by the Event Scheduler).
The only thing this method does is sending a command that will be processed by our Account Service (currentPaymentPeriod must be increased) and eventually by Payment Service (new Payment Period will be created).

Lets see how easy we can test our Saga with use of Test Fixture provided by Axon:

Finally, I want to discuss one more topic related to DDD.

Don't pollute your core domain model

What is common mistake DDD beginners make is that they try to apply DDD totally (put all application logic into ARs boundaries). Let's take an example and add new requirement to our application:
All entities (including Account) are separated by Sales Areas. Any operation on Account (creation, activation, etc.) can be performed only if the owning Sales Area is in status ACTIVE.

First, we will modify our model by adding the following JPA mapping to the Account AR:

Now lets think of the requirement. Where should we put the checking if the Sales Area is active? The activate() method of Account AR seems to be the perfect place. But if we think more, we realize that Sales Area does not belong to our core domain! Checking status of the Sales Area inside Account AR will pollute the code (checking must be done before any modification of Account's state). So our new model is broken! There should be no Account -> Sales Area mapping. But we can not remove it, because we reuse the same model for serving queries (we don't follow CQRS in this aspect) and we need to be able to filter Accounts by Sales Area easily.
Ok, so the better place to put the checking would be a Command Handler (AccountCommandHandler). But it may be necessary to reuse this logic across different commands. What we need is some kind of interceptor that will prevent particular commands (account related or other) reaching Command Handlers. Not surprisingly Axon provides CommandHandlerInterceptor interface that allows for customized command handler invocation chains. No example this time, as it is quite easy to imagine:)

22 February 2011

JPA - Stronicowanie wyników kwerendy

Interfejs kwerendy zdefiniowany w JPA (javax.persistence.Query) umożliwia stronicowanie listy wyników (paging). Służą do tego metody:

Wynikiem kwerendy ze stronicowaniem jest podzbiór obiektów czyli strona, której numer i rozmiar określają odpowiednio parametry startPosition i maxResult.

Stronicowanie przyspiesza działanie aplikacji (mniejsza ilość danych do przetwarzania) oraz ułatwia użytkownikowi nawigację i wyszukiwanie określonych rekordów.

Jednak jak to zwykle w świecie ORM bywa, każde rozwiązanie ma swoje "problemy", które w przypadku stronicowania objawiają się wraz z użyciem mechanizmu ładowania wyprzedzającego elementów kolekcji.

Ładowanie wyprzedzające ze stronicowaniem

Ładowanie wyprzedzające generalnie polega na wykonaniu kwerendy w taki sposób, aby razem z głównymi obiektami pobrane zostały obiekty powiązane (pojedyncze obiekty bądź kolekcje).

Najbardziej popularnym sposobem ładowania wyprzedzającego jest złączenie tabel w kwerendzie (join fetching). Niestety, metoda ta nie nadaje się do zapytań ze stronicowaniem. Zobaczmy dlaczego.


Mamy obiekt zamówienia (Order) zawierający pozycje (LineItem):

Załóżmy, że chcemy wyświetlić użytkownikowi stronę z listą zamówień. Jednocześnie dla każdego zamówienia chcemy załadować jego pozycje. W tym celu tworzymy kwerendę na obiekcie Order, ze stronicowaniem oraz ze złączeniem kolekcji lineItems:

Złączenie definiujemy w JPQL za pomocą klauzuli JOIN FETCH.

Oczekiwania wobec dostawcy JPA

Załóżmy, że w bazie mamy 3 zamówienia z różną ilością pozycji.
Oczekujemy, że zarządca utrwalania (EntityManager) wykona pojedyncze zapytanie sql (z klauzulą left outer join) i zwróci listę zawierającą 3 obiekty Order. Dodatkowo oczekujemy, że w każdym obiekcie Order, kolekcja lineItems będzie załadowana.


- Hibernate

Hibernate co prawda zwraca oczekiwany wynik, ale generuje podejrzanie brzmiący komunikat:
firstResult/maxResults specified with collection fetch; applying in memory!

- EclipseLink

EclipseLink zwraca 2 obiekty Order.


Aby zrozumieć co się stało, zobaczmy jak dokładnie wygląda wynik naszej kwerendy na poziomie rekordów bazy danych bez uwzględnienia stronicowania:

| order_id | cust_id | cust_name | line_id | product_sku |
| 1 | 1 | Jan Kowalski | 1 | 232342342 |
| 1 | 1 | Jan Kowalski | 2 | 345345443 |
| 2 | 2 | Paweł Kaczor | 3 | 655624323 |
| 3 | 3 | Jerzy Dudek | 4 | 673454345 |
| 3 | 3 | Jerzy Dudek | 5 | 563425676 |
| 3 | 3 | Jerzy Dudek | 6 | 234576854 |

Widzimy, że w wyniku złączenia, otrzymujemy zbiór rekordów liczebnością przekraczający ilość zamówień. Jeśli z takiego zbioru będziemy chcieli wyciągnąć stronę o rozmiarze 3, otrzymamy tylko pierwsze trzy rekordy zawierające zamówienia z id 1 i 2. Zamówienie z id 3 zostanie wykluczone. Otrzymamy zatem dwa zamówienia zamiast trzech! Wniosek: stronicowanie na poziomie rekordów w kwerendzie ze złączeniem outer join jest niedokładne.

Teraz już wiemy dlaczego EclipseLink zwrócił dwa zamówienia. Ale jakim sposobem Hibernate zwrócił trzy zamówienia? Odpowiedź jest prosta (ale bolesna). Hibernate omija problem poprzez załadowanie wszystkich rekordów z tabeli i wyselekcjonowanie strony w pamięci (stąd magiczne: "applying in memory"!). Rozwiązanie to zwiększa zużycie zasobów (procesora i pamięci), co przeczy głównemu celowi (zwiększeniu wydajności), dla którego stosujemy stronicowanie. Należy poszukać lepszego rozwiązania.

Ładowanie wsadowe ze stronicowaniem

Ładowanie wsadowe (batch fetching) to bardziej zaawansowany sposób pobierania wyprzedzającego. Obiekty powiązane nie są ładowane w kwerendzie głównej, ale w dodatkowej kwerendzie, której ostateczna postać zależy od wybranego (o ile dostawca na to pozwala) typu ładowania wsadowego.

Stronicowanie wykonywane jest bezproblemowo w kwerendzie głównej, która nie zawiera klauzuli outer join.

Konfiguracja - Hibernate

Dla kolekcji lineItems specyfikujemy ładowanie wsadowe używając adnotacji @org.hibernate.annotations.BatchSize:

Parametr size w adnotacji @BatchSize oznacza ilość elementów kolekcji, jaka zostanie załadowana w pojedynczej kwerendzie sql.

Konfiguracja - EclipseLink

Ten sam sposób ładowania wsadowego konfigurujemy w EclipseLink następująco:


Tworzymy kwerendę JPA tym razem bez klauzuli JOIN FETCH. W celu lepszego zobrazowania działania pobierania wsadowego, dodajemy warunek na pole orderDate.

Wygenerowane zostają dwa zapytania sql:

- kwerenda główna zamówień (ze stronicowaniem)

- kwerenda dodatkowa - załadowanie wsadowe pozycji zamówień

Jak widzimy, w celu załadowania pozycji tylko dla zamówień pobranych w kwerendzie głównej, w kwerendzie dodatkowej została użyta klauzula IN.

Optymalizacja - EclipseLink

EclipseLink pozwala skonfigurować trzy typy pobierania wsadowego: (IN, JOIN, EXISTS)
Typ IN już znamy. Mankamentem jest tutaj ograniczona ilość elementów kolekcji, które mogą być załadowane w jednej kwerendzie sql. Efektywniejszym rozwiązaniem jest użycie kryteriów selekcji z kwerendy głównej w kwerendzie dodatkowej (typ JOIN - "The original query's selection criteria is joined with the batch query").


Wygenerowana kwerenda dodatkowa wygląda następująco:

Widzimy, że problematyczna klauzula IN zastąpiona została kryterium wyboru identycznym jak w kwerendzie głównej.


W powyższym artykule przedstawiłem w jaki sposób stosować stronicowanie (paging) w kwerendach JPA. Szczegółowo omówiłem problem stronicowania, kiedy w kwerendzie używane jest ładowanie wyprzedzające elementów kolekcji.