DEXHA provides a horizontally scaling architecture that allows DEX-based applications to handle larger read-mostly workloads.
DEXHA has been thought to minimize developers' work to go from a single node installation to a multiple node HA-enabled installation. In fact, it does not require any change in the user application because it is simply a question of configuration.
To achieve this, several DEX slave databases work as replicas of a single DEX master database, as seen in the figure below. Thus, read operations can be performed locally on each node and write operations are replicated and synchronized through the master.
Figure 1.1 shows all components in a basic DEXHA installation:
This is responsible for receiving write requests from a slave and redirecting them to the other slave instances. At the same time the master itself also plays the role of a slave.
Only a single node of the cluster can be configured to be the master. The election of the master is automatically done by the coordinator service when the system starts.
The master is in charge of the synchronization of write operations with the slaves. To do this task it manages a history log where all writes are serialized. The size of this log is limited, and it can be configured by the user.
Slaves are exact replicas of the master database; they can therefore locally perform read operations on their own without requiring synchronization.
However, for write operations the synchronization with the master in order to preserve data consistency is a must. These writes are eventually propagated from the master to other slaves. Therefore, the result of a write operation is not immediately visible in all slaves. These synchronizations are made by default made during a write operation; however there is optional polling to force synchronization that can be configured by the user.
It is not mandatory to have a slave in the architecture, as the master can work as a standalone.
Coordinator service: Apache ZooKeeper
A ZooKeeper cluster is required to perform the coordination tasks, such as the election of the master when the system starts.
The size of the ZooKeeper cluster depends on the number of DEX instances. In every case, the size of the ZooKeeper cluster must be an odd number.
DEX v4.7 works with Apache ZooKeeper v3.4. All our tests have been performed using v3.4.3.
As DEX is an embedded graph database, a user application is required for each instance. As it has already been mentioned, moving to DEXHA mode does not require any update in the user application.
Note that the user application can be developed for all the platforms and languages supported by DEX. For the current version will be running on Windows, Linux or MacOSX and using Java, .NET or C++
The load balancer redirects the requests to each of the running applications (instances).
The load balancer is not part of the DEX technology, therefore it must be provided by the user.
In order to achieve the horizontal scalability, this redistribution of the application requests must be done efficiently. A round-robin approach would be a good starting solution but depending on the application requirements smarter solutions may be required. In fact, using existing third-party solutions is advisable.
More information about load balancing strategies & available solutions in this article.
Now that the pieces of the architecture are clear, let's see how DEXHA works in different scenarios or acts in typical operations using these components. Below is an explanation of how the system acts in the described situations.
The first time a DEX instance goes up, it registers itself into the coordinator service. The first instance registered which becomes the master. If a master already exists, it becomes a slave.
As all DEX slave databases are replicas of the DEX master database, slaves can answer read operations by performing the operation locally. They do not need to synchronize with the master.
In order to preserve data consistency, write operations require slaves to be synchronized with the master. A write operation is as follows:
If two slaves perform a write operation on the same object at the same time, it may result in a lost update in the same way as may happen in a DEX single instance installation if two different sessions want to write the same object at the same time.
Slave goes down
A failure in a slave during a regular situation does not affect the rest of the system. However if it goes down in the middle of a write operation the behavior of the rest of the system will depend on the use of transactions:
Slave goes up
When a DEX instance goes up, it registers itself with the coordinator. The instance will become a slave if there is already a master in the cluster.
If polling is enabled for the slave, it will immediately synchronize with the master to receive all pending writes. On the other hand, if polling is disabled, the slave will synchronize when a write is requested (as explained previously).
This is a first version of DEXHA, so although it is fully operational some important functionality is not available which will assure a complete high-availability of the system. Subsequent versions will focus on the following features:
Master goes down
A failure in the master leaves the system non-operational. In future versions this scenario will be correctly handled automatically converting one of the slaves into a master.
A failure during the synchronization of a write operation between a master and a slave leaves the system non-operational. For instance, a slave could fail during the performance of a write operation enclosed in a transaction, or there could be a general network error.
This scenario requires that the master should be able to abort (rollback) a transaction. As DEX does not offer that functionality, these scenarios cannot currently be solved. DEXHA will be able to react when DEX implements the required functionality.