With this post we would like to show you some performance results that will explain why DEX is claimed to be a high-performance graph database with great scalability.
To go further into DEX scalability let us see the results using the R-MAT benchmark testing DEX performance for very large databases. In this benchmark we are interested intro tracking the following indicators:
- How many nodes and edges could be created?
- Which was the size of the database created?
- How long did the load of the database take?
- How many traversals we could possibly make per unit of time?
To test the scalability we automatically created some graphs with 2SF nodes (for a scale factor, SF, ranging from 25 to 28) and the number of edges (shown in column 3 of the table) automatically created by the R-MAT synthetic graphs generator. The rest of the columns depict the size, load time and execution measures for a heavy traversal query.
Load time and size of a graph
For this benchmark we load up to 2.1 billion edges and 230 million nodes. For this very large graph, answers to the former questions are quite impressive: the graph was loaded in only 15 hours, including the creation of all the indices for the direct access to the nodes and edges. The edges are loaded at a rate of 40K per second. This large database occupies 83 GB , which leads to an average of only 36 bytes per node or edge.
In addition, we made a query against the database to test its response time (Q1 in the table). Query 1 founds the node with the maximum out degree and then, it performs a BFS traversal starting from the node selected. We can see that 4.2 billion traversals are made with an average of 295K nodes traversed per second. More information about the BFS algorithm can be found on the post about DEX graph algorithms.
We obtained the remarkable results shown in the third rightmost column, with a minimum of 24 minutes for the SF=25 graph, and 4 hours for the SF=28 graph, with a degradation of less than 5% between scale factors.
Considering its fast loading and querying DEX is the graph database to go when dealing with large datasets because it has great performance for graphs with billions of objects. Furthermore, it does not only give quick answers but also its size is optimized with only 36 bytes per object available in the database.
Stay tuned for more benchmarks to come, including an interesting analytical use case with Wikipedia.
Note: The experiments are performed using a computer with two quadcore Intel(R) Xeon(R) E5440 at 2.83 GHz. The memory hierarchy is organized as follows: 6144 KB second level cache, a 64 GB main memory and a Disk with 1.7 TB. The operating system is Linux Debian etch 4.0.
We have uploaded the code of the R-MAT syntethics graph generator we implemented for this benchmark in the download section of DEX . With this code the same benchmark can be repeated for other graph databases. We also include the code for DEX from the query 1 tested in the benchmark. To test this query for another graph database it should be adapted first. Please feel free to download it and share with us your results!