Maintenance and monitoring

In this chapter the database administrator can learn about the functionalities that Sparksee offers in order to maintain and monitorize Sparksee databases.

We would like to place particular emphasis on the Recovery functionality that will help the administrator to always keep an automatic copy of the database stored and safe.

Backup

Sparksee provides functionality for performing a cold backup and restoring a database which has been previously backed up.

During a cold backup, the database is closed or locked and not available to users. The data files do not change during the backup process so the database is in a consistent state when it is returned to normal operation.

The method Graph#backup performs a full backup by writing all the content of the database into a given file path and Sparksee#restore creates a new Database instance from a backup file.

Next code-blocks provide an example of this functionality:

[Java]
// perform backup

Graph graph = sess.getGraph();
...
graph.backup("database.gdb.back");
...
sess.close();

// restore backup

Sparksee sparksee = new Sparksee(new SparkseeConfig());
Database db = sparksee.restore("database.gdb", "database.gdb.back");
Session sess = db.newSession();
Graph graph = sess.getGraph();
...
sess.close();
db.close();
sparksee.close();
[C#]
// perform backup

Graph graph = sess.GetGraph();
...
graph.Backup("database.gdb.back");
...
sess.Close();

// restore backup

Sparksee sparksee = new Sparksee(new SparkseeConfig());
Database db = sparksee.Restore("database.gdb", "database.gdb.back");
Session sess = db.NewSession();
Graph graph = sess.GetGraph();
...
sess.Close();
db.Close();
sparksee.Close();
[C++]
// perform backup

Graph * graph = sess->getGraph();
...
graph->Backup(L"database.gdb.back");
...
delete sess;

// restore backup

SparkseeConfig cfg;
Sparksee * sparksee = new Sparksee(cfg);
Database * db = sparksee.Restore(L"database.gdb", L"database.gdb.back");
Session * sess = db->NewSession();
Graph * graph = sess->GetGraph();
...
delete db;
delete sess;
delete sparksee;
[Python]
# perform backup

graph = sess.get_graph()
...
graph.backup("database.gdb.back")
...
sess.close;

# restore backup

sparks = sparksee.Sparksee(sparksee.SparkseeConfig())
db = sparks.restore("database.gdb", "database.gdb.back")
sess = db.new_session()
graph = sess.get_graph()
...
db.close()
sess.close()
sparks.close()
[Objective-C]
// perform backup
STSGraph * graph = [sess getGraph];
...
[graph backup: @"database.gdb.back"];
...
[sess close];
[db close];
[sparksee close];
//[sparksee release];

// restore backup
STSSparkseeConfig * cfg = [[STSSparkseeConfig alloc] init];
STSSparksee * sparksee = [[STSSparksee alloc] initWithConfig: cfg];
//[cfg release];
STSDatabase * db = [sparksee restore: @"database.gdb" backupFile: @"database.gdb.back"];
STSSession * sess = [db createSession];
STSGraph * graph = [sess getGraph];
...
[sess close];
[db close];
[sparksee close];
//[sparksee release];

Note that OIDs (object identifiers) for both node and edge objects will be the same when the database is restored, however type or attribute identifiers may differ.

Take into consideration that although it does not update the database it works as a writing method. As Sparksee's concurrency model only accepts 1 writer transaction at a time (see more details about this in the 'Processing' section of the 'Graph database' chapter), this operation blocks any other transaction.

Recovery

Sparksee includes an automatic recovery manager which keeps the database safe for any eventuality. In case of application or system failures, the recovery manager is able to bring the database to a consistent state in the next restart.

By default, the recovery functionality is disabled so in order to use it, the user must enable and configure the manager. The recovery manager introduces a small penalty in the performance, so there is always a trade-off between the functionality it provides and a minor decrease in performance.

The configuration includes:

This configuration can be performed with the SparkseeConfig class or by setting the values in a Sparksee configuration file. This is explained in detail in the 'Recovery' section of the 'Configuration' chapter.

Runtime information

Logging

It is possible to enable the logging of Sparksee activity. The log configuration requires both the level and the log file path.

This configuration can be performed with the SparkseeConfig class or by setting the values in a Sparksee configuration file. This is explained in detail in the 'Log' section of the 'Configuration' chapter.

Current valid Sparksee log levels are defined in the LogLevel enum class. This is the list of values ordered from the least verbose and increasing:

  1. Off

    Log is disabled.

  2. Severe

    The log only stores errors.

  3. Warning

    Log errors and situations which may require special attention are included in the log file.

  4. Info

    Log errors, warnings and information messages are always stored.

  5. Config

    Log includes configuration details of the different components.

  6. Fine

    This is the most complete log level; it includes the previous levels of logging plus additional platform details.

  7. Debug

    Log debug information. It only works for a debug version of the library, so it can only be used by developers.

Dumps

There are two methods to dump a summary of the content from a Sparksee database.

Both files are written using YAML, a human-readable data serialization format.

Statistics

Sparksee offers a set of runtime statistics available for different Sparksee components. In order to use each statistical method it is recommended checking the class in the reference manuals of the chosen programming language.

Database statistics

The class DatabaseStatistics provides general information about the database:

Use the Database#getStatistics method to retrieve this information.

Platform statistics

The class PlatformStatistics provides general information about the platform where Sparksee is running:

Use the Platform#getStatistics method to retrieve this information.

Attribute statistics

The class AttributeStatistics provides information about a certain attribute:

For numerical attributes (integer, long and double) it also includes:

For string attributes it also includes:

Use the Graph#getAttributeStatistics method to retrieve this information. The user should take into account the fact that the method has a boolean argument in order to specify if basic (TRUE value) or complete statistics (FALSE value) for that datatype must be retrieved. Check in the reference manual for those statistics which are considered to be basic.

The administrator may also want to check which attributes have a value in a certain range, in which case the method Graph#getAttributeIntervalCount would be the most appropriate.

Note that both methods do not work for Basic attributes, statistics can only be retrieved for Indexed or Unique attributes. See 'API' chapter for more details on the attribute types.

Cache statistics

Finally, it is also possible to enable the logging of the cache to monitorize its activity. By default, the logging of the cache is disabled, so it should be enabled and configured first. This configuration can be performed with the SparkseeConfig class or by setting the values in a Sparksee configuration file. This is explained in detail in the 'Log' section of the 'Configuration' chapter.

The configuration of the cache statistics includes:

The cache statistics log includes:

Back to Index