Sparksee allows some configuration variables to be set in order to update or monitorize the behavior of the system, as well as to make the deployment of Sparksee-based applications easier.
There are two alternatives for setting-up the configuration:
SparkseeConfig
class.Both can even be used at the same time. In case of conflict, the settings from the SparkseeConfig class have a higher priority.
SparkseeConfig
classThe SparkseeConfig
class defines a setter and a getter for each of the variables that can be specified for Sparksee configuration. If not changed, all the variables are loaded with default values. All these values can be overwritten by the user calling the corresponding setter or creating a properties file.
In order to set up a configuration the user must create an instance of the SparkseeConfig
class, which will have the default values. All the variables may be set if the user needs to change them. We strongly recommend understanding each of the variables prior to changing their default values:
SparkseeConfig cfg = new SparkseeConfig();
cfg.setCacheMaxSize(2048); // 2 GB
cfg.setLogFile("HelloSparksee.log");
...
Sparksee sparksee = new Sparksee(cfg);
Database db = sparksee.create("HelloSparksee.gdb", "HelloSparksee");
SparkseeConfig cfg = new SparkseeConfig();
cfg.SetCacheMaxSize(2048); // 2 GB
cfg.SetLogFile("HelloSparksee.log");
...
Sparksee sparksee = new Sparksee(cfg);
Database db = sparksee.Create("HelloSparksee.gdb", "HelloSparksee");
SparkseeConfig cfg;
cfg.SetCacheMaxSize(2048); // 2 GB
cfg.SetLogFile(L"HelloSparksee.log");
...
Sparksee * sparksee = new Sparksee(cfg);
Database * db = sparksee->Create(L"HelloSparksee.gdb", L"HelloSparksee");
cfg = sparksee.SparkseeConfig()
cfg.set_cache_max_size(2048) # 2 GB
cfg.set_log_file("Hellosparksee.log")
...
sparks = sparksee.Sparksee(cfg)
db = sparks.create(u"Hellosparksee.gdb", u"HelloSparksee")
STSSparkseeConfig *cfg = [[STSSparkseeConfig alloc] init];
[cfg setCacheMaxSize: 2048]; // 2 GB
[cfg setLogFile: @"HelloSparksee.log"];
...
STSSparksee *sparksee = [[STSSparksee alloc] initWithConfig: cfg];
//[cfg release];
STSDatabase *db = [sparksee create: @"HelloSparksee.gdb" alias: @"HelloSparksee"];
As previously explained, as an alternative to SparkseeConfig
methods, the user can load Sparksee configuration variables from a properties file in order to make development of their applications using Sparksee easier.
A properties file is a plain-text file where there is one line per property. Each property is defined by a key and a value as follows: key=value
.
This is an example of a Sparksee configuration properties file:
sparksee.license="XXXXX-YYYYY-ZZZZZ-PPPPP"
sparksee.io.cache.maxsize=2048
sparksee.log.file="HelloSparksee.log"
By default, SparkseeConfig
tries to load all the variables defined in the ./sparksee.cfg
file (in the execution directory). If the user has not created this file, the default values will be assumed.
The SparkseeProperties
class also contains a method for specifying a different file name and path to be loaded.
To load Sparksee configuration variables from a different file rather than the default, the method load
must be called before the SparkseeConfig
class is instantiated, as we can see in the following examples where the mysparksee.cfg
file is used to load Sparksee configuration variables:
SparkseeProperties.load("sparksee/config/dir/mysparksee.cfg");
SparkseeConfig cfg = new SparkseeConfig();
Sparksee sparksee = new Sparksee(cfg);
Database db = sparksee.create("HelloSparksee.gdb", "HelloSparksee");
SparkseeProperties.Load("sparksee/config/dir/mysparksee.cfg");
SparkseeConfig cfg = new SparkseeConfig();
Sparksee sparksee = new Sparksee(cfg);
Database db = sparksee.Create("HelloSparksee.gdb", "HelloSparksee");
SparkseeProperties::Load(L"sparksee/config/dir/mysparksee.cfg");
SparkseeConfig cfg;
Sparksee * sparksee = new Sparksee(cfg);
Database * db = sparksee->Create(L"HelloSparksee.gdb", L"HelloSparksee");
sparksee.SparkseeProperties.load("sparksee/config/dir/mysparksee.cfg");
cfg = sparksee.SparkseeConfig()
sparks = sparksee.Sparksee(cfg)
db = sparks.create(u"Hellosparksee.gdb", u"HelloSparksee")
[STSSparkseeProperties load: @"sparksee/config/dir/mysparksee.cfg"];
STSSparkseeConfig *cfg = [[STSSparkseeConfig alloc] init];
STSSparksee *sparksee = [[STSSparksee alloc] initWithConfig: cfg];
//[cfg release];
STSDatabase *db = [sparksee create: @"HelloSparksee.gdb" alias: @"HelloSparksee"];
Here is the list of the existing Sparksee configuration variables divided by categories, with an identifier name, a description, the valid format for the value, the default value and the corresponding method to set each variable in the SparkseeConfig
class. Please note that the identifier name is the key to be used in the properties file.
Valid formats for the variables values are:
A string, optionally quoted.
A number.
A time unit: <X>[D|H|M|S|s|m|u]
where:
<X>
is a numberD
for days, H
for hours, M
for minutes, S
or s
for seconds, m
for milliseconds and u
for microseconds. If no unit character is given, seconds are assumed.sparksee.license
: License user code. If no license code is provided, by default the evaluation restrictions are in place.
Value format: String. Default value: ""
This variable can be set by calling SparkseeConfig#setLicense
Details about logging capabilities & value information are given in the 'Logging' section of the 'Maintenance and monitoring' chapter.
sparksee.log.level
: logging level. Allowed values are (case insensitive): off, severe, warning, info, config, fine.
Value format: String. Default value: info.
This variable can be set by calling SparkseeConfig#setLogLevel
sparksee.log.file
: log file. This is ignored when the log level is 'off'.
Value format: String. Default value: "sparksee.log"
This variable can be set by calling SparkseeConfig#setLogFile
Details about the use of transactions in Sparksee are given in the 'Transactions' section of the 'Graph Database' chapter.
sparksee.io.rollback
: enable ("true") or disable ("false") the rollback functionality.
Value format: String. Default value: "true"
This variable can be set by calling SparkseeConfig#setRollbackEnabled
Details about the recovery module are given in the 'Recovery' section of the 'Maintenance and monitoring' chapter.
sparksee.io.recovery
: enable ("true") or disable ("false") the recovery functionality. If disabled, all the other related variables are ignored.
Value format: String. Default value: "false"
This variable can be set by calling SparkseeConfig#setRecoveryEnabled
sparksee.io.recovery.logfile
: recovery log file. Empty string means that the recovery log file will be the same as the database path with the format ".log". For example, if the database is at "database.gdb" then the recovery log file will be "database.gdb.log".
Value format: String. Default value: ""
This variable can be set by calling SparkseeConfig#setRecoveryLogFile
sparksee.io.recovery.cachesize
: maximum size for the recovery cache in extents.
Value format: number. Default value: 256 (for the default extent size)
This variable can be set by calling SparkseeConfig#setRecoveryCacheMaxSize
sparksee.io.recovery.checkpointTime
: checkpoint frequency for the recovery cache.
Value format: time unit. Values must be greater than or equal to 1 second. Default value: 60
This variable can be set by calling SparkseeConfig#setRecoveryCheckpointTime
sparksee.storage.extentsize
: extent size in KB. An extent is the unit for I/O operations. It must be the same for the whole life of the database. Therefore, it can be set when the database is created and it cannot be updated subsequently. Recommended sizes are: 4, 8, 16, 32 or 64. Note large extent sizes will have a penalty performance if recovery is enabled.
Value format: number. Default value: 4
This variable can be set by calling SparkseeConfig#setExtentSize
sparksee.storage.extentpages
: number of pages per extent. A page is the logical unit of allocation size for the internal data structures of the system. Therefore, it can be set when the database is created and it cannot be updated subsequently.
Value format: number. Default value: 1
This variable can be set by calling SparkseeConfig#setExtentPages
The cache is split into different areas, the most important being for the persistent data whilst the rest are for the running user's sessions. The size of these areas is automatic, dynamic and adaptive depending on the requirements of each of the parts. Thus, there is constant negotiation between them. Although default values might be the best configuration, the user can adapt them for very specific requirements, taking into account the fact that negotiation between areas only happens when the pool is using memory in the range [minimum, maximum]. Therefore pools are always guaranteed to have at least the minimum memory of the configuration and never be larger than the maximum.
sparksee.io.cache.maxsize
: maximum size for all the cache pools in MBs (persistent pools as well as all session & other temporary pools). A value for this variable of zero means unlimited, which is in fact all the available memory (when the system starts) minus 512 MB (which is a naive estimate of the minimum memory to be used by the OS).
Value format: number. Default value: 0
This variable can be set by calling SparkseeConfig#setCacheMaxSize
sparksee.io.pool.frame.size
: number of extents per frame. Whereas the extent size determines the I/O unit, the frame size is the allocation/deallocation unit in number of extents. Thus, values higher than 1 allow the pre-allocation of extents in bulk which might make sense when loading large data sets, especially if working with a small extent size.
Value format: number. Default value: 1
This variable can be set by calling SparkseeConfig#setPoolFrameSize
sparksee.io.pool.persistent.minsize
: minimum size in number of frames for the persistent pool where zero means unlimited.
Value format: number. Default value: 64
This variable can be set by calling SparkseeConfig#setPoolPersistentMinSize
sparksee.io.pool.persistent.maxsize
: maximum size in number of frames for the persistent pool where zero means unlimited.
Value format: number. Default value: 0
This variable can be set by calling SparkseeConfig#setPoolPersistentMaxSize
sparksee.io.pool.temporal.minsize
: minimum size in number of frames for the session/temporal pools where zero means unlimited. It makes sense to have a larger minimum size in case of highly memory-consuming sessions.
Value format: number. Default value: 16
This variable can be set by calling SparkseeConfig#setPoolTemporaryMinSize
sparksee.io.pool.temporal.maxsize
: maximum size in number of frames for the session/temporal pool where zero means unlimited.
Value format: number. Default value: 0
This variable can be set by calling SparkseeConfig#setPoolTemporaryMaxSize
For development and debugging purposes it may be useful to enable the cache statistics. More details about this can be found in the 'Statistics' section of the 'Maintenance and monitoring' chapter.
sparksee.cache.statistics
: enable ("true") or disable ("false") storing of cache statistics.
Value format: String. Default value: false
This variable can be set by calling SparkseeConfig#setCacheStatisticsEnabled
sparksee.cache.statisticsFile
: file where cache statistics are stored. This is ignored when the cache statistics are disabled.
Value format: String. Default value: statistics.log
This variable can be set by calling SparkseeConfig#setCacheStatisticsFile
sparksee.cache.statisticsSnapshotTime
: frequency for storing the cache statistics. This is ignored when the cache statistics are disabled.
Value format: time unit. Default value: 1000
This variable can be set by calling SparkseeConfig#setCacheStatisticsSnapshotTime
High availability configuration is explained in detail in the 'High availability' chapter.
sparksee.ha
: enables or disables HA mode. If disabled, all other HA variables are ignored.
Value format: String. Default value: false
This variable can be set by calling SparkseeConfig#setHighAvailabilityEnabled
sparksee.ha.ip
: IP address and port for the instance. It must be given as ip:port
Value format: String. Default value: localhost:7777
This variable can be set by calling SparkseeConfig#setHighAvailabilityIP
sparksee.ha.coordinators
: comma-separated list of the ZooKeeper instances. For each instance, the IP address and the port must be given as: ip:port
. Moreover, the port must correspond to the given clientPort
in the ZooKeeper configuration file.
Value format: String. Default value: ""
This variable can be set by calling SparkseeConfig#setHighAvailabilityCoordinators
sparksee.ha.sync
: synchronization polling time. If 0, polling is disabled and synchronization is only performed when the slave receives a write request, otherwise the parameter fixes the frequency the slaves poll the master asking for writes. The polling timer is reset if the slaves receive a write request, as it synchronizes then.
Value format: time unit. Default value: 0.
This variable can be set by calling SparkseeConfig#setHighAvailabilitySynchronization
sparksee.ha.master.history
: master's history log. The history log is limited to a certain period of time, so writes occurring beyond that period of time will be removed from it and the master will not accept requests from those deleted Sparksee slaves. For example, in case of 12H
, the master will store all write operations during the last 12 hours in the history log. It will reject requests from a slave which has not been up dated in the last 12 hours.
Value format: time unit. Default value: 1D
This variable can be set by calling SparkseeConfig#setHighAvailabilityMasterHistory
In most cases default values are the best option. In fact, non-advanced users should consider setting only the following variables for a basic configuration:
sparksee.license
).sparksee.io.cache.maxsize
).sparksee.io.recovery
, sparksee.io.recovery.logfile
).sparksee.log.level
and sparksee.log.file
).