This document is the reference of the Sparksee openCypher Language (SCL). SCL is inspired by the OpenCypher Query Language. Inspired means that although the goal is to make SCL as close as possible to OpenCypher, differences between Sparksee’s property graph model and OpenCypher model and other technical issues make the two languages to divert in some aspects.
In this section, we detail the main differences between SCL and OpenCypher
For nodes, if the type of the node is known: “type-specific” → NODES → GLOBAL.
For example, in the following query, the type of n is known to be an “actor”. Thus, Sparksee will first look for an “actor” “type-specific” property “name”. If it exists, it will use that property in the query. If it does not exist, it will look for a NODES property named “name”. If it exists, it will use that property in the query. If not, it will look for a GLOBAL property. If it exists, it will use that property in the query. If not, the query will throw an error.
MATCH (n:actor {name : 'Brad'})
RETURN *
Similarly, in the following query, the type of n is known to be “actor” because of the right path of the pattern, which allows the engine to ensure that the resulting nodes bound to n will be of type actor. Thus, Sparksee will first look for an “actor” “type-specific” property “name”. If it exists, it will use that property in the query. If it does not exist, will look for a NODES property named “name”. If it exists, it will use that property in the query. If not, it will look for a GLOBAL property. If it exists, it will use that property in the query. If not, the query will throw an error.
MATCH (n), (n:actor)-[]-()
RETURN n.name
For nodes, if the type of the node is NOT known: NODES → GLOBAL.
For example, in the following query, the type of n is not known. Thus, Sparksee will first look for a NODES property named “name”. If it exists, it will use that property in the query. If not, it will look for a GLOBAL property. If it exists, it will use that property in the query. If not, the query will throw an error.
MATCH (n {name : 'Brad'})
RETURN *
For edges, if the type of the edge is known: “type-specific” → EDGES → GLOBAL.
For example, in the following query, the type of r is known to be “role”. Thus, Sparksee will first look for a “role” “type-specific” property “type”. If it exists, it will use that property in the query. If it does not exist, will look for a EDGES property named “type”. If it exists, it will use that property in the query. If not, it will look for a GLOBAL property “type”. If it exists, it will use that property in the query. If not, the query will throw an error.
MATCH ()-[r:role {type : 'actor'}]->()
RETURN *
For edges, if the type of the edge is NOT known: EDGES → GLOBAL.
For example, in the following query, the type of r is unknown. Thus, Sparksee will look for a EDGES property named “type”. If it exists, it will use that property in the query. If not, it will look for a GLOBAL property “type”. If it exists, it will use that property in the query. If not, the query will throw an error.
MATCH ()-[r {type : 'actor'}]->()
RETURN *
Similarly, in the following query, the type of r unknown because of the union of the two edge collections of different types. Thus, Sparksee will look for a EDGES property named “timestamp”. If it exists, it will use that property in the query. If not, it will look for a GLOBAL property “timestamp”. If it exists, it will use that property in the query. If not, the query will throw an error.
MATCH ()-[r:role]->()
UNION
MATCH ()-[r:is_located]->()
RETURN r.timestamp
Finally, if nodes and edges are mixed in a query: GLOBAL
For example, in the following query, a union between nodes of type actor and edges of type role is done. In this case, the type of column is “unknown”, since it contains objects of different types. When looking for the timestamp, Sparksee will always look for a GLOBAL property. If it does not exist, it will return an error
MATCH (n:actor)
UNION
MATCH ()-[n:role]->()
RETURN n.timestamp
When adding properties to a node/edge from a map, the properties must exist in the schema and be of the correct DataType. The same rules for inferring the property of the values that are inserted to apply.
When using CREATE, one and only one type must always be provided for nodes and edges. The creation of typeless nor multityped nodes and edges is not allowed.
The DETACH DELETE clause is not supported. By default, when deleting a node, all its relationships are also deleted
Setting a property to NULL with SET, does not remove the property of the node, it sets it to null. The Sparksee property-graph model does not allow nodes or edges without a property value if their type has that property. Instead, the property is NULL when they do not have it.
The REMOVE clause is not supported, given that it is meant to remove properties from nodes. This does not make sense in Sparksee, since a property nor a label cannot be removed from a node. Instead, these are set to NULL, which can be already done with the SET clause.
With the current version, the following things are not still supported:
The following are the datatypes supported in expressions:
All expressions in SCL have a resulting expression datatype. Multiple expressions (typically one or two) can be combined with an operator to form another expression whose datatype will be that of the subexpressions. In general, only subexpressions of the same datatype can be combined, However, If the subexpressions are of different datatypes but these are numeric (Integer, Long, Double or Timestamp), the datatypes of the more restrictive type will be promoted to match the less restrictive datatype of the subexpressions. The promotion order is established as follows: Integer → Long → Timestamp → Double.
SCL supports the following unary and binary operators
Operator | Description |
---|---|
NOT X | Negates X |
X IS NULL | Checks if X is null |
X IS NOT NULL | Checks if X is not null |
+ X | Unary positive arithmetic operator |
- X | Unary negative arithmetic operator |
Operator | Description |
---|---|
X AND Y | Binary AND |
X OR Y | Binary OR |
X = Y | X equals Y |
X <> Y | X different than Y |
X > Y | X greater than Y |
X >= Y | X greater or equal than Y |
X < Y | X smaller than Y |
X <= Y | X smaller or equal than Y |
X + Y | X plus Y |
X - Y | X minus Y |
X * Y | X times Y |
X / Y | X divided by Y |
X % Y | X modulo Y |
Additionally, the following functions can be used in expressions
Operator | Description |
---|---|
toBoolean(X) | Casts X into a boolean |
toInteger(X) | Casts X into an Integer |
toLong(X) | Casts X into a Long |
toTimestamp(X) | Casts X into a Timestamp |
toDouble(X) | Casts X into a Double |
toString(X) | Casts X into a String |
type(X) | Returns the type of object X as a String |
date(X) or datetime(X) | Converts the string X (e.g. 2006-01-01) to a timestamp |
timestamp() | Returns the current timestamp |
SCL supports case expressions in two forms: simple and generic.
In simple case expressions, an expression is compared against different values until a match is found. If a match is not found, the default value is returned or NULL if such value has not been specified. The syntax is as follows
CASE test
WHEN value THEN result
[WHEN ...]
[ELSE default]
END
where “test” is a valid expression, “value” is the value the expression is compared to, and result is the new value to set the column to. If no matches are found, the “default” value is set or NULL if it is not specified. All results must be of the same datatype. For example, a valid simple case expression is:
CASE n.color
WHEN "Blue" THEN 1
WHEN "Red" THEN 2
ELSE 3
END AS COLOR_CATEGORY
Generic case expressions are similar to simple ones, with the difference that the predicate can be an arbitrary expression. As in the simple CASE expression, all results must be of the same datatype.
CASE
WHEN predicate THEN result1
[WHEN ...]
[ELSE default]
END
For instance, we could write:
CASE
WHEN n.age < 13 THEN "kid"
WHEN n.age < 18 THEN "teen"
ELSE "adult"
END