Learning high-performance graph database management with Sparksee at the NoSQL matters Training Session

On Friday 21st of November from 9h to 13h Sparsity will host a Training Session as part of the NoSQL matters events.

Skilled trainers from Sparksee will explain to the attendees how to take advantage of the graph learning about the most common queries that are best suited to be answered using a graph. The training will take Twitter model and dataset to build the graph and then will cover queries to the resulting graph such as discovering how two twitter users are connected.

twitterdatamodel

Attendees will be given a Netbeans project with Sparksee Java and a complete set of exercises to fill in the blanks. Also they will be gifted with a free development license to build graphs up to 1B objects and unlimited sessions during 6 months.

Looking forward to meet you at the NoSQL matters Training Session!

Posted in Events, News, Sparksee | Leave a comment

Sparsity is attending NoSQL Matters 2014

nosqlbanner2

We are glad to announce that Sparsity Technologies is attending NoSQL Matters Conference 2014 that takes place in Barcelona on November 21st and 22nd. The conference will cover a broad spectrum of topics, including new products, use-cases and field reports of day-to-day operations of NoSQL infrastructures.

On November 21st, Arnau Prat and Joan Guisado will be giving a training session called Introduction to high performance graph data management with Sparksee. Attendees will learn how to model and load data as a graph, and discover the full potential of graph databases and what can they offer compared to other traditional database paradigms using Sparksee graph database. More details about this session will be explained in a following post next week.

On November 22nd, Sparsity’s CEO Josep Lluís Larriba-Pey will be giving the talk Graph databases go mobile, Sparksee 5 mobile use cases. He will present Sparksee mobile and explain a few use cases in the area of analytics for Social and Open Data where the use of graphs boosts job search, private recommendation, community search and personal tourist route planning.

You can still buy your tickets for both the Training Day and the Conference, but hurry up, there are just a few left!

We will also host a booth during the 22nd of November in the hall area, don’t forget to stop by and say hello.

The conference will take place at Casa Convalescència, C/ Sant Antoni Maria Claret 171, 08041, Barcelona. Looking forward to meeting you there!

Posted in Events, News, Sparksee, Sparksee mobile | Leave a comment

Scalable Community Detection on the Cloud: SCD made product

In our post Graph Databases research: Social community search we introduced the Scalable Community Detection (SCD) algorithm, born from the research work of the Sparksee team altogether with DAMA-UPC. The basic idea behind SCD is to search for the  number of transitive relations (triangles) and understand how they structure to form cohesive and structured communities. This leads to a more fast and accurate communities finder: faster than Louvain algorithm (fastest so far) and more accurate that Oslom algorithm (which had the highest quality so far).

Now we can proudly announce that we are taking the first steps towards commercialization of the SCD thanks to a Technology Transfer Project (TTP) provided by TETRACOM FP7 project. The mission of TETRACOM is to boost European academia-to-industry technology transfer (TT), and its main tool are the TTPs that provide partial funding of academia-industry collaborations that bring concrete R&D results into industrial use. In this case, the industrial partner Sparsity Technologies and the research partner Universitat Politècnica de Catalunya are working to make Scalable Community Detection on the Cloud (SCDC) a reality.

The general idea behind SCDC is to provide an scalable cloud service that when a user introduces to our system his network  he is going to get in the shortest time the most accurate communities inherently there. For those curious about our technology choices we are going to use a scalable architecture using Golang and Revel for the REST API, MongoDB to store the information and NSQ to process distributed queues.

SCDC

 

 

 

 

 

 

 

 

 

 

 

Stay tuned for more information about the project. Remember that you can get the SCD algorithm at Github.

Posted in News, Research, SNA | Tagged , , | Leave a comment

How to install & configure Sparksee iOS (Objective-C)

apple logo-01Sparksee is the first graph database available for iOS devices applications, available since 5.1 in both an Objective-C and C++ interfaces. In this article we will guide through a typical installation & configuration for the Objective-C, so you can start working with Sparksee in your mobile development environment in a few minutes. You can take a look at our C++ tutorial published here as well.

 

Step 1) Downloading Sparksee

Download your Sparksee mobile library from our website here: http://www.sparsity-technologies.com/#download

We will send you an email on how to download your own copy. Downloads for mobile (like the licenses requests) are moderated, please wait until a support member contacts back.

Once you receive your download, uncompress the “.dmg” file to get the Sparksee.framework directory. The documentation will be available in the framework Resources/Documentation.

Step 2) Creating a new project linking Sparksee mobile library

  • Add the Sparksee.framework to the Link Binary With Libraries build phase of your application project. You can just drag it there.
  • Now, this step changes whether you are using C++ in your project or not:
    • For developers not using C++, you must explicitly add the right C++ standard library because the Sparksee library core depends on it. Click on the “+” sign of the same “Link Binary With Libraries” build phase of your application project, then select the appropiate C++ library (“libc++.dylib” for LLVM C++11 version or “libstdc++.6.dylib” for the GNU C++ version) and finally click the “Add” button.
    • For developers already using C++ in their project, choose the most appropriate library: libstdc++ (GNU C++ standard library) or libc++ (LLVM C++ standard library with C++11 support) in the C++ Standard Library option from the build settings of the compiler. This version must match the one you downloaded in first place.
  • Now import the header in your source code by adding:
    #import <Sparksee/Sparksee.h>
  • Take into account that after all these changes you may need to Clean your Project.

Run the empty application! That’s it! You now have a new project using Sparksee.

 Step 3) Other configuration considerations

Setting an explicit memory limit to the Sparksee cache is highly recommended. For more information about Sparksee configuration variables check the Configuration chapter in the User Manual.

Step 4) Initial steps

Now you should create a new Sparksee database,  follow this order of actions:

  • You should make some configuration steps before creating the database. First of all, create a new configuration class with STSSparkseeConfig *cfg = [[STSSparkseeConfig alloc] init];
  • Set the license code with [cfg setLicense: @”THE_LICENSE_KEY”]; Sparksee mobile only works with a valid key, you are going to get that code in the same email of your download.
  • Limit the Sparksee cache memory with [cfg setCacheMaxSize: smallsizeinMB]: Sparksee by default takes all the free available memory space, but that is something you surely may control in a mobile device.
  • Activate the recovery functionality with [cfg setRecoveryEnabled: TRUE]: The recovery is a helpful functionality that will allow you to recover all your data if any error occurs.
  • Set the log file with [cfg setLogFile: pathtothelogfile]: Specify a log full name&path where you have write permission.
  • Create the main Sparksee class with  STSSparksee *sparksee =[[STSSparksee alloc] initWithConfig: cfg]: take into account that the argument of this method is the SparkseeConfig created.
  • You have already configured Sparksee, now let’s create your database with STSDatabase *db = [sparksee create: pathtothedbfile alias: @”nameofthedatabase”].
  • It’s time to create a new session with STSSession *sess = [db createSession].
  • Graph objects and operations are available at Graph class level; you need to get the graph from the session with STSGraph *g = [sess getGraph].

Done!

Posted in Documentation, Sparksee, Sparksee mobile | Tagged , , , | Leave a comment

Sparksee 5.1 new release!

Sparksee51

We are proud to announce today the official release of Sparksee 5.1.

During this half of year we have been working in providing a group of exciting features for the new version that we hope you’ll find interesting:

 

  • A brand new Objective-C API for MacOS and iOS. Although we already offered a C++ API to work with Objective-C based projects, some of you noticed that it would be much better to directly be able to work with a Objective-C interface. In addition this new API would allow you to work with Swift projects.
  • Rollback functionality in our transactions. We have included the rollback functionality which brings us full ACID compliance to our database.
  • Enhanced compatibility with Blueprints. Blueprints is analogous to the JDBC, but for graph databases, providing a common set of interfaces to allow developers to plug-and-play their graph database backend. We have implemented the following elements in order to be fully compatible with the complete Tinkerpop stack:
    • Implementation of the full TransactionalGraph, thanks to our newer rollback functionality we are providing now this implementation and thus supporting Blueprints transactions.
    • Attribute scope to all nodes or to all edges. We are providing a new scope for our attributes allowing to have the same attribute type for all the nodes in the graph (or analogously to all edges). Sparksee now has the following scopes for attributes:
        • Attribute for a certain node or edge type (the most common)
        • Attribute for all nodes or all edges
        • Global attributes
    • Allow to begin transactions directly as write transactions. Sparksee by default (using a simple begin) starts transactions as read transactions and it does not change their state until an update is detected, if you need to avoid a “lost update” you can now use “begin update”.
  • Dynamic size adapting cache. Specially in mobile devices it may be relevant to be able to dynamically change the maximum size dedicated to the cache because there are certain situations where the OS would require the application to release memory in order to give it to another application, if your app fails to release it the process can be stopped. Sparksee nows offers several configuration methods to handle this dynamically.
  • Compatibility with Visual Studio 2013 for .Net and C++ developers, so you can work with the latest programming environment.
  • Improved documentation for Sparksee mobile. Included in the User Manual you will now find specifics about the installation and configuration of Sparksee for the iOS and Android most common development frameworks.

You can right now start taking advantage of all the new features downloading the last version for free. Sparksee 5.1 has retro compatibility, therefore if you are already developing with Sparksee 5.0 the only requirement is to switch one library for the other.

Don’t hesitate to contact us if you need more information in any of the new features and please consider registering for one of our license programs, we offer development programs for free during all the process to selected companies!

Posted in News, Sparksee | Tagged , , , , , , , , | Leave a comment

Graph Databases research: Social community search

We would like today to share an interesting article recently published at DZone’s portal about the research the Sparksee team altogether with DAMA-UPC is working on about social community search.

Community search is a very important aspect of Social Network (SN) analysis. Communities are defined as tightly related groups of people, who, for instance, communicate or interact with the members of the community more intensely than with the rest of the population. SNs are complex representations of society and understanding their structure is key to be able to find those communities accurately.

Lately our research has focused on understanding the nature of social communities in order to use that knowledge to build a more accurate and fast communities finder in very large graphs. As a result of that on March 2014 at the WWW’14 we presented The Scalable Community Detection (SCD) algorithm.

SCD exploits one of the basic properties of SNs; they have a number of transitive relations (triangles) larger than other types of networks. The number of triangles in a specific community will be larger than in the whole SN. The basic idea of SCD is to search for those triangles and understand how they structure to form cohesive and structured communities around those triangles.

Communities

Comparing SCD results to other algorithms in the State of the Art, we can claim that it is fastest that Louvain algorithm (fastest so far) and more accurate that Oslom algorithm (which had the highest quality so far). Check the Dzone article to read more details about this claim and check DAMA-UPC website to download the code of this algorithm.

Are you interested in using Sparksee for your research? Go ahead and request your free license under our Research program.

Posted in Research, Sparksee | Leave a comment

How to install & configure Sparksee mobile for iOS (C++)

apple logo-01Sparksee is the first graph database available for iOS devices applications, available, right now, in the C++ interface. In this article we will guide through a typical installation & configuration , so you can start working with Sparksee in your mobile development environment in a few minutes.

 

Step 1) Downloading Sparksee

Download your Sparksee mobile library from our website here: http://www.sparsity-technologies.com/#download

We will send you an email on how to download your own copy. Downloads for mobile (like the licenses requests) are moderated, please wait until a support member contacts back.

Once you receive your download, uncompress the “.dmg” file to get the Sparksee.framework directory. This directory contains the include files, the static library and the documentation.

Step 2) Creating a new project linking Sparksee mobile library

  • Add the Sparksee include files to the search path in your application project. The path to Sparksee.framework/Resources/sparksee/ must be added as non-recursive to the User Header Search Paths option on the build settings of your Xcode application project. This is required because Sparksee’s include files use a hierarchy of directories not usual in an Xcode framework. Therefore, they can’t be included in the regular Headers directory of the framework.
  • Add the Sparksee.framework to the Link Binary With Libraries build phase of your application project. You can just drag it there.
  • Choose the most appropriate library: libstdc++ (GNU C++ standard library) or libc++ (LLVM C++ standard library with C++11 support) in the C++ Standard Library option from the build settings of the compiler. This version must match the one you downloaded in first place.
  • Take into account that after all these changes you may need to Clean your Project.

Run the empty application! That’s it! You now have a new project using Sparksee.

 Step 3) Other configuration considerations

Remember that all the source files using C++ should have the extension “.mm” instead of “.m”.

Step 4) Initial steps

Now you should create a new Sparksee database,  follow this order of actions:

  • Add the following namespace: using namespace sparksee::gdb;
  • Now you should make some configuration steps before creating the database. First of all, create a new configuration class with SparkseeConfig cfg;
  • Set the license code with SparkseeConfig::SetLicense(key): Sparksee mobile only works with a valid key, you are going to get that code in the same email of your download.
  • Limit the Sparksee cache memory with SparkseeConfig::SetCacheMaxSize(MB): Sparksee by default takes all the free available memory space, but that is something you surely may control in a phone device.
  • Activate the recovery functionality with SparkseeConfig::SetRecoveryEnabled(true): The recovery is a helpful functionality that will allow you to recover all your data if any error occurs.
  • Set the log file with SparkseeConfig::SetLogFile(filename): Specify a log full name&path where you have write permission.
  • Create the main Sparksee class with new Sparksee(cfg): take into account that the argument of this method is the SparkseeConfig created.
  • You have already configured Sparksee, now let’s create your database with Sparksee::Create(filename, alias). Again you will need name&path as happens with the log.
  • It’s time to create a new session with Database::NewSession().
  • Graph objects and operations are available at Graph class level; you need to get the graph from the session with Session::GetGraph().

Done!

We continue improving our Sparksee mobile iOS support by having an Objective-C API on the works to be released soon, so keep tuned!

Posted in Documentation, Sparksee | Tagged , , , , | Leave a comment

Graph Databases research: Using semijoins programs to solve traversal queries

huge-semijoinOne of the most important challenges for graph databases is how to express graph queries and how to solve them efficiently. There is an important gap between the current industrial approach with libraries of very efficient APIs for procedural languages as Java, and high level languages like Gremlin and Cypher that combine pipes or data flows of results from the execution of graph APIs. Also, it is very difficult to optimize complex graph oriented programs based on direct calls to low level APIs. In recent years several proposes have raised from the research community, but these solutions are still far from being adopted by the graph database vendors.

In the (ACM publishing pendant) paper presented by Sparsity Technologies at GRADES workshop 2014, we propose an algebra with a set of operations to solve graph queries in Sparksee, and we introduce extensions to SQL-like features to compute recursive programs and collection-oriented procedures typical for graph algorithms. The operations of this algebra are capable to reproduce the behavior of the current Sparksee APIs and, at the same time, are flexible enough to be combinable in the form of query plans that are suitable for optimization using the most usual database query optimization techniques as well as future optimizations more specific of graph patterns and graph traversals. Also, by adapting the operations to the graph data representation of the Sparksee engine, the query runtime will be able to take advantage of compressed bitmap processing and combination.

An important property of the operations is that most of them belong or have been adapted from the relational algebra, and the data structures in the form of key-value pairs are close to the relational model with multivalued attributes (Non First Normal Form relations). In particular, our main operation is the semijoin, which has been widely used in database technology for distributed systems, and semijoin program optimization has been studied and formalized in detail. Thus, our proposal is a compromise between the relational model and the noSQL and, in more detail, key-value pairs storages and graph querying systems. On the other hand, our procedural extensions for parallel subquery execution and recursion can be used to simulate graph analytical frameworks for the computation of complex graph algorithms over huge graphs.

Our next steps are to implement the prototype of the query engine with full support of the algebra, and to test and validate the performance of the proposal. After that, we will focus on query plan optimization for semijoin programs combined to our non-relational extensions for the resolution of complex queries over large graphs.

Which are your thoughts on our approach of using the semijoin as main operation? Also, remember that we have a free license program for researchers that would like to use Sparksee.

Posted in Research, Sparksee | Tagged , , | Leave a comment

Graph Databases research: Synthetic graph data generators

Sparsity Technologies is very interested not only in the graph database technology itself, but also in all the applications that can take advantage of Sparksee. One of the most popular ones is the analysis of social media data represented as graphs. For this topic, our research team is collaborating with social data experts in DAMA of the Universitat Politècnica de Catalunya.

Our latest work was presented in the Graph Data-management Experiences & Systems (GRADES) Workshop in conjunction with DAMA-UPC. There, we presented our experiencies with synthetic graph data generators. Such generators build arbitrarely large graphs that aim at simulating the structure of social networks. We experimented with two generators LFR (by Lancichinetti et al.) and the LDBC data generator. The former provides the network structure of a social network, while the latter builds a full social network with attributes and posts that will be used in the LDBC benchmarks.

growing_graph_ack_mike_bergman

In our paper, we observe that LFR is able to mimic some of the properties of real networks, though the distribution of triangles in the network differs. On the other hand, the LDBC data generator provides a much more accurate network, that simulates with large precision the community aspect of real networks. Such data generators are very valuable for Sparksee’s development since they allow us to generate huge graphs to test our database technology thoroughly. Read more information about the creation of synthetic graph data generations in our paper here.

 

Posted in Research, Sparksee | Tagged , , , | Leave a comment

Twitter influence contest at GraphLab Conference 2014

To celebrate our participation in the GraphLab Conference 2014 we have decided to hold a contest and test our beta Social Networks analytics technology using the Sparksee graph database. If you want to prove your influence in the Social Media and win one of our 10 exclusive polos, don’t hesitate to join the contest!

How can I participate?

The only thing you have to do is to retweet the tweets containing the #sparsitycontest hashtag that we will publish from Friday 18th to Monday 21st. Our system is able to simulate the propagation of the tweets in order to discover people who are influential in the topics that cover that specific tweet. The 10 most influential participants in those propagation chains will be the winners.

What do I win?

If you are among the top 10 influencers retweeting our tweets you will win one of our 10 exclusive gray polos with Sparsity’s logo embroidered in blue. All polos are size L. See picture in this post.

Where and when I can pick the prize?

If you are one of the winners, you will have to pick your prize at the stand of Sparsity Technologies in the GraphLab Conference 2014 venue (table 22). The winners will be announced on July 21st at 17:30 through our official twitter account @Sparsitytech and the prizes will be delivered in-hand before the conference is over.

la foto 5 Processed with VSCOcam with hb2 preset

Posted in Events, Sparksee | Tagged , , , | Leave a comment