Graph database use case: Insurance fraud detection

According to a fact sheet released by the Southwest Insurance Information Service (SIIS), Approximately 10% of all insurance claims are fraudulent, and nearly $80 billion in fraudulent claims are spent annually in the U.S., estimates the Coalition Against Insurance Fraud. Insurance fraud is certainly an issue that must be addressed given the benefits both the insurer and the insured will obtain from its prevention: the insurance buyer is able to receive coverage at a lower price, which gives the insurance company a competitive advantage.

Insurance fraud can be perpetrated by the seller or the buyer. Seller fraud occurs when the seller of a policy hijacks the usual process, in a way that maximizes his or her profit. Some examples are premium diversion, fee churning, ghost companies and worker’s compensation fraud. Buyer fraud occurs when the buyer deliberately invents or exaggerates a loss in order to obtain more coverage or receive payment for damages. Some examples are false medical history, murder for proceeds, post-dated life insurance and faking accidents.

Traditional methods to detect and prevent this form of fraud include duplicate testing, using date validation systems, calculating statistical parameters to identify outliers, using stratification or other types of analysis to identify unusual entries, and identifying gaps on sequential data. These methods are a great way to catch most of the casual, single fraudsters, but sophisticated fraud rings are usually well-organized and informed enough to avoid being spotted by the traditional means. They use layered “false” collusions in a similar way than money laundry rings.

In this scenario, where implementing alternative fraud detection methods is crucial, graph database management systems play a significant role. In the case of buyer fraud, the only way to catch the complex layered collusion performed by criminal rings is to analyze the relationships of the elements involved in the claim, which is a tedious task to perform on a relational database. While a RDBMS has to join a large number of tables –accidents, companies, drivers, lawyers, witnesses…- in complex schemes, a GDBMS only has to traverse the graph considering the relationships between the nodes, which is significantly more efficient, especially in cases that require querying large datasets. An example of buyer insurance fraud represented as a graph would be as follows:

graph post insurance 2

Example 1: Simple buyer insurance fraud case represented as a graph.

On this example, subjects 5 and 4 participate on both accidents in a direct way. Subject 1 is also related to both accidents in an indirect way, given that the car he drives is owned by the driver of Car 3, who is involved in Accident 2, which means Subject 1 and 5 must know each other. With the added value of social networks analytics— another scenario where graph DBMS usually outperform relational DBMS–we have one more clue pointing to our suspecting of fraud: Subject 3 and subject 6 are friends on Facebook, meaning that 5 out of 6 nodes of the graph are related to both accident 1 and accident 2 in some way, which should ring a bell to a possible fraud involved. In addition a graph database would make easy to add data from different sources in a changing schema and for instance move from subject 1 to 6 to see that they are in fact also related. To make it clear and easy to visualize we have used an example of a fraud ring that claims only two false accidents, but real cases of large fraud rings usually result in greater number of claims, where the relationships between the people involved can be hardly explained by coincidence.

A graph database could also be used to search for fraud across different insurance companies to find similarities in patterns and behaviors that could add value to an analysis like the one showed on the example above.

As we have said, a high performance graph database like Sparksee is a perfect match to deal with large amounts of data in situations where a deep relationships analysis is required. Remember that you can download it for free under evaluation or research license and use it for your own project. You can find more graph database use cases, scenarios and success stories searching for the “use case” tag on the blog or visiting the “scenarios” section of our website.



This entry was posted in Sparksee, Use Case and tagged , , , , . Bookmark the permalink.

Comments are closed.