Graph databases are becoming the next big thing in data and analytics technology. According to Gartner, the application of graph processing and graph database management systems will grow at 100% annually through 2022 to continuously accelerate data preparation and enable more complex and adaptive data science.
Driving this growth is the belief that relationships between data should be cherished and treated as first-class assets, on a par with the data itself.
While it’s still early days, the advantages of graph databases can be clearly seen in areas such as social networking, recommendation engines, and fraud detection.
But what is graph database technology and why do businesses want to use it? Well first, let’s understand why the old way of doing things doesn’t cut it anymore.
Relational databases don’t get relationships
Considering the proliferation of things like Fitbits, smartwatches and smartphones, it’s hard to deny that the world is becoming more connected. And if data is supposed to mimic the real world, then, naturally, data must become more connected too.
Unfortunately, most businesses run on relational database management systems, which, ironically, are very bad at relationships. Their tabular data models and rigid schemas make it difficult to add new or different kinds of connections.
Gartner: top 10 data and analytics technology trends for 2019
Of course, with relational databases, if you want to join entities together, you can do so with joins and join tables; the necessary design pattern for this is called data normalization. However, when you’re joining very complex and large sets of relations together it can be quite slow and difficult to build a query for.
How are graph databases different?
Indeed, the relationships between your data can be difficult to see and use because your data lives in all sorts of different databases across your business.
Joining all this together in one coherent view to allow you to leverage the power in those relationships is what graph databases are for.
With graph databases, traversing relationships within datasets is much faster because relationships are perpetually stored within the database itself.
Technology is not the silver bullet to your data management woes, but talent is
Unlike relational database systems which represent data as elements in tables, graph database systems represent them as nodes which are related to each other. In graph databases, each node of the network represents some item of data – a name, or an address, or a number etc. – and the links, or ‘edges’, denote a meaningful connection between two nodes.
This is why the links between data can be accessed and analysed more easily.
Who’s using graph database systems?
To many graph databases are most commonly associated with social networks. Twitter, for example, developed its own graph database software, FlockDB, which was used to represent the links between its members.
Other web giants who utilise graph databases include Facebook, whose ‘social graph’ maps the interconnections between users, and Google, whose ‘knowledge graph’ describes the semantic links between people, places and objects.
However, a major paradigm shift towards graph by more general enterprises seems to be underway. The graph database platform, Neo4j, for example, claim 76% of Fortune 100 companies have adopted or piloted its product and counts 20 of the top 25 financial firms and seven of the 10 top retailers as customers.
Criminal connections how supercharged graph analytics protect banks
Graph analytics can improve banks defence and increase protection from cyber criminals
Furthermore, according to figures from MarketsandMarkets Research Private Ltd., the graph database market is expected to reach $2.4 billion in annual revenue by 2023, growing at a 24% annual rate.
This is because businesses across industries are understanding how graph databases can drive competitive advantage.
Online retailers, for example, are beginning to understand how the relationships between which products in a basket typically sell together can help them money.
Banks and financial services are getting on board with graph technology because of its applications around areas such as fraud detection and identity and access management.
While global suppliers are beginning to understand how graph databases can help with supply chain transparency, in that graph technology brings the ability to model the complex relationships inherent in modern supply chains.
Naturally, as the market expands, more vendors are popping up. Some of the suppliers in the industry are Neo4J, BlazeGraph, HypergraphDB, OrientDB, JanusGraph.
Understanding the learning curve
For organisations interested in adopting graph databases, there are a number of challenges to consider around upskilling existing staff and finding new streams of talent.
The good news, according to Emil Eifrem, CEO & Co-Founder, Neo4j, Inc., is that the learning curve is quite easy.
He explained: “If you’re an enterprise developer and you work with a relational database our query language SQL then you should be able for our system, our query language is called Cypher and it is very similar to SQL.
Financial Services and Neo4j: fraud detection
“Having said that we have a certification programme too. There are over a thousand certified developers out there also, so there’s a substantial existing skill set to bring in.
And then, of course, businesses can look to developing a rich partner ecosystem, there’s a lot of consulting companies and system integrators out there, who can help in this area.
The game has just begun
Whether or not you agree graph databases are all they are being hyped to be, now that the major database players are getting in on graph databases, the next phase of the market’s development will be all about maturation.
At the same time, one can’t help but feel that we are still merely scratching the surface of what graph databases can offer.