Plotting the progress of graph databases

Posted by at 21:15h

Graph database products are a relatively new category of data storage and query technology, and one that’s getting a lot of attention lately. Unlike most conventional databases (and even some newer “NoSQL” products and engines) which track entities and their attributes, graph databases track entities and their interrelationships.

Given the prominence of social media, influencer marketing, as well as the importance of relationships in detecting fraud, cyber crime and even terrorist activity, the popularity of, and interest around, such databases should not be surprising. Networks, be they social, business, physical or criminal, are an important foundational structure in society and technology, and so graphs are an especially important structure in data and analytics.

Dedicated functionality or added feature
Sometimes graph technology is available in standalone, specialized graph databases.  Products like Neo4J typify this category. In many other cases, graph technology is available as a particular interface to a database that supports other storage models as well.

Couchbase and Microsoft’s Cosmos DB are two cases in point: these are non-traditional “NoSQL” databases that can operate as graph databases but support other models, such as wide column storage, as well. Even traditional databases are getting in on the action: SQL Server 2017, Microsoft’s latest release of its quarter-century-old flagship relational database, has added rudimentary graph capabilities, permitting individual tables in its databases to store nodes (the objects in a graph) and edges (the connections between the nodes).

Using the engine versus riding the vehicle
But is exposing graph technology as a raw capability the best way to deliver graph innovation to customers? Are graph database products immediately useful to Enterprise organizations? To be fair, the answer really depends on the organization and its technology approach. Some Enterprise organizations have sophisticated internal software teams, employing talented developers and even data scientists.  But many others choose not to go that route, preferring to buy technology rather than build it.

For the latter type of organization, the one downstream of the raw technologies, graph databases and graph technology are not especially actionable. Even organizations that build their own applications tend to use conventional relational database technology and staff up around the widely available skillsets needed for that kind of work. These organizations will likely not want to hire and pay for the kind of specialists needed to work with graph technology. And while niche solutions, like SQL Server’s graph processing, are meant to give such organizations a graph database on-ramp, those solutions may be too rudimentary to deliver on the full graph innovation promise.

This really puts the ball in the court of independent software vendors – ISVs. If graph technology is to benefit the largest number of customers possible, graph engines must be embedded in other products, their power brought to bear, behind the scenes. In last month’s post, I said something very similar about Artificial Intelligence (AI) and Machine Learning (ML): as exciting as those technologies are, dumping them on customers in raw form achieves very little. Applied AI and ML, on the other hand, can bring great benefit to customers.

Graph and AI/ML have in common the enhancement to their value made possible through embedding.  But beyond being embedded individually, they also can work amazingly well in combination, where their innate capabilities and their shared suitability for embedding can enhance each other immensely.

Graph technology can represent discovered patterns, while machine learning can observe patterns in existing data to predict values in data not yet ascertained. Graph technology looks at relationships and can help identify affinity groups, which is exactly what ML clustering algorithms do, as well. Recommendation engines determine which people, together, like the same set of products (or movies, articles or travel destinations, etc.), something that can be readily discoverable from graph data. Graph is not ML and ML is not graph. But graph technology finds connections and ML can help predict them.

Embedding ML or graph technology, individually, helps greater numbers of customers benefit from them without needing to procure specialized technology or staff up around the corresponding skill sets.  Combining the two brings even more benefit: the technologies complement and enhance each other, forming a workflow modeled on a virtuous cycle. And embedding that combined innovation harnesses its power for specific business or technological purposes, without requiring the customer to be aware of the underlying technology or have expertise in its application.