Unleashing the power of graph databases | 6point6

Home → Insights → Unleashing the power of graph databases to discover hidden data connections

In today’s data-driven world, your organisation collects an impressive volume of data. From inventory specifics to customer service interactions, data forms an intricate network of untapped potential. Discovering the hidden connections within your data can lead to substantial organisational benefits that can bring about significant improvements in your decision-making and business operations.

By understanding these concealed interdependencies, you can recognise how different aspects are interlinked, enabling them to make holistic decisions that are beneficial across different departments, branches or divisions.

Consequently, decoding this complex data network isn’t just an interesting idea, it’s a business necessity.

Graph databases are a ground-breaking technology designed to manage complex data relationships. Unlike traditional relational databases, graphs consider the connections between your data elements as significant as the data itself. This makes them ideal for analysing intricate, interrelated datasets and exploiting them in a more flexible, scalable and efficient way.

Choosing the right graph database model

There are two main categories of graph database models: Resource Description Framework (RDF) and Labelled Property Graph (LPG). Whilst both emphasise the importance of relationships, they diverge in terms of their underlying data models and the methods used to represent and store data. Ultimately, it will be your specific use case that will determine the right model for your requirements.

RDF and LPG database models – how they operate

RDF was created by the World Wide Web Consortium (W3C) and this approach provides a standard model for representing data in the form of subject-predicate-object triples. These are essentially two nodes connected by a single edge. For example, James knows Rosie where James is the subject, knows is the predicate, and Rosie is the object. Each element in an RDF triple has a Uniform Resource Identifier (URI), which allows for the unique identification of every subject, predicate and object. These RDF triples can be stored in various ways, including triplestores, relational databases, or even as plain text files. The data is represented using standardised RDF serialisation formats like RDF/XML, Turtle, N-Triples, or JSON-LD. Standardisation and interoperability are two key benefits of RDFs. This facilitates easy integration and information exchange with other RDFs.

In contrast, an LPG depicts data as nodes, edges/relationships and properties. Nodes represent entities, relationships represent connections between nodes, and properties store additional information about nodes and relationships. The relationships are typically directional and, unlike relationships in an RDF, can have properties associated with them. Taking our previous example, James and Rosie are both nodes. James knows Rosie (and presumably Rosie knows James), and there is therefore a bidirectional relationship between them. We can go further with an LPG and add properties James, Rosie and their relationship. For example, we could add their job titles, key skills, the length of time they have known each other, and the type of their relationship.

Data in an LPG is typically stored in a graph database that uses a native graph storage format, optimised for efficient traversal and graph querying.

Graph database usage and application

Graph databases have become a valuable tool in handling complex and highly interconnected data. These databases excel where intricate relationships and flexibility, such as social network analysis, recommendation systems, fraud detection, and knowledge graphs are required.

You can select the right data model for your organisation by carefully considering the nature of your specific use case.

Labelled Property Graph databases are most commonly used for dynamic applications including social network analysis, recommendation systems, fraud detection, and knowledge graphs. They are often the right choice for situations where flexibility is key, and there is little need for interoperability.
Resource Description Frameworks are generally used for representing and linking datasets which are slow moving but require some degree of interoperability or information exchange with other data stores.

The schema and flexibility of graph databases

The standardisation and interoperability offered by an RDF does come at the cost of flexibility. To support these features, RDF data conforms to pre-defined ontologies and controlled schema definitions. In contrast, LPGs are often schema-optional or flexible. This allows the data model to evolve over time. The schema can be defined on a per-node or per-relationship basis, providing flexibility in representing diverse data structures.

This makes LPGs better suited to use cases in your organisation that involve dynamic datasets and changing contexts.

Discovering your data’s hidden connections

Finding hidden connections within your data is an iterative process. It requires a combination of domain knowledge, data exploration and visualisation to uncover meaningful relationships. Following the process below is one way to help you get started:

Define your data model. Consider the nodes and relationships you would like to capture in your graph.
Using your data model as a guide, populate your graph database with the relevant data. Define nodes, relationships and any associated properties.
Analyse your data to learn more about the nodes and relationships that are more important to your use case. Formulate queries using the graph query language specific to your database (e.g. Cypher).
Write queries that traverse the graph and retrieve the desired connections. Queries can range from simple path traversals to more complex patterns or aggregations based on specific criteria.
Execute your queries and examine the results. Visualise the connections between nodes using graph visualisation tools or libraries. This will help you gain insights into the relationships, patterns, and hidden connections present in your data.
Finally, refine and iterate. Scrutinise the connections you’ve discovered and refine your data model as needed. Iterate through this process as needed to uncover further connections or delve deeper into specific areas of interest.

By harnessing the power of graph databases to uncover the hidden interconnections within your organisation’s data, you’re laying the foundations for success in this increasingly data-centric world.

Getting started

For further information on how we can support your organisation to leverage graph technology, contact us at [email protected]

Join us

Some of the greatest talent in our industry choose to work at 6point6. We are recruiting for Data Architects, Data Engineers and Data Scientists as part of our continued growth. To find out more visit our Careers page.