Designing graph database using Neo4j


This project is a continuation of a case study of Cater Me Cafe. As the business recently grows, there is a need to consider migrating its current relational database to a NoSQL database management platform that can provide better scalibility for a big environment. In addition, graph database is selected because of its flexibility and reliability for real-time sales data. 

1. Graph database modelling

The existing ER diagram will be converted into graph structure following the rules below. Full guidance on transforming a relational database to graph data model can be found in Neo4j website.
  • Each entity table in relational model becomes node label
  • Each row or record of a table becomes a node in the graph
  • Each column or field in a table becomes a node property
  • Join tables (or associative entities) in relational model becomes relationships (arrow)
  • ...
Example of transforming a record of table Customer and Orders

This image above illustrate the case of customer whose name is "McDonald Tom" with his order with ID "Ord0014". The arrow named "Purchased" represents the relationship between table Customers and Orders.

Example of transforming all records related to customer "McDonald" and order "OR0014"


2. Graph database implementation

Given the translated graph model above, Cypher Query Language is used to fully migrate the relational database into new Neo4j database. The relational database in this project is exported from the DBMS system Sqlite into csv files

Example of Cypher script to import table Customers


Example of Cypher script to map Customers and Orders


3. Neo4j graph data science library (GDSL)

The Neo4j GDS libraries support a wide ranges of data analytics, machine learning, graph algorithms etc. to improve prediction and discover insights.

In this project, the restaurant owner plans to prioritize and focus more on their registered Customers to maximize their personal experiences at the restaurant, then end up with engaging more. One strategy is to discover groups of customers who might have similar purchase references as it is beneficial for cross-selling opportunities to increase sales and customer loyalty. One technique is used to compute pair-wise similarity between customers based on Jaccard coefficient.

Example of Jaccard similarity by food items



The Jaccard coefficient for each pair of customers is fairly low around 25% or less indicating that these customers do not have very strong similarity in purchasing food item. This might be due to less transaction sales data in during last period. However, the graph still shows some interesting factor that that Rice combo special and Pho sliced beef and beef ball are the preferred food items among several customers.

Example of Jaccard similarity by sales channels



The Jaccard coefficient indicates the majority of customers prefered to dine-in or pick-up at the restaurant. There are two things need to be aware for the restaurant owner. First, the service at the counter need to be improved to serve more customers. Second, online order or food delivery are not really efficient? The restaurant owner probably needs to deep dive into the business to address these issues.



No comments:

Post a Comment