Build

Turning Neo4j Into a Real-time Database

Anmol Agrawal on Oct 22, 2015
Turning Neo4j Into a Real-time Database

neo4j-logo-2015To describe it simply, a real-time database works by continuously pushing updated query results to applications in real time instead of polling for changes.

Real-time databases (like RethinkDB) are growing in popularity as demand for real-time updates from databases increases. Most of these databases are document-based (NoSQL) or SQL based. However, most of the SQL and NoSQL databases have fixed schemas, or schema not fixed at all. And where the schema is flexible, they cannot adapt to ever changing data. As the old adage says: ‘Data is King!’, but we should change that to, ‘Data Connectedness is King!’

Most NoSQL databases store sets of disconnected aggregates. This makes it difficult to use them for connected data. That’s where graph databases comes in. Neo4j is world’s leading graph database.

While Neo4j isn’t a real-time database (for now), we can make it one by adding a layer of PubNub over top, and that's exactly what we'll do in this blog post.

Project Overview

Full source code is available here.

In this tutorial, we'll start with a real life use case: a railway reservations portal in India called IRCTC. A couple of things that stand out:

  • Whenever emergency reservations (what we call Tatkal) slots are opened every morning for all, the trains fill up within a couple minutes.
  • The server slows down and often the website hangs, which is especially bad for people with slow internet speeds.
  • When user books a ticket (when they see it available), by the end of payment, those emergency seats are filled up and they don’t get their seats, hence their money was wasted as it was non-refundable.

This is a matter of concern when you have 580,000 e-tickets booked per day.

Although Neo4j has very fast read and write performance and it can take this amount of data pressure, the problem of users knowing if the seats are available in real time is still not resolved. We'll solve that aspect in this blog post using dummy data.

In the video below, I walk through some of the code, and show how our application works. After that, we'll get to coding:

Create the nodes and relationships of Stations and Trains

Firstly, install and run Neo4j. Using CYPHER, which is the query language of Neo4j, I created the basic nodes and relationships. Start by creating nodes of train and stations:

Then create the relationships:

Create an HTML page

For our UI, create a basic HTML page to show the number of available seats in real time. If a ticket is booked somewhere else, we have to see the decrease in number of tickets instantly.

The most important part of this HTML page is the JavaScript code. Firstly, we initialized the PubNub variable our unique publish and subscribe keys.

To get your unique publish/subscribe keys, you’ll first need to sign up for a PubNub account. Once you sign up, you can get your unique PubNub keys in the PubNub Developer Dashboard. The free Sandbox tier should give you all the bandwidth you need to build and test your app with the PubNub API.

Next, through the pubnub variable, we subscribed to “neo4j” channel and we get the “seats_left” value in message payload that comes through the channel and put it inside HTML.

Create the Server

We have installed two npm packages for this project: PubNub and Seraph. Seraph is a Node.js driver that interacts with Neo4j database.

We initialize the Seraph instance connected to localhost where Neo4j server is running. In real world, we'd write a query to find the train, here I am just hard coding it as variable ‘tra’. ‘tra’ is the node with name “Train-1” (we could have added labels or indexing but that is not the focus for now).

Then we write a function update_node that has two tasks:

  • Find the train node, decrease the number of seats by one and save it to database.
  • Publish the updated number of available seats to “neo4j” channel.

Looking at the Seraph documentation, we see that we did those two tasks in sequence.

Test Run

Now open the HTML page in a web browser and fire up the Neo4j server. When we run ‘node index.js’ in terminal, that update_node function is executed and we can the see the change in number of seats in real time.

Though this is not a complete real-time database, we have built one aspect of it, connecting the server side to the client side.