QConSF 2012: Graph Database Overview from Emil Eifrem

@emileifrem gives a great introduction on NoSQL databases, especially graph databases.  He also gives a quick overview on use cases for graph databases and highlights some of the benefits of using Neo4j in this endeavor.  Here are my notes from this session:

Trends in BigData & NoSQL:

  • Increasing data size
  • Increasing connected data
  • Semi-structured data
  • Architecture – a facade over multiple services

Categories of NoSQL

Key/Value store (heritage from Amazon Dynamo) –

  • Riak, Redis, and Voldermort are implementation examples
  • Strengths is that it’s a simple but also weakness

Columnar Stores

  • BigTable – every row can potentially have it’s
  • Examples are HBase and Cassandra, Hyper Table
  • Supports semi-structured data but weakness is nested or connections

Document Database

  • Collections of documents, JSON that could potentially be nested
  • 90% uptick on NoSQL in MongoDB but also CouchDB
  • Strength is simplicity of database, but hard to do connected data

Graph Database

  • Nodes and Relationships
  • Examples are Neo4j, InfiniteDB, OrientDB, etc.
  • Great for connectiveness of data and complexity but harder to scale to size

Graph databases can provide statistical data on how likely a node is related to other nodes.  Relationships are first class citizens in graph databases to give color to how nodes are related.  Nodes and relationships have simple key/value pair types.  Indexes are also available.  Types are being considered for post production control of data.

Speed comparison for social graph database example from presentation shows that MySQL is 2000ms and Neo4j is 2ms.  For 2k to 1mil people if they’re connected is still 2ms.  In SQL, JOIN clause explosion due to combinatorial issue.  Neo4j can visit about 1-2mil nodes/per second.

Graph queries:

Cypher gives you higher abstraction and ease of use, but slower performance. It uses graph patterns to defines nodes and relationships represented like:

START A=node:person(name=”A”)

MATCH (A) – [:LOVES] -> (B)

RETURN B as lover

pattern-matching query language, declarative grammer and aggregation, ordering, limits…tabular results

Neo4j is fully ACID compliant as opposed to eventual consistency like most other NoSQL systems


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s