Graph Databases for Beginners: Why Graph Technology Is the Future
The world of graph technology has changed (and is still changing), so we’re rebooting our “Graph Databases for Beginners” series to reflect what’s new in the world of graph tech – while also helping newcomers catch up to speed with the graph paradigm.
So you’ve heard about graph database technology and you want to know what all the buzz is about.
It’s easy to take the perspective of a cynic: They’re just another passing trend – here today, gone tomorrow – right? Isn’t that the way of all tech buzzwords?
Feel free to be suspicious – skeptical even – but leave your cynicism at home. Instead, I’m inviting you on an adventure of a new way of seeing the world.
The graph paradigm goes well beyond databases and application development; it’s a reimagining of what’s possible around the idea of connections. And just like any new problem-solving framework, approaching a challenge from a different dimension often produces an orders-of-magnitude change in possible solutions.
All that to say: Graph technology is a rising tide your development team – and your business – can’t afford to pass up. Graph databases are the future, and even if you’re just a beginner, it’s never too late to get started. Let’s dive in.
In this Graph Databases for Beginners blog series, I’ll take you through the basics of graph technology assuming you have little (or no) background in the space. This week, we’ll walk through basic definitions and why those distinctions matter.
Why You Should Care about Graph Database Technology
When you’re on your own, new tech might be fun to play around with or to use on a personal side project, but when you’re at work, it’s a whole different story.
Professionally, you have to operate in a world of budgets, timelines, corporate standards and competitors. And in that world, the only test for new tech is that it better work damn well (and waybetter than anything else you already have on hand). Otherwise, the suits will be asking questions.
Graph databases fit that bill, and here’s why:
Your data volume will definitely increase in the future, but what’s going to increase at an even faster clip is the connections (or relationships) between your data. Big data will definitely get bigger, but connected data will grow exponentially.
With traditional databases, relationship queries come to a grinding halt as the number and depth of relationships increase. In contrast, graph database performance stays constant even as your data grows year over year.
With graph databases, your IT and data architecture teams move at the speed of business because the structure and schema of a graph data model flex as your solutions and industry change. Your team doesn’t have to exhaustively model your domain ahead of time (and then exhaustively remodel and migrate the DB after some exec asks for a change); instead, you can add to the existing structure without endangering current functionality.
With the graph database model, you are the one dictating changes and taking charge; whereas the RDBMS data model dictates it’s requirements to you, forcing you to adapt to its tabular way of seeing the world.
Developing with graph technology aligns perfectly with today’s agile, test-driven development practices, allowing your graph-database-backed application to evolve with your changing business requirements.
Your agile team now has a database that keeps up with your daily demands.
What Is a Graph Database? (a Non-Technical Definition)
You don’t need to understand the arcane mathematical wizardry of graph theory in order to understand graph database technology. On the contrary, they’re more intuitive to understand than relational databases (RDBMS).
A graph is composed of two elements: a node and a relationship.
Each node represents an entity (a person, place, thing, category or other piece of data), and each relationship represents how two nodes are associated. For example, the two nodes
dessert would have the relationship
is a type of pointing from
Consider another example: Twitter is a perfect example of a graph database connecting 330 million monthly active users.
In the illustration below, we have a small slice of Twitter users represented in a graph database. Each node (labeled
User) belongs to a single person and is connected with relationships describing how each user is connected. As we see below, Peter and Emil follow each other, as do Emil and Johan, but although Johan follows Peter, Peter hasn’t (yet) reciprocated.
Twitter users represented in a graph database model.
If this example makes sense to you, then you’ve already grasped the basics of what makes up a graph database.
How Graph Databases Work (Explained in a Way You Actually Understand)
This connections-first approach to data means relationships and connections are persisted (and not just temporarily calculated) through every part of the data lifecycle: from idea, to design in a logical model, to implementation in a physical model, to operation using a query language and to persistence within a scalable, reliable database system.
Unlike other database systems, this approach means your application doesn’t have to infer data connections using things like foreign keys or out-of-band processing, like MapReduce.
The result: Your data models are simpler and yet more expressive than the ones you’d produce with relational databases or NoSQL (Not only SQL) stores.
What Makes Graph Databases Unique
A lot of databases have similar characteristics, but graph databases have a few things that make them unique. Here are the two most important properties of graph database technologies that you need to understand:
- Graph storage
Some graph databases use native graph storage that is specifically designed to store and manage graphs – from bare metal on up. Other graph technologies use relational, columnar or object-oriented databases as their storage layer. Non-native storage is often slower than a native approach because all of the graph connections have to be translated into a different data model.
- Graph storage
- Graph processing
Native graph processing (a.k.a. index-free adjacency) is the most efficient means of processing data in a graph because connected nodes physically point to each other in the database. Non-native graph processing engines use other means to process Create, Read, Update or Delete (CRUD) operations that aren’t optimized for handling connected data.
- Graph processing
When it comes to current graph database technologies, Neo4j leads the space as the most native when it comes to both graph storage and processing. If you’re interested in learning more about what makes a native graph database different from non-native graph technology (and why it matters), then read the Native vs. Non-Native Graph Technology later in this Beginners series.
Conclusion: Graphs Are in More Places than You Think (They’re Everywhere)
The real world is richly interconnected, and graph databases aim to mimic those sometimes-consistent, sometimes-erratic relationships in an intuitive way. That’s what makes the graph paradigm different than other database models: It maps more realistically to how the human brain maps and processes the world around it.
And once you start seeing graphs of interconnected data in one place (your recommendation engine, for example), you start seeing them in other places too (like your fraud detection efforts or your master data management). Pretty soon, you’ll have the epiphany: graphs are everywhere.
It comes as no surprise then that graph technology is on the rise (but you don’t have to take myword for it).
There’s a good chance your competitors are at least evaluating or exploring the deployment of a graph database, so this is your opportunity to step up your game and join leading companies like
That said, it’s a narrow window.
Learn to leverage graph databases today and your business retains the competitive advantage well past tomorrow.
Ready to dive deeper into the world of graph databases? Learn how to apply graph technologies to real-world problems with O’Reilly’s Graph Databases book. Click below to get your free copy of the definitive book on graph databases and your introduction to Neo4j.
Catch up with the rest of the Graph Databases for Beginners series:
- Why Data Relationships Matter
- The Basics of Data Modeling
- Data Modeling Pitfalls to Avoid
- Why a Database Query Language Matters
- Imperative vs. Declarative Query Languages: What’s the Difference?
- Graph Theory & Predictive Modeling
- Graph Search Algorithm Basics
- Why We Need NoSQL Databases
- ACID vs. BASE Explained
- A Tour of Aggregate Stores
- Other Graph Data Technologies
- Native vs. Non-Native Graph Technology
About the Author
Bryce Merkl Sasaki, Editor-in-Chief, Neo4j
Bryce Merkl Sasaki is the Editor-in-Chief at Neo4j. He studied professional and creative writing for undergrad and has been freelancing for 7 years. Recently, he worked at an inbound marketing agency in Philadelphia as a copywriter before moving to California. When not working, he likes to spend his time working on his novel, looking for pickup soccer games and reading voraciously.