Graph-Native Machine Learning, Until Now the Domain of Big Tech, is Available with Neo4j for Graph Data Science 1.4
SAN MATEO, Calif., Oct. 20, 2020 / — Neo4jⓇ, the leader in graph technology, announced the latest version of Neo4j for Graph Data Science™, a breakthrough that democratizes advanced graph-based machine learning (ML) techniques by leveraging deep learning and graph convolutional neural networks.
Until now, few companies outside of Google and Facebook have had the AI foresight and resources to leverage graph embeddings. This powerful and innovative technique calculates the shape of the surrounding network for each piece of data inside of a graph, enabling far better machine learning predictions. Neo4j for Graph Data Science version 1.4 democratizes these innovations to upend the way enterprises make predictions in diverse scenarios from fraud detection to tracking customer or patient journey, to drug discovery and knowledge graph completion.
Neo4j for Graph Data Science version 1.4 is the first and only graph-native machine learning functionality commercially available for enterprises. The ability to learn generalized, predictive features from data is significant because organizations don’t always know how to represent connected data for use in machine learning models. The latest Neo4j version includes graph embedding algorithms that learn the structure of a user’s graph, rather than relying on predetermined formulas to calculate specific features like centrality scores.
Alicia Frame, Neo4j’s Lead Product Manager and Data Scientist, shared what Neo4j for Graph Data Science version 1.4 means for data scientists and analytics teams.
“We are thrilled to bring cutting-edge graph embedding techniques into easy-to-use enterprise software,” Dr. Frame said. “The latest version of Neo4j for Graph Data Science democratizes state-of-the-science techniques and makes it possible for anyone to use graph machine learning. This is a game changer in terms of what can be achieved with predictive analysis.“
Graph Embeddings at UK.GOV
In GOV.UK’s recent blog post titled “One Graph to rule them all“, data scientists Felisia Loukou, and Dr. Matthew Gregorywrite about deploying their first machine learning model with the help of graph data science and a Neo4j knowledge graph. Their model automatically recommends content to GOV.UK users based upon the page they are visiting. In their August 2020 post, they explain:
“node2vec, given any graph, learns continuous feature representations (a vector of numbers) for the nodes, which can then be used for various machine learning tasks, such as recommending content. Through this process, we learned that creating the necessary data infrastructure which underpins the training and deployment of a model is the most time-consuming part.”
With Neo4j for Graph Data Science, organizations now have a completely new way to learn from their data, get more value out of existing datasets and continually improve predictive accuracy:
- Uncover revelations in their data that they didn’t even know to look for: Graph embedding (algorithms) learn what’s structurally significant in data assembling a generalized superset of information that all the traditional graph algorithms gather. Graph embeddings accomplish this by sampling the topology and properties of the graph and then reducing its complexity to just those significant features for further machine learning.
- Eliminate plateaus when traditional algorithms aren’t enough. Graph algorithms and embeddings can abstract the structure of a graph using its topology and properties, making it possible to predict outcomes based on the connections between data points – rather than raw data alone.
- Perform faster feature engineering on data using generalized learning to avoid testing a multitude of targeted algorithms when predictive features are ambiguous and using high-performance methods like FastRP.
- Continually incorporate new data and predictions by storing the learned functions of GraphSage in the new machine-learning model catalog and applying them to new data for new embeddings and predictions – without having to retrain your model.
- Enhance the value of their graph database by adding ongoing scoring and classification results, as well as predicting missing information for continually better insights.
Neo4j for Graph Data Science version 1.4 includes three new graph embedding options that learn graph topology to calculate more accurate representations:
- node2Vec is a well-known graph embedding algorithm which uses neural networks
- FastRP is a graph embedding up to 75,000 times faster than node2Vec, while providing equivalent accuracy and scaling well even for very large graphs
- GraphSAGE is an embedding algorithm and process for inductive representation learning on graphs that uses graph convolutional neural networks and can be applied continuously as the graph updates.
In addition to graph embeddings that provide complex vector representations, the new version of Neo4j for Graph Data Science adds general machine learning algorithms such as the k-nearest neighbors algorithm (k-NN), commonly used for pattern-based classification, to make it easier to gain insights from graph embeddings.
Knowledge Graph Completion Example
Knowledge graph completion provides value across domains, including identifying new associations between genes and diseases, the discovery of new drugs and predicting links between customers and products for better recommendations. From queries to support domain experts in uncovering what they know, to patterns to understand trends, to calculating high-value features to train ML models, knowledge graph completion isn’t possible without graph technology.
In a drug discovery scenario, this means not only identifying possible new associations between genes and diseases or drugs and proteins but also providing immediate context to assess the relevance or validity of these discoveries. For customer recommendations, it means learning from user journeys to predict accurate recommendations for future purchases, while presenting options within their buying history to build confidence in suggestions.
Find Out More
Learn more about Neo4j for Graph Data Science or download the latest version here.
Version 1.4 will be covered in detail during Neo4j Online Developer Expo and Summit (NODES) 2020, Neo4j’s global developer conference. You can also watch the talks and demos from the Neo4j Connections for Graph Data Science online event.
Neo4j is the leader in graph database technology. As the world’s most widely deployed graph database, we help global brands – including Comcast, NASA, UBS and Volvo – to reveal and predict how people, processes and systems are interrelated. Using this relationships-first approach, applications built with Neo4j tackle connected data challenges such as analytics and artificial intelligence, fraud detection, real-time recommendations and knowledge graphs. Find out more at neo4j.com.
- Neo4j for Graph Data Science
- Neo4j Graph Data Science Library
- Get your Free Copy of O’Reilly’s Graph Algorithms Book
- Get your Free Copy of the Graph Data Science For Dummies Book
- Neo4j on Twitter
- Neo4j on LinkedIn
- Neo4j on YouTube
- Neo4j is hiring
© 2020 Neo4j, Inc., Neo Technology®, Neo4j®, Cypher®, Neo4j® Bloom™, Neo4j® Aura™ and Neo4j for Graph Data Science™ are registered trademarks or a trademark of Neo4j, Inc. All other marks are owned by their respective companies.