By Darryl Salas, Enterprise Account Manager
Aug 31 2 mins read
Money laundering is among the hardest activities to detect in the world of financial crime. Funds move in plain sight through standard financial instruments, transactions, intermediaries, legal entities and institutions – avoiding detection by banks and law enforcement. The costs in regulatory fines and damaged reputation for financial institutions are all too real. Neo4j provides an advanced, extensible foundation for fighting money laundering, reducing compliance costs and protecting brand value.
In this fourth blog in our six-part series on using graph technology to fight money laundering, we discuss how to get started quickly using the Neo4j reference implementation.
This section describes how to perform sprints when developing Neo4j applications. A sprint starts with identifying a very small number of “money queries.” The sprint continues with creating a graph data model, and then building out a full-stack solution (from ingestion to visualization) for your first set of money queries.
Building a simple solution with a handful of carefully chosen money queries allows a business to quickly reap the benefits of identifying money laundering.
The following queries can be used as building blocks or combined with other queries to arrive at a money query.
Localized Pattern Match: Suspicious Behavior Money Queries
Small Deposits: Concentration
- Cash transactions are deposited to an account, transferred to a central account, then wired to a bank outside of the U.S.
- The goal is to identify parties and largest aggregate amounts.
- Money queries show the accounts involved and the largest aggregate amounts (n hops, aggregating only when pattern matches).
- Suspicious accounts receive a high number of incoming deposits and then send a few large transactions to one or more high-risk parties.
Small Deposits: Accumulation
- This pattern is characterized by consecutive days of deposits with minimal withdrawals in the same period.
- Money queries identify parties and accounts involved and largest aggregate amounts (n hops, aggregating only when pattern matches).
Small Deposits: Velocity
- Customer makes several cash deposits just under $10,000 over an x-day period.
- Customer receives a large wire followed by withdrawal of most of it as cash via multiple ATM transaction within a short period of time.
- This pattern is signaled by a major behavior change. For example, a business customer whose cash deposit activity has gone from $50,000 a week to $250,000 a week over the course of a month.
- Party A sends to Party B and then B sends to Party C, where the transaction is greater than or equal to the amount between A and B and within Y% from B to C.
- Money queries in this scenario look for which nodes in the transactions have the highest incoming amount and few or no outgoing transactions.
- The pattern is revealed in a large aggregated set of deposits by a customer followed by a large ACH transaction or a transfer to another account such as a mortgage.
Localized Pattern Match: Suspicious Structure Money Queries
Structural: Entity Resolution
Shared attributes can be used for entity resolution
Structural: Payment Chain
Payment chain between two suspicious parties
Localized Pattern Match Scoring Money Queries
See Keymaker pipelines in the Sample Localized Pattern-Match Scoring section below.
Graph Algorithm Money Queries
PageRank, Closeness, Degree, etc.
Closeness scores detect central players, liaisons (betweenness) and the most relevant parties (PageRank) in a path between a customer and a high-risk endpoint.
Louvain Modularity, Label Propagation, Strongly and Weakly Connected Components, etc.
Label Propagation detects common entities and strongly connected components in high-risk rings. These are all based on relationships in the graph.
Common Neighbors, Preferential Attachment, Adamic Adar, etc.
Link prediction algorithms based on the money trail identify hidden COLLABORATOR relationships. These new relationships further inform the analysis of small deposit accounts that involve layering, velocity, concentration, etc.
Jaccard, Cosine, Overlap, etc.
Similarity algorithms are used for entity resolution. Also, if there is a path between a customer and a high-risk end point, similarity algorithms indicate how similar each path is to other paths from that specific customer to those high-risk end points. A company fighting money laundering could then create a subgraph of paths (e.g., paths A, B and C) and have weighted relationships representing similarities among the paths.
Pathfinding and Search
Breadth-First Search, All Pairs Shortest Path, etc.
These algorithms identify payment chains and third parties layered between customers or transactions and other end points. They are also used as a fundamental step in Weakly and Strongly Connected Components, Closeness and other graph algorithms.
Reference Graph Data Model
Neo4j’s AML graph data model pictured below is a whiteboard-style reference model for the queries described in this document. The graph model used in the framework demonstrates best practices when working with Neo4j to support pattern matching for analysis..
Only twenty indexes were required to deliver millisecond response time at scale. In contrast to Neo4j’s graph model, a relational database approach would have an entity-relationship diagram with over 150 tables and 300 to 400 indexes (not including primary keys) for the same questions and dataset. The relational model would also require hundreds of indexes to deliver much slower query-response times measured in minutes or hours.
Nodes Versus Properties The reference model implements certain structural attributes as nodes instead of properties on a customer node. These nodes include shared attributes such as address, phone, email address and SSN, as well as behavioral attributes such as transactions, withdrawals and other events that have time stamps.
Temporality A time tree helps answer questions about changed behavior or other events over different points in time. This approach allows for fast access and uses date parts such as quarters, weeks and months, without additional properties and indexes. The time tree also supports pattern matching on date values (e.g., accounts opened within 30 days of a suspicious transaction).
Relationship Types The Neo4j labeled property graph model unlocks the insights gained from leveraging a rich set of relationship types that provide context in the graph. The model demonstrates several relationship types that connect transactions, such as deposits, withdrawals, debits, etc.
Sample Localized Pattern-Match Scoring
Neo4j includes an analytics pipeline development and execution framework called Keymaker that uses these queries to score high-risk legal entities, depending on the context of payment chains, and relationships with other people, places and things. Keymaker pipelines execute analytic steps in-memory for nearly instant insights.
Keymaker allows data scientists, developers and analysts to pass code into the steps as parameters. Parameterized plug-and-play steps can be turned on and off, and can be shared among your AML team. Keymaker comes with a development console and an administration console. The results of the pipelines can be visualized as shown below.
Neo4j Browser Construct Cypher queries and visualize results in the Neo4j Browser.
Money queries are a great way to get started with your anti-money laundering efforts. The Cypher code for these queries is included in the full solution guide, so download the guide and start looking for patterns that indicate suspicious behavior, small deposits, and layering.
Next week, in blog five of our six-part series on fighting money laundering with graph technology, we will give a comprehensive overview of graph algorithms.
Stop money laundering in its tracks. Click below to get your copy of How to Combat Money Laundering Using Graph Technology.
Email me blog updates!
Yes! Please email the latest blog posts. I can unsubscribe at any time.Subscribe Me
Darryl Salas, Enterprise Account Manager
Darryl has extensive experience in data management working with semantic triple stores, NoSQL databases, columnar databases, relational databases, and database and application integration. Prior to Neo Technology, Darryl worked as a Global Account Manager at SnapLogic, the leading enterprise integration PaaS. Prior to that Darryl was a Strategic Accounts Director … know more
Sponsored by Neo4j