Analyzing the FinCEN Files in Neo4j

by admin · Published September 28, 2020 · Updated September 29, 2020

Michael Hunger, William Lyon & Rik Van Bruggen, Neo4j

Sep 22 7 mins read

ICIJ used Neo4j to analyze the FinCEN Files data, uncovering a vast network of money laundering.

The FinCEN Files Investigation

The International Consortium of Investigative Journalists (ICIJ) has recently exposed a vast network of industrial-scale money laundering running through Western banks and generally ignored by U.S. regulators – and they used Neo4j to help crack the case wide open.

The global investigation, dubbed the FinCEN Files, reveals how money launderers move their dirty money. From drug cartels and corrupt regimes to arms trafficking and other international crimes, these global banks have turned a blind eye – or even straight up refused to stop – as they earn huge profits from each transaction.

Together with BuzzFeed News and other media partners, the ICIJ spent 16 months organizing and analyzing the FinCEN Files. Using the Neo4j graph database and Linkurious graph visualization, along with many other tools, journalists built a knowledge graph to explore more than 400 spreadsheets containing data on 100,000 transactions and pieced together a nuanced picture of a broken system.

The results draw from more than 2,100 suspicious activity reports (SARs) between 1997 to 2017, which accounted for transactions of more than $2 trillion USD in dirty money. These reports were filed by banks and financial firms with the U.S. Department of Treasury’s Financial Crimes Enforcement Network (FinCEN), but were largely ignored or overlooked.

The FinCEN Files follow other breakthrough reporting and Pulitzer-Prize winning investigations from the ICIJ such as the Panama Papers, Paradise Papers, Swiss Leaks, West Africa Leaks and Luanda Leaks.

In this post, we take a closer look at the graph data model used in the FinCEN Files and walk you through a demo of querying and visualizing the connected data. We also take a deeper dive into the data using graph data science for more nuanced insights.

Some Context on Neo4j and Data Journalism

In the spring of 2016, a big piece of investigative reporting hit the streets: The ICIJ published the Panama Papers. An unprecedented set of publications, events, political revolutions and corporate boardroom changes followed – the ICIJ had hit upon a very dark nerve of the financial establishment.

The offshore constructions used by the rich, famous and criminals alike scandalized many everyday citizens, and in the aftermath, a number of different governmental and regulatory institutions initiated changes to end common tax evasion tactics. During the Panama Papers investigation, the ICIJ used not only a set of invaluable documents obtained by an anonymous source, but they also used an impressive set of technological building blocks that made an impossible task come together. One of these building blocks was the Neo4j graph data platform, and both as a company and as a community we have been proud contributors to this task of data-driven investigative journalism.

This article is about a new, and perhaps more important, piece of reporting just released by the ICIJ – using a very similar methodology and technology architecture. In the FinCEN Files, they don’t uncover offshore tax dodging constructions but instead reveal banking schemes that would and should rock the financial services world.

These banking schemes enable crime, oppression and authoritarianism across the globe. Now they’ve been brought to light thanks to the combined efforts of a global team of journalists and the power of graph technology.

Let’s dive in.

The Raw FinCEN Files Data

The ICIJ published a small subset of the suspicious activity report (SAR) data that we can use to visualize and query some of the SAR filings.

Much detail has been removed from the published data. Each transaction only includes:

The involved banks (name, country and geolocation)
The filer, originator and beneficiary
The begin-date and end-date
The monetary volume
The number of filings

We can turn this tabular information into a graph dataset that represents the participants of the filing via relationships.

The Data Import Process

For each Filing we create a node to store the attribute data. For each entity (bank) mentioned in either of the files for that filing we create an Entity and store its name and geolocation and connect it to its Country. Then we connect the Filing with an appropriate relationship to the Entity:

originator => ORIGINATOR
beneficiary => BENEFITS
filer => FILED
entity_b => CONCERNS

Our FinCEN Files graph data model.We are providing an import script for the data, and we’re also creating a demo database with the imported data and a Neo4j database dump that you can import into your own local or cloud instance.

For the demo server link use the following login information:

username: fincen
password: fincen
database: fincen

The Neo4j database dump and import script can be found in this GitHub repository: https://github.com/jexp/fincen

FinCEN Files Data Visualization and Exploration

To explore the data after the import, one option is Neo4j Bloom™ – an interactive graph data visualization and exploration tool. With the provided “perspective,” each entity is rendered with a specific icon and caption, and you further investigate the data by just entering the relevant search phrases in the search box.

Visual results for searching Neo4j Bloom with “Entity Deutsche Bank Filing”:

Bloom visualization results for “Filing Benefits Entity Russia”:

Querying the FinCEN Files Data

In the Neo4j Browser, a number of queries can give us some deeper insight into the data.

Here are the queries and resulting data visualization for the top 10 Filings and their participating Entities:

MATCH (f:Filing)
WITH f ORDER BY f.amount DESC LIMIT 10
MATCH p=(f)--(e:Entity)
RETURN *

MATCH (e:Entity)--(f:Filing)
WITH e, round(sum(f.amount)) as total
WITH e, total ORDER BY total DESC LIMIT 10
OPTIONAL MATCH (e)-[:COUNTRY]-(c:Country)
RETURN e.name, collect(c.code) as countries, total

Other Data Visualizations

The countries of the beneficiaries can be highlighted based on the aggregate transaction volumes, as you can see below.

We can visualize the bank-data by geolocation as a heat map, using the Neomap application for Neo4j Desktop.

Using Graph Data Science to Analyze the FinCEN Files Data

Banks we use the ORIGINATOR and BENEFITS relationships to create a virtual TRANSFERrelationship from one bank to another that holds the total amount.

Then, on top of that projected graph, we run a clustering algorithm to identify clusters of banks exchanging money.

MATCH (from:Entity)<-[:ORIGINATOR]-(f:Filing)-[:BENEFITS]->(to:Entity)
WITH FROM, TO, sum(f.amount) as sum
MERGE (FROM)-[t:TRANSFERRED]->(to) set t.amount=sum

Then we can run the Louvain clustering algorithm.

Or we can find the banks who received the most money transitively by using the PageRank algorithm. In this case, these banks include:

Or the banks can be visualized by node size using Neo4j Bloom:

Conclusion

The FinCEN Files investigation is not the first project to reveal international criminal activities moving through the financial system, and it certainly won’t be the last.

Global investigations at this scale – whether conducted by journalism organizations, government bodies or self-policing enterprises – have shown time and again that they require the power of the entire graph technology stack: graph database, graph data visualization and graph data science.

When investigators can store, query, explore and analyze the connections in their data, no dark secrets are safe.

Ready to dig in deeper to graphs and the FinCEN Files?
Check out the Neo4j FinCEN Files Sandbox.

Go To Sandbox

Further Resources

News:

FinCEN Files investigation: https://www.icij.org/investigations/fincen-files/
FinCEN Files story on BuzzFeed News: https://www.buzzfeednews.com/article/jasonleopold/fincen-files-financial-scandal-criminal-networks

For Developers:

Explore the data: https://www.icij.org/investigations/fincen-files/explore-the-fincen-files-data/
Panama Papers Sandbox: https://sandbox.neo4j.com/?usecase=icij-panama-papers/
Developer guide for Neo4j Graph Algorithms: https://neo4j.com/developer/graph-data-science/graph-algorithms/
Anti-Money Laundering Solution Guide: https://neo4j.com/whitepapers/anti-money-laundering-framework-solution-guide/

Tools:

Neo4j Graph Database: https://neo4j.com/neo4j-graph-database/
Linkurious graph visualization: https://linkurio.us/
Neo4j Desktop (including Neo4j Browser): https://neo4j.com/download/
Neo4j Bloom: https://neo4j.com/bloom/
Neo4j Graph Data Science library: https://neo4j.com/graph-data-science-library/

Use Cases:

Knowledge graphs: https://neo4j.com/use-cases/knowledge-graph/
Fraud detection: https://neo4j.com/use-cases/fraud-detection/

Past Investigations:

Analyzing the Panama Papers with Neo4j: https://neo4j.com/blog/analyzing-panama-papers-neo4j/
Analyzing the Paradise Papers with Neo4j: https://neo4j.com/blog/analyzing-paradise-papers-neo4j/
Panama Papers Case Study: https://neo4j.com/case-studies/the-international-consortium-of-investigative-journalists-icij/
Swiss Leaks Case Study: https://neo4j.com/case-studies/icij/

Get Started:

Graph Databases for Beginners ebook: https://neo4j.com/whitepapers/graph-databases-beginners-ebook/
Graph Databases for Dummies: https://neo4j.com/graph-databases-for-dummies/
Graph Databases (O’Reilly Media): https://neo4j.com/graph-databases-book/

Email me blog updates!

Yes! Please email the latest blog posts. I can unsubscribe at any time.Subscribe Me

The information you provide will be used in accordance with the terms of our privacy policy.AML anti-money laundering cypher Data Visualization Financial Crimes Enforcement Network FinCEN Files graph data science icij Knowledge Graph money laundering

Neo4j Community Disclaimer

Michael Hunger, William Lyon & Rik Van Bruggen Image

Author

Michael Hunger, William Lyon & Rik Van Bruggen, Neo4j

Michael Hunger has been passionate about software development for a very long time. For the last few years he has been working with Neo4j filling many roles. As caretaker of the Neo4j community and ecosystem, he especially loves to work with graph-related projects, users and contributors. As a developer, Michael … know more

Sponsored by Neo4J

Analyzing the FinCEN Files in Neo4j

Michael Hunger, William Lyon & Rik Van Bruggen, Neo4j

The FinCEN Files Investigation

Some Context on Neo4j and Data Journalism

The Raw FinCEN Files Data

The Data Import Process

FinCEN Files Data Visualization and Exploration

Querying the FinCEN Files Data

Other Data Visualizations

Using Graph Data Science to Analyze the FinCEN Files Data

Conclusion

Further Resources

Email me blog updates!

Author

Michael Hunger, William Lyon & Rik Van Bruggen, Neo4j

You may also like...

Resources

Search

News

Events

Archives

Sponsored By

HPCC Systems from LexisNexis Risk Solutions

KX

InterSystems

MySQL/Oracle

SingleStore

Supporters

McObject

NEXTGRES

Progress

Raima

Scality

Volt Active Data