High-performance Compliance Capture and Analytics Solution for Financial Institutions. Interview with Michael Hay and Oskar Mencer.
“New regulations such as MIFID II indeed aim at increasing transparency, which in turn requires more precise reporting. These reports require a lot of data to be stored and data capture to be ultra accurate.”– Michael Hay and Oskar Mencer.
Hitachi Data Systems and Maxeler Technologies announced a cooperation around High-performance Compliance Capture and Analytics Solution for Financial Institutions. I have interviewed Michael Hay, VP & CHIEF ENGINEER – HITACHI DATA SYSTEMS, and Oskar Mencer, CEO, CTO, Maxeler Technologies Inc.
Q1. What is Multi-scale Dataflow Computing?
O. Mencer: Generally, Multiscale Dataflow Computing is a computing paradigm aimed at optimizing operational efficiency of computing by computing data as it is moving through a system. We use Dataflow to minimize the sum of all distances that the data has to travel. We overlay Dataflow with a Multiscale approach of vertically optimizing the algorithm, the architecture and arithmetic.
Q2. There is an emerging EU Financial Services directive called MIFID II. This EU directive, and its associated regulation, was designed to help the regulators better handle High Frequency Trading (HFT) and so called Dark Pools, in other words, to increase transparency in the markets. What are the technological demands posed by these new financial legislation and compliance regulations?
M. Hay, O. Mencer: New regulations such as MIFID II indeed aim at increasing transparency, which in turn requires more precise reporting. These reports require a lot of data to be stored and data capture to be ultra accurate. It is an ideal environment for Hitachi data solutions to be combined with Maxeler’s low latency capability.
Q3. To address these challenges, Maxeler Technologies Inc. announced a collaboration with Hitachi Data Systems to offer a high-performance compliance capture and analytics solution. Can you please explain what this solution is about?
M. Hay, O. Mencer: We are combining programmable low latency compute with high capacity “Dataflow-like storage” and modern analytics software. This allows us to attack even the toughest customer challenges and provide competitive advantage within modest development time.
Q4. How can this solution help financial institutions achieve high-frequency, transaction-related record keeping mandated in European Union MiFID II and US Dodd-Frank regulations?
M. Hay: Hitachi’s Data Lake solutions can help to unify the wide range of regulatory data challenges faced by today’s financial institutions. With high end filtering and analytics capability added to the system, we can address regulation but also integration and security issues all within a single system.
Q5. In this cooperation, you have accomplished an operational prototype through the use of Maxeler’s DFE (Data Flow Engine) network cards, Dataflow based capture/decode capability executing on Dataflow hardware, a hardware accelerated NFS client, Hitachi’s CB500, Pentaho, and Hitachi Unified Storage (HUS). Can you explain how this architecture works?
M. Hay, O. Mencer: Our architecture accomplishes tight integration between realtime on-the-wire compute and storage. The realtime computing ability and reliability of the storage ensure that no data is lost and reports can be generated on time and on budget.
Q6. With your Multiscale Dataflow technology data is streamed from memory onto a chip where the data moves directly from one functional unit to another, without being written to off-chip memory until the entire process is complete. What is the advantage of this solution with respect to a classical ETL process?
O. Mencer: In a classical ETL process the database is in the critical loop. With the Multiscale Dataflow approach we remove the database from the critical loop and utilize an in-memory copy of the data for ultrafast access and in-memory analytics.
Q7. The overall system from packet capture to NFS write does not use a single server side CPU cycle. What does it mean in practice?
O. Mencer: We use a special substrate to create a dataflow computer by connecting vast numbers of arithmetic units, and implement networking state machines right down on the hardware level. This means that the packet flow through the system is in a tight hardware loop and only metadata travels through conventional CPUs. Additionally, on the storage side Hitachi’s Unified Storage also uses Dataflow-like structures to implement a full set of Network File Serving, a Filesystem and smart object caching for file system object I/O. In this way usage of general CPU cycles if further minimized.
The impact to customers is decreased space needed for the solution coupled to significant performance improvements.
Q8. You claim that dataflow computing can accelerate and run different applications orders of magnitude faster than conventional CPUs. Do you have any benchmarking results to share?
O. Mencer: Benchmarks are not applications and there is no claim that we can accelerate tiny benchmarks.
Our technology enables complete applications with a purpose in the real world to run orders of magnitude faster. For example, in 2011 a Tier 1 investment bank won the American Finance Technology Award for their installation of a machine from Maxeler, which reduced the time to calculate risk from 8 hours down to 2 minutes.
Q9. The Maxeler-Hitachi Data Systems solution leverages the new Amazon AWS F1 instance. Why? Can you please elaborate on this?
M. Hay, O. Mencer: Our joint hardware solution complements the F1 instance for on-premise activities in a hybrid cloud setting. It helps that the latest Maxeler generation (MAX5) is fully compatible with F1 and it is therefore easy to build a hybrid cloud solution with a single code base. If the reader would like to learn more we’re open and able to entertain discussions about finding relevant problems to engage on.
MICHAEL HAY | マイケル ヘイ
VP & CHIEF ENGINEER – HITACHI DATA SYSTEMS. GENERAL MGR, DIGITAL SOLUTIONS BUSINESS DEVELOPMENT – HITACHI, SPBD
As Vice President and Chief Engineer at Hitachi Data Systems and a General Manager of the Service Business Platform Division in Japan, Michael leads a global team that contemplates and enacts the future of Hitachi’s expanding ICT and Social Innovation portfolios. Michael engages a variety R&D teams, using a clear understanding of market requirements, to guide direction and inspire innovation. Michael joined HDS in 2001 after serving as CEO and owner of a consultancy company focused on complex Enterprise and Systems management design and deployments. His professional background spans over 20 years and includes stints at IBM, IBM partners, and other IT start-up companies. These roles have helped Michael develop a capacity to define solutions for tomorrow’s problems. Michael holds a Masters in Industrial Engineering with a focus in Human Factors from San Jose State and a Bachelors degree in Electrical Engineering from the University of New Mexico, in Albuquerque, NM.
Oskar Mencer. Prior to founding Maxeler, Oskar was Member of Technical Staff at the Computing Sciences Center at Bell Labs in Murray Hill, leading the effort in “Stream Computing”. He joined Bell Labs after receiving a PhD from Stanford University. Besides driving Maximum Performance Computing (MPC) at Maxeler, Oskar was Consulting Professor in Geophysics at Stanford University and he is also affiliated with the Computing Department at Imperial College London, having received two Best Paper Awards, an Imperial College Research Excellence Award in 2007 and a Special Award from Com.sult in 2012 for “revolutionising the world of computers”.
– Video: What is OpenSPL? Professor Michael J Flynn, Stanford University
OpenSPL is an open standard for a novel Spatial Programming Language. It is based on the core concept that a program executes in space, rather than in time sequence. All operations are assumed to be parallel unless specified to be sequential. This is similar to a factory floor where all operations execute in parallel, but each operation executes a different part of the overall process. Temporal Programming is a recipe for the execution of actions, whereas Spatial Programming builds a factory to execute the recipe.
Follow us on Twitter: @odbmsorg