Big Data: Three questions to McObject.
“In a nutshell, pipelining is a programming technique that combines functions from the database system’s library of vector-based functions into an assembly line of processing for market data, with the output of one function becoming input for the next.”–Steven T. Graves.
The fourth interview in the “Big Data: three questions to “ series of interviews, is with Steven T. Graves, President and CEO McObject
Q1. What is your current product offering?
Steven T. Graves: McObject has two product lines. One is the eXtremeDB product family. eXtremeDB is a real-time embedded database system built on a core in-memory database system (IMDS) architecture, with the eXtremeDB IMDS edition representing the “standard” product. Other eXtremeDB editions offer special features and capabilities such as an optional SQL API, high availability, clustering, 64-bit support, optional and selective persistent storage, transaction logging and more.
In addition, our eXtremeDB Financial Edition database system targets real-time capital markets systems such as algorithmic trading and risk management (and has its own Web site). eXtremeDB Financial Edition comprises a super-set of the individual eXtremeDB editions (bundling together all specialized libraries such as clustering, 64-bit support, etc.) and offers features including columnar data handling and vector-based statistical processing for managing market data (or any other type of time series data).
Features shared across the eXtremeDB product family include: ACID-compliant transactions; multiple application programming interfaces (a native and type-safe C/C++ API; SQL/ODBC/JDBC; native Java, C# and Python interfaces); multi-user concurrency with an optional multi-version concurrency control (MVCC) transaction manager; event notifications; cache prioritization; and support for multiple database indexes (b-tree, r-tree, kd-tree, hash, Patricia trie, etc.). eXtremeDB’s footprint is small, with an approximately 150K code size. eXtremeDB is available for a wide range of server, real-time operating system (RTOS) and desktop operating systems, and McObject provides eXtremeDB source code for porting.
McObject’s second product offering is the Perst open source, object-oriented embedded database system, available in all-Java and all-C# (.NET) versions. Perst is small (code size typically less than 500K) and very fast, with features including ACID-compliant transactions; specialized collection classes (such as a classic b-tree implementation; r-tree indexes for spatial data; database containers optimized for memory-only access, etc.); garbage collection; full-text search; schema evolution; a “wrapper” that provides a SQL-like interface (SubSQL); XML import/export; database replication, and more.
Perst also operates in specialized environments. Perst for .NET includes support for .NET Compact Framework, Windows Phone 8 (WP8) and Silverlight (check out our browser-based Silverlight CRM demo, which showcases Perst’s support for storage on users’ local file systems). The Java edition supports the Android smartphone platform, and includes the Perst Lite embedded database for Java ME.
Q2. Who are your current customers and how do they typically use your products?
Steven T. Graves: eXtremeDB initially targeted real-time embedded systems, often residing in non-PC devices such as set-top boxes, telecom switches or industrial controllers.
There are literally millions of eXtremeDB -based devices deployed by our customers; a few examples are set-top boxes from DIRECTV (eXtremeDB is the basis of an electronic programming guide); F5 Networks’ BIG-IP network infrastructure (eXtremeDB is built into the devices’ proprietary embedded operating system); and BAE Systems (avionics in the Panavia Tornado GR4 combat jet). A recent new customer in telecom/networking is Compass-EOS, which has released the first photonics-based core IP router, using eXtremeDB High Availability to manage the device’s control plane database.
Addition of “enterprise-friendly” features (support for SQL, Java, 64-bit, MVCC, etc.) drove eXtremeDB’s adoption for non-embedded systems that demand fast performance. Examples include software-as-a-service provider hetras Gmbh (eXtremeDB handles the most performance-intensive queries in its Cloud-based hotel management system); Transaction Network Services (eXtremeDB is used in a highly scalable system for real-time phone number lookups/ routing); and MeetMe.com (formerly MyYearbook.com – eXtremeDB manages data in social networking applications).
In the financial industry, eXtremeDB is used by a variety of trading organizations and technology providers. Examples include the broker-dealer TradeStation (McObject’s database technology is part of its next-generation order execution system); Financial Technologies of India, Ltd. (FTIL), which has deployed eXtremeDB in the order-matching application used across its network of financial exchanges in Asia and the Middle East; and NSE.IT (eXtremeDB supports risk management in algorithmic trading).
Users of Perst are many and varied, too. You can find Perst in many commercial software applications such as enterprise application management solutions from the Wily Division of CA. Perst has also been adopted for community-based open source projects, including the Frost client for the Freenet global peer-to-peer network. Some of the most interesting Perst-based applications are mobile. For example, 7City Learning, which provides training for financial professionals, gives students an Android tablet with study materials that are accessed using Perst. Several other McObject customers use Perst in mobile medical apps.
Q3. What are the main new technical features you are currently working on and why?
Steven T. Graves: One feature we’re very excited about is the ability to pipeline vector-based statistical functions in eXtremeDB Financial Edition – we’ve even released a short video and a 10-page white paper describing this capability. In a nutshell, pipelining is a programming technique that combines functions from the database system’s library of vector-based functions into an assembly line of processing for market data, with the output of one function becoming input for the next.
This may not sound unusual, since almost any algorithm or program can be viewed as a chain of operations acting on data.
But this pipelining has a unique purpose and a powerful result: it keeps market data inside CPU cache as the data is being worked.
Without pipelining, the results of each function would typically be materialized outside cache, in temporary tables residing in main memory. Handing interim results back and forth “across the transom” between CPU cache and main memory imposes significant latency, which is eliminated by pipelining. We’ve been improving this capability by adding new statistical functions to the library. (For an explanation of pipelining that’s more in-depth than the video but shorter than the white paper, check out this article on the financial technology site Low-Latency.com.)
We are also adding to the capabilities of eXtremeDB Cluster edition to make clustering faster and more flexible, and further simplify cluster administration. Improvements include a local tables option, in which database tables can be made exempt from replication, but shareable through a scatter/gather mechanism. Dynamic clustering, added in our recent v. 5.0 upgrade, enables nodes to join and leave clusters without interrupting processing. This further simplifies administration for a clustering database technology that counts minimal run-time maintenance as a key benefit. On selected platforms, clustering now supports the Infiniband switched fabric interconnect and Message Passing Interface (MPI) standard. In our tests, these high performance networking options accelerated performance more than 7.5x compared to “plain vanilla” gigabit networking (TCP/IP and Ethernet).