On Impedance Mismatch. Interview with Reinhold Thurner

by Roberto V. Zicari on August 27, 2012

“Many enterprises sidestep applications with “Shadow IT” to solve planning, reporting and analysis problems” — Reinhold Thurner.

I am coming back to the topic of “Impedance Mismatch”.
I have interviewed one of our experts, Dr. Reinhold Thurner founder of Metasafe GmbH in Switzerland.

RVZ

Q1. In a recent interview José A. Blakeley and Rowan Miller of Microsoft, said that “the impedance mismatch problem has been significantly reduced, but not entirely eliminated”? Do you agree?

Thurner: Yes I agree, with some reservations and only for the special case for the impedance mismatch between a conceptual model, a relational database and an oo-program. However even an advanced ORM is not really a solution for the more general case of complex data which affects any (also non oo) programmer and especially also an end user.

Q2. Could you please explain better what you mean here?

Thurner: My reservations concern the tools and the development process: Several standalone tools (model-designer, mapper, code generator, schema-loader) are connected by intermediate files. Is is difficult if not impossible to develop a transparent model transformation which relieves the developer from the necessity to “think” on both levels – the original model and the transformed model – at the same time. The conceptual models can be “painted” easily, but they cannot be “executed” and tested with test data.
They are practically disjoint from the instance data. It takes a lot of discipline to avoid that changes in the data structures are directly applied to the final database with the consequence that the conceptual model is lost.
I rephrase from a document about ADO.net: “Most significant applications involve a conceptual design phase early in the application development lifecycle. Unfortunately, however, the conceptual data model is captured inside a database design tool that has little or no connection with the code and the relational schema used to implement the application. The database design diagrams created in the early phases of the application life cycle usually stay “pinned to a wall” growing increasingly disjoint from the reality of the application implementation with time.”

Q3. You are criticizing the process and the tools – what is the alternative?

Thurner: I compare this tool-architecture with the idea of an “integrated view of conceptual modeling, databases and CASE” (actually the title of one of your books). The basic ideas did exist already in the early 90es but were not realized because the means to implement a “CASE database” were missing: modeling concepts (OMG), languages (java, c#), frameworks (Eclipse), big cheap memory, powerful cpus, big screens etc. Today we are in a much better position and it is now feasible to create a data platform (i.e. a database for CASE) for tool integration. As José A. Blakeley argues, ‘(…) modern applications and data services need to target a higher-level conceptual model based on entities and relationships (…)’. A modern data platform is a prerequisite to supports such a concept?

Q4. Could you give us some examples of typical (impedance mismatch) problems still existing in the enterprise? How are they practically handled in the enterprise?

Thurner: As a consequence of the problems with the impedance mismatch some applications don’t use database technology at all or develop a thick layer of proprietary services which are in fact a sort of private DBMS.
Many enterprises sidestep applications with “Shadow
IT” to solve planning, reporting and analysis problems– i.e. Spreadsheets instead of databases, mail for multi-user access and data exchange, security by obscurity and a lot of manual copy and paste.
Another important area is development tools: Development tools deal with a large number of highly interconnected artifacts which must be managed in configurations and versions. These artifacts are still stored in files, libraries and some in relational databases with a thick layer on top. A proper repository would provide better services for a tool developer and helps to create products which are more flexible and easier to use.
Data management and information dictionary: IT-Governance (COBIT) stipulates that a company should maintain an “information dictionary” which contains all “information assets”, their logical structure, the physical implementations and the responsibilities (RACI-Charts, data steward). The common warehouse model (OMG) describes the model of the most common types of data stores – which is a good start: but companies with several DMBSs, hundreds of databases, servers and programs accessing thousands of tables and IMS-segments need a proper database to store the instances to make the “information model real”. Users of such a dictionary (designers, programmers, testers, integration services, operations, problem management etc.) need an easy to use query-language to access these data in an explorative manner.

Q5. If ORM technology cannot solve this kind of problem? What are the alternatives?

Thurner: The essence of ORM-technology is to create a bridge between the “existing eco-system of databases based on the relational model and the conceptual model”. The “relational model” is not the one and only possible approach to persist data. Data storage technology has moved up the ladder from sequential files to index-sequential, to multi-index, to codasyl, to hierarchical (IMS) and today’s market leader RDBMS. This is certainly not the end and the future seems to become very colorful. As Michael Stonebraker explains “In summary, there may be a substantial number of domain-specific database engines with differing capabilities off into the future. See his paper “One Size fits all – an Idea whose time has come and gone“.
ADO.net has been described as “a part of the broader Microsoft Data Access vision” and covers a specific subset of applications. Is the “other part” – the “executable conceptual model” which was mentioned by Peter Chen in a discussion with José Blakely about “The future of Database Systems”?
I am convinced that an executable conceptual model will play an important role for the aforementioned problems: A DMBS with an entity-relationship model implements the conceptual model without an impedance mismatch. To succeed it needs however all the properties José mentioned like queries, transactions, access-rights and integrated tools.

Q6. You started a company which developed a system called Metasafe-Repository. What is it?

Thurner: It started long ago with a first version developed in C, which was e.g. used in a reengineering project to manage a few hundred types, one million instances and about five million bidirectional relationships. In 2006 we decided to redevelop the system from scratch in java and the necessary tools with the Eclipse framework. We started with the basic elements – multi-level architecture based on the entity-relationship-model, integration of models and instance-data, ACID transactions, versioning and user access rights. During development the system morphed from the initial idea of a repository-service to a complete DBMS. We developed a set of entirely model driven tools – modeler, browser, import-/export utilities, Excel-interface, ADO-Driver for BIRT etc.
Metasafe has a multilevel structure: an internal metamodel, the global data model, external models as subsets (views) of the global data model and the instance data – in OMG-Terms it stretches from M3 to M0. All types (M2, M1) are described by meta-attributes as defined in the metamodel. User access rights to models and instance data are tied to the external models. Entity instances (M0) may exist in several manifestations (Catalog, Version, Variant). An extension of the data model e.g. by additional data types, entity types, relationship types or submodels can be defined using the metaModeler tool (or via the API by a program). From the moment the model changes are committed, the database is ready to accept instance data for the new types without unload/reload of the database.

Q7. Is the Metasafe repository the solution to the impedance mismatch problem?

Thurner: It is certainly a substantial step forward because we made the conceptual model and the database definition one and the same. We take the conceptual model literally by its word: If an ‘Order has Orderdetails’, we tell the database to create two entity types ‘Order’ and ‘Orderdetails’ and the relation ‘has’ between them. This way Metasafe implements an executable conceptual model with all the required properties of a real database management system: an open API, versioning, “declarative, model related queries and transactions” etc. Our own set of tools and especially the integration of BIRT (the eclipse Business Intelligence and Reporting Tool) demonstrate how it can be done. Our graphical erSQL query builder is even integrated into the BIRT designer. The erSQL queries are translated on the fly and BIRT accesses our database without any intermediate files.

Q8: What is the API of the Metasafe repository?

Thurner: Metasafe provides an object-oriented Java-API for all methods required to search, create, read, update, delete the elements of the database – i.e. schemas, user groups /users, entities, relationships and their attributes – both on type- and on instance-level. All the tools of Metasafe (modeler, browser, import/export, query builder etc.) are built with this public API. This approach has led to an elaborate set of methods to support an application programmer. The erSQL query-builder and also the query-translator and processor were implemented with this API. An erSQL query can be embedded in a java-program to retrieve a result-set (including its metadata) or to export the result-set.
In early versions we had a C#-version in parallel but we discontinued this branch when we started with the development of the tools based on Eclipse RCP. The reimplementation of the core in C# would be relatively easy. I think that also the tools could be reimplemented because they are entirely model-driven.

Q9 How does Metasafeˈs query language differ from the Microsoft Entity Framework built-in query capabilities (i.e. Language Integrated Query (LINQ)?

Thurner: It is difficult to compare because Metasafe’s ersql query-language was designed with respect to the special nature of an Entity-Relationship model with heavily cross linked information. So the erSQL query language maps directly to the conceptual model. Also “end users” can create queries with the graphical query builder with point and click on the graphical representation of the conceptual model to identify the path through the model and to collect the information of interest.

The queries are translated on the fly and processed by the query processor. The validation and translation of a query into a command structure of the query processor is a matter or milliseconds. The query processor returns result sets of metadata and typed instance data. The query result can also be exported as Excel-Table or as XML-file. In “read-mode” the result of each retrieval step (instance objects and their attributes) is returned to the invoking program instead of building the complete result set. A query represents a sort of “user” model and is also documented graphically. “End users” can easily create queries and retrieve data from the database. erSQL and the graphical query builder is fully integrated in BIRT to create reports on the fly.
The present version supports only information retrieval. We plan to extend it by a ” … for update” feature which locks all selected entity instances for further operation.
E.g. an update query for {an order and all its order items and products} would lock “the order” until backout or commit.

Q10. There are concerns about the performance and the overhead generated by ORM technology. Is performance an issue for Metasafe?

Thurner: Performance is always an issue when the number of concurrent users and the size and complexity of the data grow. The system works quite well for medium size systems with a few hundred types, a few million instances and a few GBs. The performance depends on the translation of the logical requests into physical access commands and on the execution of the physical access to the persistence. Metasafe uses a very limited functionality of an RDBMS (currently SQLServer, Derby, Oracle) for persistence. Locking, transactions, multi-user management is handled by Metasafe; the locking tables are kept in memory. After a commit it writes all changes in one burst to the database. We could of course use an in-memory DBMS to gain performance. E.g. VoltDB with the direct transaction access could be integrated easily and would certainly lead to superior performance.
We have also another kind of performance in mind – the user performance. For many applications the number of milliseconds to execute a transaction are less important than the ability to quickly create or change a database and to create and launch queries in a matter of minutes. Metasafe is especially helpful for this kind of application.

Q11. What problems is Metasafe designed to solve?

Thurner: Metasafe is designed as a generic data platform for medium sized (XX GB) model-driven applications. The primary purpose is the support for applications with large, complex and volatile data structures as tools, models, catalogs or process managers etc. Metasafe could be used to replace some legacy repositories.
Metasafe is certainly the best data platform (repository) for the construction of an integrated development environment. Metasafe can also serve as DBMS for business applications.
We evaluate also the possibilities to use that Metasafe DBMS as data platform for portable devices as phones and tablet computers: This could be a real killer application for application developers.

Q12. How do you position Metasafe in the market?

Thurner: I had the vision of an entity relationship base database system as future data platform and decided to develop Metasafe to a really useful level without the pressure of the market (namely the first time users). Now we have our product on the necessary level of quality and we are planning the next steps. It could be the “open source approach” for a limited version or the integration into a larger organization.
We have a number of applications and POCs but we have no substantial customer base yet, which would require an adequate support and sales organization. But we have not the intension to convert a successful development setup into a mediocre service and sales organization. We are not under time pressure and are looking at a number of possibilities.

Q13. How can the developers community test your system?

Thurner: We provide an evaluation version upon request.

—————————-
Related Posts

– Do we still have an impedance mismatch problem? – Interview with José A. Blakeley and Rowan Miller. by Roberto V. Zicari on May 21, 2012

Resources

– “Implementing the Executable Conceptual Model (ECM)” (download as .pdf),
by Dr. Reinhold Thurner, Metasafe.

ODBMS.org Free Resources on:
– Entity Framework (EF) Resources
– ORM Technology
– Object-Relational Impedance Mismatch

From → Uncategorized

No comments yet

On Impedance Mismatch. Interview with Reinhold Thurner

Leave a Reply Cancel reply

About the author

Archives

Meta

About

Flickr

Search

On Impedance Mismatch. Interview with Reinhold Thurner

Leave a Reply Cancel reply

About the author

Tags

Archives

Meta

About

Flickr

Search