ODBMS Industry Watch » impedence mismatch http://www.odbms.org/blog Trends and Information on Big Data, New Data Management Technologies, Data Science and Innovation. Sun, 02 Apr 2017 17:59:10 +0000 en-US hourly 1 http://wordpress.org/?v=4.2.13 Big Data from Space: the “Herschel” telescope. http://www.odbms.org/blog/2013/08/big-data-from-space-the-herschel-telescope/ http://www.odbms.org/blog/2013/08/big-data-from-space-the-herschel-telescope/#comments Fri, 02 Aug 2013 12:45:02 +0000 http://www.odbms.org/blog/?p=2169

” One of the biggest challenges with any project of such a long duration is coping with change. There are many aspects to coping with change, including changes in requirements, changes in technology, vendor stability, changes in staffing and so on”–Jon Brumfitt.

On May 14, 2009, the European Space Agency launched an Arianne 5 rocket carrying the largest telescope ever flown: the “Herschel” telescope, 3.5 meters in diameter.

I first did an interview with Dr. Jon Brumfitt, System Architect & System Engineer of Herschel Scientific Ground Segment, at the European Space Agency in March 2011. You can read that interview here.

Two years later, I wanted to know the status of the project. This is a follow up interview.

RVZ

Q1. What is the status of the mission?

Jon Brumfitt: The operational phase of the Herschel mission came to an end on 29th April 2013, when the super-fluid helium used to cool the instruments was finally exhausted. By operating in the far infra-red, Herschel has been able to see cold objects that are invisible to normal telescopes.
However, this requires that the detectors are cooled to an even lower temperature. The helium cools the instruments down to 1.7K (about -271 Celsius). Individual detectors are then cooled down further to about 0.3K. This is very close to absolute zero, which is the coldest possible temperature. The exhaustion of the helium marks the end of new observations, but it is by no means the end of the mission.
We still have a lot of work to do in getting the best results from the data processing to give astronomers a final legacy archive of high-quality data to work with for years to come.

The spacecraft has been in orbit around a point known as the second Lagrangian point “L2″, which is about 1.5 million kilometres from Earth (around four times as far away as the Moon). This location provided a good thermal environment and a relatively unrestricted view of the sky. The spacecraft cannot be left in this orbit because regular correction manoeuvres would be needed. Consequently, it is being transferred into a “parking” orbit around the Sun.

Q2. What are the main results obtained so far by using the “Herschel” telescope?

Jon Brumfitt: That is a difficult one to answer in a few sentences. Just to take a few examples, Herschel has given us new insights into the way that stars form and the history of star formation and galaxy evolution since the big-bang.
It has discovered large quantities of cold water vapour in the dusty disk surrounding a young star, which suggests the possibility of other water covered planets. It has also given us new evidence for the origins of water on Earth.
The following are some links giving more detailed highlights from the mission:

– Press
– Results
– Press Releases
– Latest news

With its 3.5 metre diameter mirror, Herschel is the largest space telescope ever launched. The large mirror not only gives it a high sensitivity but also allows us to observe the sky with a high spatial resolution. So in a sense every observation we make is showing us something we have never seen before. We have performed around 35,000 science observations, which have already resulted in over 600 papers being published in scientific journals. There are many years of work ahead for astronomers in interpreting the results, which will undoubtedly lead to many new discoveries.

Q3. How much data did you receive and process so far? Could you give us some up to date information?

Jon Brumfitt: We have about 3 TB of data in the Versant database, most of which is raw data from the spacecraft. The data received each day is processed by our data processing pipeline and the resulting data products, such as images and spectra, are placed in an archive for access by astronomers.
Each time we make a major new release of the software (roughly every six months at this stage), with improvements to the data processing, we reprocess everything.
The data processing runs on a grid with around 35 nodes, each with typically 8 cores and between 16 and 256 GB of memory. This is able to process around 40 days worth of data per day, so it is possible to reprocess everything in a few weeks. The data in the archive is stored as FITS files (a standard format for astronomical data).
The archive uses a relational (PostgreSQL) database to catalogue the data and allow queries to find relevant data. This relational database is only about 60 GB, whereas the product files account for about 60 TB.
This may reduce somewhat for the final archive, once we have cleaned it up by removing the results of earlier processing runs.

Q4. What are the main technical challenges in the data management part of this mission and how did you solve them?

Jon Brumfitt: One of the biggest challenges with any project of such a long duration is coping with change. There are many aspects to coping with change, including changes in requirements, changes in technology, vendor stability, changes in staffing and so on.

The lifetime of Herschel will have been 18 years from the start of software development to the end of the post-operations phase.
We designed a single system to meet the needs of all mission phases, from early instrument development, through routine in-flight operations to the end of the post-operations phase. Although the spacecraft was not launched until 2009, the database was in regular use from 2002 for developing and testing the instruments in the laboratory. By using the same software to control the instruments in the laboratory as we used to control them in flight, we ended up with a very robust and well-tested system. We call this approach “smooth transition”.

The development approach we adopted is probably best classified as an Agile iterative and incremental one. Object orientation helps a lot because changes in the problem domain, resulting from changing requirements, tend to result in localised changes in the data model.
Other important factors in managing change are separation of concerns and minimization of dependencies, for example using component-based architectures.

When we decided to use an object database, it was a new technology and it would have been unwise to rely on any database vendor or product surviving for such a long time. Although work was under way on the ODMG and JDO standards, these were quite immature and the only suitable object databases used proprietary interfaces.
We therefore chose to implement our own abstraction layer around the database. This was similar in concept to JDO, with a factory providing a pluggable implementation of a persistence manager. This abstraction provided a route to change to a different object database, or even a relational database with an object-relational mapping layer, should it have proved necessary.

One aspect that is difficult to abstract is the use of queries, because query languages differ. In principle, an object database could be used without any queries, by navigating to everything from a global root object. However, in practice navigation and queries both have their role. For example, to find all the observation requests that have not yet been scheduled, it is much faster to perform a query than to iterate by navigation to find them. However, once an observation request is in memory it is much easier and faster to navigate to all the associated objects needed to process it. We have used a variety of techniques for encapsulating queries. One is to implement them as methods of an extent class that acts as a query factory.

Another challenge was designing a robust data model that would serve all phases of the mission from instrument development in the laboratory, through pre-flight tests and routine operations to the end of post-operations. We approached this by starting with a model of the problem domain and then analysing use-cases to see what data needed to be persistent and where we needed associations. It was important to avoid the temptation to store too much just because transitive persistence made it so easy.

One criticism that is sometimes raised against object databases is that the associations tend to encode business logic in the object schema, whereas relational databases just store data in a neutral form that can outlive the software that created it; if you subsequently decide that you need a new use-case, such as report generation, the associations may not be there to support it. This is true to some extent, but consideration of use cases for the entire project lifetime helped a lot. It is of course possible to use queries to work-around missing associations.

Examples are sometimes given of how easy an object database is to use by directly persisting your business objects. This may be fine for a simple application with an embedded database, but for a complex system you still need to cleanly decouple your business logic from the data storage. This is true whether you are using a relational or an object database. With an object database, the persistent classes should only be responsible for persistence and referential integrity and so typically just have getter and setter methods.
We have encapsulated our persistent classes in a package called the Core Class Model (CCM) that has a factory to create instances. This complements the pluggable persistence manager. Hence, the application sees the persistence manager and CCM factories and interfaces, but the implementations are hidden.
Applications define their own business classes which can work like decorators for the persistent classes.

Q5. What is your experience in having two separate database systems for Herschel? A relational database for storing and managing processed data products and an object database for storing and managing proposal data, mission planning data, telecommands and raw (unprocessed) telemetry?

Jon Brumfitt: There are essentially two parts to the ground segment for a space observatory.
One is the “uplink” which is used for controlling the spacecraft and instruments. This includes submission of observing proposals, observation planning, scheduling, flight dynamics and commanding.
The other is the “downlink”, which involves ingesting and processing the data received from the spacecraft.

On some missions the data processing is carried out by a data centre, which is separate from spacecraft operations. In that case there is a very clear separation.
On Herschel, the original concept was to build a completely integrated system around an object database that would hold all uplink and downlink data, including processed data products. However, after further analysis it became clear that it was better to integrate our product archive with those from other missions. This also means that the Herschel data will remain available long after the project has finished. The role of the object database is essentially for operating the spacecraft and storing the raw data.

The Herschel archive is part of a common infrastructure shared by many of our ESA science projects. This provides a uniform way of accessing data from multiple missions.
The following is a nice example of how data from Herschel and our XMM-Newton X-ray telescope have been combined to make a multi-spectral image of the Andromeda Galaxy.

Our archive, in turn, forms part of a larger international archive known as the “Virtual Observatory” (VO), which includes both space and ground-based observatories from all over the world.

I think that using separate databases for operations and product archiving has worked well. In fact, it is more the norm rather than the exception. The two databases serve very different roles.
The uplink database manages the day-to-day operations of the spacecraft and is constantly being updated. The uplink data forms a complex object graph which is accessed by navigation, so an object database is well suited.
The product archive is essentially a write-once-read-many repository. The data is not modified, but new versions of products may be added as a result of reprocessing. There are a large number of clients accessing it via the Internet. The archive database is a catalogue containing the product meta-data, which can be queried to find the relevant product files. This is better suited to a relational database.

The motivation for the original idea of using a single object database for everything was that it allowed direct association between uplink and downlink data. For example, processed products could be associated with their observation requests. However, using separate databases does not prevent one database being queried with an observation identifier obtained from the other.
One complication is that processing an observation requires both downlink data and the associated uplink data.
We solved this by creating “uplink products” from the relevant uplink data and placing them in the archive. This has the advantage that external users, who do not have access to the Versant database, have everything they need to process the data themselves.

Q6. What are the main lessons learned so far in using Versant object database for managing telemetry data and information on steering and calibrating scientific on-board instruments?

Jon Brumfitt: Object databases can be very effective for certain kinds of application, but may have less benefit for others. A complex system typically has a mixture of application types, so the advantages are not always clear cut. Object databases can give a high performance for applications that need to navigate through a complex object graph, particularly if used with fairly long transactions where a significant part of the object graph remains in memory. Web (JavaEE) applications lose some of the benefit because they typically perform many short transactions with each one performing a query. They also use additional access layers that result in a system which loses the simplicity of the transparent persistence of an object database.

In our case, the object database was best suited for the uplink. It simplified the uplink development by avoiding object-relational mapping and the complexity of a design based on JDBC or EJB 2. Nowadays with JPA, relational databases are much easier to use for object persistence, so the rationale for using an object database is largely determined by whether the application can benefit from fast navigational access and how much effort is saved in mapping. There are now at least two object database vendors that support both JDO and JPA, so the distinction is becoming somewhat blurred.

For telemetry access we query the database instead of using navigation, as the packets don’t fit neatly into a single containment hierarchy. Queries allows packets to be accessed by many different criteria, such as time, instrument, type, source and so on.
Processing calibration observations does not introduce any special considerations as far as the database is concerned.

Q7. Did you have any scalability and or availability issues during the project? If yes, how did you solve them?

Jon Brumfitt: Scalability would have been an important issue if we had kept to the original concept of storing everything including products in a single database. However, using the object database for just uplink and telemetry meant that this was not a big issue.

The data processing grid retrieves the raw telemetry data from the object database server, which is a 16-core Linux machine with 64 GB of memory. The average load on the server is quite low, but occasionally there have been high peak loads from the grid that have saturated the server disk I/O and slowed down other users of the database. Interactive applications such as mission planning need a rapid response, whereas batch data processing is less critical. We solved this by implementing a mechanism to spread out the grid load by treating the database as a resource.

Once a year, we have made an “Announcement of Opportunity” for astronomers to propose observations that they would like to perform with Herschel. It is only human nature that many people leave it until the last minute and we get a very high peak load on the server in the last hour or two before the deadline! We have used a separate server for this purpose, rather than ingesting proposals directly into our operational database. This has avoided any risk of interfering with routine operations. After the deadline, we have copied the objects into the operational database.

Q8. What about the overall performance of the two databases? What are the lessons learned?

Jon Brumfitt: The databases are good at different things.
As mentioned before, an object database can give a high performance for applications involving a complex object graph which you navigate around. An example is our mission planning system. Object persistence makes application design very simple, although in a real system you still need to introduce layers to decouple the business logic from the persistence.

For the archive, on the other hand, a relational database is more appropriate. We are querying the archive to find data that matches a set of criteria. The data is stored in files rather than as objects in the database.

Q9. What are the next steps planned for the project and the main technical challenges ahead?

Jon Brumfitt: As I mentioned earlier, the coming post-operations phase will concentrate on further improving the data processing software to generate a top-quality legacy archive, and on provision of high-quality support documentation and continued interactive support for the community of astronomers that forms our “customer base”. The system was designed from the outset to support all phases of the mission, from early instrument development tests in the laboratory, though routine operations to the end of the post-operations phase of the mission. The main difference moving into post-operations is that we will stop uplink activities and ingesting new telemetry. We will continue to reprocess all the data regularly as improvements are made to the data processing software.

We are currently in the process of upgrading from Versant 7 to Versant 8.
We have been using Versant 7 since launch and the system has been running well, so there has been little urgency to upgrade.
However, with routine operations coming to an end, we are doing some “technology refresh”, including upgrading to Java 7 and Versant 8.

Q10. Anything else you wish to add?

Jon Brumfitt: These are just some personal thoughts on the way the database market has evolved over the lifetime of Herschel. Thirteen years ago, when we started development of our system, there were expectations that object databases would really take off in line with the growing use of object orientation, but this did not happen. Object databases still represent rather a niche market. It is a pity there is no open-source object-database equivalent of MySQL. This would have encouraged more people to try object databases.

JDO has developed into a mature standard over the years. One of its key features is that it is “architecture neutral”, but in fact there are very few implementations for relational databases. However, it seems to be finding a new role for some NoSQL databases, such as the Google AppEngine datastore.
NoSQL appears to be taking off far quicker than object databases did, although it is an umbrella term that covers quite a few kinds of datastore. Horizontal scaling is likely to be an important feature for many systems in the future. The relational model is still dominant, but there is a growing appreciation of alternatives. There is even talk of “Polyglot Persistence” using different kinds of databases within a system; in a sense we are doing this with our object database and relational archive.

More recently, JPA has created considerable interest in object persistence for relational databases and appears to be rapidly overtaking JDO.
This is partly because it is being adopted by developers of enterprise applications who previously used EJB 2.
If you look at the APIs of JDO and JPA they are actually quite similar apart from the locking modes. However, there is an enormous difference in the way they are typically used in practice. This is more to do with the fact that JPA is often used for enterprise applications. The distinction is getting blurred by some object database vendors who now support JPA with an object database. This could expand the market for object databases by attracting some traditional relational type applications.

So, I wonder what the next 13 years will bring! I am certainly watching developments with interest.
——

Dr Jon Brumfitt, System Architect & System Engineer of Herschel Scientific Ground Segment, European Space Agency.

Jon Brumfitt has a background in Electronics with Physics and Mathematics and has worked on several of ESA’s astrophysics missions, including IUE, Hipparcos, ISO, XMM and currently Herschel. After completing his PhD and a post-doctoral fellowship in image processing, Jon worked on data reduction for the IUE satellite before joining Logica Space and Defence in 1980. In 1984 he moved to Logica’s research centre in Cambridge and then in 1993 to ESTEC in the Netherlands to work on the scientific ground segments for ISO and XMM. In January 2000, he joined the newly formed Herschel team as science ground segment System Architect. As Herschel approached launch, he moved down to the European Space Astronomy Centre in Madrid to become part of the Herschel Science Operations Team, where he is currently System Engineer and System Architect.

Related Posts

The Gaia mission, one year later. Interview with William O’Mullane. January 16, 2013

Objects in Space: “Herschel” the largest telescope ever flown. March 18, 2011

Resources

Introduction to ODBMS By Rick Grehan

ODBMS.org Resources on Object Database Vendors.

—————————————
You can follow ODBMS.org on Twitter : @odbmsorg

##

]]>
http://www.odbms.org/blog/2013/08/big-data-from-space-the-herschel-telescope/feed/ 0
On Impedance Mismatch. Interview with Reinhold Thurner http://www.odbms.org/blog/2012/08/on-impedance-mismatch-interview-with-reinhold-thurner/ http://www.odbms.org/blog/2012/08/on-impedance-mismatch-interview-with-reinhold-thurner/#comments Mon, 27 Aug 2012 16:34:21 +0000 http://www.odbms.org/blog/?p=1693 “Many enterprises sidestep applications with “Shadow IT” to solve planning, reporting and analysis problems” — Reinhold Thurner.

I am coming back to the topic of “Impedance Mismatch”.
I have interviewed one of our experts, Dr. Reinhold Thurner founder of Metasafe GmbH in Switzerland.

RVZ

Q1. In a recent interview José A. Blakeley and Rowan Miller of Microsoft, said that “the impedance mismatch problem has been significantly reduced, but not entirely eliminated”? Do you agree?

Thurner: Yes I agree, with some reservations and only for the special case for the impedance mismatch between a conceptual model, a relational database and an oo-program. However even an advanced ORM is not really a solution for the more general case of complex data which affects any (also non oo) programmer and especially also an end user.

Q2. Could you please explain better what you mean here?

Thurner: My reservations concern the tools and the development process: Several standalone tools (model-designer, mapper, code generator, schema-loader) are connected by intermediate files. Is is difficult if not impossible to develop a transparent model transformation which relieves the developer from the necessity to “think” on both levels – the original model and the transformed model – at the same time. The conceptual models can be “painted” easily, but they cannot be “executed” and tested with test data.
They are practically disjoint from the instance data. It takes a lot of discipline to avoid that changes in the data structures are directly applied to the final database with the consequence that the conceptual model is lost.
I rephrase from a document about ADO.net: “Most significant applications involve a conceptual design phase early in the application development lifecycle. Unfortunately, however, the conceptual data model is captured inside a database design tool that has little or no connection with the code and the relational schema used to implement the application. The database design diagrams created in the early phases of the application life cycle usually stay “pinned to a wall” growing increasingly disjoint from the reality of the application implementation with time.”

Q3. You are criticizing the process and the tools – what is the alternative?

Thurner: I compare this tool-architecture with the idea of an “integrated view of conceptual modeling, databases and CASE” (actually the title of one of your books). The basic ideas did exist already in the early 90es but were not realized because the means to implement a “CASE database” were missing: modeling concepts (OMG), languages (java, c#), frameworks (Eclipse), big cheap memory, powerful cpus, big screens etc. Today we are in a much better position and it is now feasible to create a data platform (i.e. a database for CASE) for tool integration. As José A. Blakeley argues, ‘(…) modern applications and data services need to target a higher-level conceptual model based on entities and relationships (…)’. A modern data platform is a prerequisite to supports such a concept?

Q4. Could you give us some examples of typical (impedance mismatch) problems still existing in the enterprise? How are they practically handled in the enterprise?

Thurner: As a consequence of the problems with the impedance mismatch some applications don’t use database technology at all or develop a thick layer of proprietary services which are in fact a sort of private DBMS.
Many enterprises sidestep applications with “Shadow
IT
” to solve planning, reporting and analysis problems– i.e. Spreadsheets instead of databases, mail for multi-user access and data exchange, security by obscurity and a lot of manual copy and paste.
Another important area is development tools: Development tools deal with a large number of highly interconnected artifacts which must be managed in configurations and versions. These artifacts are still stored in files, libraries and some in relational databases with a thick layer on top. A proper repository would provide better services for a tool developer and helps to create products which are more flexible and easier to use.
Data management and information dictionary: IT-Governance (COBIT) stipulates that a company should maintain an “information dictionary” which contains all “information assets”, their logical structure, the physical implementations and the responsibilities (RACI-Charts, data steward). The common warehouse model (OMG) describes the model of the most common types of data stores – which is a good start: but companies with several DMBSs, hundreds of databases, servers and programs accessing thousands of tables and IMS-segments need a proper database to store the instances to make the “information model real”. Users of such a dictionary (designers, programmers, testers, integration services, operations, problem management etc.) need an easy to use query-language to access these data in an explorative manner.

Q5. If ORM technology cannot solve this kind of problem? What are the alternatives?

Thurner: The essence of ORM-technology is to create a bridge between the “existing eco-system of databases based on the relational model and the conceptual model”. The “relational model” is not the one and only possible approach to persist data. Data storage technology has moved up the ladder from sequential files to index-sequential, to multi-index, to codasyl, to hierarchical (IMS) and today’s market leader RDBMS. This is certainly not the end and the future seems to become very colorful. As Michael Stonebraker explains “In summary, there may be a substantial number of domain-specific database engines with differing capabilities off into the future. See his paper “One Size fits all – an Idea whose time has come and gone“.
ADO.net has been described as “a part of the broader Microsoft Data Access vision” and covers a specific subset of applications. Is the “other part” – the “executable conceptual model” which was mentioned by Peter Chen in a discussion with José Blakely about “The future of Database Systems”?
I am convinced that an executable conceptual model will play an important role for the aforementioned problems: A DMBS with an entity-relationship model implements the conceptual model without an impedance mismatch. To succeed it needs however all the properties José mentioned like queries, transactions, access-rights and integrated tools.

Q6. You started a company which developed a system called Metasafe-Repository. What is it?

Thurner: It started long ago with a first version developed in C, which was e.g. used in a reengineering project to manage a few hundred types, one million instances and about five million bidirectional relationships. In 2006 we decided to redevelop the system from scratch in java and the necessary tools with the Eclipse framework. We started with the basic elements – multi-level architecture based on the entity-relationship-model, integration of models and instance-data, ACID transactions, versioning and user access rights. During development the system morphed from the initial idea of a repository-service to a complete DBMS. We developed a set of entirely model driven tools – modeler, browser, import-/export utilities, Excel-interface, ADO-Driver for BIRT etc.
Metasafe has a multilevel structure: an internal metamodel, the global data model, external models as subsets (views) of the global data model and the instance data – in OMG-Terms it stretches from M3 to M0. All types (M2, M1) are described by meta-attributes as defined in the metamodel. User access rights to models and instance data are tied to the external models. Entity instances (M0) may exist in several manifestations (Catalog, Version, Variant). An extension of the data model e.g. by additional data types, entity types, relationship types or submodels can be defined using the metaModeler tool (or via the API by a program). From the moment the model changes are committed, the database is ready to accept instance data for the new types without unload/reload of the database.

Q7. Is the Metasafe repository the solution to the impedance mismatch problem?

Thurner: It is certainly a substantial step forward because we made the conceptual model and the database definition one and the same. We take the conceptual model literally by its word: If an ‘Order has Orderdetails’, we tell the database to create two entity types ‘Order’ and ‘Orderdetails’ and the relation ‘has’ between them. This way Metasafe implements an executable conceptual model with all the required properties of a real database management system: an open API, versioning, “declarative, model related queries and transactions” etc. Our own set of tools and especially the integration of BIRT (the eclipse Business Intelligence and Reporting Tool) demonstrate how it can be done. Our graphical erSQL query builder is even integrated into the BIRT designer. The erSQL queries are translated on the fly and BIRT accesses our database without any intermediate files.

Q8: What is the API of the Metasafe repository?

Thurner: Metasafe provides an object-oriented Java-API for all methods required to search, create, read, update, delete the elements of the database – i.e. schemas, user groups /users, entities, relationships and their attributes – both on type- and on instance-level. All the tools of Metasafe (modeler, browser, import/export, query builder etc.) are built with this public API. This approach has led to an elaborate set of methods to support an application programmer. The erSQL query-builder and also the query-translator and processor were implemented with this API. An erSQL query can be embedded in a java-program to retrieve a result-set (including its metadata) or to export the result-set.
In early versions we had a C#-version in parallel but we discontinued this branch when we started with the development of the tools based on Eclipse RCP. The reimplementation of the core in C# would be relatively easy. I think that also the tools could be reimplemented because they are entirely model-driven.

Q9 How does Metasafeˈs query language differ from the Microsoft Entity Framework built-in query capabilities (i.e. Language Integrated Query (LINQ)?

Thurner: It is difficult to compare because Metasafe’s ersql query-language was designed with respect to the special nature of an Entity-Relationship model with heavily cross linked information. So the erSQL query language maps directly to the conceptual model. Also “end users” can create queries with the graphical query builder with point and click on the graphical representation of the conceptual model to identify the path through the model and to collect the information of interest.

The queries are translated on the fly and processed by the query processor. The validation and translation of a query into a command structure of the query processor is a matter or milliseconds. The query processor returns result sets of metadata and typed instance data. The query result can also be exported as Excel-Table or as XML-file. In “read-mode” the result of each retrieval step (instance objects and their attributes) is returned to the invoking program instead of building the complete result set. A query represents a sort of “user” model and is also documented graphically. “End users” can easily create queries and retrieve data from the database. erSQL and the graphical query builder is fully integrated in BIRT to create reports on the fly.
The present version supports only information retrieval. We plan to extend it by a ” … for update” feature which locks all selected entity instances for further operation.
E.g. an update query for {an order and all its order items and products} would lock “the order” until backout or commit.

Q10. There are concerns about the performance and the overhead generated by ORM technology. Is performance an issue for Metasafe?

Thurner: Performance is always an issue when the number of concurrent users and the size and complexity of the data grow. The system works quite well for medium size systems with a few hundred types, a few million instances and a few GBs. The performance depends on the translation of the logical requests into physical access commands and on the execution of the physical access to the persistence. Metasafe uses a very limited functionality of an RDBMS (currently SQLServer, Derby, Oracle) for persistence. Locking, transactions, multi-user management is handled by Metasafe; the locking tables are kept in memory. After a commit it writes all changes in one burst to the database. We could of course use an in-memory DBMS to gain performance. E.g. VoltDB with the direct transaction access could be integrated easily and would certainly lead to superior performance.
We have also another kind of performance in mind – the user performance. For many applications the number of milliseconds to execute a transaction are less important than the ability to quickly create or change a database and to create and launch queries in a matter of minutes. Metasafe is especially helpful for this kind of application.

Q11. What problems is Metasafe designed to solve?

Thurner: Metasafe is designed as a generic data platform for medium sized (XX GB) model-driven applications. The primary purpose is the support for applications with large, complex and volatile data structures as tools, models, catalogs or process managers etc. Metasafe could be used to replace some legacy repositories.
Metasafe is certainly the best data platform (repository) for the construction of an integrated development environment. Metasafe can also serve as DBMS for business applications.
We evaluate also the possibilities to use that Metasafe DBMS as data platform for portable devices as phones and tablet computers: This could be a real killer application for application developers.

Q12. How do you position Metasafe in the market?

Thurner: I had the vision of an entity relationship base database system as future data platform and decided to develop Metasafe to a really useful level without the pressure of the market (namely the first time users). Now we have our product on the necessary level of quality and we are planning the next steps. It could be the “open source approach” for a limited version or the integration into a larger organization.
We have a number of applications and POCs but we have no substantial customer base yet, which would require an adequate support and sales organization. But we have not the intension to convert a successful development setup into a mediocre service and sales organization. We are not under time pressure and are looking at a number of possibilities.

Q13. How can the developers community test your system?

Thurner: We provide an evaluation version upon request.

—————————-
Related Posts

Do we still have an impedance mismatch problem? – Interview with José A. Blakeley and Rowan Miller. by Roberto V. Zicari on May 21, 2012

Resources

“Implementing the Executable Conceptual Model (ECM)” (download as .pdf),
by Dr. Reinhold Thurner, Metasafe.

ODBMS.org Free Resources on:
Entity Framework (EF) Resources
ORM Technology
Object-Relational Impedance Mismatch

##

]]>
http://www.odbms.org/blog/2012/08/on-impedance-mismatch-interview-with-reinhold-thurner/feed/ 0
Several new resources published in ODBMS.ORG http://www.odbms.org/blog/2009/09/several-new-resources-published-in/ http://www.odbms.org/blog/2009/09/several-new-resources-published-in/#comments Fri, 11 Sep 2009 13:27:00 +0000 http://www.odbms.org/odbmsblog/2009/09/11/several-new-resources-published-in-odbms-org/ I have published several new resources in ODBMS.ORG:

– A new User Report, (number 32/09), by Dr. Andreas Geppert at Credit Suisse, Switzerland.
Andreas Geppert is a Platform Architect. Gepperts tell us that the strategy of his bank is to buy IT infrastructures whenever possible, and avoid developing them in-house. When asked if they had an “impedance mismatch” problem in the bank, Geppert replied: “We certainly have an impedance mismatch problem, in particular as we are increasingly developing new applications in Java accessing relational databases such as Oracle and DB2.”
You can read the full report in the Object Databases – User Reports Section.

– A Link to download Databeans.
Databeans is an object oriented persistence framework for Java, available under GPL. The link is available here.

-A Link to download ConceptBase.
ConceptBase is a multi-user deductive and object-oriented database system for meta modeling and method engineering, developed by Tilburg University. It is freely available under a FreeBSD-style license. The link is available here.

– Databeans Tutorial for Java version 2.0.
You can download the tutorial (PDF) in the Object Databases – Tutorials Section.

– Slides of a course based on ConceptBase, developed by Tilburg University.
The slides are under a permissive Creative Commons license, and are available here.

I would also like to welcome a new Expert Manfred Jeusfeld, who has just joined the ODBMS.ORG`s panel of Experts.

Hope you`ll find the resources useful. And as always, all resources in ODBMS.ORG are freely accessible!

RVZ

]]>
http://www.odbms.org/blog/2009/09/several-new-resources-published-in/feed/ 0
O/R Impedance Mismatch? Users Speak Up! Fourth Series of User Reports published. http://www.odbms.org/blog/2009/01/or-impedance-mismatch-users-speak-up/ http://www.odbms.org/blog/2009/01/or-impedance-mismatch-users-speak-up/#comments Tue, 13 Jan 2009 23:06:00 +0000 http://www.odbms.org/odbmsblog/2009/01/13/or-impedance-mismatch-users-speak-up-fourth-series-of-user-reports-published/ I have published the fourth series of user reports on using technologies for storing and handling persistent objects.

The fourth series includes 6 new user reports from the following users:

-Martin F. Kraft
-Serena Pizzi at Banca Fideuram
-Dan Schutzer at FSTC
-Peter Fallon at Castle Software Australia
-Benny Schaich-Lebek at SAP
-Stephan Kiemle at German Aerospace Center

The new 6 reports and the complete series of user reports are available for free download.

I have also published a new paper by ODBMS.ORG panel member William Cook on Interprocedural Query Extraction for Transparent Persistence.
Transparent Persistence promises to integrate programming languages and databases by allowing programs to access persistent data with the same ease as non-persistent data. The work is focused on programs written in the current version of Java, without languages changes. However, the techniques developed by Cook and his colleagues may also be of value in conjunction with object-oriented languages extended with high-level query syntax.

]]>
http://www.odbms.org/blog/2009/01/or-impedance-mismatch-users-speak-up/feed/ 2
O/R Impedance Mismatch? Users Speak Up! Third Series of User Reports published. http://www.odbms.org/blog/2008/10/or-impedance-mismatch-users-speak-up-2/ http://www.odbms.org/blog/2008/10/or-impedance-mismatch-users-speak-up-2/#comments Thu, 23 Oct 2008 02:12:00 +0000 http://www.odbms.org/odbmsblog/2008/10/23/or-impedance-mismatch-users-speak-up-third-series-of-user-reports-published/ I have published the third series of user reports on using technologies for storing and handling persistent objects.
I have defined “users” in a very broad sense, including: CTOs, Technical Directors, Software Architects, Consultants, Developers, and Researchers.

The third series includes 7 new user reports from the following users:

– Peter Train, Architect, Standard Bank Group Limited, South Africa.
– Biren Gandhi, IT Architect and Technical Consultant, IBM Global Business Services, Germany.
– Sven Pecher, Senior Consultant, IBM Global Business Services, Germany.
– Frank Stuch, Managing Consultant, IBM Global Business Services, Germany.
– Hiroshi Miyazaki, Software Architect, Fujitsu, Japan.
– Robert Huber, Managing Director, 7r gmbh, Switzerland.
– Thomas Amberg, Software Engineer, Oberon microsystems, Switzerland.

I asked each users a number of equal questions, among them what experience do they have in using the various options available for persistence for new projects and what are the lessons learned in using such solution(s).

“Some of our newer systems have been developed in-house using an object oriented paradigm. Most (if not all) of these use Relational Database systems to store data and the “impedance mismatch” problem does apply” says Peter Train from Standard Bank.

The lessons learned using Object Relational mapping tools confirm the complexity of such technologies.

Peter Train explains: “The most common problems that we have experienced with object Relational mapping tools are:
i) The effort required to define mappings between the object and the relational models; ii) Difficulty in understanding how the mapping will be implemented at runtime and how this might impact performance and memory utilization. In some cases, a great deal of effort is spent tweaking configurations to achieve satisfactory performance.”

Frank Stuch from IBM Global Business Services has used Hibernate, EJB 2 and EJB 3 Entity Beans in several projects.
Talking about his experience with such tools he says: “EJB 2 is too heavy weight and outdated by EJB 3. EJB 3 is not supported well by development environments like Rational Application Developer and not mature enough. In general all of these solutions give the developer 90% of the comfort of an OODBMS with well established RDBMS.
The problem is that this comfort needs a good understanding of the impedance mismatch and the consequences on performance (e.g. “select n+1 problem”). Many junior developers don’t understand the impact and therefore the performance of the generated/created data queries are often very poor. Senior developers can work very efficient with e.g. Hibernate. “

In some special cases custom solutions have been built, like in the case of Thomas Amberg who works in mobile and embedded software and explains “We use a custom object persistence solution based on sequential serialized update operations appended to a binary file”.

The new 7 reports and the complete series of user reports are available for free download.

I plan to continue to publish users reports on a regular base.

]]>
http://www.odbms.org/blog/2008/10/or-impedance-mismatch-users-speak-up-2/feed/ 0
More Impedance mismatch: Cloud Computing http://www.odbms.org/blog/2008/10/more-impedance-mismatch-cloude/ http://www.odbms.org/blog/2008/10/more-impedance-mismatch-cloude/#comments Sat, 04 Oct 2008 05:59:00 +0000 http://www.odbms.org/odbmsblog/2008/10/04/more-impedance-mismatch-cloud-computing/ I noticed a news on an additional source of Impedance mismatch: Cloud Computing…

Geir Magnusson, vice president of engineering and co-founder of 10gen, presented at a conference called Web 2.0 Expo, a talk: “The Sequel to SQL: Why You Won’t Find Your RDBMS in the Clouds.”

Magnusson said “an RDBMS is what you need, but not in the cloud.”
Magnusson seems to support O/R mapping: “O/R mapping blends the power of an RDBMS with the programming simplicity of an ODBMS [object database management system],” Magnusson said, noting that there is support for O/R mapping in Java, Python, Ruby, .NET and Groovy. “O/R mapping is everywhere.”

However, the series of interviews with users indicate that O/R mapping is only one way (and not the most simple one) of getting around the impedance mismatch between object-oriented languages and data stored in a relational system.

]]>
http://www.odbms.org/blog/2008/10/more-impedance-mismatch-cloude/feed/ 0
Do you have an impedance mismatch problem? Users speak up! Second series of user reports published. http://www.odbms.org/blog/2008/09/second-series-of-user-reports-published/ http://www.odbms.org/blog/2008/09/second-series-of-user-reports-published/#comments Thu, 04 Sep 2008 05:03:00 +0000 http://www.odbms.org/odbmsblog/2008/09/04/do-you-have-an-impedance-mismatch-problem-users-speak-up-second-series-of-user-reports-published/ I have started a new series of interviews with users of technologies for storing and handling persistent objects, around the globe.

6 additional user reports (12-17/08) have been published, from the following users:

  • Ajay Deshpande, Persistent
  • Horst Braeuner, City of Schwaebisch Hall
  • Tore Risch, Uppsala University
  • Michael Blaha, OMT Associates
  • Stefan Keller, HSR Rapperswil
  • Mohammed Zaki, Rensselaer

The complete initial series of user reports is available as always for free download.

Here I define “users” in a very broad sense, including: CTOs, Technical Directors, Software Architects, Consultants, Developers, Researchers.

I have asked 5 questions:

Q1. Please explain briefly what are your application domains and your role in the enterprise.

Q2. When the data models used to persistently store data (whether file systems or database management systems) and the data models used to write programs against the data (C++, Smalltalk, Visual Basic, Java, C#) are different, this is referred to as the “impedance mismatch” problem. Do you have an “impedance mismatch” problem?

Q3. What solution(s) do you use for storing and managing persistence objects? What experience do you have in using the various options available for persistence for new projects? What are the lessons learned in using such solution(s)?

Q4. Do you believe that Object Database systems are a suitable solution to the “object persistence” problem? If yes why? If not, why?

Q5. What would you wish as new research/development in the area of Object Persistence in the next 12-24 months?

More information here.

]]>
http://www.odbms.org/blog/2008/09/second-series-of-user-reports-published/feed/ 2
Do you have an impedance mismatch problem? Users speak up! http://www.odbms.org/blog/2008/07/do-you-have-impedance-mismatch-problem/ http://www.odbms.org/blog/2008/07/do-you-have-impedance-mismatch-problem/#comments Tue, 01 Jul 2008 01:15:00 +0000 http://www.odbms.org/odbmsblog/2008/07/01/do-you-have-an-impedance-mismatch-problem-users-speak-up/ I have started a new series of interviews with users of technologies for storing and handling persistent objects, around the globe.

Here I define “users” in a very broad sense, including: CTOs, Technical Directors, Software Architects, Consultants, Developers, Researchers.

I have asked 5 questions:

Q1. Please explain briefly what are your application domains and your role in the enterprise.

Q2. When the data models used to persistently store data (whether file systems or database management systems) and the data models used to write programs against the data (C++, Smalltalk, Visual Basic, Java, C#) are different, this is referred to as the “impedance mismatch” problem. Do you have an “impedance mismatch” problem?

Q3. What solution(s) do you use for storing and managing persistence objects? What experience do you have in using the various options available for persistence for new projects? What are the lessons learned in using such solution(s)?

Q4. Do you believe that Object Database systems are a suitable solution to the “object persistence” problem? If yes why? If not, why?

Q5. What would you wish as new research/development in the area of Object Persistence in the next 12-24 months?

The first series of interviews I published in ODBMS.ORG include:

ODBMS.ORG User Report No. 1/08
Editor Roberto V. Zicari- ODBMS.ORG www.odbms.org
July 2008.
Category: Industry
Domain: Automation System Solutions for Postal Processes.
User Name: Gerd Klevesaat
Title: Software Architect
Organization: – Siemens AG- Industry Sector, Germany

ODBMS.ORG User Report No.2/08
Editor Roberto V. Zicari- www.odbms.org
July 2008.
Category: Academia
Domain: Research/Education
User Name: Pieter van Zyl
Title: Researcher
Organization: Meraka Institute of South Africa’s Council for
Scientific and IndustrialResearch (CSIR) and University of
Pretoria, South Africa.

ODBMS.ORG User Report No.3/08
Editor Roberto V. Zicari- www.odbms.org
July 2008.
Category: Academia
Domain: Research/Education
User Name: Philippe Roose
Title: Associate Professor / Researcher
Organization: LIUPPA/IUT de Bayonne, France.

ODBMS.ORG User Report No.4/08
Editor Roberto V. Zicari- ODBMS.ORG www.odbms.org
July 2008.
Category: Industry
Domain: Various
User Name: William W. Westlake
Title: Principal Systems Engineer
Organization: Science Applications International Corporation, USA

ODBMS.ORG User Report No.5/08
Editor Roberto V. Zicari- ODBMS.ORG www.odbms.org
July 2008.
Category: Academia
Domain: Research/Education
User Name: Stefan Edlich
Title: Professor
Organization: TFH-Berlin, Germany

ODBMS.ORG User Report No. 6/08
Editor Roberto V. Zicari- ODBMS.ORG www.odbms.org
July 2008.
Category: Industry
Domain: Various.
User Name: Udayan Banerjee
Title: CTO
Organization: NIIT Technologies, India.

ODBMS.ORG User Report No. 7/08
Editor Roberto V. Zicari- ODBMS.ORG www.odbms.org
July 2008.
Category: Industry
Domain: Robotics.
User Name: NISHIO Shuichi
Title: Senior Researcher
Organization: JARA/ATR, Japan.

ODBMS.ORG User Report No.8/08
Editor Roberto V. Zicari- ODBMS.ORG www.odbms.org
July 2008.
Category: Industry
Domain: Financial Services
User Name: John Davies
Title: Technical Director
Organization: Iona, UK

ODBMS.ORG User Report No.9/08
Editor Roberto V. Zicari- ODBMS.ORG www.odbms.org
July 2008.
Category: Industry
Domain: Various
User Name: Scott W. Ambler
Title: Practice Leader Agile Development
Organization: IBM Rational, Canada

ODBMS.ORG User Report No. 10/08
Editor Roberto V. Zicari- ODBMS.ORG www.odbms.org
June 2008.
Category: Industry
Domain: Defense/intelligence area.
User Name: Mike Card
Title: Principal engineer
Organization: Syracuse Research Corporation (SRC), USA

ODBMS.ORG User Report No. 11/08
Editor Roberto V. Zicari- ODBMS.ORG www.odbms.org
July 2008.
Category: Industry
Domain: Finance
User Name: Richard Ahrens
Title: Director
Organization: Merrill Lynch, US

All user reports are available for free download (PDF)

Hope you`ll find them interesting. More to come….I plan to publish user reports in ODBMS.ORG on a regular base.

RVZ

]]>
http://www.odbms.org/blog/2008/07/do-you-have-impedance-mismatch-problem/feed/ 0
O/R mismatch: What is the Problem? http://www.odbms.org/blog/2007/08/or-mismatch/ http://www.odbms.org/blog/2007/08/or-mismatch/#comments Wed, 29 Aug 2007 04:32:00 +0000 http://www.odbms.org/odbmsblog/2007/08/29/or-mismatch-what-is-the-problem/ August 28, 2007

There has been quite a discussion recently on the so called “O/R mismatch”.
This is a quite interesting discussion. The bottom line is that after so many years, still object persistence does not seem to have a fully adequate solution.
This is ackward, bringing programming languages and databases seems still a rather diffcult task…!

There are a number of interesting resources I have recently published on this subject on ODBMS.ORG.

In cooperation with FranklinsNet, ODBMS.ORG has published the transcript of the panel discussion “ORM Smackdown” between Ted Neward and Oren “Ayende” Eini on different viewpoints on Object-Relational Mapping (ORM) systems.
It is an interesting reading. Pls check: ORM Smackdown

I have also published Ted Neward’s follow on essay discussing solutions to the problems
of Object/Relational-Mapping titled “Avoiding the Quagmire”.
This new essay is a follow on to Neward’s “The Vietnam of Computer Science” , which compared
the inherent problems of object/relational mapping to the quagmire in the Vietnam war.
The initial “Vietnam” essay was first published in 2006 and widely discussed in the industry.

“Avoiding the Quagmire” discusses the impact of choosing to integrate object concepts into the database as opposed to using relational concepts or object/relational mappers.
Neward states that while using an object oriented database management system (ODBMS) will not completely eliminate all of the problems described in the intial “Vietnam” essay, it does address some of the more egregious problems. ODBMS thus frequently provide the developer a better chance of avoiding the quagmire and allowing them to focus more clearly on the problem at hand.

Pls check: Avoiding the Quagmire

I published a copy of Ted Neward’s “The Vietnam of Computer Science”.
Neward argues that the O/R mismatch is a quagmire where current approaches including object-relational mappers (ORMs) are subject to decreasing marginal returns. He lists the abandonment of objects (as a programming paradigm) or of relational data structures (as a database paradigm) as the only wholehearted solutions, while living with the pain or full integration of ORMs into languages or databases are other approaches.

I personally do not like the analogy with Vietnam… but the article has a number of interesting points. The article as you may immagine has received a mix feedback from the readers….

Here is the reference: The Vietnam of Computer Science

]]>
http://www.odbms.org/blog/2007/08/or-mismatch/feed/ 0