Monday, February 8, 2010

"document stores", "nosql databases" , ODBMSs.

I was asked to compare & contrast odbms systems to the new nosql datastores out there. So I have asked several people in the last few weeks....

Here are some resources I recently published on this:
- On NoSQL technologies - Part II. Roberto V. Zicari (editor) - February 10, 2010.

- The Object Database Language “Galileo" - 25 years later. Renzo Orsini – February 10 , 2010.

- Relational Databases, Object Databases, Key-Value Stores, Document Stores, and Extensible Record Stores: A Comparison.Rick Cattell – January 11, 2010.

- On NoSQL technologies - Part I.R. Zicari (Editor) et al. – December 14, 2009.

Below, you`ll find a few more selected replies - I do not pretend this list to be complete, but hopefully it will help triggering a discussion....By the way, this list keeps growing so check the blog on a regular base for new updates.

RVZ

Leon Guzenda (Objectivity):
"I've always been amused by the many relational database enthusiasts who continue to insist that the technology is adequate for all purposes. It is a great and flexible technology and it has many applications. However, you don't have to look far before you come across instances where developers have chosen not to use it. Why is it that Microsoft Word doesn't break its documents down into chapter, section, paragraph, sentence and letter tables or columns and store them in SQL Server? It clearly could, but manipulating the data structures would be tiresome. Storing them in an object database, on the other hand, is efficient and flexible. Reading a document from disk into memory would probably only involve a few method invocations and I/Os, rather than the dozens of index, join and read operations that SQL would need. You could apply the same argument to Microsoft Excel.

Wikipedia lists many dozens of Windows file types. I'm sure that one could easily find hundreds of them. I'm also sure that the developers of some of those file formats might have found it convenient to store their formatted data in a relational database, but they clearly didn't. I'd also hazard a guess that most of them could be more easily stored in an ODBMS, particularly as that's why ODBMSs were developed in the first place - to overcome the limitations of the relational model.

If you look at the various types of data storage and management that are being clumped under the NoSQL banner you soon find that some of them are using subsets of relational technology and others are simply distributed file systems. I don't think that there's anything wrong with that. If you don't need to do concurrent updates or ad hoc queries on content then why pay for the overheads of locking, transaction journals and indices? Almost all of the files that users store in online repositories such as Picasa, Flickr, Youtube and Facebook are written once, read many times and seldom, if ever, updated. There may be advantages to indexing the files, but you certainly wouldn't want to break them down into rows and columns and there aren't any significant advantages in storing them as BLOBs in an RDBMS. Files work just fine. However, you do need a highly scalable and efficient way to put them somewhere and find them again, which is where sharding can be useful. It's not a new technique, though some people seem to think that it is. Objectivity/DB, being a distributed, federated ODBMS, has used sharding since 1988 to split its federations into convenient logical and physical chunks, but still provide a single logical view of connected object graphs.

The people who don't like the NoSQL paradigm have some good points though. Throwing away a lot of the lessons learned from the building and refinement of relational database technology would be a bad thing to do. It's certainly not worth the time and effort of rediscovering the problems and reinventing solutions to them.

In 1976, International Computers Limited (ICL) released a hardware and software technology called the Content Addressed File System. It used special disks and microprocessors to drop records into any convenient location on the disks, with some minimal local indexing. It found them again by examining their contents. The microprocessors handled the predicate operations and they could often handle many concurrent queries in a single rotation of each disk. It wasn't a great marketing success, as ICL targeted the mainframe datastores rather than the workstation or emerging PC market. However, it did solve some pretty tricky problems at the time and the idea is still valid.

The XAM file system protocol makes it possible to store file metadata with each file and to have the storage infrastructure conduct searches for files that match predicates and values supplied by the application that needs them. Like CAFs, it would provide an ideal repository for the kinds of data that Youtube and the other sites I mentioned previously need to store. You could use SQL, SPARQL, LINQ or any other query language to find things, but you wouldn't be using relational storage structures for the actual data. Somebody at an XLDB Workshop mentioned that the bulk of their data processing consists of scanning emails and attached files for viruses and other malware. They also spend most of their RDBMS cycles on scanning tables for marketing reasons, rather than executing queries. You could argue that content addressable hardware would be a perfect solution for them.

So, I regard the NoSQL category as a mixed bag and I don't see it as being a threat to relational or object DBMSs. As we all know, each has its place in our technology toolkit. Much of the need for NoSQL variants may go away if content addressable hardware steps up to the challenge. However, I do wish that the people who are spending time developing these capabilities would look around a bit more before they start reinventing the wheel."

Hamid Pirahesh, (IBM Fellow):
"There is heavy activity in the Bay area in nonsql. There is VC $ going into those companies. They do extensive blogging.
There is also xml DBs, which goes beyond relational. Hybridization with relational turned out to be very useful. For example, DB2 has a huge investment in XML, and it is extensively published, and it is also made commercial. Monet DB did substantial work in that area early on as well. "

Borislav Iordanov ( HyperGraphDB):
"I think we've just realized that different representations are suitable for different domains and that it is possible to persist those representations natively without having to ultimately translate everything into a rigid schema. A very important driver of those NoSQL dbs. is their dynamic nature. I think many people like them for the same reason they like dynamic languages
(no static typing, no long compile times etc.): change is easier, prototyping is faster, and getting it into production is not that
risky. The problem with them of course is lack of integrity & consistency guarantees, something which ODBMs still try to provide at some level while remaining more flexible and offering richer structures than RDMBs. Another problem with RDBMS is SQL the language itself, which is very tightly coupled to the storage meta-model. Here again ODBMs do better through standartization, but more openness/flexibility etc. and come perhaps as a middle ground between anarchistic NoSQL and totalitarian SQL :) "

Michael Stonebraker (MIT):
"Greene`s reply is perfectly reasonable. I think the "one size does not fit all" mantra -- which I have been espousing for some time -- is a good way to think about things. After all, the no SQL folks are themselves pretty diverse."

Mårten Gustaf Mickos (previously CEO of MySQL AB):
" I think Kaj had a great response. (Link to Kaj response). Generally, in the early days of a new term, it doesn't in my mind make sense to try to define it exactly or narrowly. It's only a term, and it will take years before we know how significant it is and whether it is justified as a category of its own. For instance, it took a long time for Web2.0 to become an acknowledged term with a useful and reasonably well defined meaning."

Dirk Bartels (Versant):
The "NoSQL" movement is a reflection of different and entirely new application requirements that are not orthogonal to SQL and relational databases. What makes this discussion really interesting in my opinion is that it has its roots with application developers.

The least an application developer wants to do is spending (or should I say wasting) time to implement a database system. Application developers want to concentrate on their domain, getting their apps out and compete in their markets. Today, pretty much all data persistence requirements are being implemented with SQL, in particular with several open source choices such as MySQL and PostgreSQL readily and at no cost available. SQL is what developers are learning in college, and therefore, it is today often the only database technology considered.

Now having these application developers stepping out of their own comfort zone, tossing the SQL databases and spending their precious resources inventing their own data management layer should tell us something.

From time to time, advances in compute infrastructure are disruptive. The PC, Client / Server, Internet, and lately Cloud Computing have been or will be catalysts for significant changes. Is it possible that "No SQL" means that certain new types of applications are simply not a good fit for the relational data model provided via SQL?

When taking a closer look, these applications still require typical database functionality also found in a SQL database, for example some query support, long term and reliable data storage, and scalability just to name a few. Nevertheless, there must be issues with SQL to make the pain just too big to stick with SQL. I haven`t done a lot of research on this subject, but suspect that the issues revolve around the data model (too rigid), the transaction processing model (too linear), the scalability model (horizontal scale out solutions too expensive), the programming model (too cumbersome) and probably more.

I remember when IT did`t get the PC, the Graphical User Interface, the Internet etc. I am not surprised that many traditional IT people are not getting it today. I expect the No SQL movement to gain momentum and to continue to evolve rapidly and likely in several directions. These days, applications and their data management requirements are significantly more complex. In my opinion it is just a matter of time that developers realize that the traditional relational model, invented for high volume transaction processing, may not be the best choice for application domains a SQL database is simply not designed for.

Is this a call for object databases, a long overlooked database technology that has matured over the past 20 somewhat years? I think it`s possible the future will tell. No SQL application developers should at least give object databases a good look, it may save them a lot of time and headaches down the road. "

Dwight Merriman, (CEO of 10gen):
A comparison of document-oriented and object-oriented databases is fascinating as they are philosophically more different than one might at first expect.
In both we have a somewhat standardized document/object representation -- typically JSON in currently popular document-oriented stores, perhaps ODL in ODBMS.  The nice thing with JSON is that at least for web developers, JSON is already a technology they use and are familiar with.  We are not adding something new for the web developer to learn.
In a document store, we really are thinking of "documents", not objects.  Objects have methods, predefined schema, inheritance hierarchies.  These are not present in a document database; code is not part of the database.
While some relationships between documents may exist, pointers between documents are deemphasized. 
The document store does not persist "graphs" of objects -- it is not a graph database. (Graph databases/stores are another new NoSQL category - what is the different between a graph database and an ODBMS?  An interesting question.)
Schema design is important in document databases.  One doesn't think in terms of "persist what i work with in ram from my code".  We still define a schema.  This schema may vary from the internal "code schema" of the application. 
For example in the document-oriented database MongoDB, we have collections (analogous to a table) of JSON documents, and explicit declaration of indexes on specific fields for the collection. 
We think this approach has some merits -- a decoupling of data and code.  Code tends to change fast. Embedding is an important concept in document stores.  It is much more common to nest data within documents than have references between documents. Why the deemphasis of relationships?  A couple reasons. 
First, with arbitrary graphs of objects, it is difficult to process the graph from a client without many client/server turnarounds.  Thus ,one might run code server-side.  A goal with document databases is to maintain the client/server paradigm and keep code biased to the client (albeit with some exceptions such as map/reduce).  
Second, a key goal in the "NoSQL" space is horizontal scalability.  Arbitrary graphs of objects would be difficult to partition among servers in a guaranteed performant manner.

Eric Falsken (db4o):
"NoSQL database (like Google's BigTable data behind their Gears API) is an awkward sort of "almost-sql" or "sql-like".
But it ends up being a columnar-database. What you call a "document store" is a row-based database. Where records are stored together in a single clump, called the row. By eliminating strongly-typed columns, they can speed up i/o by many factors (data written to one place rather than many places) just in the insert/select operation. By intelligent use of indexes, they should be able to achieve some astounding benchmarks. The complexity of object relationships is their shared drawback. Being unable to handle things like inheritance and polymorphism is what stops them from becoming object databases. You can think of db4o as a "document-oriented database" which has been extended to support object-oriented principles. (each level of inheritance is a "document" that all gets related together.)"

Peter Neubauer (Neo Technology):
"We have not modeled an ODBMS on Neo4j yet, but if you look at e.g. the Ruby bindings , it fits very naturally into dynamic language paradigms, mapping the domain models almost directly onto nodes and relationships. We have written a couple of blogs on the topic, latest Emils classification of the NOSQL space,and there are approaches to turn Neo4j into something that resembles a OODBMS:
1. JRuby bindings , hiding Neo4j as the backend largely form the code, but still exposing the Traverser APIs baside the normal Ruby collection etc for deep graph traversals.
2. Jo4neo by Taylor Cowan , which is persisting objects via Java annotations.
3. Neo4j traversers, and Gremlin , for deep and fast graph traversals (really not OODBMS like, but VERY useful in todays data sets) .
It would be very interesting to have more conversation on these topics!"

Jan Lehnardt (CouchDB):
" For me, NoSQL is about choice. OODBs give users a choice. By that definition though, Excel is a NoSQL storage solution and I wouldn't support that idea :) I think, as usual, common sense needs to be applied. Prescriptive categorisation rarely helps but those who are in the business of categorising."

Miguel-Angel Sicilia (University of Alcalá):
"The NoSQL movement, according to Wikipedia today promotes "non-relational data stores that do not need a fixed schema". I do not believe ODBMS really fit with that view on data management. Also, other aspects of the NoSQL "philosophy" make ODBMS be far from them. However, NoSQL focuses on several problems of traditional relational databases for which ODBMS can be a good option, so that they can be considered to be "cousins" in some sense. I do not see NoSQL and ODBMS as overlapping, but as complementary solutions for non-traditional data management problems."

Manfred Jeusfeld (Tilburg University):
" I am a bit frightened by this development. Basically, people step back to the time prior to database systems. I heard similar ideas at a Dagstuhl seminar from IT experts of STATOIL (the Norwegian oil giant). They experienced that non-database data stores are much faster, and they appear to be willing to dump ACID for increased performance of some of their systems.
This is typical for a programmer's attitude. They want the data storage & access to be optimized for their application and do not care too much about interoperability and reliability. If every data items can be referenced by a 64bit memory address, why would we need a query language to access the data. Following memory pointers is certainly much faster. OODBs can serve as a bridge between the pure programming view and the database view."

Peter Norvig (Director of Research at Google Inc.):
"You should probably use what works for you and not worry about what people call it."

Floyd Marinescu (Chief Editor InfoQ ):
"I think that web development has become so old and mature now that people are discovering that certain types of applications are better off with different solutions than your standard doctrine 3 tier system with SQL database. Plus the whole web services phenomenon has opened people`s minds to the notion of infrastructure as a service and that is driving this as well. This is my anthropological view of things. ;) "

Matthew Barker (Versant Corporation):
"As Robert (Greene) mentioned, when everyone thought "one size fits all", SQL evolved from a "simple query language" to a "do-it-all" tool including creating, modifying, and reorganizing data, not just querying. Many object databases such as Versant have an "SQL-like" query language but the capabilities are limited to actually querying the database, not updates, creates, deletes, etc. When you use SQL to modify data, you break the OO paradigm of safety and encapsulation; in a large application, it very easily becomes a monster that is difficult if not impossible to control. If we reign it and use SQL for it's original purpose, querying data, then SQL can fit in nicely with object databases - but the monster it has become does not fit into object database technologies."

Martin F. Kraft (Software Architect):
" NoSQL is a very interesting topic, though a standardized API like the W3C proposal would be challenging to adopt, and even more to outperform native legacy OODBMS (non-sql) queries.
I see the need to improve SQL performance in object oriented use as with J2EE and understand that some of the SQL performance implementations for OO like GemFire are doing great, but don't solve the underlying root-cause: the SQL overhead in non-sql data queries like object traversal and key/value lookup..
So far I would rather use RDBMS' and OODMBS' where they perform best, as Mr. Greene said "one size does not fit all"....
Some object databases provide (slow) SQL interfaces, and NoSQL should not mean no-QL"

Kenneth Rugg: (Progress):
" I was at the "Boston Big Data Summit" a few weeks ago and the NoSQL movement came up as a topic. The moderator, Curt Monash whose is on dbms2.com, had a pretty funny quote on the topic. He said "The NoSQL movement is a lot like the Ron Paul campaign - it consists of people who are dissatisfied with the status quo, whose dissatisfaction has a lot to do with insufficient liberty and/or excessive expenditure, and who otherwise don't have a whole lot in common with each other."

Andy Riebs (Software Engineer at HP):
"Interesting blog! (Stonebraker sounds a bit too much like he's defending his baby ) Greene's comments are sensible. Following through on the "one size doesn't fit all" theme, just how many simple databases are best implemented as flat text files? The "NoSQL" discussions are reminiscent of the old "RISC vs. CISC" arguments. While people usually understood the notion of a simpler instruction set, no one noticed that pipelining and huge register sets were introduced in the same package. Now you can find all those elements in most architectures.
In the same sense, one presumes that many of the good ideas that survive the "NoSQL" debates will, in fact, end up in relational databases. Will that make the resulting products any more or less relational? Some problems will best be resolved with non-relational, non-SQL tools. Best bet for a NoSQL technology that will survive? Harvesting meaningful data from the log files of data centers with 20.000 servers! With a proper MapReduce implementation, it will be a thousand times more effective to distribute the processing to the source of the data, and return only pre-processed results to the consuming application, rather than hundreds of gigabytes of raw data. Of course, the real winner will be the one who can implement SQL on top of a MapReduce engine! "

Tobias Downer (MckoiDDB):
" Interesting read. I don't think you can really nail down exactly what a 'NoSQL' technology is. It's a rejection of the prevailing popular opinion that's been around for the last decade, which is that a data management system that doesn't support SQL is no way to manage 'real' data, and a forum for advocates of certain scalable data technologies to promote themselves.
It's been successful at getting people interested and excited about data systems outside the world of SQL which I think is really what the aim was. I wouldn't count on the word being in our lexicon for very long though or any products seriously branding themselves using this label, because these systems are likely to eventually come back around and support SQL like query languages in the future."

Warren Davidson (Objectivity):
" This is of interest to me, and should be to anyone interested in market change. Change is always difficult, but in the database world it seems especially so when you see how often an RDBMS is used even though it might be a technically poor decision. Since change is risky, in order for Objectivity, or Versant or Progress to get people to adopt its technology, you need momentum and corroborating support. The NoSQL movement lends credence to the first notion of change; one size does not fit all. So to have multiple technologies saying the same thing establishes credibility for change to begin, and from there people can start making the right technical choice, which may or may not be ODBMS. But this is ok, the industry needs the market to embrace a very simple concept of 'the right database for the right application'. They don't do that today, but the cloud movement is going this direction. And the NoSQL movement may help everyone.
As they say, a rising tide lifts all boats. :) "

Stefan Edlich (Beuth Hochschule):
" There are some databases which are document oriented and nosql as CouchDB and SimpleDB. And there are document oriented ones which are not nosql as Lotus or Jackrabbit (a really weird system I think). I think the interesting tool and user group is the nosql group which excludes the latter group (hopefully). So the article you mentioned describes nosql with products storing documents as attribute data and not documents as pure byte / data documents (which Jackrabbit does)."

Daniel Weinreb (previously co-founder Object Design):
"They're being used not just as caches but as primary stores of data. There's one called Tokyo Tiger (and Tokyo Cabinet) that I've heard is particularly good."

William Cook (University of Texas at Austin):
" I think it is important! Facebook is using this style of data store. I'm not sure about the performance implications, but it needs to be studied carefully."

Raphael Jolly (Databeans):
"That a database could be designed without SQL is not a surprise to me since Databeans has no query language and is meant to be queried in Java (native queries). In addition, I'll happily believe that "relational databases are tricky to scale".
However, the subject of extending Databeans with distributed computing capability has been on my mind for a long time and I presently have no idea how it could be done. What is interesting about NoSQL is how they mean to perform queries, i.e. through MapReduce. I don't know whether everything that can be expressed in SQL is amenable to MapReduce (this is probably not the case), but obviously a fair amount of what is done today on the internet is, the killer app being... search engines.
In summary, I tend to agree with this comment by alphadogg: "The main reason this [relegating to nich usage] will happen with the various key-value stores that are now in vogue is that they, like the predecessors, are built to solve a very, very specific issue on high-volume, low-complexity data, which is not the norm. The NoSQL moniker is indicative of a misplaced focus. NO structured query language? Really? We want to go back to chasing pointers and recreating wrappers for access? ".
My perception is that even when they literally have no SQL based queries, object databases are very different from NoSQL technologies as currently understood because, as is clearly explained in your references, there is more to ODBMS than just the query language. Specifically : ACID transaction constraints, which "NoSQL" seem to relax quite a bit.
These constraints are difficult to manage in a distributed setting. One has to consider advanced concurrency control techniques. But with a careful design, nothing seems to prevent a fully structured approach.
In this respect, DHTs are clearly limited compared to classical object databases. Recently, I was reading about such an attempt a distributed design, yet with "strict" transactions: (download the book "XSTM: Object Replication for Java, .NET & GWT " as .pdf). "

Steve Graves (McObject):
" I thought alphadogg had good comments, although he has a relational/SQL bias."

Jonathan Gennick (Apress):
" it is an interesting discussion. I have heard the term "NoSQL". I did find the comment about relational databases not supporting key/value stores amusing: "...and index key/value based data, another key characteristic of "NoSQL" technology. The schema seems very simple but may be challenging to implement in a relational database because the value type is arbitrary."
In Oracle, one simply needs a table as follows:

CREATE TABLE key_value (
the_key NUMBER,
the_value BLOB);

There you go! Key/value. How much simpler can you get? "

##

Labels: , , ,

Monday, January 4, 2010

"NoSQL technologies" interview with John Clapperton

Happy New Year!

We start in the new year with a topic we covered already in 2009: "NoSQL technologies".

I asked this time John Clapperton for his opinion. John Clapperton BSc CEng MBCS CITP is proprietor and author of the 'VOSS' virtual object storage system, which extends Smalltalk with integrated database management, providing transparent access and transaction processing of persistent, versioned, Smalltalk objects. Previously, John has worked on database applications and research at Standard Telephones & Cables, Unilever Research, Acorn Computers and Deductive Systems.

RVZ: John, are object databases "NoSQL" technologies?:

John Clapperton:: The absence of SQL in "NoSQL" databases is less an a priori choice than a consequence of their simplified schema capability, imposed in the interests of higher performance, being unable to support the full set of SQL language constructs. It does not follow, therefore, that object databases from which SQL has been excluded for the opposite reason, as a language unable to address their more general representational capabilities, should be automatically included in the NoSQL classification.

A person thinking of adopting a NoSQL database will have certain capabilities in mind, so the question is really "Might an object database have 'NoSQL' capabilities?"

These include:

1) Data partition (which is application dependent).
2) Optimistic locking (which helps only if most accesses are read-only).
3) Relaxation of ACID transaction rules by:
  • a) Data replication with eventual consistency, and/or
  • b) Suppression of transaction logging and/or flushing, and/or
  • c) Data storage in fast but volatile memory, sacrificing durability.
    4) Fast navigational access to arbitrary data structures.

    and in principle, an object database is capable of any or all of these.

    The characteristics of an object database are its ability to manage arbitrarily complex object structures and to represent relationships by explicit named references. These have the potential for better performance by, respectively, reducing the required number of file writes for (de-normalised) data structures, and fast navigation of direct references instead of relational joins. However, against that must be set the cost of serialising arbitrary object structures for durable storage and instantiating the same on retrieval, compared with the simpler handling of pre-defined rows in a relational database.

    Normalisation of behaviour, encapsulated in class definitions in language persistence odbms such as Logic Arts' VOSS for Smalltalk, reducing implicit replication in application programs and queries, may have an advantage in NoSQL applications, but it's not clear to me how significant that might be, given that its benefit is in managing complexity whereas NoSQL applications tend to be simpler.

    Labels: , , , ,

  • Thursday, December 17, 2009

    "Nonschematic" databases.

    Carl Olofson, Research Vice President, Database Management and Data Integration Software Research, at IDC, sent me this note, where he argues about the term "NoSQL" in relation with object databases.

    RVZ

    Carl Olofson:
    I would shy away from this term. A number of analysts (including myself) consider it a somewhat sloppy term intended to convey a certain spirit of rebellion. It actually derives from the core idea that the so-called "No-SQL" databases do not require schemas, and since most DBMSs are relational, it is simpler to say "NoSQL" than the more obscure "NoDDL".
    In fact, OODBMS does require a schema, and the data structure, which is tied to the application object model, is key to how it operates, and especially to its transparent operational nature. The so-called "NoSQL" database, which I call a "nonschematic" database, is one that requires no schema to be defined before data is loaded. One usually does define a schema afterward, through a process of data discovery and definition. If you know of a OODBMS that can accept undefined data, and allow schema definition after the fact, that could qualify. Otherwise, I would shy away from the term altogether.

    Labels: , , ,

    Wednesday, December 2, 2009

    Are object databases "NoSQL" technologies? Part II

    I asked the opinion of another ODBMS vendor on the topic of "NoSQL databases": Luis Ramos who is Principal Systems Engineer at Progress Software.

    RVZ: Luis, how do you position yourself with respect to the so called "NoSQL" databases?:

    Luis Ramos: We view many of the characteristics of the growing "NoSQL" movement as a market reaction to the realities of present day cloud-based data requirements, where ACID properties are not as important as performance, the bulk of the data's schema is not as complex, and the corresponding queries are relatively simple. Gone could be the days of complex relational schemas and the DBAs that are needed to maintain and administer them. Similar phenomena have been seen in other areas. For example in programming languages, the reaction against the very complex and error prone C++ led to the popularity of Java.

    In many respects, object databases can be classified as "NoSQL" technology. It satisfies many of the pivotal characteristics of today's "NoSQL" data stores. Object databases have been around since the late 1980s in response to the needs and requirements initially of the CAD market. At that time, the CAD practitioners needed an approach to data management that was fundamentally different than that provided by the relational databases. Consequently, a whole new breed of non relational (object-oriented) databases emerged. Customers from other markets, whose requirements could not be met by SQL databases, followed. Call it the original "NoSQL" movement? We certainly agree with Robert Greene's stipulation that "one size does not fit all."
    An alternative way to put it is "You can put lipstick on a 'relational table' but its still a 'relational table'".

    The schema-free characteristic that one finds in many "NoSQL" technologies is not entirely new. This is a requirement of many eCommerce applications developed in the 90s. There are object databases that support this nicely, enabling applications to store, manage, and index key/value based data, another key characteristic of "NoSQL" technology. The schema seems very simple but may be challenging to implement in a relational database because the value type is arbitrary.

    The horizontal scaling characteristic is another key requirement that object databases more easily supports. Multiple terabytes databases have been successfully deployed. These object database systems have a client-centric (rather than a server-centric) architecture. Data is distributed to the client and queries are performed on the client instead of on monolithic servers. Consequently, the data can be partitioned, replicated, and scaled much more easily without being tied down to the hardware limitations of a single server computer.

    So indeed, object database systems could be considered "NoSQL" technologies. They can be utilized either as a persistent store for data as well as a cache.

    Labels: , ,

    Sunday, October 25, 2009

    "document stores", "nosql databases" vs. ODBMS.

    There is a growing interest in our community in having resources published in ODBMS.ORG, which compare & contrast ODBMS with other "data stores", such as "document stores", and "nosql databases".

    Systems such as CouchDB, MongoDB, SimpleDB, Voldemort, Scalaris, etc. provide less functionality than OODBs but a distributed "object" cache over multiple machines.

    I plan to add a number of new resources on that in the next months to come.

    RVZ

     

    Labels: , ,

    Sunday, August 16, 2009

    New updated version (2009) of the ETH Zurich ODBMS Lecture Series.

    I`d like to mention that I have published a complete new updated version (2009) of the ETH Zurich ODBMS Lecture Series on ODBMS.ORG (PDF).

    This is by far the most up-to-date and comprehensive lecture series on object databases, developed by Michael Grossniklaus, and Moira Norrie at the renowned Swiss Federal Institute of Technology (ETH) Zurich.

    For the 2009 version of ETH Zurich's lecture on Object-oriented databases a number of additions and updates have been made:
    - New lecture providing a Versant tutorial
    - New lecture discussing different OODBMS architectures
    - Updated lectures on db4objects incorporating new features such as transparent persistence and activation.
    - Updated lectures on the OM model of data, OML and OMS Avon
    - Many corrections of errata throughout the whole course.

    RVZ

    Labels: , , , ,

    Tuesday, April 7, 2009

    ODBMS and RDBMS?

    I have recently asked Alexander Jaehne -Application Infrastructure & Integration Team Lead, at a major Swiss bank, what experience does he have in using the various options available for persistence for new projects.

    "For very large databases, you need to complement an ODBMS with some relational database. We prefer to have both.. " replied Jaehne.

    You can read the interview with Jaehne: User Report 31/09 .

    Of course, this is not true in general.

    For example, Richard Ahrens, Director at Merrill Lynch explains : "Our order and quote management system combines an embedded object-based continuous event processor with an embedded object database. This allows us to rapidly add new derivative products to our environment and keeps developers focused on writing code that adds direct business value. With our design, we have strived to eliminate "nonproductive" development: keeping objects in sync with a relational data model adds no value to our business, so we rely on object database technology to make that problem go away.
    We have found this approach not only enables us to deliver incremental functionality faster, but also reduces our testing burden since there are fewer moving parts for us to maintain ourselves. "

    The complete set of User Reports includes:

    User Report 1/08: Gerd Klevesaat at Siemens
    Segment: Industry - Automation
    User: Gerd Klevesaat - Software architect - Siemens, Germany

    User Report 2/08: Pieter van Zyl at CSIR
    Segment: Academia
    User: Pieter van Zyl - Researcher - CSIR, South Africa

    User Report 3/08: Philippe Roose at Liuppa
    Segment: Academia
    User: Philippe Roose - Ass. Professor / Researcher - LIUPPA, France

    User Report 4/08: William Westlake at SAIC
    Segment: Industry - Medical
    User: William Westlake - Principal Systems Engineer - SAIC, USA

    User Report 5/08: Stefan Edlich at TFH Berlin
    Segment: Academia
    User: Stefan Edlich - Professor - TFH Berlin, Germany

    User Report 6/08: Udayan Banerjee at NIIT
    Segment: Industry - Various
    User: Udayan Banerjee - CTO - NIIT, India

    User Report 7/08: Nishio Shuichi at ATR
    Segment: Industry - Robotics
    User: Nishio Shuichi - Senior Researcher - ATR Labs, Japan

    User Report 8/08: John Davies at Iona
    Segment: Industry - Finance
    User: John Davies - Technical Director - Iona, USA

    User Report 9/08: Scott Ambler at IBM
    Segment: Industry - Various
    User: Scott Ambler - Practice Leader - IBM Rational, Canada

    User Report 10/08: Mike Card at Syracuse
    Segment: Industry - Defense
    User: Mike Card - Researcher - Syracuse, USA

    User Report 11/08: Rich Ahrens at Merrill Lynch
    Segment: Industry - Finance
    User: Richard Ahrens - Director - Merrill Lynch, USA

    User Report 12/08: Ajay Deshpande at Persistent
    Segment: Industry - Various
    User: Ajay Deshpande - Senior Architect - Persistent, India

    User Report 13/08: Horst Braeuner at City of Schwaebisch Hall
    Segment: Public - Government
    User: Horst Braeuner - CTO, CIO - City of Schwaebisch Hall, Germany

    User Report 14/08: Tore Risch at University of Uppsala
    Segment: Academia
    User: Tore Risch - Professor - University of Uppsala, Sweden

    User Report 15/08: Michael Blaha at OMT
    Segment: Industry - Consulting
    User: Michael Blaha - Principal - OMT Associates, USA

    User Report 16/08: Stefan Keller at HSR Rapperswil
    Segment: Academia
    User: Stefan Keller - Professor - HSR Rapperswil, USA

    User Report 17/08: Mohammed Zaki at Rensselaer Polytechnic Institute
    Segment: Academia
    User: Mohammed Zaki - Associate Professor - Rensselaer Polytechnic Institute, USA

    User Report 18/08: Peter Train at Standard Bank
    Segment: Industry - Finance
    User: Peter Train - Architect - Standard Bank, South Africa

    User Report 19/08: Biren Gandhi at IBM
    Segment: Industry - Consulting
    User: Biren Gandhi - Architect - IBM, Germany

    User Report 20/08: Sven Pecher at IBM
    Segment: Industry - Consulting
    User: Sven Pecher - Senior Consultant - IBM, Germany

    User Report 21/08: Frank Stuch at IBM
    Segment: Industry - Consulting
    User: Sven Pecher - Managing Consultant - IBM, Germany

    User Report 22/08: Hiroshi Miyazaki at Fujitsu
    Segment: Industry - Various
    User: Hiroshi Miyazaki - Methodology - Fujitsu, Japan

    User Report 23/08: Robert Huber at 7r
    Segment: Industry - Various
    User: Robert Huber - Managing Director - 7r, Switzerland

    User Report 24/08: Thomas Amberg at Oberon
    Segment: Industry - Various
    User: Thomas Amberg - Software Engineer, Oberon, Switzerland

    User Report 25/08: Martin F. Kraft
    Segment: Industry - Logistics
    User: Martin F. Kraft - Application Architect, Shipping Company (not disclosed), USA

    User Report 26/08: Serena Pizzi at Banca Fideuram
    Segment: Industry - Finance
    User: Serena Pizzi - Responsible Application Management Back End, Banca Fideuram SpA, Italy

    User Report 27/08: Dan Schutzer at FSTC
    Segment: Industry - Financial Services
    User: Dan Schutzer - Director, FSTC, USA

    User Report 28/08: Peter Fallon at Castle Software Australia
    Segment: Industry - Software development and consulting
    User: Peter Fallon - Director , Castle Software Australia, Australia

    User Report 29/08: Benny Schaich-Lebek at SAP
    Segment: Industry - ERP
    User: Benny Schaich-Lebek - Product Management, SAP, Germany

    User Report 30/08: Stephan Kiemle at German Aerospace Center
    Segment: Industry - Aereospace
    User: Stephan Kiemle - Chief software engineer, German Aerospace Center DLR, Germany

    User Report 31/09: Alexander Jaehne at Major Swiss Bank
    Segment: Industry - Finance
    User: Alexander Jaehne -Application Infrastructure & Integration Team Lead, Switzerland.

    Labels: , , ,

    Monday, February 9, 2009

    ODBMS.ORG Useful Links

    Since we started up in September 2005, ODBMS.ORG has grown quite a bit. A lot of free resources have been added in the course of the years.

    I thought it could be useful to give you a few links to easy your search for useful resources....

    Here we are:

    If you are interested in Lecture Notes:
    Object Databases - Lecture Notes

    OO Programming - Lecture Notes

    Database in General Lecture notes

    If you are interested in testing some vendors software and/or download some free software:
    Object Databases - Free Software

    OO Programming - Free Software

    If you are interested in standards, and in the Object Data Management Group -Past Resources in particular:
    Object Data Management Group -Past Resources (ODMG Version 1-3)

    If you would like to read user reports on how persistent objects are handled in various domains.

    If you are interested in dedicated articles from ODBMS.ORG's Panel of Experts

    And plenty more of Articles and Papers on Object Databases

    If you are looking to know more about Commercial and Open Source Object Database Vendors

    Last but least if you are looking for books

    Hope it helps....

    RVZ

    Labels: , , , ,

    Tuesday, January 13, 2009

    O/R Impedance Mismatch? Users Speak Up! Fourth Series of User Reports published.

    I have published the fourth series of user reports on using technologies for storing and handling persistent objects.

    The fourth series includes 6 new user reports from the following users:

    -Martin F. Kraft
    -Serena Pizzi at Banca Fideuram
    -Dan Schutzer at FSTC
    -Peter Fallon at Castle Software Australia
    -Benny Schaich-Lebek at SAP
    -Stephan Kiemle at German Aerospace Center


    The new 6 reports and the complete series of user reports are available for free download.

    I have also published a new paper by ODBMS.ORG panel member William Cook on Interprocedural Query Extraction for Transparent Persistence.
    Transparent Persistence promises to integrate programming languages and databases by allowing programs to access persistent data with the same ease as non-persistent data. The work is focused on programs written in the current version of Java, without languages changes. However, the techniques developed by Cook and his colleagues may also be of value in conjunction with object-oriented languages extended with high-level query syntax.

    Labels: , , ,

    Monday, December 22, 2008

    Versant acquired the assets of the database software business of privately-held Servo Software, Inc. (formerly db4objects, Inc.).

    You probably noticed a news in the object database market: On December 1, 2008 "Versant acquired the assets of the database software business of privately-held Servo Software, Inc. (formerly db4objects, Inc.)".

    What`s the meaning of this acquisition? I asked a few questions to
    Robert Greene who is responsible for defining Versant's overall object database strategy ....

    Q1. What`s the meaning of this acquisition for Versant? db4o is an open source object database, but Versant had no open source strategy until now.

    [RCG] This acquisition recognizes the value the db4objects team created, by bringing visibility to software developers, the relevance of object database technology in the software development toolkit.

    Incidentally, this is not Versant’s first initiative in the open source space. In 2006, Versant open sourced a JDO/JPA based ORM driver and initiated an open source JPA project within Eclipse, at the time known as Eclipse JSR220-ORM. Eclipse had managed to use this project to get Oracle to commit a similar open source project. In the end, both projects merged into what is the Eclipse Dali project and Oracle became the project lead.

    This open source activity by Versant was aimed at making developers more aware of object based transparent persistence and fostering such an API approach in their development. We view this as a tremendous success, as now a substantial portion of the Java community uses Hibernate (or TopLink) and Eclipse Dali to develop applications.

    Those ORM API’s which have flourished since the early 2002 timeframe, are in essence the Versant database API’s which have existed since the mid 90’s in our object database technology. It was an ex-Versant product manager who went to Sun and drove those standards through the Java JSR process. Ultimately, it was open source Hibernates’ flavor which gained the most acceptance, but the similarity of the approach is undeniable.

    Due to the power of open source, anyone who knows ORM technology, has in essence, become an expert in the use of object databases. They can simply get rid of the mapping portion of the ORM work and then everything else is nearly the same as long as they point connections to an object database. In fact, Versant plans to release a compatibility version for Eclipse Dali.


    Q2. Will you keep db4o as a separate product or will you merge it into Versant Object Database?

    [RCG] Versant plans to continue to operate db4o in the same manner, continuing to foster the community and improve the technology in the traditional open source fashion. It will remain a separate product.

    Q3. How do you plan to manage/support the db4o open source community?

    [RCG] one of the nice things about db4o is the extended community of supporters it’s developed over the years. Versant plans to simply join that community, following the same open form which has worked for db4o in the past. Of course, that being said, Versant has a long history and extended expertise in the OODB technology space. In that regard, we have opened our technology stack to the db4o core team and where it makes good technology sense, we can contribute significant forms of functionality that otherwise take a long time to create.

    Q4. db4o is targeting the embedded device market. Is this a market for Versant as well?

    [RCG] Versant technology has many successes in the embedded space. However, our real commercial success, comes from the many large scale systems developed using our technology to overcome limitations in traditional database systems. So, in this regard, db4o will dominate the embedded side of the Versant business and the Versant commercial object database will exist to help those who want the simplicity of the OODB programming model, but require greater scaling capabilities.

    Q5. Are there going to be any changes in the db4o business model?

    [RCG] No. The db4o brand will continue to offer the dual licensing model common to open source businesses, along with professional levels of subscription based support.

    Labels: , , ,

    Saturday, December 20, 2008

    TechView Product Reports

    Most of the time it is difficult to gather good technical information on products, without marketing or sales hype.

    I therefore decided to create a series of product reports on some of the leading Object Database Systems around.

    For that, I have prepared 23 questions which I sent to four vendors: db4objects,Objectivity, Inc.,Progress Software and Versant Corporation.
    I asked them detailed information on their products, such as: Support of Programming Languages, Queries, Data Modeling, Integration with relational data, Transactions,Persistence,Storage, Architecture,Applications, and Performance.

    The result are four TechView Product Reports, which contain detailed useful information on the respective products:
    -db4o
    - Objectivity/DB
    - ObjectStore
    - Versant Object Database

    I hope these will be useful resources for developers and architects alike.
    As always you can freely download the reports.

    Labels: , , , , , , ,

    Tuesday, December 16, 2008

    OMG ODBTWG next steps

    This is a short note related to the OMG ODBTWG meeting, on December 9, 2008.

    During the meeting there was a consensus that the OMG's Semantic Meta Object Facility (“semantic MOF” or “S-MOF”) would be a good place to start for the object model in the Object Database Standard RFP.

    Mike Card is planning to publish a rough draft of an OMG RFP for the new database standard in advance of the March 2009 OMG meeting in Washington DC.

    RFP stands for Request for Proposals; the OMG technology adoptions revolve around the RFP.
    More info on the OMG Technology Adoption Process.

    Labels: , , ,

    Friday, December 5, 2008

    OMG is hosting an Object Database Standard Definition Scope meeting in Santa Clara

    I have received a note from Mike Card that I would like to share with you.

    "The OMG is hosting an Object Database Standard Definition Scope meeting in Santa Clara, CA at the Hyatt Regency on Tuesday afternoon, December 9th.

    The purpose of this meeting will be to define what the scope of the new object database standard should be.

    We have already done some work in this area but more remains to be done.
    Our goal is to complete the definition of what will and will not be included in the scope of the new standard at this meeting. Once we have defined what will and will not be included, I can begin work on a draft OMG Request For Proposal (RFP).
    The RFP is important because this is the mechanism by which the OMG generates standards – an RFP is put out there and a group of vendors who intend to implement the final standard responds to the RFP with a standard.
    So, we cannot get the ball rolling until we get the RFP out there, and we are getting close. Once the RFP is put out by the OMG, then the “real work” begins where object database vendors intending to submit and other interested parties begin working together to develop a response to the RFP.
    It is this response that will become the successor to ODMG 3.0.

    The agenda for this meeting will be as follows:

    1300-1310 Welcome and introductory comments (Mike Card)
    1310-1330 Review of scoping consensus thus far and db4o comments from last meeting (Mike Card)
    1330-1630 Discussion of scope areas to be included or excluded (all participants)
    1630-1700 Wrap-up and discussion of next steps (Mike Card)


    We got some excellent feedback from db4o at our last meeting on these topics and we would like input from other vendors as well.

    We very much hope to see you there! There is a $150 registration fee for this event, to register please visit the registration page

    There should be a link there soon to register for this event. Thanks!

    Michael P. Card
    Syracuse Research Corporation "

    For a summary of the work done until now by the OMG on the definition of a new object database standard, pls see my interview to Mike Card

    Labels: , , ,

    Thursday, October 23, 2008

    O/R Impedance Mismatch? Users Speak Up! Third Series of User Reports published.

    I have published the third series of user reports on using technologies for storing and handling persistent objects.
    I have defined "users" in a very broad sense, including: CTOs, Technical Directors, Software Architects, Consultants, Developers, and Researchers.

    The third series includes 7 new user reports from the following users:

    - Peter Train, Architect, Standard Bank Group Limited, South Africa.
    - Biren Gandhi, IT Architect and Technical Consultant, IBM Global Business Services, Germany.
    - Sven Pecher, Senior Consultant, IBM Global Business Services, Germany.
    - Frank Stuch, Managing Consultant, IBM Global Business Services, Germany.
    - Hiroshi Miyazaki, Software Architect, Fujitsu, Japan.
    - Robert Huber, Managing Director, 7r gmbh, Switzerland.
    - Thomas Amberg, Software Engineer, Oberon microsystems, Switzerland.


    I asked each users a number of equal questions, among them what experience do they have in using the various options available for persistence for new projects and what are the lessons learned in using such solution(s).

    “Some of our newer systems have been developed in-house using an object oriented paradigm. Most (if not all) of these use Relational Database systems to store data and the "impedance mismatch" problem does apply” says Peter Train from Standard Bank.

    The lessons learned using Object Relational mapping tools confirm the complexity of such technologies.

    Peter Train explains: “The most common problems that we have experienced with object Relational mapping tools are:
    i) The effort required to define mappings between the object and the relational models; ii) Difficulty in understanding how the mapping will be implemented at runtime and how this might impact performance and memory utilization. In some cases, a great deal of effort is spent tweaking configurations to achieve satisfactory performance.”

    Frank Stuch from IBM Global Business Services has used Hibernate, EJB 2 and EJB 3 Entity Beans in several projects.
    Talking about his experience with such tools he says: “EJB 2 is too heavy weight and outdated by EJB 3. EJB 3 is not supported well by development environments like Rational Application Developer and not mature enough. In general all of these solutions give the developer 90% of the comfort of an OODBMS with well established RDBMS.
    The problem is that this comfort needs a good understanding of the impedance mismatch and the consequences on performance (e.g. "select n+1 problem"). Many junior developers don't understand the impact and therefore the performance of the generated/created data queries are often very poor. Senior developers can work very efficient with e.g. Hibernate. “

    In some special cases custom solutions have been built, like in the case of Thomas Amberg who works in mobile and embedded software and explains “We use a custom object persistence solution based on sequential serialized update operations appended to a binary file”.

    The new 7 reports and the complete series of user reports are available for free download.

    I plan to continue to publish users reports on a regular base.

    Labels: , , , ,

    Tuesday, October 7, 2008

    LINQ: the best option for a future Java query API?

    My interview to Mike Card has triggered an intense discussion (still ongoing), on the pros and cons of considering LINQ as the best option for a future Java query API.

    There is a consensus that a common query mechanism for odbms is needed.

    However, there is quite a disagreement on how this should be done. In particular, some see LINQ as a solution, provided that LINQ is also available for Java. Others on the contrary do not like LINQ, but would rather prefer a vendor neutral solution, for example based on SBQL.

    You can follow the discussion here.

    I have listed here some useful resources I published in ODBMS.ORG - related to this discussion:

    Erik Meijer, José Blakeley
    The Microsoft perspective on ORM
    An Interview in ACM Queue Magazine with Erik Meijer and José Blakeley. With LINQ (language-integrated query) and the Entity Framework, Microsoft divided its traditional ORM technology into two parts: one part that handles querying (LINQ) and one part that handles mapping (Entity Framework).| September 2008 |

    Panel Discussion "ODBMS: Quo Vadis?
    Panel discussion with Mike Card, Jim Paterson, and Kazimierz Subieta, on their views on on some critical questions related to Object Databases: Where are Object Database Systems going? Are Relational database systems becoming Object Databases?
    Do we need a standard for Object Databases? Why ODMG did not succeed?

    Java Object Persistence: State of the Union PART II
    Panel discussion with Jose Blakeley (Microsoft), Rick Cattell (Sun Microsystems), William Cook (University of Texas at Austin), Robert Greene (Versant), and Alan Santos (Progress). The panel addressed the ever open issue of the impedance mismatch.

    Java Object Persistence: State of the Union PART I
    Panel discussion with Mike Keith: EJB co-spec lead, main architect of Oracle Toplink ORM, Ted Neward: Independent consultant, often blogging on ORM and persistence topics, Carl Rosenberger: lead architect of db4objects, open source embeddable object database. Craig Russell: Spec lead of Java Data Objects (JDO) JSR, architect of entity bean engine in Sun's appservers prior to Glassfish, on their views on the current State of the Union of object persistence with respect to Java.

    Stack-Based Approach (SBA) and Stack-Based Query Language (SBQL)
    Kazimierz Subieta, Polish-Japanese Institute of Information Technology
    Introduction to object-oriented concepts in programming languages and databases, SBA and SBQL

    The Object-Relational Impedance Mismatch
    Scott Ambler, IBM. Scott explores the technical and the cultural impedance mismatch between the relational and the object world.

    ORM Smackdown - Transcript
    Ted Neward, Oren "Ayende" Eini. Transcripts of the Panel discussion "ORM Smackdown" on different viewpoints on Object-Relational Mapping (ORM) systems, courtesy of FranklinsNet.

    OOPSLA Panel Objects and Databases
    William Cook et.al. Transcript of a high ranking panel on objects and databases at the OOPSLA conference 2006, with representatives from BEA, db4objects, GemStone, Microsoft, Progress, Sun, and Versant.

    Labels: , , , , ,

    Thursday, September 4, 2008

    Do you have an impedance mismatch problem? Users speak up! Second series of user reports published.

    I have started a new series of interviews with users of technologies for storing and handling persistent objects, around the globe.

    6 additional user reports (12-17/08) have been published, from the following users:

    • Ajay Deshpande, Persistent
    • Horst Braeuner, City of Schwaebisch Hall
    • Tore Risch, Uppsala University
    • Michael Blaha, OMT Associates
    • Stefan Keller, HSR Rapperswil
    • Mohammed Zaki, Rensselaer

    The complete initial series of user reports is available as always for free download.

    Here I define "users" in a very broad sense, including: CTOs, Technical Directors, Software Architects, Consultants, Developers, Researchers.

    I have asked 5 questions:

    Q1. Please explain briefly what are your application domains and your role in the enterprise.

    Q2. When the data models used to persistently store data (whether file systems or database management systems) and the data models used to write programs against the data (C++, Smalltalk, Visual Basic, Java, C#) are different, this is referred to as the “impedance mismatch” problem. Do you have an “impedance mismatch” problem?

    Q3. What solution(s) do you use for storing and managing persistence objects? What experience do you have in using the various options available for persistence for new projects? What are the lessons learned in using such solution(s)?

    Q4. Do you believe that Object Database systems are a suitable solution to the "object persistence" problem? If yes why? If not, why?

    Q5. What would you wish as new research/development in the area of Object Persistence in the next 12-24 months?

    More information here

    Labels: , , , ,

    Wednesday, August 27, 2008

    LINQ is the best option for a future Java query API

    A conversation with Mike Card.

    I have interviewed Mike Card on the latest development of the OMG working group which aims at defining a new standards for Object Database Systems.

    Mike works with Syracuse Research Corporation (SRC) and is involved in object databases and their application to challenging problems, including pattern recognition. He chairs the ODBT group in OMG to advance object database standardization.


    R. Zicari: Mike, you recently chaired an OMG ODBTWG meeting, on June 24, 2008 What kind of synergy do you see outside OMG in relation to your work?

    Mike Card: We think it is likely that the OMG would need to participate in the Java Community Process (JCP) in order to write a Java Specification Request (JSR) to add LINQ functionality to Java.

    R. Zicari: There has been a lot of discussion lately on the merit of SBQL vs. LINQ as a possible query API standard for object databases . Did you discuss this issue at the meeting?

    M. Card: I began the technical part of our meeting by reviewing Professor Subieta’s comparison of SBQL and LINQ. It was my understanding from this comparison that LINQ was technically capable of performing any query that could be performed by SBQL, and I wanted to know if the participants saw this the same way. They agreed in general, and believed that even if LINQ were only able to do 90% of what SBQL could do in terms of data retrieval that it would still be the way to go.

    R. Zicari: Could you please go a bit more in detail on this?

    M. Card: Sure. At the meeting it was pointed out that Prof. Subieta had noted in his comparison that he had not shown queries using features that are not a part of LINQ, such as fixed-point arithmetic, numeric ranges, etc.

    These are language features that would be familiar to users of Ada but which are not found in languages like C++, C#, and Java so they would likely not be missed and would be considered esoteric.

    It was also pointed out that the query examples chosen by Prof. Subieta in his comparison were all “projections” (relational term meaning a query or operation that produces as its output table a subset of the input table, usually containing only some of the input table’s columns).

    A query like this by definition will rely on iteration, and this will show the inherent expressive power of SBQL since the abstract machine contains a stack that can be used to do the iteration processing and thus avoid the loops, variables, etc. needed by SQL/LINQ.

    R. Zicari: Did you agree on a common direction for your work in the group?

    M. Card: The consensus at this meeting and at ICOODB conference in Berlin was that LINQ was the best option for a future Java query API since it already had broad support in the .Net community. We will have to choose a new name for the OMG-Java effort, however, as LINQ is trademarked by Microsoft.

    It was also agreed that the query language need not include object update capability, as object updates were generally handled by object method invocations and not from within query expressions.

    Now, since LINQ allows method invocations as part of navigation (e.g. “my_object.getBoss().getName()”) it is entirely possible that these method calls could have side effects that update the target objects, perhaps in such a way that the changes would not get saved to the database.

    This was recognized as a problem, ideas kicked around for how to solve it included source code analysis tools.
    This is something we will need a good answer for as it is a potential “open manhole cover” if we intend the LINQ API to be read-only and not capable of updating the database (especially unintentionally!)

    R. Zicari: What else did you address at the meeting?

    Mike Card: The discussion then moved on to a list of items included Carl Rosenberger’s ICOODB presentation.
    Other items were also reviewed from an e-mail thread in the ODBMS.ORG forumthat included comments from both Prof. Subieta and Prof. William Cook.

    The areas discussed were broken down into 3 groups:
    i) those things there was consensus on for standardization,
    ii) those things that needed more discussion/participation by a larger group, and
    iii) those things that there was consensus on for exclusion from standardization.

    R. Zicari: What are the areas you agree to standardize?

    Mike Card: The areas we agree to standardize are:

    1. object lifecycle (in memory): What happens at object creation/deletion, “attached” and “detached” objects, what happens during a database transaction (activation and de-activation), etc. It is desirable that we base our efforts in this area on what has already been done in existing standards for Java such as JDO, JPA, OMG, et. al. This interacts with the concurrency control mechanism for the database engine, may need to refer to Bernstein et. al. for serialization theory / CC algorithms.

    2. object identification: A participant raised a concern here RE: re-use of OID where the OID is implemented as a physical pointer and memory is re-cycled resulting in re-use of an OID, which can corrupt some applications. He favored a standard requiring all OIDs to be unique and not re-used

    3. session:: what are the definition and semantics of a session?
    a. Concurrency control: again, we should refer to Bernstein et. al. for proven algorithms and mathematical definitions in lieu of ACID criteria (ACA: Avoidance of Cascading Aborts, ST: Strict, SR: Serializable, RC: Recoverable for characterizing transaction execution sequences)
    b. Transactions: semantics/behavior and span/scope

    4. Object model: what OM will we base our work upon?

    5. Native language APIs: how will we define these? Will they be based on the Java APIs in ODMG 3.0, or will they be different? Will they be interfaces?

    6. Conformance test suite: we will need one of these for each OO language we intend to define a standard for. The test suite, however, is not the definition of the standard; the definition must exist in the specification.

    7. Error behavior: exception definitions etc.

    R. Zicari: What are the areas where no agreement was (yet) found?

    Mike Card: Areas we need to find agreement on are:

    1. keys and indices: how do you sort objects? How do you define compound keys or spatial keys? Uniqueness constraints? Can this be handled by annotation, with the annotation being standardized but the implementation being vendor-specific? This interacts with the query mechanism, e.g. availability of an index could be checked for by the query optimizer.

    2. referential integrity: do we want to enforce this? Avoidance of dangling pointers, this interacts with object lifecycle/GC considerations.

    3. cascaded delete: when you delete an object, do you also delete all objects that it references? It was pointed out that this has issues for a client/server model ODBMS like Versant because it may have to “push” out to clients that objects on the server have been deleted, so you have a distributed cache consistency problem to solve.

    4. replication/synchronization: how much should we standardize the ability to keep a synchronized copy of part or all of an object database? Should the replication mechanism be interoperable with relational databases? Part or all of this capability could be included in an optional portion of the standard.

    a. Backup: this is a specialized form of replication, how much should this be standardized? Is the answer to this
    question dependent upon the kind of environment (DBA or DBA-less/embedded) that the ODBMS is operating in?

    5. events/triggers: do we want to standardize certain kinds of activity (callbacks et. al.) when certain database operations occur?

    6. update within query facility: this is a recognition of the limitations of LINQ, which does not support object update it is “read-only.” Generally, object updates and deletes are performed by method invocations in a program and not by query statements.
    The question is, since LINQ allows method invocations as part of navigation, e.g. “my_employee_obj.getBoss().getName(),” is it possible in cases like this that such method calls could have side effects which update the object(s) in the navigation statement? If so, what should be done?

    7. extents: do we expose APIs for extents to the user?

    8. support for C++: how will we support C++/legacy languages for which a LINQ-like facility is not available? We could investigate string-based QL like OQL and/or we could use a facility similar to Cook/db4o “native queries”

    R. Zicari: And what are the areas you definitely do not want to standardize?

    Mike Card: Areas we do not want to standardize are:

    1. garbage collection: issue here is behavioral differences between “embedded” (linked-in) OODBMS vs. client/server OODBMS

    2. stored procedures/functions/views: these are relational/SQL concepts that are not necessarily applicable to object-oriented programming languages which are the purview of object databases.

    R. Zicari: How will you ensure that the vendor community will support this proposal?

    Mike Card: We plan on discussing this list and verify that others not present agree with the grouping of these items. We should also figure out what we want to do with the items in the “middle” group and then begin prioritizing these things. It appears likely that a next-generation ODBMS standard will follow a “dual-track” model in that the query mechanism (at least for Java) will be developed as a JSR within the JCP, while all of the other items will be developed within the OMG process.

    For C# (assuming C# is a language we will want an ODBMS standard for, and I think it is), the query API will be built into the language via LINQ and we will need to address all of the “other” issues within our OMG effort just as with Java. In the case of C# and Java, most of these issues can probably be dealt with in the same manner.

    How much interest there is in a C++ standardization effort is unclear, this is an area we will need to discuss further.
    A LINQ-like facility for C++ is not an option since unlike C# and Java there is no central maintenance point for C++ compilers.

    There is an ISO WG that maintains the C++ standard, but C++ “culture” accepts non-conformant compilers so there are many C++ compilers out there that only conform to part of the ISO standard.

    The developers present who work with C++ mentioned that their C++ code base must be “tweaked” to work with various compilers as a given set of C++ code might compile fine with 7 compilers but fail with the compiler from vendor number 8.
    In general, the maintenance of C++ is more difficult than for Java and C# due to inconsistency in compiler implementation and this complicates anything we want to do with something as complex as object persistence.
    ##

    Some Useful Resources:
    - Panel Discussion "ODBMS: Quo Vadis?

    - Java Object Persistence: State of the Union PART II

    - Java Object Persistence: State of the Union PART I

    Labels: , , , , ,

    Tuesday, July 1, 2008

    Do you have an impedance mismatch problem? Users speak up!

    I have started a new series of interviews with users of technologies for storing and handling persistent objects, around the globe.

    Here I define "users" in a very broad sense, including: CTOs, Technical Directors, Software Architects, Consultants, Developers, Researchers.

    I have asked 5 questions:

    Q1. Please explain briefly what are your application domains and your role in the enterprise.

    Q2. When the data models used to persistently store data (whether file systems or database management systems) and the data models used to write programs against the data (C++, Smalltalk, Visual Basic, Java, C#) are different, this is referred to as the “impedance mismatch” problem. Do you have an “impedance mismatch” problem?

    Q3. What solution(s) do you use for storing and managing persistence objects? What experience do you have in using the various options available for persistence for new projects? What are the lessons learned in using such solution(s)?

    Q4. Do you believe that Object Database systems are a suitable solution to the "object persistence" problem? If yes why? If not, why?

    Q5. What would you wish as new research/development in the area of Object Persistence in the next 12-24 months?

    The first series of interviews I published in ODBMS.ORG include:

    ODBMS.ORG User Report No. 1/08
    Editor Roberto V. Zicari- ODBMS.ORG www.odbms.org
    July 2008.
    Category: Industry
    Domain: Automation System Solutions for Postal Processes.
    User Name: Gerd Klevesaat
    Title: Software Architect
    Organization: - Siemens AG- Industry Sector, Germany

    ODBMS.ORG User Report No.2/08
    Editor Roberto V. Zicari- www.odbms.org
    July 2008.
    Category: Academia
    Domain: Research/Education
    User Name: Pieter van Zyl
    Title: Researcher
    Organization: Meraka Institute of South Africa's Council for
    Scientific and IndustrialResearch (CSIR) and University of
    Pretoria, South Africa.


    ODBMS.ORG User Report No.3/08
    Editor Roberto V. Zicari- www.odbms.org
    July 2008.
    Category: Academia
    Domain: Research/Education
    User Name: Philippe Roose
    Title: Associate Professor / Researcher
    Organization: LIUPPA/IUT de Bayonne, France.

    ODBMS.ORG User Report No.4/08
    Editor Roberto V. Zicari- ODBMS.ORG www.odbms.org
    July 2008.
    Category: Industry
    Domain: Various
    User Name: William W. Westlake
    Title: Principal Systems Engineer
    Organization: Science Applications International Corporation, USA

    ODBMS.ORG User Report No.5/08
    Editor Roberto V. Zicari- ODBMS.ORG www.odbms.org
    July 2008.
    Category: Academia
    Domain: Research/Education
    User Name: Stefan Edlich
    Title: Professor
    Organization: TFH-Berlin, Germany

    ODBMS.ORG User Report No. 6/08
    Editor Roberto V. Zicari- ODBMS.ORG www.odbms.org
    July 2008.
    Category: Industry
    Domain: Various.
    User Name: Udayan Banerjee
    Title: CTO
    Organization: NIIT Technologies, India.

    ODBMS.ORG User Report No. 7/08
    Editor Roberto V. Zicari- ODBMS.ORG www.odbms.org
    July 2008.
    Category: Industry
    Domain: Robotics.
    User Name: NISHIO Shuichi
    Title: Senior Researcher
    Organization: JARA/ATR, Japan.

    ODBMS.ORG User Report No.8/08
    Editor Roberto V. Zicari- ODBMS.ORG www.odbms.org
    July 2008.
    Category: Industry
    Domain: Financial Services
    User Name: John Davies
    Title: Technical Director
    Organization: Iona, UK

    ODBMS.ORG User Report No.9/08
    Editor Roberto V. Zicari- ODBMS.ORG www.odbms.org
    July 2008.
    Category: Industry
    Domain: Various
    User Name: Scott W. Ambler
    Title: Practice Leader Agile Development
    Organization: IBM Rational, Canada

    ODBMS.ORG User Report No. 10/08
    Editor Roberto V. Zicari- ODBMS.ORG www.odbms.org
    June 2008.
    Category: Industry
    Domain: Defense/intelligence area.
    User Name: Mike Card
    Title: Principal engineer
    Organization: Syracuse Research Corporation (SRC), USA

    ODBMS.ORG User Report No. 11/08
    Editor Roberto V. Zicari- ODBMS.ORG www.odbms.org
    July 2008.
    Category: Industry
    Domain: Finance
    User Name: Richard Ahrens
    Title: Director
    Organization: Merrill Lynch, US

    All user reports are available for free download (PDF)

    Hope you`ll find them interesting. More to come....I plan to publish user reports in ODBMS.ORG on a regular base.

    RVZ

    Labels: , , , ,

    Tuesday, May 20, 2008

    Object Database Systems: Quo vadis?

    I wanted to have an opinion on some critical questions related to Object Databases:

    Where are Object Database Systems going? Are Relational database systems becoming Object Databases? Do we need a standard for Object Databases? Why ODMG did not succeed?


    I have therefore interviewed one of our Experts, Mike Card , on his view on the current State of the Union of object database systems.
    Mike works with Syracuse Research Corporation (SRC) and is involved in object databases and their application to challenging problems, including pattern recognition. He chairs the ODBT group in OMG to advance object database standardization.

    Question1:
    It has been said (See Java Panel II ) that an Object Database System in order to be a suitable solution to the object persistence problem needs to support not only a richer object model, but it also has to support set-oriented, scalable, cost-based-optimized query processing, and high-throughput transactions.
    Do current ODBMS offer these features?


    Mike Card:
    In my opinion, no though the support for true transactional processing varies between vendors. Some products use “optimistic” concurrency control, which is suitable only for environments where there is very little concurrent access to the database, such as single-threaded embedded applications. In my opinion, a database engine is not “scalable” (at least in the enterprise sense of the word) if it is based on optimistic concurrency control. This is because most truly large-scale applications will require optimal performance with many concurrent transactions, and this cannot be achieved when updates have to be rolled back at transaction commit time and re-attempted due to access conflicts.

    Question2:
    Relational systems are rapidly becoming object database systems (See Java Panel II ). Do you agree or disagree with this statement? Why?


    Mike Card:
    I would disagree, because relational databases still fundamentally access objects as rows of tables and do not offer seamless integration into a host programming language’s type system. It is true that there are some good ORMs out there, but these will never offer the performance or seamlessness that is available with a good ODBMS. I would agree that ORMs are getting better, but relational databases themselves are not becoming object databases.

    Question3:
    A lot of the worlds systems are built on relational technology and those systems need to be extended and integrated.
    That job is always difficult. An ODBMS should be able to fully participate in the enterprise data ecosystem as well as any other DBMS for both new development as well as enhancing existing applications. How this can be achieved?
    What is your opinion on this issue?


    Mike Card:
    As many vendors have noted, this is to some extent a marketing problem in terms of making enterprise customers aware of what object databases can do. It is also a technology issue, however, as engines based on “small-scale” concepts like optimistic concurrency control are not suitable to many enterprise environments.

    Question4:
    Object databases vary greatly from vendor to vendor. Is a standard for object databases (still) needed? If yes, what needs to be standardized in your opinion?


    Mike Card:
    Yes, I believe it is. The APIs for creating, opening, deleting, and synchronizing/replicating databases as well as the native query APIs should be standardized to allow application portability. Any APIs needed to insert objects into the database, remove them from the database, or create an index on them should also be standardized, again for the sake of application portability. I would also like to see a standard XML format for exporting object database contents to allow for data portability. I am not sure our current OMG effort can achieve all of these standardization goals, but I would like to.

    Question5:
    How would this new standard would different to the previous effort in ODMG? And what relationships this new standard would have with standards such as SQL?


    Mike Card:
    Unlike the previous ODMG standard, the new standard should have a conformance test suite that anyone can download and run against a candidate product. The standard itself should also be unambiguous and use precise language as is done in ISO standards for things like programming languages, e.g. ISO/IEC 8652 (Ada programming language standard).

    The primary focus of an object database standard should be its support of a native programming language, so I would expect that an object database standard might be more closely tied to an ISO standard for an object programming language (Ada, C++, other ISO-standardized languages that may appear) than to SQL, though perhaps if a LINQ-like native query capability were included in the object database standard would also reference the SQL standard due to the use of SQL-like verbs and semantics in LINQ.

    Question6:
    LINQ is leading in database API innovation, providing native language data access. Is this a suitable standard for ODBMS? Why?


    Mike Card:
    LINQ looks like it has a lot of promise in this area. We (the Object Database Technology Working Group in OMG) are currently evaluating LINQ vs. the stack-based query language (SBQL) developed at the Polish-Japanese Institute for Information Technology to see how these technologies compare for handling complex queries. SBQL has proven to be very good for complex queries and is being deployed in several EU projects, though it is unknown to most American developers. We are doing this evaluation to ensure LINQ is a good foundation for developers of applications that require complex queries, and is not too “small-scale” in its current form. We also want to hear from the LINQ community on plans (if any) to include update capability in LINQ and we need to be sure there are no surprises for parallel transaction execution.

    Question7:
    When object databases are a suitable solution for an Enterprise and when they are not?


    Mike Card:
    They are not suitable when the engine is intended primarily for use in single-threaded embedded systems (optimistic concurrency control is a good indicator of this as I mentioned earlier).

    An object database would be suitable for use in an enterprise system if it was really good at large-scale data management, i.e. the engine was designed to handle large volumes of data and many parallel transactions. Some object databases are not built like this, they are designed for use primarily in single-threaded embedded applications with fairly small data volumes and as such they would not be good candidates for enterprise applications.

    Besides the technology used in the database engine itself, a good enterprise object database would need database maintenance tools (e.g. taking database A offline and replacing it with database B, updating or fiddling with database A and then bringing it back on-line, scheduling backups of databases and replicating databases between sites etc.).

    Question 8:
    Future direction of object databases. Where do they go?


    Mike Card:
    The answer to this question depends on where object programming languages themselves go. Up to this point, programming languages have not included the concept of persistence, it is always included as a “foreign” thing to be dealt with using APIs for things like file I/O etc. This is a very 1960s view of persistence, where programs were things that lived in core memory and persistent things were data files written out to tape or disk.

    The closest thing to true integration of persistence I have seen is in Ruby with its “PStore” class. I would like to see persistence integrated even more fully, where objects can be declared persistent or made persistent a la

    public class myClass {

    persistent Integer[] myInts = new Integer[5];
    Integer[] myOtherInts = new Integer[2];

    public void aMethod() {
    myOtherInts.makePersistent();
    }

    }

    and the programming language itself would take care of maintaining them in files and loading them in at program start-up etc. without any additional work from the programmer.

    Now there are obviously challenges with this as this small example shows. What does it mean to initialize a persistent object in a class declaration? Is the object re-initialized when the program starts up? Or is the persisted value retained, rendering the initialization clause meaningless on a subsequent run of the program? Should persistent objects be allowed to have initialization clauses like this? What are the rules about inter-object access? Must persistence by reachability be used to ensure referential integrity? Can a “stack” variable (i.e. a variable declared in a method) be declared or made persistent, or must persistent variables be at the class level or even “global” (static)? Are these questions different for interpreted languages like Ruby which do not have the same notions of class as languages like Java? These are computer science/discrete math questions that will be answered during the language design process which will in turn determine how much “database” functionality ends up in the language itself.

    If persistence were fully integrated into an object programming language in this way, then the role of an object database for that language might be to just provide an efficient way to organize and search the program’s persistent variables. This would reduce the scope of what an object database has to do, since today an object database not only has to provide efficient organization and search (index and query) capability, but it also has to make objects persistent as seamlessly as possible. Of course, this “reduction in scope” would only be possible if the default persistence mechanism for the programming language was implemented in a way that was efficient and fast for large numbers of objects.

    ##

    Labels: , ,

    Tuesday, March 11, 2008

    Robert Greene, Leon Guzenda and Rick Cattell on Sun Microsystems acquisition of MySQL.

    On Wednesday Jan 16, 2008 Jonathan Schwartz, Chief Executive Officer and President, Sun Microsystems, Inc., announced in his blog that SUN is acquiring MySQL AB.

    On 26 February 2008 Sun Microsystems, Inc. announced it has completed the acquisition of MySQL AB, for approximately $1 billion in total consideration.

    Kevin Harvey, Chairman of the MySQL board of directors told InfoQ that there were two main drivers behind Sun's purchase of MySQL " it solidifies Sun's role in the Web 2.0 datacenter, and it also confirms Sun's position as a leading provider of open source software."

    I have asked three of our experts, Robert Green, Leon Guzenda, and Rick Cattell a few questions on this. Robert is responsible for defining Versant's overall object database strategy, Leon is responsible for the Objectivity object database strategy. Rick worked for several years at Sun Microsystems, and now he is an independent consultant.


    Q1. What does this announcement mean for the database market in general? and specifically will it have any impact on the object database market in your opinion? and if yes, how?

    RCG> I think this announcement means that companies who were concerned about putting MySQL into their enterprise environments will now rethink things. If Oracle was not concerned before about the MySQL threat, it ought to be now. It is interesting to watch as people are beginning to pay for these products, previously perceived as "free", in the form of services and value added capabilities(in MySQL's case, better tooling). I don't think there will be any direct impact to the OODB market other than the perpetuation of changing attitudes that what counts most is using the right tool for the job. It's that change in attitude that's having the greatest impact on the OODB market.

    LG> MySQL is a conventional RDBMS built and sold using the open source model. Sun has traditionally been vendor neutral in its approach to DBMS sales, partnering with whichever DBMS company a joint customer expressed interest in. They have always had a strong partnership with Oracle, for instance. As Oracle also sells its product on IBM hardware, competing
    directly with DB2, these partnerships are interesting in their complexity. Will Oracle shift more of its attention to sales on HP
    equipment rather than Sun's. I doubt it. However, there will undoubtedly be some pressure on Sun's sales people to work new deals that can be 100% handled and supported by Sun.
    I can't remember a situation where Objectivity/DB has been in competition with MySQL as we tackle completely different kinds of application, so this won't affect us directly. Likewise, our customers almost all find us without Sun's help, so I don't think it matters that much to us.

    RC> Sun has had a good adoption rate on its open source offerings: their application server, Open Office, Java, and so on. I believe that the MySQL acquisition was exactly the right move for Sun at this point, and also will be a big benefit to open source users.
    The acquisition will be good for open source users because Sun will push MySQL innovation in new directions, Sun will provide long-term stability for MySQL, which has been under attack from Oracle (who recently acquired both InnoDB and SleepyCat, the "engines" for MySQL), and there will be synergy and benefits between MySQL and Sun's current open source
    offerings, e.g. the application server and development tools.

    The acquisition was exactly the right move for Sun because unlike Microsoft, IBM, and Oracle, Sun did not have a strong database component in its software stack. Sun's software stack is open source (again, the right move I believe). Unlike Sun's current database offerings with PostgreSQL and Java DB, which are only strong in narrow markets,
    MySQL has a very large following in a wide variety of applications. MySQL thereby gives Sun a complete software stack with "best of breed" solutions pretty much across the board. It also allows Sun to tune that software stack for its platform (for example, optimizing MySQL for Sun Solaris, and utilizing innovative proprietary hardware features).

    As for the object database market, I don't see the acquisition having a big impact one way or the other. Object database systems are being used in different markets than relational database systems, for the most part. However, Sun's obvious support of open source is a "shot in the arm" for open source databases. Also, Sun's Java Persistence API and
    the adoption of object/relational mappings is a boost for object databases, because these allow object databases to be more easily and naturally substituted for relational databases in application servers and web servers. Sun will likely do some tuning of MySQL with JavaPersistence.

    By the way, I recently left Sun to do independent consulting as Cattell.Net, so the opinions and speculations I express here are purely my own. But as I mentioned, I believe the MySQL acquisition was a great move, so I remain positive on Sun's future if they play their cards right with the MySQL technology and customers over the next coupleyears.

    Q2. Schwartz in his blog says "....customers confirmed what we've known for years - that MySQL is by far the most popular platform on which modern developers are creating network services. From Facebook, Google and Sina.com to banks and telecommunications companies, architects looking for performance, productivity and innovation have turned to MySQL."

    Will it change anything in this respect?


    RCG> Well, I would hope the Schwartz believes in his message. The fact that Sun spent 1B for MySQL would suggest that he does not believe his perceptions will change for the worse, but I would hedge with a quote by Niels Bohr, "Prediction is very difficult, especially of the future". The key is to provide value, so far MySQL has done this well, whether or not they have "peaked", time will tell.

    There are public declarations about where the technology succeeds and where it begins to break down, if they want to expand, there are known issues that must be addressed. The future is wrought with challenges due to unbounded data growth, coupled with concurrency and complexity. In the future, other drivers like "Green" abilities will outweigh the ability to simply get the job done. Almost anything can be forced to work, but when you have to decide between something that works on 1000 servers or 400, your decisions will be heavily influenced by these other factors.
    I would predict, in the future, value will be driven by "Green" technologies, much like they have in the semi-conductor industry over the last decade.

    LG> It might for ODBMSs that are targetting traditional IT applications.

    Q3. Schwartz in his blog says: "The adoption of MySQL across the globe is nothing short of breathtaking. They are the root stock from which an enormous portion of the web economy springs"

    Is this specific to MySQL or else?


    RCG> I think the open source movement as a whole is the stock, MySQL is simply one of the branches, certainly big enough to hang a hammock without fear of breakage.

    LG> Apart from the usual pressure to use the vendor's technology, I can't see Oracle, DB2 or SQL Server shops suddenly switching everything to MyQL because Sun now owns it. I think it more likely that MySQL users will be pressured to switch to Sun's hardware offerings.

    RC> I believe Schwartz is right: the adoption of MySQL has been incredible, particularly among the fast-growing web companies. This is another reason that the MySQL acquisition was a smart move: it gives Sun an opening into these companies. Sun has suffered somewhat because these fast-growing companies have generally not bought Sun hardware,
    support, or software. Sun has only done well with the traditional and more conservative "enterprise" companies. Now Sun has a complete open source software stack, gives customers a choice of operating systems, and offers competitive hardware with both Intel and SPARC architectures.
    Sun is now well aligned with the fastest-growing sectors of the Internet market.

    You might question how "adoption" translates into dollars for Sun, since open source is free. But I believe Sun is in a good position to monetize widespread adoption of its software stack, through support revenue, upgrade revenue, and synergy between software and hardware sales.

    Q4. Schwartz in his blog says: " So what are we announcing today? That in addition to acquiring MySQL, Sun will be unveiling new global support offerings into the MySQL marketplace. We'll be investing in both the community, and the marketplace - to accelerate the industry's phase change away from proprietary technology to the new world of open web platform"

    What`s the meaning for the open source community in the database market?


    RCG> Maybe this is supposed to be a trick question. I think the meaning is the same as it is to other software markets. The sum of the constituents that subscribe to it's use and adoption. Software must increasingly provide value to
    compete, even free and open software. If there is no appreciable value, then it will have no constituency. The key is to figure out where you are in that value curve and how best to drive adoption given your particular situation.

    LG> Let's not forget that Sun moved to the open source model as its own efforts started to lag the faster moving community. While this matters a lot in some highly dynamic and emerging markets it can even be a problem in enterprise applications. Red Hat was changing so rapidly at one point that equipment manufacturers and rigorous IT shops were having problems
    achieving a stable base, so red Hat introduced the more pricey Enterprise Edition. Sun is probably aiming to make its money from services and bundled sales. Not all open source offerings have become commercially viable, but MySQL is a notable exception

    Q5. Schwartz in his blog says: "The good news is Sun is already committed to the business model at the heart of MySQL's success -"
    Is MySQL business model usable/adaptable also for ODBMS? How?


    RCG> As stated above, it's identifiable value that is important. The MySQL business model for the sake of the MySQL business model is a non-starter. If you have a technology that has non-commoditized value, there are other equally viable business models. The ODBMS company I can think of that most closely matches the MySQL model is db4o, and they have a database value which is highly commoditized, so I guess that business model makes sense for them. Some of the other ODBMS companies have highly differentiated value, so they do not depend on a MySQL like business model. So, it appears that many business models work.

    Which one works best is another question altogether. Ultimately, a business model has the goal of returning profits to it's owners and shareholders in a competitive landscape. So, what is the best company/business model, one that has 50% of a 10B software market and only earns 60M/yr for shareholders ( at a loss ) or one that has .005% of the market and earns 25M/yr for their shareholders at a profit? Again, I think the important point is understanding where you are in the value curve to help establish the business model that makes the most sense. I think this is one of the places that Sun has fallen short in the past. They have been trying to do the MySQL business model, but have failed to really understand where their various offering reside in the value curve. I could be wrong or rather that could be changing - perhaps they understand it very well and it's just more complex than a first glance. Perhaps software is the commodity and hardware is the value add and they are looking for MySQL to be the catalyst for adoption much like Hibernate was for JBoss.

    LG> There are currently about a half dozen ODBMS products sold with conventional licensing and a similar number of open source ones. As in the early days of ODBMSs, where there were about three times more products than the market could sustain, I doubt that many of the open source ones will survive in a crowded, highly specialized market. RDBMSs
    need a lot more support, e.g. for database administration, than ODBMSs, so the split between license sales and services is dramatically different for the two technologies. The open source ODBMSs will need to spread out into applications to earn significant revenue from services. At that point, if they're in the wrong vertical they'll be competing head on with the big players.

    RC> An excellent question. Many companies have admired MySQL's success and wondered how to emulate it. In my mind, MySQL is the only company in the last 20 years to successfully challenge the domination of Oracle, IBM, and Microsoft in the database market. Other database companies have failed, have been acquired, or have been relegated to a smaller niche
    market.

    Object database companies have generally been in the last category. Partly that was a matter of timing: object databases came out too early to "ride the wave" of the Internet, Java, and open source. Customers feared compromising the integrity of their databases using an "unsafe" C++ object database, and object databases met stiff competition from the
    big relational players, both in marketing dollars and in inertia behind existing relational database installations.

    I see that changing somewhat going forward. Although I think it's too late for a new database contender to ride the "open source" wave in the way that MySQL did, and I still don't see object databases challenging relational databases in mainstream markets, I do see that an open-source Java object database system could grow significantly, especially in
    applications where relational databases are not well suited.


    Q6. Specially to your company, do you see MySQL as an example you wish to follow?

    RCG> Absolutely! The first company that comes along and offers 1B to acquire Versant, we will accept the deal on behalf of our shareholders ;-) Seriously, to remain viable, companies must constantly consider where they reside on the
    value curve and adjust business models accordingly. Again, the future is especially hard to predict, should Versant decide it is prudent to switch business models, we would most certainly inform the public as is required by any publically traded corporation.

    LG> No, not for pure ODBMSs. It will be interesting to track the commercial progress of db4objects, objectdb and other open source ODBMSs.
    ##

    Labels: , , , , ,