Skip to content

LINQ is the best option for a future Java query API

by Roberto V. Zicari on August 27, 2008

A conversation with Mike Card.

I have interviewed Mike Card on the latest development of the OMG working group which aims at defining a new standards for Object Database Systems.

Mike works with Syracuse Research Corporation (SRC) and is involved in object databases and their application to challenging problems, including pattern recognition. He chairs the ODBT group in OMG to advance object database standardization.

R. Zicari: Mike, you recently chaired an OMG ODBTWG meeting, on June 24, 2008 What kind of synergy do you see outside OMG in relation to your work?

Mike Card: We think it is likely that the OMG would need to participate in the Java Community Process (JCP) in order to write a Java Specification Request (JSR) to add LINQ functionality to Java.

R. Zicari: There has been a lot of discussion lately on the merit of SBQL vs. LINQ as a possible query API standard for object databases . Did you discuss this issue at the meeting?

M. Card: I began the technical part of our meeting by reviewing Professor Subieta’s comparison of SBQL and LINQ. It was my understanding from this comparison that LINQ was technically capable of performing any query that could be performed by SBQL, and I wanted to know if the participants saw this the same way. They agreed in general, and believed that even if LINQ were only able to do 90% of what SBQL could do in terms of data retrieval that it would still be the way to go.

R. Zicari: Could you please go a bit more in detail on this?

M. Card: Sure. At the meeting it was pointed out that Prof. Subieta had noted in his comparison that he had not shown queries using features that are not a part of LINQ, such as fixed-point arithmetic, numeric ranges, etc.

These are language features that would be familiar to users of Ada but which are not found in languages like C++, C#, and Java so they would likely not be missed and would be considered esoteric.

It was also pointed out that the query examples chosen by Prof. Subieta in his comparison were all “projections” (relational term meaning a query or operation that produces as its output table a subset of the input table, usually containing only some of the input table’s columns).

A query like this by definition will rely on iteration, and this will show the inherent expressive power of SBQL since the abstract machine contains a stack that can be used to do the iteration processing and thus avoid the loops, variables, etc. needed by SQL/LINQ.

R. Zicari: Did you agree on a common direction for your work in the group?

M. Card: The consensus at this meeting and at ICOODB conference in Berlin was that LINQ was the best option for a future Java query API since it already had broad support in the .Net community. We will have to choose a new name for the OMG-Java effort, however, as LINQ is trademarked by Microsoft.

It was also agreed that the query language need not include object update capability, as object updates were generally handled by object method invocations and not from within query expressions.

Now, since LINQ allows method invocations as part of navigation (e.g. “my_object.getBoss().getName()”) it is entirely possible that these method calls could have side effects that update the target objects, perhaps in such a way that the changes would not get saved to the database.

This was recognized as a problem, ideas kicked around for how to solve it included source code analysis tools.
This is something we will need a good answer for as it is a potential “open manhole cover” if we intend the LINQ API to be read-only and not capable of updating the database (especially unintentionally!)

R. Zicari: What else did you address at the meeting?

Mike Card: The discussion then moved on to a list of items included Carl Rosenberger’s ICOODB presentation.
Other items were also reviewed from an e-mail thread in the ODBMS.ORG forumthat included comments from both Prof. Subieta and Prof. William Cook.

The areas discussed were broken down into 3 groups:
i) those things there was consensus on for standardization,
ii) those things that needed more discussion/participation by a larger group, and
iii) those things that there was consensus on for exclusion from standardization.

R. Zicari: What are the areas you agree to standardize?

Mike Card: The areas we agree to standardize are:

1. object lifecycle (in memory): What happens at object creation/deletion, “attached” and “detached” objects, what happens during a database transaction (activation and de-activation), etc. It is desirable that we base our efforts in this area on what has already been done in existing standards for Java such as JDO, JPA, OMG, et. al. This interacts with the concurrency control mechanism for the database engine, may need to refer to Bernstein et. al. for serialization theory / CC algorithms.

2. object identification: A participant raised a concern here RE: re-use of OID where the OID is implemented as a physical pointer and memory is re-cycled resulting in re-use of an OID, which can corrupt some applications. He favored a standard requiring all OIDs to be unique and not re-used

3. session:: what are the definition and semantics of a session?
a. Concurrency control: again, we should refer to Bernstein et. al. for proven algorithms and mathematical definitions in lieu of ACID criteria (ACA: Avoidance of Cascading Aborts, ST: Strict, SR: Serializable, RC: Recoverable for characterizing transaction execution sequences)
b. Transactions: semantics/behavior and span/scope

4. Object model: what OM will we base our work upon?

5. Native language APIs: how will we define these? Will they be based on the Java APIs in ODMG 3.0, or will they be different? Will they be interfaces?

6. Conformance test suite: we will need one of these for each OO language we intend to define a standard for. The test suite, however, is not the definition of the standard; the definition must exist in the specification.

7. Error behavior: exception definitions etc.

R. Zicari: What are the areas where no agreement was (yet) found?

Mike Card: Areas we need to find agreement on are:

1. keys and indices: how do you sort objects? How do you define compound keys or spatial keys? Uniqueness constraints? Can this be handled by annotation, with the annotation being standardized but the implementation being vendor-specific? This interacts with the query mechanism, e.g. availability of an index could be checked for by the query optimizer.

2. referential integrity: do we want to enforce this? Avoidance of dangling pointers, this interacts with object lifecycle/GC considerations.

3. cascaded delete: when you delete an object, do you also delete all objects that it references? It was pointed out that this has issues for a client/server model ODBMS like Versant because it may have to “push” out to clients that objects on the server have been deleted, so you have a distributed cache consistency problem to solve.

4. replication/synchronization: how much should we standardize the ability to keep a synchronized copy of part or all of an object database? Should the replication mechanism be interoperable with relational databases? Part or all of this capability could be included in an optional portion of the standard.

a. Backup:
this is a specialized form of replication, how much should this be standardized? Is the answer to this
question dependent upon the kind of environment (DBA or DBA-less/embedded) that the ODBMS is operating in?

5. events/triggers: do we want to standardize certain kinds of activity (callbacks et. al.) when certain database operations occur?

6. update within query facility: this is a recognition of the limitations of LINQ, which does not support object update it is “read-only.” Generally, object updates and deletes are performed by method invocations in a program and not by query statements.
The question is, since LINQ allows method invocations as part of navigation, e.g. “my_employee_obj.getBoss().getName(),” is it possible in cases like this that such method calls could have side effects which update the object(s) in the navigation statement? If so, what should be done?

7. extents: do we expose APIs for extents to the user?

8. support for C++: how will we support C++/legacy languages for which a LINQ-like facility is not available? We could investigate string-based QL like OQL and/or we could use a facility similar to Cook/db4o “native queries”

R. Zicari: And what are the areas you definitely do not want to standardize?

Mike Card: Areas we do not want to standardize are:

1. garbage collection: issue here is behavioral differences between “embedded” (linked-in) OODBMS vs. client/server OODBMS

2. stored procedures/functions/views: these are relational/SQL concepts that are not necessarily applicable to object-oriented programming languages which are the purview of object databases.

R. Zicari: How will you ensure that the vendor community will support this proposal?

Mike Card: We plan on discussing this list and verify that others not present agree with the grouping of these items. We should also figure out what we want to do with the items in the “middle” group and then begin prioritizing these things. It appears likely that a next-generation ODBMS standard will follow a “dual-track” model in that the query mechanism (at least for Java) will be developed as a JSR within the JCP, while all of the other items will be developed within the OMG process.

For C# (assuming C# is a language we will want an ODBMS standard for, and I think it is), the query API will be built into the language via LINQ and we will need to address all of the “other” issues within our OMG effort just as with Java. In the case of C# and Java, most of these issues can probably be dealt with in the same manner.

How much interest there is in a C++ standardization effort is unclear, this is an area we will need to discuss further.
A LINQ-like facility for C++ is not an option since unlike C# and Java there is no central maintenance point for C++ compilers.

There is an ISO WG that maintains the C++ standard, but C++ “culture” accepts non-conformant compilers so there are many C++ compilers out there that only conform to part of the ISO standard.

The developers present who work with C++ mentioned that their C++ code base must be “tweaked” to work with various compilers as a given set of C++ code might compile fine with 7 compilers but fail with the compiler from vendor number 8.
In general, the maintenance of C++ is more difficult than for Java and C# due to inconsistency in compiler implementation and this complicates anything we want to do with something as complex as object persistence.
##

Some Useful Resources:
- Panel Discussion “ODBMS: Quo Vadis?

- Java Object Persistence: State of the Union PART II

- Java Object Persistence: State of the Union PART I

From → Uncategorized

72 Comments Leave one →
  1. Wow, there is something I don’t really understand. What are they actually going to do? Standardize LINQ? I think Microsoft has already done it pretty well.

  2. Nina
    the focus here is to standardize an API for Objet databases. Mike Card is indicating that LINQ is a suitable candidate. But it is missing in Java.

    RVZ

  3. Thank you for the answer. Unfortunately, I still have some doubts regarding the review and the whole initiative.

    In my opinion LINQ is not just an API, it is rather a language extension. In order to implement an API it is enough to build a library. However, in order to provide a language extension, one needs to change the syntax/semantics/pragmatics of the programming language (in this case Java). The latter is much more difficult if you do not control the language. OMG should be aware of that Java is a relatively open platform, while Microsoft controls everything in .NET and SQL Server. Assuming that Java is indeed extended with LINQ-like support (I guess it will be very hard to convince Sun), how would third-party, propriatory solutions be introduced in Java? Wouldn't the decision process (DBMS provider->OMG->SUN) take too much time? Wouldn't the whole Java platform become too complex and unstable?

    Another thing that concerns me is the following sentence:

    "Stored procedures/functions/views: these are relational/SQL concepts that are not necessarily applicable to object-oriented programming languages which are the purview of object databases."

    But they are applicable to databases! Those features are so essential for database programmers I cannot imagine any serious database management system that does not implement them.

    In my opinion the sentence quoted above represents the point of view which led the whole idea of object databases to failure. I believe that object databases by no means should be perceived as object-oriented language extensions! Since databases are much more complex than typical programming languages, OMG should take the database-centric approach, not the programming language one.

    Databases should be controlled by database programming languages, not application (traditional) programming languages. An example of such a language is Oracle PL/SQL. Although Oracle DBMS supports Java as a server-side programming language, almost nobody uses it. Why? Because PL/SQL is very well integrated with the database. It provides the opportunity to develop software at a much higher level of abstraction than what is currently offered by Java + Hibernate or Java + current object-oriented DBMS or .NET + LINQ.

    I think that instead of trying to copy Microsoft LINQ, OMG should rather concentrate on developing a new database programming language in the spirit of PL/SQL (but well designed and object-oriented). As a person who has had the opportunity to develop applications using various persistence solutions for Java, I believe that the LINQ-like Java extension would't provide much more functionality than what is currently offered by Hibernate. Why reinventing the wheel? Isn't it better to invent something new?

    Just my 2 cents. Sorry for taking your time ;)

  4. Nina
    all valuable comments.
    I believe it should be of interest for you what Carl Rosenberger is trying to do.
    Pls check:
    http://developer.db4o.com/blogs/carl/archive/2008/05/02/linq-for-java.aspx

    When you say “OMG should rather concentrate on developing a new database programming language in the spirit of PL/SQL (but well designed and object-oriented)”.
    The problem with that is who is going to use yet another database programming language?
    This is not a technical issue though in my opinion.

    RVZ

  5. I don’t think that LINQ implies a ‘big’ language change. In my project JaQue (http://jaques.googlecode.com) I’m perfectly set with closures addition only, which will probably be introduced in Java 7.

    Regarding the language capabilities, I think, that Java should not strive to be able to express all the PL/SQL or other Database concepts. As it is mentioned in the interview, if Java will handle majority of use cases that will already bring a tremendous value.

    Kosta

  6. Roberto,

    I think you underestimate programmers and their capapility to learn new things. If a language makes their life easier, they will start to use it. Please look at recent examples, like PHP and Ruby.

    The biggest difficulty in learning a new language is to familiarize oneself with the API of the standard environment. I don’t expect a database language similar to PL/SQL to have a huge standard library.

    Konstantin,

    ANY language change is a problem if you don’t control the language. Can you tell me what “tramendous value” will LINQ bring to Java? In my opinion – not much comparing to Hibernate.

    I also don’t understand why an object database should give up powerful database mechanisms like stored procedures or views. Have you ever written a database application working in an OLTP environment? How would you make it work fast enough (or work at all) without stored procedures?

    The problem here is that people involved in this project seem to be Java programmers, not database specialists. You are going to add persistence to the Java programming language – that’s all. Unfortunately, it has nothing in common with defining the new standard of object-oriented databases (wasn’t it the goal of this initiative?).

    Please understand that JDBC is not the standard of relational databases, it’s a standard of accessing relational databases. Similar with Java + LINQ.

    It’s OK to define an API + Java support of a database middleware, but how would you design a remote controller without designing the TV first?

    This work may help you promote db4o as a tool working in small, embedded systems but it will not help define the standard. You need much more to do that.

  7. Nina,

    ANY language change is a problem if you don’t control the language.

    Agree, that’s why I want to bring LINQ capabilities without changing the language. Please see http://jaque.googlecode.com

    Can you tell me what “tremendous value” will LINQ bring to Java? In my opinion – not much comparing to Hibernate.

    LINQ is a layer above Hibernate or any JPA. See it as JPQL embedded into Java.

    Kosta

  8. LINQ is a layer above Hibernate or any JPA. See it as JPQL embedded into Java.

    So another layer of complexity? Why does a simple thing like storing/searching data in the database have to be so complex? I don’t know many people who can REALLY understand how their Java EE applications work. Does providing new and new layers of complexity help design more stable/faster software in shorter time?

    LINQ is a layer above Hibernate or any JPA. See it as JPQL embedded into Java.

    I have spent many nights trying to optimize silly SQL queries generated by Hibernate. Assuming that Hibernate is just the lower layer, I guess the optimization path in your case would be even longer (LINQ to HQL and then from HQL to SQL?). Can your optimizer really do that?

  9. They are all levels of responsibility, consider:

    JDBC – accepts DB specific SQL statements. Most powerful.
    HQL – abstracts DB specific SQL. Less power, but DB neutral and Object Oriented.
    LINQ: provides Java language bindings to HQL.

    As a programmer you are free to choose what layer is better suites your needs, based on application requirements, your skills etc.

    Regarding the optimizations: as the implementation will evolve it will optimize better and better. Thus over time you will get ‘free’ upgrades, bug fixes etc – as usual. There is always a chance that at some corner cases the hand-tuned query string will do better. For those cases we should log the generated queries, providing an opportunity for a programmer to review them and choose the right thing to do (rewrite her statement or even directly call JDBC).

    Kosta

  10. Nina says “I think you underestimate programmers and their capapility to learn new things. If a language makes their life easier, they will start to use it. Please look at recent examples, like PHP and Ruby.”

    I take your point here. However, when it comes to enterprise computing and data, this “bottom up” approach of accepting new technologies may clash with company policies and internal rules.
    I am actuallly curious to see if the OMG is able to pull out a “standard API ” for object databases that gets used.

  11. Nina, you raise some valid concerns, but I don’t think any of them are insurmountable. We have to assume that Java can be changed, even if it is difficult. If not, then Java will fall behind in terms of innovation. It is easier to copy a design that has been worked out by somebody else; Microsoft has shown this many times, so its fair that Java should take things that are good an include them. (There are some other stick-in-the-mud aspects of Java that they could borrow from C#: property methods, type inference, etc but these are less critical.)

    As for creating a new database programming language, I think this misses the point. The problem is how to specify queries and updates from within Java — its the connection between the PL and the DB that’s hard. There is no reason why LINQ can’t invoke stored procedures where they are needed. It would be useful if SQL was more uniform, for example, if a stored procedure call could be used in part of a join.

    The thing that I find interesting is whether Java could jump ahead of LINQ by fixing some of its limitations. Two things that come to mind are updates (which were mentioned) and better prefetch.

  12. William,

    I also like Java and I wish it all the best. However, I believe that the future of Java should be left to Sun. As far as I know the ODBT WG is working on standards for object databases, not for object-oriented programming languages (especially if it’s only Java).

    I agree with you that a seamless connection between the PL and DB is very hard to achieve. I don’t think it is the most important problem in the area of object databases, but let’s say it is.

    As you know, the area of persistent programming languages has been a subject of research for a few decades now. Starting from Pascal (e.g. Pascal/R), through Modula (e.g. Persistent Modula), ending on Java (e.g. PJama), all of the research projects failed. I personally know a professor who has been working on this problem for the past 25 (or even more) years.

    The lesson learnt from all those efforts is that one CANNOT take an existing, traditional programming language (whether it’s Java, C#, C++, Ruby, or whatever else), extend it with database constructs and get a satisfactory solution. There are too many differences between the worlds of databases and programming languages (so called impedance mismatch).
    The advent of LINQ doesn’t change anything here, so from my point of view any attempt to integrate it with Java is simply a waste of time.

    The only known way of achieving a seamless integration between procedural languages and database constructs is through designing:
    1) a database programming language (DBPL) in the spirit of PL/SQL,
    2) it’s runtime environment in the form of a full-fledged database management system.

    Once one have done that, one can try to shift such a DBMS to the client side. What one gets is a database management system at the client side (client applications, even GUI-based, written using DBPL), at the application-server side (client-server application logic written using DBPL), and at the database-server side (data-intensive logic written using DBPL).

    This approach gives one the full integration of procedural and declarative constructs in a distributed environment and the ability to build complex database applications using just a few lines of code.

  13. Nina,

    I am well aware of the history you cite. And I agree that the history of attempts to turn PL runtimes into databases has been rife with problems. But I think that the converse idea of taking a database and extending it so it can implement the entire system (including the client), is also doomed to fail. I think that this is what you are proposing, but I could be wrong.

    The only solution that I see as viable is to keep separate databases and clients: the database is a robust data engine and the client is written in a general purpose language. I believe this because is a good approach for many reasons, including scalability, integrity of data, evolution, transactional behavior, and client programmers needs. If you believe this, then the key problem is to find a way for the PLs to reach out to the databases effectively, and for DBs to be designed to support PLs needs. LINQ is a better way for PLs to talk to databases. Its not perfect, and there is still work to make databases easier to talk to. But that is what I think we should be doing. I don’t think that we should try to find an all-encompassing single language that does it all. I suppose that makes me a postmodernist.

    Sun has created a process for proposing language evolution. It has worked fairly well in the past, and that is the process that will be used to attempt to add LINQ to Java. It might not work, but its worth a try.

    I’m actually working on an essay on the gulf between PL and DB viewpoints on this problem. If you’d be interested in commenting on it, I could send you a copy. I’d love to get your feedback.

    William
    wcook@cs.utexas.edu

  14. Thank you for Roberto for drawing my attention to this blog. I’m travelling at the moment so will leave reading the comments for the flight home.

    Having read the title and conversation though I’d like to comment on what I see as needed in the Java world relating to OODBs and hierarchical persistence. LINQ is an attractive end-goal however it is so totally MS centric that it would be impractical for the “real” world. I read the conversation with interest as many of the problems and issues ring true with my experiences. What we’re missing however is a nice simple Java API for storing hierarchical trees of data/objects not a whole new realm of thinking and not yet another layer on top of an already inefficient ORM layer.

    ORM is perfect when the problem is simple but it doesn’t scale well due to the impedance mismatch complexity. ORM may well be an implementation to solve the problem but the programmer needs an API that abstracts the mapping.

    I have a number of clients (investment banks), yes their numbers are reducing I know but for those that are left they are investing significant amounts of time and money into developing XML databases or should I say APIs to store XML into DBs, some OO some R and some specifically XML (a la MarkLogic). What they want however is not a single solution but a generic API to abstract the multiple solutions, one of which would well be LINQ.

    I’ve got to go ‘nd do some real work now, I’ll read the comments on the flight home and chat later, I just wanted to get this out.

    -John Davies-
    CTO Incept5

  15. Solutions to the OO/Database problems often tunnel focus on a few issues and miss others – mainly because each of us has a different problem to solve, as this discussion shows.

    1 – Pure object database solutions solve the problem of impedance mismatch and many related programming issues, but usually at the expense of the power offered by the large (relational) database products, database mangement tools or simple product maturity. They also frequently can't be used with complementary system such as commercial reporting/analysis tools (which for the most part only offer ODBC or similar interfaces – and mapping an OODB to ODBC just brings back the same old Object/Relational issues again)

    2 – Object/relational mapping solutions allow you to work with your OO language, and still connect to the relational backend, with all the power, corporate standards and commercial tool availability implied. However there are always compromises with (server side) business logic, as O/R mapping usually makes it difficult to link complex or pre-existing business logic or security on the server to your front end.

    LINQ is a way of addressing data query tasks and retaining OO compile type checking (something SQL lacks), while allowing for both in-memory execution against local object collections, or deferred execution/translation to SQL on a backend. For someone using a pure OODB – LINQ, and the standards implied will probably be a real benefit as its a step up from what has been offered previously.

    Providing LINQ method based query functionality to Java could be done as a straight API/function library – but going the whole hog syntax wise, as per the following C# 3.0 Query Expression example:

    IEnumerable query = from s in names where s.Length == 5 orderby s select s.ToUpper();

    would not happen unless Sun made changes to the compiler and language specification for Java.

    For anyone still using a relational backend, it doesn't solve any of the key issues they already face – though it will make work simpler on data buffered at the client. The same problems of how good the SQL generated is, and whether it makes use of pre-defined procedures, views…etc remains.

  16. A few thoughts:
    1. I’m always interested in the development of better ways for application code to access databases.
    2. There’s a long history of new database access languages that are meant to supercede SQL. I have no doubt that jLINQ (or whatever it ends up being called) will enjoy the same level of success as these previous efforts.
    3. There are very good architectural reasons for implementing some functionality in the database and some in other locations. Every so often the extremes of “everything in the DB” or “everything outside the DB” make sense, but this is incredibly rare in practice. You need to find the sweet spot.
    4. Instead of spending all this effort trying to put together a slightly better technical approach to solving a problem which has been addressed many times over, it would be far more effective adopting practices and philosophies that enabled data professionals and developers to work together more effectively. This is something that I try to focus on at http://www.agiledata.org.

    - Scott
    Practice Leader Agile Development, IBM

  17. @Peter:

    1. I don't think that 'Java LINQ' must have exactly same syntax MS LINQ has. Rather we need it will have something clear, typesafe, easy to use and target the same domain of problems. With an addition of closures, that I believe will do in the next Java release, I can offer the following
    syntax:

    Iterable query = from(names, where( {String s => s.length == 5}, orderBy( { String s => s }, select ( { String s => s.toUpper() } ))));

    I think, it meets the requirements and sometimes even better than MS LINQ since every closure is a regular method and is not subject to any restrictions. For example, there is no need in 'let' statement introduced by LINQ to
    cope with some of them.

    That's what I do in my JaQue project at http://jaque.googlecode.com

    2. I think that MS slightly mistargeted LINQ with “LINQ to SQL”, and “LINQ to Entities” ‘fixes’ that. LINQ shines when it targets Objects or Object Models (i.e. ORM). When this is the architecture, the ORM handles the low level SQL generation, based on its annotations, configuration etc, delivering better quality in
    translation.

    This is the architecture of JaQue project http://jaque.googlecode.com
    since it targets JPA.

    K

  18. The discussion is very interesting as the problem itself is. But I’m not sure if the approach of LINQ is correct – in general I agree with Nina. Taking LINQ as a standard is rather (extremely?) strange. First, it is Microsoft proprietary and Java comes from Sun. Next, it defines nothing but a PL syntax extension (awkward and difficult to generalise and propagate to other PLs, IMHO). Nothing underlying can be controlled or accessed by a programmer, many database features are lost (transactions, stored procedures, etc). LINQ is not a query language as it’s expressions are not evaluated by a database engine – they need to be translated by some extra middleware (yet another transparent but not translucent layer) to the database specific QL. Resulting native queries are again out of programmer’s control (substantial optimisation issues). And, finally, what about DDL and DML constructs? OK, let’s assume the schema does need to be changed, or it’s maintained from somewhere else, or DDL is not a part of QLs. Fine. But still I cannot imagine working with a database without updating data. I could not find anything like that in LINQ, while simple ‘read-only’ queries are insufficient.

    I’m a Java programmer for quite a few years and I was unable to find a decent middleware for accessing relational databases that complies with my requirements – recently I got seriously disappointed with Hibernate and its performance issues. Therefore, I’m still using JDBC with SQL strings, although the language has many drawbacks and flaws, and it’s expressions are very often implementation dependent. But it allows me to control what and how I do with my database. What we need is a new flexible and efficient query language, not masking a poor one with something else only expected better. As LINQ is supposed to wrap anything existing below (LINQ for X, LINQ for Y, …), it assumes our database technology (mostly relational) is what we actually need and we are happy about. Are we? Don’t we need to develop new (object-oriented) technologies? There’s no chance for database evolution with such approach. No progress, only stagnation. Personally, I do not like such future.

  19. I’m sorry to say it jacenty, but just about everything you say about LINQ in your message is false.

    “First, it is Microsoft proprietary and Java comes from Sun. “
    * LINQ is a proprietary implementation of a general idea which originated in academic literature. There is no reason why Java cannot borrow from LINQ in the same way that C# borrowed from Java.

    “it defines nothing but a PL syntax extension (awkward and difficult to generalise and propagate to other PLs, IMHO)”
    * LINQ is a PL syntax and semantic extension. It is really two concepts: a set of higher-order method calls for Where, Select, GroupBy, etc. The second is some syntactic “sugar” to make these calls look more like SQL. The first would easily generalize to other languages, the syntax is not important.

    “Nothing underlying can be controlled or accessed by a programmer”
    * LINQ is implemented by a library of method calls. I think you could modify or replace parts of it fairly easily. Or you could implement your own version.

    “many database features are lost (transactions, stored procedures, etc).
    * Transactions are orthogonal. It is true that .NET seems to have lost some of the capabilities from Microsoft Transaction Server (which was copied to create EJB, by the way). But I’m pretty sure they will merge the two eventually.

    * LINQ can call stored procedures.

    “LINQ is not a query language as it’s expressions are not evaluated by a database engine – they need to be translated by some extra middleware (yet another transparent but not translucent layer) to the database specific QL.”
    * This comment seems very confusing. I think LINQ is a query language. As far as I’m concerned its queries are executed by the database engine because of the translation that you describe. Using your argument, SQL is not a query language, because it is translated into a lower level query plan for execution.

    “Resulting native queries are again out of programmer’s control (substantial optimisation issues).”
    * This is true. There is a big debate about whether automatically generated queries are good enough. I think they are, in most cases. If they aren’t then create a stored procedure for the 5% of complex queries where it matters.

    Finally, you are right that LINQ doesn’t support bulk update operations, as far as I know. But 95% of updates are to single objects, and those are easily handled by storing modified objects.

    I don’t have any comments on your second paragraph, which seems reasonable enough.

  20. Well I think JWislicki is right on the nose. While the academics and visual programmers who’ve never actually worked on real systems will always side with LINQ I just can’t see the real world accepting anything that originates from Microsoft.

    Don’t get me wrong, I think LINQ is pretty cool and I’d love to see it succeed but it just won’t. It’s another technology that works very nicely in the tight proprietary world of Microsoft but why would any other vendor have any interest in implementing it on their technology. Even if they do it will just be an academic exercise like MONO was/is. The only reason it was ported was so that MS could say .NET wasn’t locked into the MS OS. There are too many owners, Redhat’s Application server using Apache components with a Mule ESB running on Sun’s Java with Oracle’s database on SUSE Linux. Who’s going to own the LINQ part here?

    LINQ is a nice technology that’s worth steeling a few ideas from, assuming of course MS didn’t patent them. It will never work outside of the MS world in anything more than a few academics and enthusiasts’ prototypes. I wish it every success in the MS world.

    -John-

  21. William, thanks for your response and explanation. I still disagree with several things and I will never like eclectic constructs. However, I looked at several official LINQ examples and what I could were just selecting queries (what I wrote about updates and stored procedures).

    As for transactions, usually they can be to some extent managed by a programmer. Another issue (not related directly to LINQ) is whether transaction orthogonality to a QL is a correct approach.

  22. Personally I find it difficult to get too excited about LINQ one way or the other (a bit like my feelings about Vista!). I don’t see that it allows me to anything that I couldn’t already do with a decent API, and I don’t see how it makes those things significantly easier. Integrating this stuff into the language itself appears to me to be entirely unnecessary, just adding complexity whilst removing choice.
    Perhaps LINQ is solving the wrong problem?

    In terms of the OMG, LINQ is a language extension NOT an API. It therefore falls completely outside of what Roberto said, “the focus here is to standardize an API for Object databases.”. The statement, “LINQ is the best option for a future Java query API”, appears to be a contradiction in terms.

  23. JIm:

    I have quoted Mke Card: “The consensus at this meeting and at ICOODB conference in Berlin was that LINQ was the best option for a future Java query API since it already had broad support in the .Net community. “

  24. JIm:

    I have quoted Mke Card: “The consensus at this meeting and at ICOODB conference in Berlin was that LINQ was the best option for a future Java query API since it already had broad support in the .Net community. “

  25. Dear all,

    just my few cents…

    One important point is that ODBMS needs a new API and
    a fresh stipulation of a standard soon. At ICOODB 08 there
    were only 2 visible alternatives : LINQ and SBA/SBQL
    (in this context I really can not understand the
    mentioning of hibernate or hql in the context of
    object databases?! Gavin doesn’t like us anyway ;-).

    And although I think the capabilities of SBA/SBQL are awesome
    - and I appreciate the great amount of work that has been
    invested in this approach -,
    LINQ is a language that nearly everyone knows or is able
    to learn in a second when looking at this page
    http://msdn.microsoft.com/en-us/vcsharp/aa336746.aspx

    By the way: I can see at least 12 books on amazon on LINQ.

    So LINQ is already mainstream and would help ODBMS to get
    a huge mainstream momentum. And to my opinion the Java
    problems will be solved soon one or the other way.

    I am not sure if we have time for a decision till ICOODB 2009
    at ETH Zürich but anyway I hope that OMG will take the
    best decision soon.

    Stefan Edlich

  26. Mike Card says:
    The consensus at this meeting and at ICOODB conference in Berlin was that LINQ was the best option for a future Java query API

    This is excellent news.

    What has been missing for a breakthrough paradigm shift towards a widely adopted use of object databases was a standard for querying.

    LINQ is the chance for such a standard.

    When we use Java, of course we would love to stay in the object-oriented world with our database queries. LINQ does that: It allows method calls on objects and it returns objects.

    LINQ could take language integration of queries to the next level and advance development productivity and quality:
    With LINQ all queries would be typesafe, compile-time checked and refactorable.

    LINQ could also make database backends truely interchangeable. A LINQ provider would be fully standardized by the language, with no ugly dialects that create incompatibilities, like we see them in SQL.

    If LINQ finds it’s way into the Java language we would of course see great implementations for in-memory use and on top of relational databases. Then database interfacing code
    would only have to be written once and it could be run on the best platform for the respective task:
    - In memory for testing
    - against relational databases, if corporate policies require their use
    - against object databases for maximum performance or for minimal ressource consumption on embedded systems.

    Let’s go for LINQ!

    LINQ is making it’s way on .NET. There is no reason it should be less successful if it becomes available for Java.

  27. At ICOODB 08 there
    were only 2 visible alternatives : LINQ and SBA/SBQL. (…)And although I think the capabilities of SBA/SBQL are awesome
    - and I appreciate the great amount of work that has been
    invested in this approach -,
    LINQ is a language that nearly everyone knows

    If there are two good alternatives, why not to use them both? Do they contradict each other, or can they coexist? I have the impression that LINQ and SBQL/SBA target different areas of object databases. The goal of the former is to provide the “remote controller”, while the latter concentrates on the actual “TV”.

    I think that the protagonists of LINQ forget (among other things) about one important (crucial?) aspect of every query language, i.e. optimization. Those “cool” features of LINQ like queries that are “typesafe, compile-time checked and refactorable” do not have much value if you have to wait years for a single query to be executed. In the case of LINQ there is SQL Server that does (better or worse) this job (assuming that you send it an SQL query). But how would you deal with optimization in the case of an object-oriented database? No, you can’t put it off (“as the implementation will evolve it will optimize better and better”), you need to know it NOW. In the case of SBA very powerful (some of them don’t exist even for SQL) optimization algorithms already exist. What about LINQ?

  28. Nina

    you have identifed a *crucial* point. Query optimization. In fact, lack of good query optimization was apparemtly one of the technical obstacles to a wider adoption of odbms of the first generation back in the 90s.

    On this respect I like to quote an interview of Marianne Winslett to Professor David Maier in 2002. To the question “Are there any other results from object-oriented database research that you would single out as having had long-term impact? ” , David says “The other thing [that] I think will have impact is [that] I think we finally figured out after ten years how to optimize OQL, to do cost-based [query] optimization [for OQL] and solve some of the hard [optimization] problems. And I think that will [be] useful for XML query languages. “
    I do not know if all of this past research knowledge on optimizing query for odbms can be applied to LINQ now…

  29. David says “The other thing [that] I think will have impact is [that] I think we finally figured out after ten years how to optimize OQL

    With all due respect to Prof. David Maier, I doubt he has ever known that. In order to optimize OQL queries, one would have to define its precise semantics first. Unfortunately, to this day OQL’s semantics lack sufficient precision.

    This is not the case with SBQL. Because it’s semantics is clearly defined, optimization is possible. As far as I know, the optimization techniques designed for SBQL are not just some performance “tricks”, but very powerful and general algorithms.

    If people love the perspective of LINQ + Java + ODBMS so much, I can think of only one viable method of saving this project from a complete disaster: generate SBQL queries the same way MS LINQ generates SQL queries. Of course, if one takes this approach, one needs to design the architecture of the ODBMS according to the requirements set by SBA.

  30. What about this, nina?
    Formal semantics and analysis of object queries

    OQL is not that difficult a language to pin down. Its not that different from SBQL either. I don’t see what the fuss is all about. But I do agree that optimization is the key requirement of queries. I think that LINQ gives the back end enough information to optimize propertly, whether the back end is a ODBMS or a RDBMS.

  31. What about this, nina?

    Sorry, I can’t download it. I’m not an ACM member.

    OQL is not that difficult a language to pin down.

    If it’s not, then why nobody hasn’t done it so far?

  32. Hi.

    I’ve been following this thread and my opinion is that OMG was very clever at spotting LINQ as the possible foundation for a standard API for object databases (the fact that it’s missing in Java right now is not a blocker).

    IMHO LINQ must not be put aside because it was introduced by Microsoft, it has all the potential of escaping Microsoft’s stronghold.

    On the technical side I would like to say that yes, LINQ like query integration in the language help developers design more stable/faster software in shorter time mainly because of the reasons that Carl mentioned in this thread. db4o and Prof. Cook were pioneers by introducing native queries which helped real developers working with real applications to deliver in a shorter time.

    Current LINQ implementation by MS might have many flaws but it’s evolving and certainly the Java version can improve on it. There many arguments in favor of LINQ (eg this one) and adoption is growing.

    The train is moving and it won’t stop!

  33. Because the discussion mentioned SBQL, I think it will be good to know what it is. SBQL web pages are http://www.sbql.pl. Recently we have prepared a programmer manual for our system ODRA where SBQL is fully implemented. See
    http://www.sbql.pl/various/ODRA/ODRA_manual.html
    The manual does not include the section on transactions, because I have decided to make a new version of this feature (old transactions do not support distributed databases). Recently one of my coworkers have implemented an interface from .NET (C#, …) to ODRA via SBQL. This part is also not included in the manual yet.

  34. I have more general doubts concerning the Mike Card’s proposal and this discussion. Java is already standardized by ISO. I suppose that any extension to Java should be the deal of a corresponding ISO committee rather than an OMG committee. Standardization of LINQ by OMG causes again my doubts. Although Microsoft is a member of OMG, it is deeply in opposition, at least on the ground of middleware (Roger Sessions severely criticised the CORBA standard comparing it to COM/DCOM).

    I also would like to note that OMG already standardized a query language known as OCL. This was done together with the standardization of UML2 aka Executable UML. In this standard OCL is used as a constraint language (for specification preconditions, postconditions and assertions), but in another standard QVT OCL is used as a regular query language. Is OMG prepared to standardize two query languages? Note that OCL is truly object-oriented addressing the UML object model. By no way it is related to the relational model.

    I know that so far there is no programmers of OCL, but this can quickly change. Several groups already implemented OCL, in particular Martin Gogolla group and my group. In our case OCL is implemented as a database query language on top of SBQL, hence it inherits everything from SBQL, in particular, query optimization and access to external (distributed) resources.

  35. In addition to the above comment, I would like to note that the tradition of OMG is developing standards that are platform and vendor independent. All OMG standards that I know (CORBA, UML, UML2, OCL, QVT, MDA, …) are developed from scratch as a tradeoff between proposals of different industrial OMG members. If a new OMG database standard would be based on Java and LINQ, it could be perceived as a direct support for particular companies such as Microsoft and Sun. I have doubts if other OMG big players (IBM, HP, SAP, Oracle, …) would be happy from such a solution. If my impression is right, then we can forget that such a new standard proposal will be ever approved by OMG.

    In this context I suggest to put again more attention to SBQL. It is platform independent, not supported by any company, based on powerful and abstract SBA theory, its semantics is formally defined for a rich family of UML-like object models. SBQL implementation supports almost everything that are important for making such a standard, including a powerful query language (much more powerful than LINQ), query optimization, all kind of updates, stored procedures, classes and views, semi-strong typechecking, and more.

  36. Dr. Subieta,

    I think you are confusing things: OCL is an object-oriented notation for a predicate in mathematics. OCL does allow iteration of collections, to allow “for-all” and “exists” aggregations. That is, OCL is similar to a SQL where clause. OCL does not construct structured values, as in the OQL or SQL select clause, so it is not a full query language. (I happen to think that the original mathematical notation is nicer, and it was silly to create an object-oriented syntax for it. But it is just syntax, so I’m not going to worry about it.)

    I also think that SBQL and OQL are very similar, as I have said before. Nina said that OQL does not have a formal semantics, but then she admits she hasn’t read one of the papers that does give a semantics for OQL. SBQL is stack-based (like forth) so it avoids some explicit binding operators. That doesn’t seem like a big difference to me.

    LINQ is not a new database query language. It is a programming language interface that allows you to specify queries in type-safe way, and also to cleanly represent the queries as explicit values so that they can be optimized (or sent to a database for optimization).

    In other words, LINQ is a solution David Maier’s original definition of impedance mismatch: “Whatever the database programming model, it must allow complex, data-intensive operations to be picked out of programs for execution by the storage manager, rather than forcing a record-at-a-time interface.” You see, LINQ allows parts of program (the queries) to be lifted out and sent to the database. Other techniques for doing this (notably query strings) are simply a bad way to partition a program. LINQ is better. Its not perfect, but its better.

    Finally, I should say that I have very little faith in existing standardization processes. Standards have always been a weapon as much as anything else; they are not created by thinking from first principles. Your own comments demonstrate this, because you are taking a very political approach to this problem, in saying that you can’t use an idea because a particular company thought it up. I don’t believe that OMG’s standards were made “from scratch”: CORBA was influenced by proprietary offerings, and UML was unified from a number of competing approaches (which had significant consulting practices supporting them).

    [1] David Maier. Representing database programs as ob jects. In Advances in Database Programming Languages, Papers from DBPL-1, pages 377–386. ACM Press / Addison-Wesley, 1987.

  37. Hi,

    William Cook says:
    I think you are confusing things: OCL is an object-oriented notation for a predicate in mathematics. OCL does allow iteration of collections, to allow “for-all” and “exists” aggregations. That is, OCL is similar to a SQL where clause. OCL does not construct structured values, as in the OQL or SQL select clause, so it is not a full query language.

    I have to disagree. Apart of its primary purpose as a constraint language, OCL is a quite powerful expression language. Please note its "->collect(…)" iterator operation (in OCL’s terminology). This serves a role analogous to OQL’s select clause. It allows for nesting sub-queries in it and can incluse a Tuple type constructor, so you can construct structured results – also nested ones.
    Yes, the syntax is a bit odd, and there are also some ambiguities in its specification if someone considers its usage as a query language. However, in our current research project that deals with programming in UML, we chose (due to its assumption to maximize existing OMG specifications reuse) OCL as a query language and did not encounter significant limitations in its expressiveness. We were able to fit it in a relatively seamless way into UML’s Activities and Actions modules (for imperative constructs) to construct a query language with a programming laugage capabilities. BTW: More results of this work, performed under VIDE 6th Framework Program project, including part of the software produced will be available soon.

  38. William Cook said…
    I also think that SBQL and OQL are very similar, as I have said before. Nina said that OQL does not have a formal semantics, but then she admits she hasn’t read one of the papers that does give a semantics for OQL. SBQL is stack-based (like forth) so it avoids some explicit binding operators. That doesn’t seem like a big difference to me.

    Sorry Prof. Cook, you seem to compare things surely without being familiar with one of them. OQL and SBQL are fundamentally different languages, because SBQL is a fully-fledged object-oriented programming language with one difference in comparison to the classical ones – expressions in SBQL are queries. For instance, 2+2 is a query, sin(x) is a query and Employee where salary > 1000 is a query. Such queries or expressions are used everywhere, in particular, as arguments of imperative statements and as parameters of procedures and methods.

    I also think you have misunderstood the term "stack-based". All programming languages, including C, Java, Pascal, C++, etc., are "stack based", because all of them involve environmental or call stack. The novelty of SBQL is that this stack, defined on the abstract level, is used to specify the semantics of query operators such as selections, navigations, joins, quantifiers, etc. This forms a new theory that is original in both database and programming language domains. I encourage you to understand it. Without this our discussion and comparisons are simply waste of your and our time.

    I support Nina's thesis that OQL has no formal semantics. I make the thesis even stronger: it is impossible to define for OQL the formal semantics. There are two reasons. The first is that the ODMG object model (database state) is mathematically very imprecise, the standard even does not specify formally the concept of "object". The same concerns the formal model of query results. Formal semantics means that we have to define for each query the mapping State –> Result, and we should do that recursively according to the OQL abstract syntax. If the domains are not precisely defined, then formal semantics is impossible to define. The second reason is that the ODMG standard is full of technical flaws. Flaws are even observed on the distance of a half of page (see some my publications). I spend a lot of time trying to formalize the standard and OQL. Are you curious about the result?

    The result is SBQL and SBA. I don't believe in other formalizations, I saw too many fake formalisms.

  39. I think the main point is that object databases need a common query mechanism, so tooling can be implemented easily enabling access to any vendor implementation.

    Imagine something like, the BIRT LINQ extensions, so BIRT reporting tools can access any OODB or RDB.

    Simply bringing a common query mechanism to the object database vendors is not enough, it needs to be a query mechanism accepted by the software community at large, otherwise we’ve really just created a better form of OQL … big deal it does not help with adoption.

    The approach taken by LINQ lends itself well to the object database notion of “the memory model, is the data model” ala transparent persistence.

    The ideas expressed in LINQ are genuinely interesting and present true value add over alternatives in Java ( as articulated by many in this thread ).

    LINQ fits OODB’s well and it’s implementation in Java would add value to that community. So, object database vendors should promote it’s implementation in Java and in the process achieve both interoperablity and (arguably) by side effect, something widely accepted.

    It brings value to the Java community and brings value to object database technolgy interoperability.

    I think an important part of this is “it brings value to the Java community”, and as such we should be able to get cooperation from that community in it’s implementation. If that cannot be achieved, then the positive impact regarding odb interoperability would be significantly diminished.

    -Robert

  40. Prof. Kazimierz,
    We have gotten into this discussion before, and it is never resolved. There certainly may be some be some ambiguities in the ODMB OQL spec, but rather than working to clear them up, you just say that its impossible. But at the same time you give correspondences between OQL and SBQL on your own web site. You mention a few fine points where they differ, although some of these are simply that things are not as easy in OQL as in SBQL; this is not an argument about expressive power, its an argument about syntactic ease. I was comparing OQL and the query part of SBQL, without the updates. You make a big point that "2+2 is a query, sin(x) is a query and Employee where salary > 1000". These are also queries in OQL:
    2+2 is a query, sin(x) is a query and 'select e from Employee as e where e.salary > 1000' are all OQL queries. I stand by my assertion that SBQL is stack-based in the sense that Forth is stack-based. Your non-algebraic operators are defined so that the left side pushes items onto the stack, and they are implicitly referenced by the right side of expression. This gives somewhat of an economy of expression (as in Forth) but at the cost of being less explicit in terms of references to values. The difference is a matter of taste. At the same time you refuse to consider that any other language could have a formal semantics. I agree that this discussion is not very productive. You have a competing technology to OQL and you are promoting it. I'm fine with that. You may very well have a much better implementation of SBQL than OQL implementations, because OQL was never adopted widely and as previously mentioned here, the original OODB products didn't have very powerful query optimization. This was a terrible mistake on their part, which I hope will be rectified in future products. It would be better if you gave performance numbers than trying to argue about semantic foundations.

  41. William Cook said:
    You make a big point that "2+2 is a query, sin(x) is a query and Employee where salary > 1000". These are also queries in OQL:

    I never said that these queries are impossible in OQL. My thesis was different: OQL is a query language, while SBQL is a programming language that use queries as expressions. In SBQL there are no expressions that are not queries and this feature is so far unique for both databases and programming languages. OQL queries can be loosely coupled with imperative statements or procedures, actually as strings of characters. This is not the case of SBQL: queries, similarly to programming expressions, can be arguments of imperative (updating) statements, can be passed (not as strings!) as parameters of procedures and methods in both call-by-reference and call-by-value mode. Moreover, SBQL is strongly typed, including the use of queries as parameters of procedures/methods.
    For this reason SBQL is higher-level than C#/LINQ. C#/LINQ makes distinction between expressions and queries, what is illogical, because the typing system is the same. Moreover, LINQ queries cannot be used for updating, what is illogical too. Such limitations do not exist in SBQL.

    William Cook said:
    I stand by my assertion that SBQL is stack-based in the sense that Forth is stack-based.

    I disagree. Although both Forth and SBQL are described as “stack based”, the reasons for this descriptor is different and the stacks that are used by these languages are different. Forth is a bit more advanced assembler and involves stacks known as parameter (data) stack and return stack. These stacks are EXPLICITLY used by the programmer. SBQL is designed as the most abstract database programming language (more abstract than SQL and LINQ) and no stack is explicitly used by the programmer. SBQL stacks are used for formal description of SBQL semantics. SBQL run time introduces two stacks: a result stack and an environment (call) stack. Both stacks are used in some form in every programming environment, including Pascal, C, Java, Ruby, etc. The novelty of SBQL is that these stacks are described in an abstract mathematical form, which can be used for formal description of every language construct, including semantics of query operators. There is little in common with reverse polish notation (RPN) that is used by Forth and Hewlett-Packard calculators. SBQL does not deal with such a notion.

    William Cook said:
    At the same time you refuse to consider that any other language could have a formal semantics.

    Sorry, I never said that, this is your invention. I claimed that I do not believe in formalization of OQL and presented the reasons for such a claim. To formalize OQL one must make it consistent, because it is impossible to formalize (and implement) anything that is internally inconsistent. One of many places where OQL is inconsistent concerns name scoping rules. In one place the standard says that each name defined in a query is invalid outside it. But a half a page later it gives an example which explicitly uses a name defined in a query outside it. BTW, explaining scoping rules without introducing an environment stack is impossible. Only SBA introduces this stack, hence my doubts concerning other formalizations.

    I would like to underline once again that I am not against ODMG and OQL. Nothing is perfect from the beginning and I think ODMG did a good job. Our role is to make their job more perfect. Me and my group have done that, naming the result SBA and SBQL.

    William Cook said:
    Finally, you are right that LINQ doesn’t support bulk update operations, as far as I know. But 95% of updates are to single objects, and those are easily handled by storing modified objects.

    Even if updates concern single objects, they must be found somehow. How? There are three scenarios: (1) iterate over a collection; (2) use an index; (3) use a query that returns a reference to an object. (1) is a disaster for large collections having e.g. millions of elements. (2) is a trouble for database administration, because such an index must always exist and DBA has no freedom to change indices; (3) is the only good option that is used both in SQL and SBQL. For instance, let Doe have to obtain the salary being the average salary plus 100. In SBQL this is accomplished by two queries, one on the left side of the assignment and the second one on its right side:

    (Employee where name = “Doe”).salary := avg(Employee.salary) + 100;

    The first query returns a reference to the Doe’s salary (providing there is only one Doe, otherwise an exception is rised) and the second query returns some value that is assigned according to the reference.

  42. My point with OQL is that in formalizing it you would have to change some of the informal documentation to be consistent. You may even have to correct or modify the informal specification. That is why I say that OQL can be formalized, with appropriate modifications to the spec. This is normal in any formalization. Your point that a particular specification document cannot be formalized without changes is certainly true, but not very useful.

    If SBQL is a full programming language, then it should be compared to C# as a full programming language, not to LINQ, which is a way for an object-oriented procedural language to interface cleanly with a variety of pure functional query languages. The whole point of LINQ is that you don’t need a complete new programming language, you can keep using C# which has lot of tools and support behind it. Sure, it is not an academically perfect solution, but it is a practical one.

    I think that the SBQL use of an implicit environment stack is mostly a syntactic device. I know you that you say it makes the language semantics more compositional. The only semantics I’ve seen for SBQL (e.g. chapter 6 of your book) is very operational. All languages manipulate environments of bindings, and these are explicit in the semantics of the languages. Saying that “Only SBA introduces this [environment] stack” makes no sense.

    As for updates, in LINQ you can use a query to find an object, then use the update operations on the object to make changes. This handles 95% of cases. The thing you cannot do in LINQ is bulk updates, which is unfortunate.

  43. William Cook said…
    If SBQL is a full programming language, then it should be compared to C# as a full programming language, not to LINQ

    It should be compared to PL/SQL. Apart from client-side programming, there is a whole of world server-side problems. I doubt anyone would like to develop stored procedures, triggers, views, etc. using Java (even with LINQ).

  44. William Cook said:
    Your point that a particular specification document cannot be formalized without changes is certainly true, but not very useful.

    In case of OQL these changes are fundamental and concern the object model, the model of query results, the syntax (that is inconsistent), the idea of the semantics (which is absent), the idea of strong typing (which is inconsistent too), scoping and binding rules (that are absent), integration with object manipulation capabilities (which is poor), integration with database views (which is naïve), metamodel (which is underspecified and wrong), etc. We have left from the standard only the general idea, changing almost all the details and making the semantics specified, consistent, implementable and optimizable.

    William Cook said:
    I think that the SBQL use of an implicit environment stack is mostly a syntactic device.

    Totally disagree. The environment stack (together with an abstract object store and the result stack) is fundamental for description of semantics of SBQL. Obviously, it is related to syntax, because the semantics of some syntactic constructs (binding, non-algebraic operators, method calls, parameter passing, etc.) is specified through operations on the stack, but this is the property of any semantics that is driven by syntax. The stack is also fundamental for implementation, strong typing and optimization of queries.

    William Cook said:
    The only semantics I’ve seen for SBQL (e.g. chapter 6 of your book) is very operational.

    Indeed, it is operational, but I don’t understand your argument. What is wrong in operational semantics? In 1985 I started to formulate SBA in terms of the denotational semantics (see my early papers), but quickly abandoned it. It was totally illegible for any audience. Hence I start to develop the operational semantics, but without changing the SBA idea.

    William Cook said:
    All languages manipulate environments of bindings, and these are explicit in the semantics of the languages. Saying that “Only SBA introduces this [environment] stack” makes no sense.

    I stated this explicitly: “Both stacks are used in some form in every programming environment, including Pascal, C, Java, Ruby, etc.” SBA is commonly used in all programming languages that involve any kind of sub-routines, procedures, functions, methods, classes, etc. My argument was in the context of OQL. For QUERY LANGUAGES only SBA introduces the environment stack. If any other formalism, e.g. object algebra or calculus, does not involve the environment stack, it means that the formalism is conceptually limited or invalid. It is unable to express precisely fundamental semantic properties such as naming, name scoping, name binding and query nesting. This is the case of other formalizations of OQL.

    William Cook said:
    As for updates, in LINQ you can use a query to find an object, then use the update operations on the object to make changes.

    Sorry, I didn’t find such an example. Can you specify the place? So far I understand that LINQ queries do not return references to objects, hence any updating operations are impossible. This is of course a quite easy option to implement and perhaps is or will be quickly introduced by the LINQ developers. But this concerns only C# objects. If we are talking on LINQ to SQL, this is much more difficult, because LINQ queries together with updating tokens must be mapped into SQL update, insert and delete statements. Such a feature is not easy to develop and implement, especially if the mapping between a relational database and a C# object model is not trivial. All that I read on LINQ pages is that LINQ queries can call stored procedures on the side of a relational database and these procedures can perform updates. Of course, in this case, no matter if updates concern single or bulk objects.

  45. Well, I suppose we will just have to agree to disagree on many of these points.

    If you want to understand updates in LINQ, Google for ‘LINQ updates’. Here’s an example from the first hit:

    var product =
    (from p in dataContext.Products
    where p.ProductID == 1
    select p).Single();
    product.Name = “Toaster”;
    dataContext.SubmitChanges();

    Just for the record, there are many papers on semantics, typing and optimization of OQL, if anyone cares to find out about them. They don’t have the dire problems that professor Kazimierz describes. Trigoni’s thesis, Semantic Optimization of OQL Queries, has a good bibilography.

  46. As I have found this discussion here is why I think the SBA/SBQL should not be left behind the LINQ, on contrary LINQ is a product of a commercial company and does not really bring new solutions its rather a technology not the new idea. Why then try to make it standard. Standard should not be for one type of car say “Toyota Camry is a standard…” even though it might be a good car it does not really mean that it should be a standard. When we say standard car we mean that car should have 4 wheels, steering wheel, engine etc. And this “etc” should be the matter of discussion about how the standard should look like. Therefore here is where SBA/SBQL comes in. SBA/SBQL is an approach so as the “car standard” is. SBA/SBQL are the tandem that gives us a new quality of approach not just a new language. I am not putting here details as You can all read from the links given by prof. Kazimierz Subieta, and as You read it You can all get the right immpresion not influenced by my opinion. The more peope will know about it, the more chances it get to proof its strength against present day solutions and spread (as I believe most of the readers (not devoted or attached to MS) will appreciate the great idea of it).

    It is as clear as crystal that Microsoft has right to promote its proprietary LINQ for .NET as its technology but accepting it as a “car” does not seem to be reasonable.

    What is more important here that Java (developed by Sun) is trying to get 100% opened and here again the SBA/SBQL has tremendous advantage as an independent solution. Those are the basic level arguments in my opinion for the SBA/SBQL. The technical discussion above with the details given by prof Subieta gives only a small fraction of the possibilities of the SBA/SBQL and its innovative way (eg. the updatable views, optimisation).
    As the history gives us the rather unpleasant experience of ISO standardisation of MS OOXML it is highly possible that some behavioral “patters” could be copied. Why to give the MS the tool to influence Java. MS has already tried to get its own Java but the attempts failed. Trying to standard the proprietary LINQ technology to Java will simply mean that MS would have influence on language that compete with the .NET. The only disadvantage of SBA/SBQL in this aspect is that it has not been developed and supported by a large company with an extensive financial and marketing support.

    I honestly believe SBA/SBQL is a great step forward not only relating it to the present day databases possibilities but also to the widely used technologies that soon will have to (because of the technological progress) be used for object DB.

    Thank you

  47. FYI- an interesting readings:

    Erik Meijer, José Blakeley
    The Microsoft perspective on ORM

    Interview in ACM Queue Magazine with Erik Meijer and José Blakeley. With LINQ (language-integrated query) and the Entity Framework, Microsoft divided its traditional ORM technology into two parts: one part that handles querying (LINQ) and one part that handles mapping (Entity Framework).

    Article | Basic | English | LINK | September 2008 |

    You can find the link at:
    http://www.odbms.org/downloads.html#odbms_ap

  48. Hello everyone-

    I apologize for being late into this discussion, Roberto has asked me a couple of times to contribute but I have been very busy working on some new business efforts at my company (which involve object databases and advanced data mining!)

    At the Object Database Technology Working Group, we had several presentations by Prof. Subieta of his SBA/SBQL work. I was and continue to be very impressed by them, and several of us who saw them thought that they represented a whole new thought on approaching the management of data from an object perspective and that they could be the ultimate “bridge” between the object and relational worlds. Others, such as Prof. Cook, were less impressed and did not see as much value in SBA/SBQL.

    After several meetings, some of which included demonstrations by Prof. Subieta, the consensus among vendors was that a “string-based” approach would not be accepted by their customers, many of whom are Java developers. Their position was that SBQL was a language separate from Java which would ultimately not integrate easily with Java itself (e.g. SBQL strings would have to parsed or interpreted as opposed to compiled in-line like LINQ).

    As a developer myself, I concur with the view that what one wants is a native-language access mechanism for persistent objects. I have read some posts here where people say something like “well all you are doing is adding persistence to Java.” Indeed, you might say that is “all” but it is huge. Anyone who has used an ODBMS to manage data can immediately see the advantage vs. JDBC and similar string-based mechanisms.

    One also has to remember that OMG is not a true standards body; it is an industry consortium which is a very different animal. The OMG itself makes nothing; its groups put out Requests For Proposals (RFPs) and participating members answer the RFPs with draft standards. A “winner” is ultimately chosen that becomes the new standard, so OMG standards are the result of collaborations between sometimes competing companies. All of the vendors in our group are familiar with LINQ and many support it in C# product offerings. None of the vendors have built-in support for SBQL, it would have to be created anew and an effort to include it as a native language feature of Java (remember why – don’t want to be in the string-parsing business) would be a “zero-momentum” effort, i.e. if the vendors backed that option they would have to spend $ to market it and explain it in an effort to draw interest and support. With LINQ, that is not the case because Microsoft has already done that heavy lifting and has already put their money where their mouth is (so to speak) by building it into C#.

    So, while I and several others are impressed by SBA/SBQL and even like it, there are 2 realities that must be acknowledged:

    (1) The vendors are right when they say Java (and presumably also C#) developers would rather have native-language access to objects rather than get them via strings that they have to hand to a “magic box” that gives them back their objects. I have personally written software both ways and I agree with the ODBMS vendors on this point

    (2) None of the ODBMS vendors can see a “go-to-market” strategy based on SBQL even though many appreciate its technical soundness. They see LINQ as a “proven” approach that will deliver what their customers want, and they believe they can work together to respond to an OMG RFP based on LINQ.

    In the end, number (2) is what matters since my company (Syracuse Research Corporation) is not a vendor that would be submitting an RFP response. The same is true for some others who like me are impressed with SBQL, i.e. we like it but we are not vendors. The people who sell ODBMS products and their developers are the ones who will make any new standard “alive” and they have to be the driving force behind it. Time will tell how successful we will be, but the cohesion I have seen so far gives me hope that we can produce a good standard that will benefit the ODBMS community.

    -Mike

  49. they represented a whole new thought on approaching the management of data from an object perspective and that they could be the ultimate “bridge” between the object and relational worlds.

    Sorry, but It’s either my poor English or something is wrong here. SBA has nothing to do in providing the “bridge” between the object and relational worlds.

    Others, such as Prof. Cook, were less impressed and did not see as much value in SBA/SBQL.

    That’s because Prof. Cook is a programming-language specialist, not a a database expert. His world seem to end on an API to a database middleware. He doesn’t care what happens inside of the database.

    “string-based” approach would not be accepted by their customers, many of whom are Java developers. Their position was that SBQL was a language separate from Java which would ultimately not integrate easily with Java itself (e.g. SBQL strings would have to parsed or interpreted as opposed to compiled in-line like LINQ).

    Sorry if I’m being rude, but I disagree. The “string-based” approach is unavoidable. That’s because even LINQ has to output a string-based query that should be sent to the database. As I have written, in the case of relational databases it’s an SQL query. What is it in the case of object databases? If it is OQL then it’s not a war between LINQ and SBQL, but between OQL and SBQL. I wonder how many vendors have OQL fully implemented any how many of them can improve it so that it could e.g. have proper support for updates.

    I still believe this initiative can’t produce anything reasonable if it is based on LINQ alone. IMHO LINQ AND SBQL is the only possible way if one wants both – to satisfy the industry and make object-oriented databases a real alternative to relational systems.

    Anyway, good luck. Fortunately the world of object-oriented databases doesn’t end on OMG.

  50. Nina-

    I don’t know about your English, but what I meant by a “bridge” was that the abstract store model described by Prof. Subieta is not limited to traditional persisted object stores. SBA is a higher level abstraction that allows any kind of storage mechanism to be used for objects so long as the abstract store criteria are met. This includes relational tables, so that SBQL could be used to qouery objects stored in tables just like an ORM. It also can be used to access persisted objects (in Java, POJOs).

    Prof. Subieta’s work is the first I have seen that addresses database semantics for objects with an abstract machine model that goes all the way from storage to query. He has demonstrated it to us where a simple text or XML file was used as an M0 store that could be fully queried in SBQL.

    Because SBQL can access any object regardless of it’s storage mechanism, it could indeed be used for object queries of both object and relational databases if one so desired.

    -Mike

  51. By the way, if any of you would like to be added to the OMG’s mailing list for this standard work, please email me at mcard@syrres.com and I will get you added.

    Also, please feel free to join us at the next OMG Technical Meeting, which will be held in San Jose CA (see omg.org home page, follow links). We may set up a telecom link so interested parties can attend by voice using Skype et. al.

    -Mike

  52. And Jim-

    You are right, properly speaking LINQ is not an API it is a language extension so I should probably have titled this thread LINQ as a Java query *mechanism* rather than a Java query API.

    -Mike

  53. Thanks, Mike, for your very clear and helpful discussion of the issues at the ODTWG. I think you picked exactly the right goal: extending Java with a standard query interface that does not require application programmers to put queries into strings. This interface should not be tied to any particular back-end query language or database standard. Each vendor can implement their own internal query format, and we have no need for any “war” at all. I am working on ways to improve LINQ, and having an interface for bulk updates is one such topic. Are you interested in trying to push for this kind of innovation?

    I think that this would meet Nina’s goals too, since she says that LINQ+SBQL is her desired solution. A JavaLINQ can be designed to enable multiple back ends, just as C# LINQ does.

  54. Prof. Cook-

    Can you attend the December ODBTWG meeting in Santa Clara? (I mistakenly typed San Jose above). That would be an excellent forum for you to present this work. I have not read anything on update capability in LINQ. I think it would be appealing if it did not break the existing read syntax for LINQ.

    -Mike

  55. Tegiri Nenashi said…
    http://vadimtropashko.wordpress.com/object-relational-impedance-mismatch/

    IMO, the author presents totally wrong and misleading perception of the impedance mismatch problem. It is based on some mathematical divagations, fully irrelevant in this context. If you are interested in true explanation, see http://www.sbql.pl/Topics/ImpedanceMismatch.html

  56. Dear Tegiri,
    Do you think a funtional/relational language be cleanly embedded or invoked from a procedural language? I think so. More and more people are realizing the programming is best done by a combination of specialized languages. Different languages are good for UIs, security models, data models, queries, makefiles, grammars, analytics, workflow, etc. Queries can be defined and optimized in a monoid setting, then converted to more rigid sequences for procedural processing. The goal is to make a clean embedding, not one based on command strings. I think we are finally getting closer to this goal.

    I think that Prof. Subieta’s page gives a much more realistic and useful discussion of impedance mismatch. I disagree with some of his conclusions, for example in the section “Impedance mismatch and native queries”, but his description of the problem is insightful.

    My reading is that his conclusion, like yours, is that we should have one integrated language for everything. This is a nice goal, but just as PL people don’t tend to appreciate DB issues, DB people don’t tend to appreciate all the different requirements on PLs. I think that PL and DB work should strongly connected, but not require one global language.

  57. Mike, Thanks for the invite. I have been to a previous OMG meeting. Currently we are working on the problem of prefetching, or structuring query results, that can apply to LINQ. We haven’t finished an update model yet, but we are thinking about it. I’ll contact you off line.

  58. To: Mike- all
    in fact, it would be quite good if this discussion could be useful for your work within OMG.

    This would then mean that the community behind ODBMS.ORG can work together.

  59. I published two more papers on ODBMS.ORG that are relevant to this discussion:

    -Michael Blaha, Bill Huth and Peter Cheung, “Object-Oriented Design of Database Stored Procedures”
    Link: http://www.odbms.org/experts.html#article20

    and

    -Miguel Garcia and Rakesh Prithiviraj,
    “Rethinking the Architecture of OR Mapping for EMF in terms of LINQ”
    Link: http://www.odbms.org/downloads.html#oop_ap

  60. Kazimierz’s website is kind of remarkable, full with “executable UML” and “data model independence” nonsense. Quote of the day:

    “However, the theses that SQL is a syntactic variant of the relational algebra (or the mathematical logic) are worthless. Approximately, the relational algebra covers not more than 5% of the functionality of SQL. The rest is not founded on any theories. “

  61. Tegiri Nenashi said…
    Kazimierz’s website is kind of remarkable, full with “executable UML” and “data model independence” nonsense.

    “executable UML”: see Wikipedia:
    http://en.wikipedia.org/wiki/Executable_UML
    Google reports 92 900 pages that contain “executable UML”.
    In the European project VIDE (together with partners such as SAP, Fraunhofer Institute, Softeam) we have implemented executable UML together with another OMG standard known as OCL.

    I dont’t want to comment other Tegiri Nenashi aggressive statements. I am very sorry that he/she is disappointed by some of my theses. I see no nonsense within them, they are based on more than 30 years of experience in databases and software engineeering.

  62. For those who do not know Japanese, Tegiri Nenashi is a joke name. So is Mikito Harakiri.

  63. Tegiri Nenashi said…
    Kazimierz’s website is kind of remarkable, full with “executable UML” and “data model independence” nonsense. Quote of the day:

    “However, the theses that SQL is a syntactic variant of the relational algebra (or the mathematical logic) are worthless. Approximately, the relational algebra covers not more than 5% of the functionality of SQL. The rest is not founded on any theories. ”

    5% concerns the SQL-89 standard, if you take all syntactic constructs of SQL and try to realize which of them can be covered by the relational algebra. In case of SQL-92 this is probably much less, because SQL-92 introduces a lot of fatures that are close to programming languages, obiously not covered by the relational algebra. In case of SQL-99 this is 0%, because SQL-99 is a full programming languages and data structures that it addresses are no more flat tables and contain a lot of options fully incompatible with the relational algebra.

  64. To Tegiri Nenashi :

    Out of courstesy to others it would be appropriate if you could

    i) identify yourselves (give us a little background of who you are)

    ii) keep the discussion to a level of courtesy, even if you may not agree on some technical points.

    There is no point of being unecessary rude.
    We are all trying to help finding a good solution…

  65. My apologies for inappropriate tone of the message. This kind of arrogance is typical for a relational zealot (who unfortunately I am:-), especially in discussion about “impedance mismatch”. Therefore, the right action is just not to be here.

    Few farewell comments. Nina mentioned that object query language optimization is nonexistent, and let me defend this position. First, there is strong algebraic foundation for any kind of optimization. In procedural programming, when optimizer moves a statement outside of the loop, it essentially rewrites an expression in Kleene algebra. When a subquery is unnested in SQL it is also an algebraic transformation. Likewise, System R style evaluation of the cost of different join orders leverages join associativity of the relational algebra. Take a look at http://en.wikipedia.org/wiki/Relational_algebra#Use_of_algebraic__properties_for_query_optimization

    Why object query language optimization is a myth? Because the foundation algebra is too complex. Sure some can write a PhD thesis finding few query transformations here and there, but the whole system would fall short of simplicity and clarity of System R method (which each and every database vendor copied ever since). Coming across a couple of such theses in the past, I would suggest that nobody except the author understands them, and this is why we don’t see any implementations.

    The same applies to SQL, which had grown to monstrous proportions. However, nobody really cares about all this junk (my apologies again) that accumulated there in the past decades. Most people rarely step beyond basic select-project-join query — and this one has firm foundation.

  66. No, this kind of tone is characteristic of an uncompetent troll. I didnt say that object query optimisation is nonexistent. You have no idea what we are talking about, sorry.

  67. nina said…
    No, this kind of tone is characteristic of an uncompetent troll. I didnt say that object query optimisation is nonexistent. You have no idea what we are talking about, sorry.

    I think we shold stop this tone of polemics. We all are incompetent concerning a lot of matters. Let our discussion partners to learn a bit within this discussion.

  68. Tegiri Nenashi said…
    … Why object query language optimization is a myth? Because the foundation algebra is too complex….

    I disagree that object query language optimization is a myth. In the SBA/SBQL research we have developed and implemented several optimization methods that are quite powerful:

    a) factoring independent subqueries out of loops implied by non-algebraic operators. See http://www.sbql.pl/phds/PhD%20Jacek%20Plodzien.pdf.
    This method is known from SQL in a less general variant. For instance, in the query:

    select * from Employee where salary > select avg(salary) from Employee

    the subquery

    select avg(salary) from Employee

    can be calculated in advance, to avoid recalculation it within each loop of the where operator. The method that is used in the mentioned PhD cannot be expressed in any algebra, it is based on analysis of scoping and binding names.

    2) Exploiting the distributivity property of query operators. In SQL this method is known as pushing selections before joins. For example, the query

    select * from Employee, Department
    where Employee.D# = Department.D# and Department.dname = "Toys"

    can be rewritten to:

    select * from Employee, (Department where Department.dname = "Toys")
    where Employee.D# = Department.D#

    We much generalized it for OODB, but again, not on the basis of some algebra, but on analysis of scoping and binding rules.

    3) Removing dead subqueries. They mostly appear by processing of views through the query modification technique. Usually a view delivers more than it is required in a particular query, hence unnecessary part can be cut off. This method is also known from SQL, but we much generalized it for object databases. The algorithm is rather complex. So far it is published only in my book (in Polish) http://www.sbql.pl/various/SBA_SBQL_book/Theory%20and%20Construction%20of%20OOQLs.html

    d) Optimization by indices. We can optimize queries by indices organized according to different techniques. This is the subject of a PhD that will be completed soon. Transparent indices are fully implemented in ODRA and work in a way similar to SQL.

    e) Optimization by query caching. This is the subject of another PhD, the result will be probably ready in a year.

    e) Optimization by pipelining. The method is known from SQL, but we have generalized it for OO databases. It is the subject of another PhD. The method is developed mostly in the context of distributed databases.

    f) Methods based on tuning of physical database structures and buffering. The most known method from this group is pointer swizzling. It is implemented in Objectivity/DB. We implemented it as so-called memory-mapping files.

    g) One more PhD concerns the method of optimization in distributed object-oriented databases that is known from relational databases as a method based on semi-joins. We generalized it to a method based on so-called coloured query syntax trees, where "colours" denote different distributed servers.

    There are more methods, in particular, based on chosing an optimal query execution plan. I have at least two more great ideas concerning query optimization in OODB and looking for talented people who want to investigate them.

    I agree with Tegiri Nenashi that algebraic optimization methods in OODB are inefficient thus I am not following such ideas. SBA and SBQL have established an own theoretical school that is self-contained – it does not require object algebras, object calculi, monoid comprehensions calculus, F-logic and other mathematical concepts that people invented so far to cope with object-oriented queries.

    Sorry for this long post, I hope it helps…

  69. For an API approach to LINQ for Java consider using Querydsl : http://source.mysema.com/display/querydsl/Querydsl

  70. SBQL4J (http://code.google.com/p/sbql4j/) is extension of Java language similar to LINQ. It allows to query Java objects.
    But it advantages LINQ in many aspects:
    1. It’s type safe in compile time, even more than LINQ, because result is proper Java type instead of anonymous ‘var’ type which is returned by LINQ queries.
    2. Queried objects can be ANY Java type, instead of IEnumberable like in LINQ.
    3. SBQL4J has full expression power of SBQL language, many SBQL4J queries cannot be expressed in LINQ (see executable examples on project page)
    4. It is expressed by clear, precise semantics without needless, obscure syntactic sugar.
    5. According to Wikipedia “Some benchmark on simple use cases tend to show that LINQ to Objects performance has a large overhead compared to normal operation”. This problem doesn’t apply to SBQL4J, because it’s queries are finally translated to pure, fast Java code without any reflection usage.
    6. SBQL4J semantics is well-defined, so allows to use many unique query optimization techniques (mentioned by Prof. Subieta), which gives better results than in any other query language.
    7. SBQL is not bound to any data model, it deals in data structures in more abstract way, so it works perfectly both with simple object data model in Java and more sophisticated model implemented in ODRA system.

    I would like to encourage You to introduce with SBQL4J and rethink promoting LINQ as standard Java API to object databases.

    Emil

Leave a Reply

Note: HTML is allowed. Your email address will not be published.

Subscribe to this comment feed via RSS