Wednesday, August 27, 2008

LINQ is the best option for a future Java query API

A conversation with Mike Card.

I have interviewed Mike Card on the latest development of the OMG working group which aims at defining a new standards for Object Database Systems.

Mike works with Syracuse Research Corporation (SRC) and is involved in object databases and their application to challenging problems, including pattern recognition. He chairs the ODBT group in OMG to advance object database standardization.


R. Zicari: Mike, you recently chaired an OMG ODBTWG meeting, on June 24, 2008 What kind of synergy do you see outside OMG in relation to your work?

Mike Card: We think it is likely that the OMG would need to participate in the Java Community Process (JCP) in order to write a Java Specification Request (JSR) to add LINQ functionality to Java.

R. Zicari: There has been a lot of discussion lately on the merit of SBQL vs. LINQ as a possible query API standard for object databases . Did you discuss this issue at the meeting?

M. Card: I began the technical part of our meeting by reviewing Professor Subieta’s comparison of SBQL and LINQ. It was my understanding from this comparison that LINQ was technically capable of performing any query that could be performed by SBQL, and I wanted to know if the participants saw this the same way. They agreed in general, and believed that even if LINQ were only able to do 90% of what SBQL could do in terms of data retrieval that it would still be the way to go.

R. Zicari: Could you please go a bit more in detail on this?

M. Card: Sure. At the meeting it was pointed out that Prof. Subieta had noted in his comparison that he had not shown queries using features that are not a part of LINQ, such as fixed-point arithmetic, numeric ranges, etc.

These are language features that would be familiar to users of Ada but which are not found in languages like C++, C#, and Java so they would likely not be missed and would be considered esoteric.

It was also pointed out that the query examples chosen by Prof. Subieta in his comparison were all “projections” (relational term meaning a query or operation that produces as its output table a subset of the input table, usually containing only some of the input table’s columns).

A query like this by definition will rely on iteration, and this will show the inherent expressive power of SBQL since the abstract machine contains a stack that can be used to do the iteration processing and thus avoid the loops, variables, etc. needed by SQL/LINQ.

R. Zicari: Did you agree on a common direction for your work in the group?

M. Card: The consensus at this meeting and at ICOODB conference in Berlin was that LINQ was the best option for a future Java query API since it already had broad support in the .Net community. We will have to choose a new name for the OMG-Java effort, however, as LINQ is trademarked by Microsoft.

It was also agreed that the query language need not include object update capability, as object updates were generally handled by object method invocations and not from within query expressions.

Now, since LINQ allows method invocations as part of navigation (e.g. “my_object.getBoss().getName()”) it is entirely possible that these method calls could have side effects that update the target objects, perhaps in such a way that the changes would not get saved to the database.

This was recognized as a problem, ideas kicked around for how to solve it included source code analysis tools.
This is something we will need a good answer for as it is a potential “open manhole cover” if we intend the LINQ API to be read-only and not capable of updating the database (especially unintentionally!)

R. Zicari: What else did you address at the meeting?

Mike Card: The discussion then moved on to a list of items included Carl Rosenberger’s ICOODB presentation.
Other items were also reviewed from an e-mail thread in the ODBMS.ORG forumthat included comments from both Prof. Subieta and Prof. William Cook.

The areas discussed were broken down into 3 groups:
i) those things there was consensus on for standardization,
ii) those things that needed more discussion/participation by a larger group, and
iii) those things that there was consensus on for exclusion from standardization.

R. Zicari: What are the areas you agree to standardize?

Mike Card: The areas we agree to standardize are:

1. object lifecycle (in memory): What happens at object creation/deletion, “attached” and “detached” objects, what happens during a database transaction (activation and de-activation), etc. It is desirable that we base our efforts in this area on what has already been done in existing standards for Java such as JDO, JPA, OMG, et. al. This interacts with the concurrency control mechanism for the database engine, may need to refer to Bernstein et. al. for serialization theory / CC algorithms.

2. object identification: A participant raised a concern here RE: re-use of OID where the OID is implemented as a physical pointer and memory is re-cycled resulting in re-use of an OID, which can corrupt some applications. He favored a standard requiring all OIDs to be unique and not re-used

3. session:: what are the definition and semantics of a session?
a. Concurrency control: again, we should refer to Bernstein et. al. for proven algorithms and mathematical definitions in lieu of ACID criteria (ACA: Avoidance of Cascading Aborts, ST: Strict, SR: Serializable, RC: Recoverable for characterizing transaction execution sequences)
b. Transactions: semantics/behavior and span/scope

4. Object model: what OM will we base our work upon?

5. Native language APIs: how will we define these? Will they be based on the Java APIs in ODMG 3.0, or will they be different? Will they be interfaces?

6. Conformance test suite: we will need one of these for each OO language we intend to define a standard for. The test suite, however, is not the definition of the standard; the definition must exist in the specification.

7. Error behavior: exception definitions etc.

R. Zicari: What are the areas where no agreement was (yet) found?

Mike Card: Areas we need to find agreement on are:

1. keys and indices: how do you sort objects? How do you define compound keys or spatial keys? Uniqueness constraints? Can this be handled by annotation, with the annotation being standardized but the implementation being vendor-specific? This interacts with the query mechanism, e.g. availability of an index could be checked for by the query optimizer.

2. referential integrity: do we want to enforce this? Avoidance of dangling pointers, this interacts with object lifecycle/GC considerations.

3. cascaded delete: when you delete an object, do you also delete all objects that it references? It was pointed out that this has issues for a client/server model ODBMS like Versant because it may have to “push” out to clients that objects on the server have been deleted, so you have a distributed cache consistency problem to solve.

4. replication/synchronization: how much should we standardize the ability to keep a synchronized copy of part or all of an object database? Should the replication mechanism be interoperable with relational databases? Part or all of this capability could be included in an optional portion of the standard.

a. Backup: this is a specialized form of replication, how much should this be standardized? Is the answer to this
question dependent upon the kind of environment (DBA or DBA-less/embedded) that the ODBMS is operating in?

5. events/triggers: do we want to standardize certain kinds of activity (callbacks et. al.) when certain database operations occur?

6. update within query facility: this is a recognition of the limitations of LINQ, which does not support object update it is “read-only.” Generally, object updates and deletes are performed by method invocations in a program and not by query statements.
The question is, since LINQ allows method invocations as part of navigation, e.g. “my_employee_obj.getBoss().getName(),” is it possible in cases like this that such method calls could have side effects which update the object(s) in the navigation statement? If so, what should be done?

7. extents: do we expose APIs for extents to the user?

8. support for C++: how will we support C++/legacy languages for which a LINQ-like facility is not available? We could investigate string-based QL like OQL and/or we could use a facility similar to Cook/db4o “native queries”

R. Zicari: And what are the areas you definitely do not want to standardize?

Mike Card: Areas we do not want to standardize are:

1. garbage collection: issue here is behavioral differences between “embedded” (linked-in) OODBMS vs. client/server OODBMS

2. stored procedures/functions/views: these are relational/SQL concepts that are not necessarily applicable to object-oriented programming languages which are the purview of object databases.

R. Zicari: How will you ensure that the vendor community will support this proposal?

Mike Card: We plan on discussing this list and verify that others not present agree with the grouping of these items. We should also figure out what we want to do with the items in the “middle” group and then begin prioritizing these things. It appears likely that a next-generation ODBMS standard will follow a “dual-track” model in that the query mechanism (at least for Java) will be developed as a JSR within the JCP, while all of the other items will be developed within the OMG process.

For C# (assuming C# is a language we will want an ODBMS standard for, and I think it is), the query API will be built into the language via LINQ and we will need to address all of the “other” issues within our OMG effort just as with Java. In the case of C# and Java, most of these issues can probably be dealt with in the same manner.

How much interest there is in a C++ standardization effort is unclear, this is an area we will need to discuss further.
A LINQ-like facility for C++ is not an option since unlike C# and Java there is no central maintenance point for C++ compilers.

There is an ISO WG that maintains the C++ standard, but C++ “culture” accepts non-conformant compilers so there are many C++ compilers out there that only conform to part of the ISO standard.

The developers present who work with C++ mentioned that their C++ code base must be “tweaked” to work with various compilers as a given set of C++ code might compile fine with 7 compilers but fail with the compiler from vendor number 8.
In general, the maintenance of C++ is more difficult than for Java and C# due to inconsistency in compiler implementation and this complicates anything we want to do with something as complex as object persistence.
##

Some Useful Resources:
- Panel Discussion "ODBMS: Quo Vadis?

- Java Object Persistence: State of the Union PART II

- Java Object Persistence: State of the Union PART I

Labels: , , , , ,

72 Comments:

Blogger nina said...

Wow, there is something I don't really understand. What are they actually going to do? Standardize LINQ? I think Microsoft has already done it pretty well.

September 1, 2008 11:35 AM  
Blogger Roberto V. Zicari said...

Nina
the focus here is to standardize an API for Objet databases. Mike Card is indicating that LINQ is a suitable candidate. But it is missing in Java.

RVZ

September 6, 2008 7:16 AM  
Blogger nina said...

Thank you for the answer. Unfortunately, I still have some doubts regarding the review and the whole initiative.

In my opinion LINQ is not just an API, it is rather a language extension. In order to implement an API it is enough to build a library. However, in order to provide a language extension, one needs to change the syntax/semantics/pragmatics of the programming language (in this case Java). The latter is much more difficult if you do not control the language. OMG should be aware of that Java is a relatively open platform, while Microsoft controls everything in .NET and SQL Server. Assuming that Java is indeed extended with LINQ-like support (I guess it will be very hard to convince Sun), how would third-party, propriatory solutions be introduced in Java? Wouldn't the decision process (DBMS provider->OMG->SUN) take too much time? Wouldn't the whole Java platform become too complex and unstable?

Another thing that concerns me is the following sentence:

"Stored procedures/functions/views: these are relational/SQL concepts that are not necessarily applicable to object-oriented programming languages which are the purview of object databases."

But they are applicable to databases! Those features are so essential for database programmers I cannot imagine any serious database management system that does not implement them.

In my opinion the sentence quoted above represents the point of view which led the whole idea of object databases to failure. I believe that object databases by no means should be perceived as object-oriented language extensions! Since databases are much more complex than typical programming languages, OMG should take the database-centric approach, not the programming language one.

Databases should be controlled by database programming languages, not application (traditional) programming languages. An example of such a language is Oracle PL/SQL. Although Oracle DBMS supports Java as a server-side programming language, almost nobody uses it. Why? Because PL/SQL is very well integrated with the database. It provides the opportunity to develop software at a much higher level of abstraction than what is currently offered by Java + Hibernate or Java + current object-oriented DBMS or .NET + LINQ.

I think that instead of trying to copy Microsoft LINQ, OMG should rather concentrate on developing a new database programming language in the spirit of PL/SQL (but well designed and object-oriented). As a person who has had the opportunity to develop applications using various persistence solutions for Java, I believe that the LINQ-like Java extension would't provide much more functionality than what is currently offered by Hibernate. Why reinventing the wheel? Isn't it better to invent something new?

Just my 2 cents. Sorry for taking your time ;)

September 6, 2008 10:32 AM  
Blogger Roberto V. Zicari said...

Nina
all valuable comments.
I believe it should be of interest for you what Carl Rosenberger is trying to do.
Pls check:
http://developer.db4o.com/blogs/carl/archive/2008/05/02/linq-for-java.aspx

When you say "OMG should rather concentrate on developing a new database programming language in the spirit of PL/SQL (but well designed and object-oriented)".
The problem with that is who is going to use yet another database programming language?
This is not a technical issue though in my opinion.

RVZ

September 9, 2008 5:39 AM  
Blogger Konstantin Triger said...

I don't think that LINQ implies a 'big' language change. In my project JaQue (http://jaques.googlecode.com) I'm perfectly set with closures addition only, which will probably be introduced in Java 7.

Regarding the language capabilities, I think, that Java should not strive to be able to express all the PL/SQL or other Database concepts. As it is mentioned in the interview, if Java will handle majority of use cases that will already bring a tremendous value.

Kosta

September 22, 2008 8:20 AM  
Blogger Konstantin Triger said...

Sorry, the project link is http://jaque.googlecode.com

September 22, 2008 8:22 AM  
Blogger nina said...

Roberto,

I think you underestimate programmers and their capapility to learn new things. If a language makes their life easier, they will start to use it. Please look at recent examples, like PHP and Ruby.

The biggest difficulty in learning a new language is to familiarize oneself with the API of the standard environment. I don't expect a database language similar to PL/SQL to have a huge standard library.

Konstantin,

ANY language change is a problem if you don't control the language. Can you tell me what "tramendous value" will LINQ bring to Java? In my opinion - not much comparing to Hibernate.

I also don't understand why an object database should give up powerful database mechanisms like stored procedures or views. Have you ever written a database application working in an OLTP environment? How would you make it work fast enough (or work at all) without stored procedures?

The problem here is that people involved in this project seem to be Java programmers, not database specialists. You are going to add persistence to the Java programming language - that's all. Unfortunately, it has nothing in common with defining the new standard of object-oriented databases (wasn't it the goal of this initiative?).

Please understand that JDBC is not the standard of relational databases, it's a standard of accessing relational databases. Similar with Java + LINQ.

It's OK to define an API + Java support of a database middleware, but how would you design a remote controller without designing the TV first?

This work may help you promote db4o as a tool working in small, embedded systems but it will not help define the standard. You need much more to do that.

September 22, 2008 9:06 AM  
Blogger Konstantin Triger said...

Nina,

ANY language change is a problem if you don't control the language.

Agree, that's why I want to bring LINQ capabilities without changing the language. Please see http://jaque.googlecode.com

Can you tell me what "tremendous value" will LINQ bring to Java? In my opinion - not much comparing to Hibernate.

LINQ is a layer above Hibernate or any JPA. See it as JPQL embedded into Java.

Kosta

September 22, 2008 11:07 AM  
Blogger nina said...

LINQ is a layer above Hibernate or any JPA. See it as JPQL embedded into Java.

So another layer of complexity? Why does a simple thing like storing/searching data in the database have to be so complex? I don't know many people who can REALLY understand how their Java EE applications work. Does providing new and new layers of complexity help design more stable/faster software in shorter time?

LINQ is a layer above Hibernate or any JPA. See it as JPQL embedded into Java.

I have spent many nights trying to optimize silly SQL queries generated by Hibernate. Assuming that Hibernate is just the lower layer, I guess the optimization path in your case would be even longer (LINQ to HQL and then from HQL to SQL?). Can your optimizer really do that?

September 22, 2008 11:40 AM  
Blogger Konstantin Triger said...

They are all levels of responsibility, consider:

JDBC - accepts DB specific SQL statements. Most powerful.
HQL - abstracts DB specific SQL. Less power, but DB neutral and Object Oriented.
LINQ: provides Java language bindings to HQL.

As a programmer you are free to choose what layer is better suites your needs, based on application requirements, your skills etc.

Regarding the optimizations: as the implementation will evolve it will optimize better and better. Thus over time you will get 'free' upgrades, bug fixes etc - as usual. There is always a chance that at some corner cases the hand-tuned query string will do better. For those cases we should log the generated queries, providing an opportunity for a programmer to review them and choose the right thing to do (rewrite her statement or even directly call JDBC).

Kosta

September 22, 2008 12:15 PM  
Blogger Roberto V. Zicari said...

Nina says "I think you underestimate programmers and their capapility to learn new things. If a language makes their life easier, they will start to use it. Please look at recent examples, like PHP and Ruby."

I take your point here. However, when it comes to enterprise computing and data, this "bottom up" approach of accepting new technologies may clash with company policies and internal rules.
I am actuallly curious to see if the OMG is able to pull out a "standard API " for object databases that gets used.

September 23, 2008 11:03 PM  
Blogger William Cook said...

Nina, you raise some valid concerns, but I don't think any of them are insurmountable. We have to assume that Java can be changed, even if it is difficult. If not, then Java will fall behind in terms of innovation. It is easier to copy a design that has been worked out by somebody else; Microsoft has shown this many times, so its fair that Java should take things that are good an include them. (There are some other stick-in-the-mud aspects of Java that they could borrow from C#: property methods, type inference, etc but these are less critical.)

As for creating a new database programming language, I think this misses the point. The problem is how to specify queries and updates from within Java -- its the connection between the PL and the DB that's hard. There is no reason why LINQ can't invoke stored procedures where they are needed. It would be useful if SQL was more uniform, for example, if a stored procedure call could be used in part of a join.

The thing that I find interesting is whether Java could jump ahead of LINQ by fixing some of its limitations. Two things that come to mind are updates (which were mentioned) and better prefetch.

October 4, 2008 12:37 PM  
Blogger nina said...

William,

I also like Java and I wish it all the best. However, I believe that the future of Java should be left to Sun. As far as I know the ODBT WG is working on standards for object databases, not for object-oriented programming languages (especially if it's only Java).

I agree with you that a seamless connection between the PL and DB is very hard to achieve. I don't think it is the most important problem in the area of object databases, but let's say it is.

As you know, the area of persistent programming languages has been a subject of research for a few decades now. Starting from Pascal (e.g. Pascal/R), through Modula (e.g. Persistent Modula), ending on Java (e.g. PJama), all of the research projects failed. I personally know a professor who has been working on this problem for the past 25 (or even more) years.

The lesson learnt from all those efforts is that one CANNOT take an existing, traditional programming language (whether it's Java, C#, C++, Ruby, or whatever else), extend it with database constructs and get a satisfactory solution. There are too many differences between the worlds of databases and programming languages (so called impedance mismatch).
The advent of LINQ doesn't change anything here, so from my point of view any attempt to integrate it with Java is simply a waste of time.

The only known way of achieving a seamless integration between procedural languages and database constructs is through designing:
1) a database programming language (DBPL) in the spirit of PL/SQL,
2) it's runtime environment in the form of a full-fledged database management system.

Once one have done that, one can try to shift such a DBMS to the client side. What one gets is a database management system at the client side (client applications, even GUI-based, written using DBPL), at the application-server side (client-server application logic written using DBPL), and at the database-server side (data-intensive logic written using DBPL).

This approach gives one the full integration of procedural and declarative constructs in a distributed environment and the ability to build complex database applications using just a few lines of code.

October 4, 2008 8:36 PM  
Blogger William Cook said...

Nina,

I am well aware of the history you cite. And I agree that the history of attempts to turn PL runtimes into databases has been rife with problems. But I think that the converse idea of taking a database and extending it so it can implement the entire system (including the client), is also doomed to fail. I think that this is what you are proposing, but I could be wrong.

The only solution that I see as viable is to keep separate databases and clients: the database is a robust data engine and the client is written in a general purpose language. I believe this because is a good approach for many reasons, including scalability, integrity of data, evolution, transactional behavior, and client programmers needs. If you believe this, then the key problem is to find a way for the PLs to reach out to the databases effectively, and for DBs to be designed to support PLs needs. LINQ is a better way for PLs to talk to databases. Its not perfect, and there is still work to make databases easier to talk to. But that is what I think we should be doing. I don't think that we should try to find an all-encompassing single language that does it all. I suppose that makes me a postmodernist.

Sun has created a process for proposing language evolution. It has worked fairly well in the past, and that is the process that will be used to attempt to add LINQ to Java. It might not work, but its worth a try.

I'm actually working on an essay on the gulf between PL and DB viewpoints on this problem. If you'd be interested in commenting on it, I could send you a copy. I'd love to get your feedback.

William
wcook@cs.utexas.edu

October 4, 2008 8:54 PM  
Blogger John Davies said...

Thank you for Roberto for drawing my attention to this blog. I'm travelling at the moment so will leave reading the comments for the flight home.

Having read the title and conversation though I'd like to comment on what I see as needed in the Java world relating to OODBs and hierarchical persistence. LINQ is an attractive end-goal however it is so totally MS centric that it would be impractical for the "real" world. I read the conversation with interest as many of the problems and issues ring true with my experiences. What we're missing however is a nice simple Java API for storing hierarchical trees of data/objects not a whole new realm of thinking and not yet another layer on top of an already inefficient ORM layer.

ORM is perfect when the problem is simple but it doesn't scale well due to the impedance mismatch complexity. ORM may well be an implementation to solve the problem but the programmer needs an API that abstracts the mapping.

I have a number of clients (investment banks), yes their numbers are reducing I know but for those that are left they are investing significant amounts of time and money into developing XML databases or should I say APIs to store XML into DBs, some OO some R and some specifically XML (a la MarkLogic). What they want however is not a single solution but a generic API to abstract the multiple solutions, one of which would well be LINQ.

I've got to go 'nd do some real work now, I'll read the comments on the flight home and chat later, I just wanted to get this out.

-John Davies-
CTO Incept5

October 6, 2008 6:35 AM  
Blogger Peter Fallon said...

Solutions to the OO/Database problems often tunnel focus on a few issues and miss others - mainly because each of us has a different problem to solve, as this discussion shows.

1 - Pure object database solutions solve the problem of impedance mismatch and many related programming issues, but usually at the expense of the power offered by the large (relational) database products, database mangement tools or simple product maturity. They also frequently can't be used with complementary system such as commercial reporting/analysis tools (which for the most part only offer ODBC or similar interfaces - and mapping an OODB to ODBC just brings back the same old Object/Relational issues again)

2 - Object/relational mapping solutions allow you to work with your OO language, and still connect to the relational backend, with all the power, corporate standards and commercial tool availability implied. However there are always compromises with (server side) business logic, as O/R mapping usually makes it difficult to link complex or pre-existing business logic or security on the server to your front end.

LINQ is a way of addressing data query tasks and retaining OO compile type checking (something SQL lacks), while allowing for both in-memory execution against local object collections, or deferred execution/translation to SQL on a backend. For someone using a pure OODB - LINQ, and the standards implied will probably be a real benefit as its a step up from what has been offered previously.

Providing LINQ method based query functionality to Java could be done as a straight API/function library - but going the whole hog syntax wise, as per the following C# 3.0 Query Expression example:

IEnumerable<string> query = from s in names where s.Length == 5 orderby s select s.ToUpper();

would not happen unless Sun made changes to the compiler and language specification for Java.

For anyone still using a relational backend, it doesn't solve any of the key issues they already face - though it will make work simpler on data buffered at the client. The same problems of how good the SQL generated is, and whether it makes use of pre-defined procedures, views...etc remains.

October 6, 2008 8:29 AM  
Blogger Scott W, Ambler said...

A few thoughts:
1. I'm always interested in the development of better ways for application code to access databases.
2. There's a long history of new database access languages that are meant to supercede SQL. I have no doubt that jLINQ (or whatever it ends up being called) will enjoy the same level of success as these previous efforts.
3. There are very good architectural reasons for implementing some functionality in the database and some in other locations. Every so often the extremes of "everything in the DB" or "everything outside the DB" make sense, but this is incredibly rare in practice. You need to find the sweet spot.
4. Instead of spending all this effort trying to put together a slightly better technical approach to solving a problem which has been addressed many times over, it would be far more effective adopting practices and philosophies that enabled data professionals and developers to work together more effectively. This is something that I try to focus on at www.agiledata.org.

- Scott
Practice Leader Agile Development, IBM

October 6, 2008 2:29 PM  
Blogger Konstantin Triger said...

@Peter:

1. I don't think that 'Java LINQ' must have exactly same syntax MS LINQ has. Rather we need it will have something clear, typesafe, easy to use and target the same domain of problems. With an addition of closures, that I believe will do in the next Java release, I can offer the following
syntax:

Iterable<String> query = from(names, where( {String s => s.length == 5}, orderBy( { String s => s }, select ( { String s => s.toUpper() } ))));

I think, it meets the requirements and sometimes even better than MS LINQ since every closure is a regular method and is not subject to any restrictions. For example, there is no need in 'let' statement introduced by LINQ to
cope with some of them.

That's what I do in my JaQue project at http://jaque.googlecode.com

2. I think that MS slightly mistargeted LINQ with "LINQ to SQL", and "LINQ to Entities" 'fixes' that. LINQ shines when it targets Objects or Object Models (i.e. ORM). When this is the architecture, the ORM handles the low level SQL generation, based on its annotations, configuration etc, delivering better quality in
translation.

This is the architecture of JaQue project http://jaque.googlecode.com
since it targets JPA.

K

October 7, 2008 2:28 AM  
Blogger jacenty said...

The discussion is very interesting as the problem itself is. But I'm not sure if the approach of LINQ is correct – in general I agree with Nina. Taking LINQ as a standard is rather (extremely?) strange. First, it is Microsoft proprietary and Java comes from Sun. Next, it defines nothing but a PL syntax extension (awkward and difficult to generalise and propagate to other PLs, IMHO). Nothing underlying can be controlled or accessed by a programmer, many database features are lost (transactions, stored procedures, etc). LINQ is not a query language as it's expressions are not evaluated by a database engine – they need to be translated by some extra middleware (yet another transparent but not translucent layer) to the database specific QL. Resulting native queries are again out of programmer's control (substantial optimisation issues). And, finally, what about DDL and DML constructs? OK, let's assume the schema does need to be changed, or it's maintained from somewhere else, or DDL is not a part of QLs. Fine. But still I cannot imagine working with a database without updating data. I could not find anything like that in LINQ, while simple 'read-only' queries are insufficient.

I'm a Java programmer for quite a few years and I was unable to find a decent middleware for accessing relational databases that complies with my requirements - recently I got seriously disappointed with Hibernate and its performance issues. Therefore, I'm still using JDBC with SQL strings, although the language has many drawbacks and flaws, and it's expressions are very often implementation dependent. But it allows me to control what and how I do with my database. What we need is a new flexible and efficient query language, not masking a poor one with something else only expected better. As LINQ is supposed to wrap anything existing below (LINQ for X, LINQ for Y, ...), it assumes our database technology (mostly relational) is what we actually need and we are happy about. Are we? Don't we need to develop new (object-oriented) technologies? There's no chance for database evolution with such approach. No progress, only stagnation. Personally, I do not like such future.

October 7, 2008 2:32 PM  
Blogger William Cook said...

I'm sorry to say it jacenty, but just about everything you say about LINQ in your message is false.

"First, it is Microsoft proprietary and Java comes from Sun. "
* LINQ is a proprietary implementation of a general idea which originated in academic literature. There is no reason why Java cannot borrow from LINQ in the same way that C# borrowed from Java.

"it defines nothing but a PL syntax extension (awkward and difficult to generalise and propagate to other PLs, IMHO)"
* LINQ is a PL syntax and semantic extension. It is really two concepts: a set of higher-order method calls for Where, Select, GroupBy, etc. The second is some syntactic "sugar" to make these calls look more like SQL. The first would easily generalize to other languages, the syntax is not important.

"Nothing underlying can be controlled or accessed by a programmer"
* LINQ is implemented by a library of method calls. I think you could modify or replace parts of it fairly easily. Or you could implement your own version.

"many database features are lost (transactions, stored procedures, etc).
* Transactions are orthogonal. It is true that .NET seems to have lost some of the capabilities from Microsoft Transaction Server (which was copied to create EJB, by the way). But I'm pretty sure they will merge the two eventually.

* LINQ can call stored procedures.

"LINQ is not a query language as it's expressions are not evaluated by a database engine – they need to be translated by some extra middleware (yet another transparent but not translucent layer) to the database specific QL."
* This comment seems very confusing. I think LINQ is a query language. As far as I'm concerned its queries are executed by the database engine because of the translation that you describe. Using your argument, SQL is not a query language, because it is translated into a lower level query plan for execution.

"Resulting native queries are again out of programmer's control (substantial optimisation issues)."
* This is true. There is a big debate about whether automatically generated queries are good enough. I think they are, in most cases. If they aren't then create a stored procedure for the 5% of complex queries where it matters.

Finally, you are right that LINQ doesn't support bulk update operations, as far as I know. But 95% of updates are to single objects, and those are easily handled by storing modified objects.

I don't have any comments on your second paragraph, which seems reasonable enough.

October 7, 2008 2:50 PM  
Blogger John Davies said...

Well I think JWislicki is right on the nose. While the academics and visual programmers who've never actually worked on real systems will always side with LINQ I just can't see the real world accepting anything that originates from Microsoft.

Don't get me wrong, I think LINQ is pretty cool and I'd love to see it succeed but it just won't. It's another technology that works very nicely in the tight proprietary world of Microsoft but why would any other vendor have any interest in implementing it on their technology. Even if they do it will just be an academic exercise like MONO was/is. The only reason it was ported was so that MS could say .NET wasn't locked into the MS OS. There are too many owners, Redhat's Application server using Apache components with a Mule ESB running on Sun's Java with Oracle's database on SUSE Linux. Who's going to own the LINQ part here?

LINQ is a nice technology that's worth steeling a few ideas from, assuming of course MS didn't patent them. It will never work outside of the MS world in anything more than a few academics and enthusiasts' prototypes. I wish it every success in the MS world.

-John-

October 7, 2008 3:40 PM  
Blogger jacenty said...

William, thanks for your response and explanation. I still disagree with several things and I will never like eclectic constructs. However, I looked at several official LINQ examples and what I could were just selecting queries (what I wrote about updates and stored procedures).

As for transactions, usually they can be to some extent managed by a programmer. Another issue (not related directly to LINQ) is whether transaction orthogonality to a QL is a correct approach.

October 7, 2008 3:48 PM  
Blogger Jim said...

Personally I find it difficult to get too excited about LINQ one way or the other (a bit like my feelings about Vista!). I don't see that it allows me to anything that I couldn't already do with a decent API, and I don't see how it makes those things significantly easier. Integrating this stuff into the language itself appears to me to be entirely unnecessary, just adding complexity whilst removing choice.
Perhaps LINQ is solving the wrong problem?

In terms of the OMG, LINQ is a language extension NOT an API. It therefore falls completely outside of what Roberto said, "the focus here is to standardize an API for Object databases.". The statement, "LINQ is the best option for a future Java query API", appears to be a contradiction in terms.

October 8, 2008 1:11 AM  
Blogger Roberto V. Zicari said...

JIm:

I have quoted Mke Card: "The consensus at this meeting and at ICOODB conference in Berlin was that LINQ was the best option for a future Java query API since it already had broad support in the .Net community. "

October 8, 2008 3:56 AM  
Blogger Roberto V. Zicari said...

JIm:

I have quoted Mke Card: "The consensus at this meeting and at ICOODB conference in Berlin was that LINQ was the best option for a future Java query API since it already had broad support in the .Net community. "

October 8, 2008 3:57 AM  
Blogger Prof. Dr. Stefan Edlich said...

Dear all,

just my few cents...

One important point is that ODBMS needs a new API and
a fresh stipulation of a standard soon. At ICOODB 08 there
were only 2 visible alternatives : LINQ and SBA/SBQL
(in this context I really can not understand the
mentioning of hibernate or hql in the context of
object databases?! Gavin doesn't like us anyway ;-).

And although I think the capabilities of SBA/SBQL are awesome
- and I appreciate the great amount of work that has been
invested in this approach -,
LINQ is a language that nearly everyone knows or is able
to learn in a second when looking at this page
http://msdn.microsoft.com/en-us/vcsharp/aa336746.aspx

By the way: I can see at least 12 books on amazon on LINQ.

So LINQ is already mainstream and would help ODBMS to get
a huge mainstream momentum. And to my opinion the Java
problems will be solved soon one or the other way.

I am not sure if we have time for a decision till ICOODB 2009
at ETH Zürich but anyway I hope that OMG will take the
best decision soon.

Stefan Edlich

October 8, 2008 6:09 AM  
Blogger Carl Rosenberger said...

Mike Card says:
The consensus at this meeting and at ICOODB conference in Berlin was that LINQ was the best option for a future Java query API

This is excellent news.

What has been missing for a breakthrough paradigm shift towards a widely adopted use of object databases was a standard for querying.

LINQ is the chance for such a standard.

When we use Java, of course we would love to stay in the object-oriented world with our database queries. LINQ does that: It allows method calls on objects and it returns objects.

LINQ could take language integration of queries to the next level and advance development productivity and quality:
With LINQ all queries would be typesafe, compile-time checked and refactorable.

LINQ could also make database backends truely interchangeable. A LINQ provider would be fully standardized by the language, with no ugly dialects that create incompatibilities, like we see them in SQL.

If LINQ finds it's way into the Java language we would of course see great implementations for in-memory use and on top of relational databases. Then database interfacing code
would only have to be written once and it could be run on the best platform for the respective task:
- In memory for testing
- against relational databases, if corporate policies require their use
- against object databases for maximum performance or for minimal ressource consumption on embedded systems.

Let's go for LINQ!

LINQ is making it's way on .NET. There is no reason it should be less successful if it becomes available for Java.

October 8, 2008 6:28 AM  
Blogger nina said...

At ICOODB 08 there
were only 2 visible alternatives : LINQ and SBA/SBQL. (...)And although I think the capabilities of SBA/SBQL are awesome
- and I appreciate the great amount of work that has been
invested in this approach -,
LINQ is a language that nearly everyone knows


If there are two good alternatives, why not to use them both? Do they contradict each other, or can they coexist? I have the impression that LINQ and SBQL/SBA target different areas of object databases. The goal of the former is to provide the "remote controller", while the latter concentrates on the actual "TV".

I think that the protagonists of LINQ forget (among other things) about one important (crucial?) aspect of every query language, i.e. optimization. Those "cool" features of LINQ like queries that are "typesafe, compile-time checked and refactorable" do not have much value if you have to wait years for a single query to be executed. In the case of LINQ there is SQL Server that does (better or worse) this job (assuming that you send it an SQL query). But how would you deal with optimization in the case of an object-oriented database? No, you can't put it off ("as the implementation will evolve it will optimize better and better"), you need to know it NOW. In the case of SBA very powerful (some of them don't exist even for SQL) optimization algorithms already exist. What about LINQ?

October 8, 2008 7:16 AM  
Blogger Roberto V. Zicari said...

Nina

you have identifed a *crucial* point. Query optimization. In fact, lack of good query optimization was apparemtly one of the technical obstacles to a wider adoption of odbms of the first generation back in the 90s.

On this respect I like to quote an interview of Marianne Winslett to Professor David Maier in 2002. To the question "Are there any other results from object-oriented database research that you would single out as having had long-term impact? " , David says "The other thing [that] I think will have impact is [that] I think we finally figured out after ten years how to optimize OQL, to do cost-based [query] optimization [for OQL] and solve some of the hard [optimization] problems. And I think that will [be] useful for XML query languages. "
I do not know if all of this past research knowledge on optimizing query for odbms can be applied to LINQ now...

October 8, 2008 12:16 PM  
Blogger nina said...

David says "The other thing [that] I think will have impact is [that] I think we finally figured out after ten years how to optimize OQL

With all due respect to Prof. David Maier, I doubt he has ever known that. In order to optimize OQL queries, one would have to define its precise semantics first. Unfortunately, to this day OQL's semantics lack sufficient precision.

This is not the case with SBQL. Because it's semantics is clearly defined, optimization is possible. As far as I know, the optimization techniques designed for SBQL are not just some performance "tricks", but very powerful and general algorithms.

If people love the perspective of LINQ + Java + ODBMS so much, I can think of only one viable method of saving this project from a complete disaster: generate SBQL queries the same way MS LINQ generates SQL queries. Of course, if one takes this approach, one needs to design the architecture of the ODBMS according to the requirements set by SBA.

October 8, 2008 3:11 PM  
Blogger William Cook said...

What about this, nina?
Formal semantics and analysis of object queries

OQL is not that difficult a language to pin down. Its not that different from SBQL either. I don't see what the fuss is all about. But I do agree that optimization is the key requirement of queries. I think that LINQ gives the back end enough information to optimize propertly, whether the back end is a ODBMS or a RDBMS.

October 8, 2008 3:20 PM  
Blogger nina said...

What about this, nina?

Sorry, I can't download it. I'm not an ACM member.

OQL is not that difficult a language to pin down.

If it's not, then why nobody hasn't done it so far?

October 8, 2008 3:28 PM  
Blogger German Viscuso said...

Hi.

I've been following this thread and my opinion is that OMG was very clever at spotting LINQ as the possible foundation for a standard API for object databases (the fact that it's missing in Java right now is not a blocker).

IMHO LINQ must not be put aside because it was introduced by Microsoft, it has all the potential of escaping Microsoft's stronghold.

On the technical side I would like to say that yes, LINQ like query integration in the language help developers design more stable/faster software in shorter time mainly because of the reasons that Carl mentioned in this thread. db4o and Prof. Cook were pioneers by introducing native queries which helped real developers working with real applications to deliver in a shorter time.

Current LINQ implementation by MS might have many flaws but it's evolving and certainly the Java version can improve on it. There many arguments in favor of LINQ (eg this one) and adoption is growing.

The train is moving and it won't stop!

October 8, 2008 11:12 PM  
Blogger Kazimierz said...

Because the discussion mentioned SBQL, I think it will be good to know what it is. SBQL web pages are www.sbql.pl. Recently we have prepared a programmer manual for our system ODRA where SBQL is fully implemented. See
http://www.sbql.pl/various/ODRA/ODRA_manual.html
The manual does not include the section on transactions, because I have decided to make a new version of this feature (old transactions do not support distributed databases). Recently one of my coworkers have implemented an interface from .NET (C#, ...) to ODRA via SBQL. This part is also not included in the manual yet.

October 9, 2008 3:41 AM  
Blogger Kazimierz said...

I have more general doubts concerning the Mike Card's proposal and this discussion. Java is already standardized by ISO. I suppose that any extension to Java should be the deal of a corresponding ISO committee rather than an OMG committee. Standardization of LINQ by OMG causes again my doubts. Although Microsoft is a member of OMG, it is deeply in opposition, at least on the ground of middleware (Roger Sessions severely criticised the CORBA standard comparing it to COM/DCOM).

I also would like to note that OMG already standardized a query language known as OCL. This was done together with the standardization of UML2 aka Executable UML. In this standard OCL is used as a constraint language (for specification preconditions, postconditions and assertions), but in another standard QVT OCL is used as a regular query language. Is OMG prepared to standardize two query languages? Note that OCL is truly object-oriented addressing the UML object model. By no way it is related to the relational model.

I know that so far there is no programmers of OCL, but this can quickly change. Several groups already implemented OCL, in particular Martin Gogolla group and my group. In our case OCL is implemented as a database query language on top of SBQL, hence it inherits everything from SBQL, in particular, query optimization and access to external (distributed) resources.

October 9, 2008 6:14 AM  
Blogger Kazimierz said...

In addition to the above comment, I would like to note that the tradition of OMG is developing standards that are platform and vendor independent. All OMG standards that I know (CORBA, UML, UML2, OCL, QVT, MDA, ...) are developed from scratch as a tradeoff between proposals of different industrial OMG members. If a new OMG database standard would be based on Java and LINQ, it could be perceived as a direct support for particular companies such as Microsoft and Sun. I have doubts if other OMG big players (IBM, HP, SAP, Oracle, ...) would be happy from such a solution. If my impression is right, then we can forget that such a new standard proposal will be ever approved by OMG.

In this context I suggest to put again more attention to SBQL. It is platform independent, not supported by any company, based on powerful and abstract SBA theory, its semantics is formally defined for a rich family of UML-like object models. SBQL implementation supports almost everything that are important for making such a standard, including a powerful query language (much more powerful than LINQ), query optimization, all kind of updates, stored procedures, classes and views, semi-strong typechecking, and more.

October 10, 2008 12:49 AM  
Blogger William Cook said...

Dr. Subieta,

I think you are confusing things: OCL is an object-oriented notation for a predicate in mathematics. OCL does allow iteration of collections, to allow "for-all" and "exists" aggregations. That is, OCL is similar to a SQL where clause. OCL does not construct structured values, as in the OQL or SQL select clause, so it is not a full query language. (I happen to think that the original mathematical notation is nicer, and it was silly to create an object-oriented syntax for it. But it is just syntax, so I'm not going to worry about it.)

I also think that SBQL and OQL are very similar, as I have said before. Nina said that OQL does not have a formal semantics, but then she admits she hasn't read one of the papers that does give a semantics for OQL. SBQL is stack-based (like forth) so it avoids some explicit binding operators. That doesn't seem like a big difference to me.

LINQ is not a new database query language. It is a programming language interface that allows you to specify queries in type-safe way, and also to cleanly represent the queries as explicit values so that they can be optimized (or sent to a database for optimization).

In other words, LINQ is a solution David Maier's original definition of impedance mismatch: “Whatever the database programming model, it must allow complex, data-intensive operations to be picked out of programs for execution by the storage manager, rather than forcing a record-at-a-time interface.” You see, LINQ allows parts of program (the queries) to be lifted out and sent to the database. Other techniques for doing this (notably query strings) are simply a bad way to partition a program. LINQ is better. Its not perfect, but its better.

Finally, I should say that I have very little faith in existing standardization processes. Standards have always been a weapon as much as anything else; they are not created by thinking from first principles. Your own comments demonstrate this, because you are taking a very political approach to this problem, in saying that you can't use an idea because a particular company thought it up. I don't believe that OMG's standards were made "from scratch": CORBA was influenced by proprietary offerings, and UML was unified from a number of competing approaches (which had significant consulting practices supporting them).

[1] David Maier. Representing database programs as ob jects. In Advances in Database Programming Languages, Papers from DBPL-1, pages 377–386. ACM Press / Addison-Wesley, 1987.

October 10, 2008 8:21 AM  
Blogger Piotr said...

Hi,

William Cook says:
I think you are confusing things: OCL is an object-oriented notation for a predicate in mathematics. OCL does allow iteration of collections, to allow "for-all" and "exists" aggregations. That is, OCL is similar to a SQL where clause. OCL does not construct structured values, as in the OQL or SQL select clause, so it is not a full query language.

I have to disagree. Apart of its primary purpose as a constraint language, OCL is a quite powerful expression language. Please note its "->collect(...)" iterator operation (in OCL's terminology). This serves a role analogous to OQL's select clause. It allows for nesting sub-queries in it and can incluse a Tuple type constructor, so you can construct structured results - also nested ones.
Yes, the syntax is a bit odd, and there are also some ambiguities in its specification if someone considers its usage as a query language. However, in our current research project that deals with programming in UML, we chose (due to its assumption to maximize existing OMG specifications reuse) OCL as a query language and did not encounter significant limitations in its expressiveness. We were able to fit it in a relatively seamless way into UML's Activities and Actions modules (for imperative constructs) to construct a query language with a programming laugage capabilities. BTW: More results of this work, performed under VIDE 6th Framework Program project, including part of the software produced will be available soon.

October 10, 2008 9:53 AM  
Blogger Kazimierz said...

William Cook said...
I also think that SBQL and OQL are very similar, as I have said before. Nina said that OQL does not have a formal semantics, but then she admits she hasn't read one of the papers that does give a semantics for OQL. SBQL is stack-based (like forth) so it avoids some explicit binding operators. That doesn't seem like a big difference to me.

Sorry Prof. Cook, you seem to compare things surely without being familiar with one of them. OQL and SBQL are fundamentally different languages, because SBQL is a fully-fledged object-oriented programming language with one difference in comparison to the classical ones - expressions in SBQL are queries. For instance, 2+2 is a query, sin(x) is a query and Employee where salary > 1000 is a query. Such queries or expressions are used everywhere, in particular, as arguments of imperative statements and as parameters of procedures and methods.

I also think you have misunderstood the term "stack-based". All programming languages, including C, Java, Pascal, C++, etc., are "stack based", because all of them involve environmental or call stack. The novelty of SBQL is that this stack, defined on the abstract level, is used to specify the semantics of query operators such as selections, navigations, joins, quantifiers, etc. This forms a new theory that is original in both database and programming language domains. I encourage you to understand it. Without this our discussion and comparisons are simply waste of your and our time.

I support Nina's thesis that OQL has no formal semantics. I make the thesis even stronger: it is impossible to define for OQL the formal semantics. There are two reasons. The first is that the ODMG object model (database state) is mathematically very imprecise, the standard even does not specify formally the concept of "object". The same concerns the formal model of query results. Formal semantics means that we have to define for each query the mapping State --> Result, and we should do that recursively according to the OQL abstract syntax. If the domains are not precisely defined, then formal semantics is impossible to define. The second reason is that the ODMG standard is full of technical flaws. Flaws are even observed on the distance of a half of page (see some my publications). I spend a lot of time trying to formalize the standard and OQL. Are you curious about the result?

The result is SBQL and SBA. I don't believe in other formalizations, I saw too many fake formalisms.

October 10, 2008 10:47 AM  
Blogger Robert said...

I think the main point is that object databases need a common query mechanism, so tooling can be implemented easily enabling access to any vendor implementation.

Imagine something like, the BIRT LINQ extensions, so BIRT reporting tools can access any OODB or RDB.

Simply bringing a common query mechanism to the object database vendors is not enough, it needs to be a query mechanism accepted by the software community at large, otherwise we've really just created a better form of OQL ... big deal it does not help with adoption.

The approach taken by LINQ lends itself well to the object database notion of "the memory model, is the data model" ala transparent persistence.

The ideas expressed in LINQ are genuinely interesting and present true value add over alternatives in Java ( as articulated by many in this thread ).

LINQ fits OODB's well and it's implementation in Java would add value to that community. So, object database vendors should promote it's implementation in Java and in the process achieve both interoperablity and (arguably) by side effect, something widely accepted.

It brings value to the Java community and brings value to object database technolgy interoperability.

I think an important part of this is "it brings value to the Java community", and as such we should be able to get cooperation from that community in it's implementation. If that cannot be achieved, then the positive impact regarding odb interoperability would be significantly diminished.

-Robert

October 10, 2008 11:06 AM  
Blogger William Cook said...

Prof. Kazimierz,
We have gotten into this discussion before, and it is never resolved. There certainly may be some be some ambiguities in the ODMB OQL spec, but rather than working to clear them up, you just say that its impossible. But at the same time you give correspondences between OQL and SBQL on your own web site. You mention a few fine points where they differ, although some of these are simply that things are not as easy in OQL as in SBQL; this is not an argument about expressive power, its an argument about syntactic ease. I was comparing OQL and the query part of SBQL, without the updates. You make a big point that "2+2 is a query, sin(x) is a query and Employee where salary > 1000". These are also queries in OQL:
2+2 is a query, sin(x) is a query and 'select e from Employee as e where e.salary > 1000' are all OQL queries. I stand by my assertion that SBQL is stack-based in the sense that Forth is stack-based. Your non-algebraic operators are defined so that the left side pushes items onto the stack, and they are implicitly referenced by the right side of expression. This gives somewhat of an economy of expression (as in Forth) but at the cost of being less explicit in terms of references to values. The difference is a matter of taste. At the same time you refuse to consider that any other language could have a formal semantics. I agree that this discussion is not very productive. You have a competing technology to OQL and you are promoting it. I'm fine with that. You may very well have a much better implementation of SBQL than OQL implementations, because OQL was never adopted widely and as previously mentioned here, the original OODB products didn't have very powerful query optimization. This was a terrible mistake on their part, which I hope will be rectified in future products. It would be better if you gave performance numbers than trying to argue about semantic foundations.

October 10, 2008 8:07 PM  
Blogger Kazimierz said...

William Cook said:
You make a big point that "2+2 is a query, sin(x) is a query and Employee where salary > 1000". These are also queries in OQL:

I never said that these queries are impossible in OQL. My thesis was different: OQL is a query language, while SBQL is a programming language that use queries as expressions. In SBQL there are no expressions that are not queries and this feature is so far unique for both databases and programming languages. OQL queries can be loosely coupled with imperative statements or procedures, actually as strings of characters. This is not the case of SBQL: queries, similarly to programming expressions, can be arguments of imperative (updating) statements, can be passed (not as strings!) as parameters of procedures and methods in both call-by-reference and call-by-value mode. Moreover, SBQL is strongly typed, including the use of queries as parameters of procedures/methods.
For this reason SBQL is higher-level than C#/LINQ. C#/LINQ makes distinction between expressions and queries, what is illogical, because the typing system is the same. Moreover, LINQ queries cannot be used for updating, what is illogical too. Such limitations do not exist in SBQL.

William Cook said:
I stand by my assertion that SBQL is stack-based in the sense that Forth is stack-based.

I disagree. Although both Forth and SBQL are described as “stack based”, the reasons for this descriptor is different and the stacks that are used by these languages are different. Forth is a bit more advanced assembler and involves stacks known as parameter (data) stack and return stack. These stacks are EXPLICITLY used by the programmer. SBQL is designed as the most abstract database programming language (more abstract than SQL and LINQ) and no stack is explicitly used by the programmer. SBQL stacks are used for formal description of SBQL semantics. SBQL run time introduces two stacks: a result stack and an environment (call) stack. Both stacks are used in some form in every programming environment, including Pascal, C, Java, Ruby, etc. The novelty of SBQL is that these stacks are described in an abstract mathematical form, which can be used for formal description of every language construct, including semantics of query operators. There is little in common with reverse polish notation (RPN) that is used by Forth and Hewlett-Packard calculators. SBQL does not deal with such a notion.

William Cook said:
At the same time you refuse to consider that any other language could have a formal semantics.

Sorry, I never said that, this is your invention. I claimed that I do not believe in formalization of OQL and presented the reasons for such a claim. To formalize OQL one must make it consistent, because it is impossible to formalize (and implement) anything that is internally inconsistent. One of many places where OQL is inconsistent concerns name scoping rules. In one place the standard says that each name defined in a query is invalid outside it. But a half a page later it gives an example which explicitly uses a name defined in a query outside it. BTW, explaining scoping rules without introducing an environment stack is impossible. Only SBA introduces this stack, hence my doubts concerning other formalizations.

I would like to underline once again that I am not against ODMG and OQL. Nothing is perfect from the beginning and I think ODMG did a good job. Our role is to make their job more perfect. Me and my group have done that, naming the result SBA and SBQL.

William Cook said:
Finally, you are right that LINQ doesn't support bulk update operations, as far as I know. But 95% of updates are to single objects, and those are easily handled by storing modified objects.

Even if updates concern single objects, they must be found somehow. How? There are three scenarios: (1) iterate over a collection; (2) use an index; (3) use a query that returns a reference to an object. (1) is a disaster for large collections having e.g. millions of elements. (2) is a trouble for database administration, because such an index must always exist and DBA has no freedom to change indices; (3) is the only good option that is used both in SQL and SBQL. For instance, let Doe have to obtain the salary being the average salary plus 100. In SBQL this is accomplished by two queries, one on the left side of the assignment and the second one on its right side:

(Employee where name = “Doe”).salary := avg(Employee.salary) + 100;

The first query returns a reference to the Doe’s salary (providing there is only one Doe, otherwise an exception is rised) and the second query returns some value that is assigned according to the reference.

October 13, 2008 6:27 AM  
Blogger William Cook said...

My point with OQL is that in formalizing it you would have to change some of the informal documentation to be consistent. You may even have to correct or modify the informal specification. That is why I say that OQL can be formalized, with appropriate modifications to the spec. This is normal in any formalization. Your point that a particular specification document cannot be formalized without changes is certainly true, but not very useful.

If SBQL is a full programming language, then it should be compared to C# as a full programming language, not to LINQ, which is a way for an object-oriented procedural language to interface cleanly with a variety of pure functional query languages. The whole point of LINQ is that you don't need a complete new programming language, you can keep using C# which has lot of tools and support behind it. Sure, it is not an academically perfect solution, but it is a practical one.

I think that the SBQL use of an implicit environment stack is mostly a syntactic device. I know you that you say it makes the language semantics more compositional. The only semantics I've seen for SBQL (e.g. chapter 6 of your book) is very operational. All languages manipulate environments of bindings, and these are explicit in the semantics of the languages. Saying that "Only SBA introduces this [environment] stack" makes no sense.

As for updates, in LINQ you can use a query to find an object, then use the update operations on the object to make changes. This handles 95% of cases. The thing you cannot do in LINQ is bulk updates, which is unfortunate.

October 14, 2008 7:59 AM  
Blogger nina said...

William Cook said...
If SBQL is a full programming language, then it should be compared to C# as a full programming language, not to LINQ

It should be compared to PL/SQL. Apart from client-side programming, there is a whole of world server-side problems. I doubt anyone would like to develop stored procedures, triggers, views, etc. using Java (even with LINQ).

October 14, 2008 2:17 PM  
Blogger Kazimierz said...

William Cook said:
Your point that a particular specification document cannot be formalized without changes is certainly true, but not very useful.

In case of OQL these changes are fundamental and concern the object model, the model of query results, the syntax (that is inconsistent), the idea of the semantics (which is absent), the idea of strong typing (which is inconsistent too), scoping and binding rules (that are absent), integration with object manipulation capabilities (which is poor), integration with database views (which is naïve), metamodel (which is underspecified and wrong), etc. We have left from the standard only the general idea, changing almost all the details and making the semantics specified, consistent, implementable and optimizable.

William Cook said:
I think that the SBQL use of an implicit environment stack is mostly a syntactic device.

Totally disagree. The environment stack (together with an abstract object store and the result stack) is fundamental for description of semantics of SBQL. Obviously, it is related to syntax, because the semantics of some syntactic constructs (binding, non-algebraic operators, method calls, parameter passing, etc.) is specified through operations on the stack, but this is the property of any semantics that is driven by syntax. The stack is also fundamental for implementation, strong typing and optimization of queries.

William Cook said:
The only semantics I've seen for SBQL (e.g. chapter 6 of your book) is very operational.

Indeed, it is operational, but I don’t understand your argument. What is wrong in operational semantics? In 1985 I started to formulate SBA in terms of the denotational semantics (see my early papers), but quickly abandoned it. It was totally illegible for any audience. Hence I start to develop the operational semantics, but without changing the SBA idea.

William Cook said:
All languages manipulate environments of bindings, and these are explicit in the semantics of the languages. Saying that "Only SBA introduces this [environment] stack" makes no sense.

I stated this explicitly: “Both stacks are used in some form in every programming environment, including Pascal, C, Java, Ruby, etc.” SBA is commonly used in all programming languages that involve any kind of sub-routines, procedures, functions, methods, classes, etc. My argument was in the context of OQL. For QUERY LANGUAGES only SBA introduces the environment stack. If any other formalism, e.g. object algebra or calculus, does not involve the environment stack, it means that the formalism is conceptually limited or invalid. It is unable to express precisely fundamental semantic properties such as naming, name scoping, name binding and query nesting. This is the case of other formalizations of OQL.

William Cook said:
As for updates, in LINQ you can use a query to find an object, then use the update operations on the object to make changes.

Sorry, I didn’t find such an example. Can you specify the place? So far I understand that LINQ queries do not return references to objects, hence any updating operations are impossible. This is of course a quite easy option to implement and perhaps is or will be quickly introduced by the LINQ developers. But this concerns only C# objects. If we are talking on LINQ to SQL, this is much more difficult, because LINQ queries together with updating tokens must be mapped into SQL update, insert and delete statements. Such a feature is not easy to develop and implement, especially if the mapping between a relational database and a C# object model is not trivial. All that I read on LINQ pages is that LINQ queries can call stored procedures on the side of a relational database and these procedures can perform updates. Of course, in this case, no matter if updates concern single or bulk objects.

October 14, 2008 2:45 PM  
Blogger William Cook said...

Well, I suppose we will just have to agree to disagree on many of these points.

If you want to understand updates in LINQ, Google for 'LINQ updates'. Here's an example from the first hit:

var product =
(from p in dataContext.Products
where p.ProductID == 1
select p).Single();
product.Name = "Toaster";
dataContext.SubmitChanges();

Just for the record, there are many papers on semantics, typing and optimization of OQL, if anyone cares to find out about them. They don't have the dire problems that professor Kazimierz describes. Trigoni's thesis, Semantic Optimization of OQL Queries, has a good bibilography.

October 14, 2008 3:21 PM  
Blogger MiChro said...

As I have found this discussion here is why I think the SBA/SBQL should not be left behind the LINQ, on contrary LINQ is a product of a commercial company and does not really bring new solutions its rather a technology not the new idea. Why then try to make it standard. Standard should not be for one type of car say "Toyota Camry is a standard..." even though it might be a good car it does not really mean that it should be a standard. When we say standard car we mean that car should have 4 wheels, steering wheel, engine etc. And this "etc" should be the matter of discussion about how the standard should look like. Therefore here is where SBA/SBQL comes in. SBA/SBQL is an approach so as the "car standard" is. SBA/SBQL are the tandem that gives us a new quality of approach not just a new language. I am not putting here details as You can all read from the links given by prof. Kazimierz Subieta, and as You read it You can all get the right immpresion not influenced by my opinion. The more peope will know about it, the more chances it get to proof its strength against present day solutions and spread (as I believe most of the readers (not devoted or attached to MS) will appreciate the great idea of it).

It is as clear as crystal that Microsoft has right to promote its proprietary LINQ for .NET as its technology but accepting it as a "car" does not seem to be reasonable.

What is more important here that Java (developed by Sun) is trying to get 100% opened and here again the SBA/SBQL has tremendous advantage as an independent solution. Those are the basic level arguments in my opinion for the SBA/SBQL. The technical discussion above with the details given by prof Subieta gives only a small fraction of the possibilities of the SBA/SBQL and its innovative way (eg. the updatable views, optimisation).
As the history gives us the rather unpleasant experience of ISO standardisation of MS OOXML it is highly possible that some behavioral "patters" could be copied. Why to give the MS the tool to influence Java. MS has already tried to get its own Java but the attempts failed. Trying to standard the proprietary LINQ technology to Java will simply mean that MS would have influence on language that compete with the .NET. The only disadvantage of SBA/SBQL in this aspect is that it has not been developed and supported by a large company with an extensive financial and marketing support.

I honestly believe SBA/SBQL is a great step forward not only relating it to the present day databases possibilities but also to the widely used technologies that soon will have to (because of the technological progress) be used for object DB.

Thank you

October 16, 2008 2:44 AM  
Blogger Roberto V. Zicari said...

FYI- an interesting readings:

Erik Meijer, José Blakeley
The Microsoft perspective on ORM

Interview in ACM Queue Magazine with Erik Meijer and José Blakeley. With LINQ (language-integrated query) and the Entity Framework, Microsoft divided its traditional ORM technology into two parts: one part that handles querying (LINQ) and one part that handles mapping (Entity Framework).

Article | Basic | English | LINK | September 2008 |

You can find the link at:
www.odbms.org/downloads.html#odbms_ap

October 20, 2008 7:50 AM  
Blogger Michael said...

Hello everyone-

I apologize for being late into this discussion, Roberto has asked me a couple of times to contribute but I have been very busy working on some new business efforts at my company (which involve object databases and advanced data mining!)

At the Object Database Technology Working Group, we had several presentations by Prof. Subieta of his SBA/SBQL work. I was and continue to be very impressed by them, and several of us who saw them thought that they represented a whole new thought on approaching the management of data from an object perspective and that they could be the ultimate "bridge" between the object and relational worlds. Others, such as Prof. Cook, were less impressed and did not see as much value in SBA/SBQL.

After several meetings, some of which included demonstrations by Prof. Subieta, the consensus among vendors was that a "string-based" approach would not be accepted by their customers, many of whom are Java developers. Their position was that SBQL was a language separate from Java which would ultimately not integrate easily with Java itself (e.g. SBQL strings would have to parsed or interpreted as opposed to compiled in-line like LINQ).

As a developer myself, I concur with the view that what one wants is a native-language access mechanism for persistent objects. I have read some posts here where people say something like "well all you are doing is adding persistence to Java." Indeed, you might say that is "all" but it is huge. Anyone who has used an ODBMS to manage data can immediately see the advantage vs. JDBC and similar string-based mechanisms.

One also has to remember that OMG is not a true standards body; it is an industry consortium which is a very different animal. The OMG itself makes nothing; its groups put out Requests For Proposals (RFPs) and participating members answer the RFPs with draft standards. A "winner" is ultimately chosen that becomes the new standard, so OMG standards are the result of collaborations between sometimes competing companies. All of the vendors in our group are familiar with LINQ and many support it in C# product offerings. None of the vendors have built-in support for SBQL, it would have to be created anew and an effort to include it as a native language feature of Java (remember why - don't want to be in the string-parsing business) would be a "zero-momentum" effort, i.e. if the vendors backed that option they would have to spend $ to market it and explain it in an effort to draw interest and support. With LINQ, that is not the case because Microsoft has already done that heavy lifting and has already put their money where their mouth is (so to speak) by building it