Why Patterns of Data Modeling?
I published another chapter of the new book on “Patterns of Data Modeling” of
Dr. Michael Blaha. All together you can now download three chapters of the book:
Tree Template, Models, and Universal Antipatterns.
At the same time, I asked Dr. Blaha a few questions.
At the end of the interview you`ll find some more opinions on this topic.
Q1. What are Patterns of Data Modeling?
Michael Blaha: Experienced data modelers don’t limit their thinking to primitive constructs. Rather they leverage what they have seen before. Patterns of data modeling are ways of cataloging past superstructures that are profound and likely to recur.
There are different aspects of data modeling patterns. There are models of common data structures (mathematical templates), models to be avoided (antipatterns), core concepts that transcend application domains (archetypes), and models of common services (canonical models). Modelers should avail themselves of the full pattern toolkit and not focus on one technique to the exclusion of others.
The literature covers abstract programming patterns that exist apart from application concepts. For example, the gang of four book — “Design Patterns: Elements of Reusable Object-Oriented Software” has excellent coverage of abstract programming patterns. There is no reason why databases should not have a comparable level of treatment. Until my recent book (“Patterns of Data Modeling“) the literature has lacked an abstract treatment of data modeling patterns.
Q2. Where and when are Patterns of Data Modeling useful?
Michael Blaha: All experienced modelers should use data modeling patterns. It is important to reuse ideas that have been tried and tested, rather than reinvent technology from scratch. I know that data modeling patterns are useful because this is the way that I think as I perform my work as an industrial consultant.
I use data modeling patterns for application data models, enterprise data models, data reverse engineering, and abstract conceptual thinking. Data modeling patterns are not a panacea to the troubles of development, but they are part of the solution. With patterns, developers can accelerate their thinking and reduce modeling errors.
Q3. Is there any difference in the applicability of Patterns of Data Modeling if the underlying Database System is a relational database as opposed to for example an Object Oriented or a NoSQL database?
Michael Blaha: No. That is the whole premise of software engineering — to quickly address the essential aspects of a problem and defer implementation details. A conceptual data model is focused on finding the important concepts for a problem, delineating scope, and determining the proper level of abstraction. All this deep, early thinking happens regardless of the eventual implementation target. Data modeling patterns mostly apply to the early stages of software development
Bill Premerlani and I took this approach in our 1998 book (“Object-Oriented Modeling and Design for Database Applications”). We presented detailed mapping rules for how to implement conceptual models with relational databases, an object-oriented database (ObjectStore) and flat files. Our 1991 book (“Object-Oriented Modeling and Design”) and its 2005 sequel explained how to map OO models to several programming languages.
So patterns of data modeling (as well as programming patterns and other kinds of patterns) apply regardless of the eventual downstream implementation.
Q4. What’s the difference between a pattern and a seed model?
Michael Blaha: A seed model is specific to a problem domain. It is a tangible piece that you can extend to build an entire application. Several authors (such as Hay, Fowler, and Silverston) have published excellent books with seed models. In constrast, a pattern is abstract and stands apart from any particular application domain. Patterns are at the same level of abstraction as UML classes, associations, and generalizations. A pattern is a composite building block. Seed models and abstract patterns are both valuable techniques. They are complimentary and are often used together.
Q5. What do you see as frontier areas of databases and data modeling?
Michael Blaha: I’m now working on a new topic — SOA and databases. SOA is an acronym for Service-Oriented Architecture, an approach for organizing business functionality into meaningful units of work. Instead of placing logic in application silos, SOA organizes functionality into services that transcend the various departments and fiefdoms of a business. A service is a meaningful unit of business processing. Services communicate by passing data back and forth. Such data is typically expressed in terms of XML. XML combines data with metadata that defines the data’s structure. A second language — XSD (XML Schema Definition) — is often used to specify valid XML data structure.
The promise of SOA is being held back by a lack of rigor with XSD files. Many developers focus on the design of individual services and pay little attention to how the services fit together and collectively evolve. Enterprise data modeling is the solution to this problem. A data model is essential for grasping the entirety of services and abstracting services properly. A data model also provides a guide for combining services in flexible ways.
I see evidence for a lack of data modeling in my consulting practice. I have studied several XSD standards and they all ignore data models. The literature in the area of SOA and data modeling is sparse. The current situation is untenable and SOA projects must pay more attention to data.
#
“Patterns of data modeling are very important. They enable data modeling efforts to be both effective and efficient. Working without patterns is like wandering around in the data wilderness trying to find your way.
SOA and Data. This is another vital area that must be addressed. I am doing it in my practice. It brings together data, metadata, metacards, data registries, data catalogs — and service. Very important for scalablility when the data network size grows (e.g., the government, nationwide health services, etc.).” — James Odell.
“I am mostly an object modeller, but I always recommend that my clients start with existing data model patterns rather than with a blank sheet of paper.
The data modelling patterns I most turn to are David C. Hay (Data Model Patterns: Conventions of Thought etc.).” — Jim Arlow.
“I agree with all that Dr. Blaha said advocating the use of patterns. This was very articulately worded, and I like to see those views spread around.
I also recognize that what he’s tried to do in this book is very different from what Len Silverston, Martin Fowler and I did.
It is true that we were focused on modeling the real world–“domains” as he described it. He, on the other hand has abstracted modeling to the point that he describes modeling itself–“tree” structures, undirected graphs, directed graphs, and so forth.
It is true that Dr. Blaha’s book is abstract in the extreme.
In fact, in my new book, Enterprise Model Patterns: Describing the World take on the issue of level of abstraction directly. In this, I am presenting a semantic model that I claim describes the entire enterprise, but on multiple levels of abstraction.
The first (Level 1) is a generic model that any company or government agency can take on as a starting point. It is generic because most attributes are actually captured as data in CHARACTERISTIC entities. (This corresponds to Dr. Blaha’s discussion of soft-coded values.) Thus, they become the problem of the user community, not the data modeler. The data modeler can address the true structures of the business. Yes, this model is organized in terms of five fundamental domains: people and organizations (who), geographic locations (where), physical assets (what), and activities and events (how). It also addresses time (when), but that’s a different kind of model. (This model is based on some 20+ years experience in the field, but I was inspired to write it from my experience over the last few years with the Federal Data Architecture Subcommittee. The committee hasn’t been very effective at creating patterns to distribute to Federal agencies, but it did inspire me to try to capture my views on the subject.)
I then address Level 0, which is a template for the first four categories above. (This is an enhanced version of the THING/THING TYPE model). In addition, at this level are two “meta” models: Document management and accounting. Each of these subject areas itself refers to the entire rest of the model.
At Level 2, I deal with functional specializations. These are more detailed than the level 1 models and make use of the entities in Level 1 combined in specific ways. These subject areas address such things as addresses (both physical addresses–“facilities”–and virtual addresses–telephone numbers, e-mail addresses, etc.), human resources, contracts, and the like. While they are more specialized than level 1, they are still generally applicable patterns. (And these areas address the “why” of the organization.)
At level 3, I address specific industries. For “vertical” models, I take the position that the Level 1/2 models address 80-90% of any company’s requirements. For each industry, however, there are a few special areas that need special attention. These are the things that make that industry unique. I took on a five of these, trying to get a cross-section from completely different worlds: criminal justice, microbiology, banking, oil production, and highway maintenance. If you don’t know anything about one of these industries, here is where you can learn something.
I agree that patterns are technology independent. I disagree that “object” models are technologically independent. That Dr. Blaha began with the gang of four book – “Design Patterns: Elements of Reusable Object-Oriented Software” tells something about his orientation. As it happens, in my latest book, I did (as my colleagues would say) “move over to the dark side”, and use UML as the notation, even though that notation is specifically oriented towards object-oriented design, not business modeling. I had to tweak some of the terms to break out of its object-oriented design history. These are conceptual, business-oriented models, not design models.
In doing this, I may have managed to offend both my data modeling colleagues (“You really have gone over to the dark side, haven’t you?”) and my UML colleagues (“What have you done to my UML?”). Or, perhaps, maybe have started building a bridge between the two groups? Only time will tell.” — Dave Hay.