In my previous Advisor (see "Enterprise Semantics: Speed-Reading Your Enterprise Data Architecture, Part I," 11 August 2010), I explained that the skills needed for doing enterprise data architecture differed from data modeling or data warehousing. In addition, I pointed out that one of the most significant differences is that of scale. Coming up with an intelligible (and defensible) big picture out of thousands of tables in hundreds (or thousands) of databases is not easy -- but it is possible. In fact, it is straightforward once you get the hang of it.
The trick to speed-reading enterprise data is to focus on the big stuff. In this Advisor, I concentrate on the three major data categories and then, in the next Advisor, I'll talk about the other three. Once you understand data categories and how they relate to each other, you'll quickly be able to apply the model to all sorts of circumstances and environments.
I call this model the AMO (pronounced "ammo," as in shells and rockets). AMO stands for "actors," "messages," and "objects." The actors are people, organizations, or systems. The AMO model is basically a communication model. Actors "communicate" with one another via messages, and the messages are about something -- an object (or, in some cases, a subject, which is when the "object" of the discussion is another actor).
A good way to be able to determine the actors, messages, and objects is to develop a context diagram (see Figure 1) and find the major players. In a standard sales/marketing situation, for example, the principal actor is the "customer"; the messages are "order," "shipment," "bill," and "payment" (my friend and mentor Jean-Dominique Warnier believed that these four messages were canonical and could be used to describe any organization, but more on that later); and the object is the product.
That's it, that's what you look for: actors, messages, and objects. They're always there, and they're always the most important items. The funny thing is that people keep rediscovering them. When I first got into computing back in the days of mainframes and COBOL, systems designers (there really weren't any database designers yet) used to talk about "master files" and "transaction files." Well, master files were largely either files about actors or objects, and transaction files were about messages. In today's data jargon, "fact tables" are almost always messages, where actors and objects are often "dimension tables." And with MDM (master data management), the "master" here is pretty much identical to the "master" in "master files" so many decades ago.
As important as actors and objects are, it is the messages that are the glue of data models; the thing that relates all the major things in the model. You'll find if you examine messages closely they often (usually) have two levels -- one that relates the "actors" to each other (in most cases one of the most important actors, namely, the "enterprise," is left out but assumed), while the second level relates the message to the contents (i.e., the "objects" referenced by the message). An order transaction is ordinarily modeled as two levels: the "order-header" and the "order-line-item."
So when people talk about MDM as describing the "master data" modeling "customers" and "products" absent "transactional" data, they are wrong. Master data does often have "relational" data that relates objects to other objects hierarchically (e.g., organizational structure or product structure) but transactional (e.g., messages) always relate actors and objects to each other in a business sense.
In my next Advisor, I'll carry the speed-reading process further to discuss the critical role that actors and messages play in business process modeling, and the three other categories one needs to remember when developing enterprise data architectures.
I welcome your comments about this Advisor and encourage you to send your insights about enterprise architecture to me at korr@cutter.com.
Sincerely,
Ken Orr, Fellow
Cutter Business Technology Council
E-mail: korr@cutter.com