What is a payment?

The video below (now with beautiful audio) describes how you can use Collibra’s Business Semantics Glossary to get an understanding ofand agreement on payment. Starting from ISDA’s 1987 terms and definitions, the video gradually explains how you can add bits of meaning to different things, owned by different people, until at the end, the software produces a UML model (in XMI 2.1).

Posted in Collibra, Glossary, data governance, sbvr | Leave a comment

Social Semantics, Hybrid Ontologies and the Tri-Sortal Internet

During the Semantic Data Management workshop at the European Semantic Web Conference 2010 in Crete, my former supervisor and lab director Robert Meersman gave a talk with a fresh vision on how we should tackle the mass of (meta)data about communities (enterprises, business webs), people, and systems (incl. documents and media) and the links in between them with real-world (business) semantics. This vision is also perfectly in line with the principles of IT democracy implemented by Collibra.

Here are the slides: MeersmanSemDataESWC.pdf

Business Semantics (a.k.a. ontologies) are indeed crucial to make sense out of this tri-sortal relationship. The Linked Open Data(LOD) initiative is an important first step to set free hidden data, and make access to it scalable. SPRQL endpoints however do not bring much human-driven sense to it. First visual analysis of the linked data cloud reveals the same non-linear graph structure as found in social networks. Hence there is indeed a tri-sortal dynamics.

Mash-ups  and other kinds of services based on LOD should not happen ad-hoc, but should serve the needs and goals of a community/business (Web). These include needs to interact socially between people (beyond Web 2.0), and computationally between information systems. Moreover, these services should not be only computationally consistent, trustworthy, and scalable (topics on which a large deal of the SemWeb is focusing), but also economically feasibly and profitable (an underestimated topic uniquely covered in the e3value research programme in the Business, Web and Media group at VU University Amsterdam).

Reblog this post [with Zemanta]
Posted in Collibra | Tagged , , | Leave a comment

Enabling your data governance

Setting the scene

The image below displays a simplified view of every large organization: an organizational border (represented by the outer circle), an HQ that needs consolidated information (represented in the inner circle), customers (represented by the “dollarized” houses), orders (represented by the trucks moving inside), order-processing units (represented by the factories that receive the trucks), and reporting units (represented by the file cabinets and the steel silos in the inner circle).

Information delivery

Information delivery*

As you can imagine, the HQ has its silos so it can get a good grip on what orders are being made, and act accordingly. All the arrows represent information flows (only about orders here -what was an order again, exactly?) that serve to realize this.

In any real organization, this kind of picture is real, only bigger and more complex.

*Images kindly borrowed from Iconshock (samplesample) and tit0 (sample).

From data management …

Fortunately, orders are not sent via trucks, and data is not stored in filecabinets (anymore). Nowadays, your organization has an army of well-equipped personnel to manage all that information: database administrators that take care of the databases underlying the order systems and reporting systems, ordertakers that key in incoming orders, architects that take care of the overall system architecture (at least for their part of the business), analysts and developers that build and maintain systems to support all this, BI consultants that serve up datawarehouses, -marts, -cubes, and ETLs…

Functional descriptions for these profiles are all over the web and public employment agencies, colleges and universities produce competent future-of-the-above, software vendors have stacks-of-everything that high-price-help you out, system integrators and consultants are more than happy to sharpen their data-machete to hack through your jungle.

With all this in place, how is it possible that people still find bad quality data in databases, have different systems to perform the same function, can’t understand what the figures in report X mean, and how they are inconsistent across business units. Why do we still need to dig through piles of ETL jobs and XSLT scripts to figure out what needs to change when the business needs to change? Why are we still unsure whether the data flowing by actually means what it should, and is valid according to a variety of dearly needed business rules? Why do the goals of compliance and transparency remain highly complex and seemingly very, very distant?

… to data governance

The answer to the questions above lies in data governance: making sure that the data management is run right. Organizationally speaking, you need more than just the roles & responsibilities we know from data management. Whether applied invasive or not, you need someone (or someones) that has the authority to enforce certain rules (stewards). You need guiding principles and policies to properly direct and control efforts. You need process to handle this, methodology to ensure qualitative results, and the right mindset in people’s minds. You need to be able to measure whether data that moves around (whether inside or outside) continues to be of high quality and compliant with your business rules.

No matter what the depth of your data governance programme is, practitioners out there agree that you need to be clear about what it is that you are managing and governing. Whether you call this a Common Business Language, a Shared Business Vocabulary, Business Semantics, One Language, an Enterprise Information Model, RDF triples (T- or A-Box), or anything else, it is clear that you need to be able to understand what your business is all about. The Object Management Group’s (OMG) Semantics of Business Vocabulary and Rules (SBVR) is the only standard out there that helps you with all this complexity: from term, fact type, rule, over vocabularies and rulesets, to subcommunities and communities with their own definitions. This is where it all starts, and from here on we can move down into the actual mud – step by step, to steadily clean out the lake, as well as the streams that fill it.

With Collibra’s help

Collibra can help you with your data governance:

  • the Business Semantics Glossary enables proper collaboration between different people in different business units, making sure that everyone can understand their own terms, definitions, facts and rules, in all their different versions with the appropriate stewards taking care;
  • the Business Semantics Studio helps you to link that layer of meaning to the data that’s out there, whether it is hidden in databases or flying over bus systems;
  • the Business Semantics Enabler automatically generates data services based on those links and their meaning to help you translate data between different systems, and to help you make sure that it is valid according to the definitions and rules in their own context, with appropriate feedback propagating back up where needed. Whatever the existing infrastructure, the Enabler’s data services can be integrated to serve as measuring points in your data governance.

To come back to the picture we began with, Collibra’s helps you to greatly reduce the complexity that is hidden beneath the surface to make sure that you benefit from what is running now, as well make sure that you are prepared for the change around the corner.

Posted in Collibra | Tagged , , , , , , , | Leave a comment

Collibra and IBM Research join forces in European research on service-oriented architectures

Collibra has acquired considerable co-funding in ACSI, a European FP7 research project worth 5 million Euro. ACSI stands for “Artifact-Centric Service Interoperation”. The coordinator is IBM Research Haifa, and the kick-off of the project will be held in June at their premises in Israel. Details of the international consortium are below.

ACSI will serve to dramatically reduce the effort and lead-time of designing, deploying, maintaining, and joining into environments that support service collaborations. This will be achieved by developing a rich framework around the novel notions of dynamic artifacts and interoperation hubs, enabling a substantial simplification in the establishment and maintenance of service collaborations.

Motivation

Interoperation between electronic services, and more generally the business processes embodied by these services, is one of the most challenging and pressing issues in today’s increasingly globalized and de-centralized economy. Out-sourcing, globalization, and the automation of business processes continue to increase.  However, today, there is no effective, flexible, scalable, and principled approach to enable the interoperation of services across enterprise boundaries in support of shared (business) goals.  This is a major roadblock to preventing the automation of these kinds of collaboration, and more broadly, the design, deployment, and operation of innovative value nets.  The ACSI project is aimed directly at filling this vacuum.

Based on an innovative foundation, the ACSI research will develop scientific advances, techniques, and tools to dramatically simplify the design and deployment of infrastructure to support service collaborations, the ability of services to join such collaborations, and the evolution of such collaborations as the marketplace and competitive landscape change.

A Brand New Approach

Artifact-Centric Service Interoperation (ACSI) is based on two fundamental constructs: the interoperation hub and dynamic artifacts. Business-driven intelligent operation of these constructs will be grounded by business semantics.

An interoperation hub serves as a virtual rendezvous for multiple services that work together towards a common goal. Our research will develop a principled, easy-to-use framework for creating, deploying, maintaining, and joining into ACSI interoperation hubs in essentially any application domain. Similar to EasyChair or SalesForce.com, an ACSI interoperation hub will serve as the anchor for a collaborative IT environment that supports large numbers of service collaborations that operate independently, but focus on essentially common goals. These hubs are primarily reactive, serving as a kind of structured whiteboard to which participating services can refer. The hubs can be updated with information relevant to the group, assist the services by carrying out selected tasks, or notify services of key events.

Example of interoperation hub that supports collaboration around hiring

The interoperation hubs used in ACSI will be structured around dynamic artifacts, also known as “business artifacts” or “business entities”. These provide an holistic marriage of data and processes, serving as the basic building block for modeling, specifying, and implementing services and business processes.  In the context of single enterprises, it has been shown that the use of artifacts can lead to substantial cost savings in the design and deployment of business operations and processes, and can dramatically improve communication between stakeholders. Artifacts can give an end-to-end view of how key conceptual business entities evolve as they move through the business operations, in many cases across two or more silos. As a result, artifacts can substantially simplify the management or “hand-off” of data and processing between services and organizations.

A key pillar of the ACSI research is to generalize the advantages of dynamic artifacts to the broader context of interoperation hubs and service collaborations. While the interoperation hubs themselves will take advantage of the artifact paradigm, the services participating in such hubs are not required to be artifact-centric; they can be conventional SOA services, including legacy applications with SOA adapters.

Impact

ACSI provides an approach to populating the web with semantically rich building blocks, around which services can cluster to create a broad variety of service collaborations and value networks.

The ACSI interoperation hub framework, in conjunction with the underlying ACSI artifact paradigm, provides a rich structure around which many subsequent scientific and technology advances can be made. The ACSI research will substantially extend currentverification and synthesis techniques to incorporate data along with process, and will develop the next generation of process mining research by generalizing it to handle data along with process.

The project aims to achieve dramatic savings over conventional approaches to service interoperation across several areas: design and deployment, on-boarding, day-to-day operation, maintenance, data transformation automation, and evolvability. This will be accomplished while enabling rich flexibility for the different service collaborations using a given interoperation hub.

The technology can be applicable in key challenge areas of societal importance, including government, energy, healthcare, supply chain logistics (especially in industries such as food or heavy manufacture with deep upstream supply chains), and heavy manufacture (e.g., airline industries). The mechanisms incorporated into the ACSI framework to support rich variation within a single hub can be especially advantageous in domains, such as human resources, where there are significant differences from country to country.

The ACSI interoperation hub framework will provide a paradigm shift in the way that services, and more generally enterprises, can work together.

Consortium

IBM Research – Haifa (coordinator)

Università degli Studi di Roma La Sapienza

Libera Università di Bolzano

Imperial College of Science, Technology and Medicine

Technische Universiteit Eindhoven

Tartu Ulikool

Indra Software Labs SLU

Collibra NV

Posted in Collibra | Tagged , , , | Leave a comment

Mike Ferguson on Enterprise Compliance

I recently came across Mike Ferguson’s work a couple of times (once at Enterprise Data World in San Francisco, and once at the Data Governance conference in London). As someone who has been active in the data world for sometime, he clearly recognizes the need to properly understand the meaning of your data through clear and understood business vocabularies (which he refers to as a SBV or Shared Business Vocabulary – coincidence that this acronym is so close to the Object Management Group’s SBVR specification…?). Without proper understanding, implementing data governance is like walking on one leg.

Importance of proper meaning
Importance of proper meaning

In a recent publication on the Data Quality Pro website, titled “Enterprise Compliance“,  Mike explains how extremely important data quality is for companies today. It seems that not delivering the right data for the regulator might lead to very negative consequences (e.g., jail sentences, business not listed, stock decreasing, …).

Mike uses the metaphor of a “Data Quality Firewall”, which helps your organization to filter out low quality incoming data to help keep your systems clean. In a previous encounter, somebody refered to this as “why spend effort in cleaning out the lake, when you have not yet cleaned upstream”. Among the various pieces of advice that Mike describes, I extract a couple related to properly understanding your data:

  • managing metadata quality so that data carries consistent and unambiguous meaning wherever it goes;
  • data quality business rules should be able to be defined using common shared business vocabulary names or other application specific vocabularies or a combination of both;
  • provide tooling to allow the creation of a shared business vocabulary consisting of common names, definitions and integrity rules as well as tooling to discover disparate definitions for data and for mapping disparate data definitions to a shared business vocabulary;
  • integrate with software that can provide functionality to translate disparate data names into common ones as data moves between applications or between applications and presentation devices;
  • … (read the 16 page article – it is worth your time)

When you struggling with your organization’s Data Quality Firewall, Collibra has solutions that can help, all using open standards (such as OMG’s SBVR):

  • Collibra’s Business Semantics Glossary allows you to bridge business and IT, and capture (and keep alive) a layer that defines the concepts, facts and rules (i.e., semantics) in your organization. Stewards and stakeholders within the appropriate business communities help to control this. From this tool you can then generate technical models, such as an XML schema.
  • Collibra’s Business Semantics Studio allows you to map those semantics to the actual fields (e.g., in an existing XML schema or database).
  • Collibra’s Business Semantics Enabler uses these semantic mappings to generate data services (POJO, EJB, REST, WSDL): data translation (translating between formats based on common definitions) and data validation (ensuring that the data actually complies to your business rules). As an operational engine, this software integrates well with existing infrastructure (e.g., an Enterprise Service Bus).
Posted in Collibra | Tagged , , , , , | Leave a comment

A metacircular approach for (custom) attributes in SBVR

Introduction

SBVR is an elegant standard, which combines research from linguistics, logicians and practical experience from consultants. SBVR is an abbreviation for Semantics of Business Vocabulary and Rules, and is one of the Object Management Group (OMG) standards. It is focused specifically at the level of business modeling for your organization’s (or even industry’s) business vocabulary and rules (e.g., next to a BPMN initiative focused at your business processes).

Using SBVR, vocabularies are defined that describe your domain in a user friendly way, with enough logical underpinning to be able to generate data schemas or integrate legacy systems. The elegance partly stems from the fact that SBVR is defined in SBVR itself, what we call a metacircular approach in computer science.

You will see metacircularity everywhere, for example mysql databases systems have meta-tables that contain the definitions of all tables and columns present in the system. In programming languages, this is a necessity to be able to perform introspection and have your code reason about itself.

In industry, metacircularity is not always received positively. It takes a while to grasp this concept, since it involves recursive thinking, which for some reason is perceived as difficult. It is my firm belief however, if your language is meta-circularly defined this is a good sign of quality, since you’re at least eating your own dog food.

At Collibra we implement the SBVR standard in our Business Semantics Glossary. Sometimes additions or simplifications need to be made to make it a usable product. One of the features our customers need is the ability to create custom attributes that help explain their terms. For example, in the SBVR standard, you can make use of definitions, descriptions, notes, descriptive examples, etc. However, you can imagine that a customer quickly has the need for other kinds of representation.

Etymology as a Custom Attribute

Lets take “Etymology” for example. Etymology is about the origin of words. This is currently not in the SBVR standard, and might not be part of it ever for very good reason. We could customise our software and include it hardcoded, or we could modify SBVR to contain etymology as a representation. Neither of those options are a good choice as it is a one off change for one customer.

We have solved this by adding Attribute to the SBVR vocabulary. We create a new “Attribute” Vocabulary and add “Etymology” to this vocabulary and set its concept type to AttributeType. Any vocabulary that wants to make use of this custom attribute to represent meaning, just needs to incorporate the attribute vocabulary.

Attribute Model

Above you can see a simplified model to store these attributes. The trick is that we subclass Attribute from Representation. Which makes it the same as any other kind of representation, e.g. Definition, Note. The only difference is that also has an association with a term, which we call the label for the attribute. This lets us neatly reuse everything in our Glossary to define vocabularies for  attributes.

Lets take a look at the object model for the etymology attribute:

Etymology Object Model

We have created our own attribute “Etyomology” in the “Attribute” vocabulary. The “Auto” vocabulary incorporates this vocabulary, and automatically “Etymology” is available to any vocabulary entry we are defining, as if it were any other attribute such as definition, note, etc.

Car Term Window

In our example the etymology of “Car” is: “Late Middle English (in the general sense [wheeled vehicle] ): from Old Northern French carre, based on Latin carrum, carrus, of Celtic origin.” In the background an EtymologyAttribute is created in the “auto” vocabulary that represents the Car Concept. The label for this EtyomologyAttribute is the term “Etymology” in the “Attribute” vocabulary.

Conclusions for the End-user

By making such a small change in the SBVR model we can let the end-user customize the Business Semantics Glossary in ways we might not have thought of before, and get everything for free that is already there. Such small additions leading to such great abilities, are only possible thanks to the well-designed standard that SBVR is.

References

The following are a non-exhaustive list of references if you would like to know more about SBVR.

  • http://en.wikipedia.org/wiki/SBVR
  • http://www.omg.org/spec/SBVR/1.0/
  • http://www.omg.org/news/meetings/ThinkTank/past-events/2006/presentations/04-WS1-2_Hall.pdf
Posted in Collibra, Glossary, sbvr | 3 Comments

The Semantic Web from a different perspective

Jan Henderyckx sent this brief humorous clip on the Semantic Web. It comes from a presentation given at the O’Reilly Open Source conference in 2009, and provides a frightening look ahead.

Posted in Collibra | Leave a comment

IDC Conference in London

OMG’s Semantics of Business Vocabulary and Rules (SBVR) is a standard for semantics. In business terms, this translates into clear definitions of key business assets. In operational terms, this translates into technical models and making sure that the (meta)data is understood and aligned. We illustrate the approach with a customer case from SCA Packaging, which successfully combines a shared message-oriented middleware, a methodology and a network of experienced people to handle the issues of B2B and EAI.
View more presentations from Collibra.
Posted in Collibra | Tagged , , , | Leave a comment

Andy Hayler features Collibra on Master Data Governance

After a short talk with Andy Hayler, which we met at the last IRM UK event, it’s clear that Andy knows the Master Data Management market inside out. During our pleasant talk, we discussed Collibra’s product, market positioning, and unique approach. He agreed that Collibra provides a lot of value for what people are starting to call master data governance.

He wrote some of our discussion down in a short article at IT-Director. He makes some interesting and valid points. Be sure to check it out here !

Reblog this post [with Zemanta]
Posted in Collibra, data governance | Tagged , , | Leave a comment