you may have heard, or already know about “real-time data warehousing” or “active data warehousing” or some similar term. the operational data vault is something slightly different (or is it? – you decide and let me know!) the operational data vault has several facets to it that change it’s dynamics a bit from the traditional data warehouse.
- it is a system of record
- it talks directly to an operational application layer – be it web-services, be-it an actual application that edits data in the data vault
- it allows real-time updates to the information in the data vault… but wait – it still is a traditional data vault because it stores the historical copy / version of the old record, and inserts the new changes.
- it must have a data access layer between the data vault model and the operational application in order to handle two-phase commit, and data locking.
the operational data vault (odv) has a data vault model under the covers, and many of the standard rules apply. however, when it comes to providing information in an operational fashion, this is where it excels. the operational data vault may also double as a master system (no, i didn’t say it is a master data management system, and no i didn’t say it housed master data). if the odv does house master data, it houses versions of the master data – so by definition, it cannot be a master data management system. why? because master data rules dictate that the data store house only one copy of each record, and that it’s golden copy of that record is in fact cleansed, edited, and the most accurate around the company.
no, the odv is not part of an mdm solution, but it can be deemed as a master system – housing the historical records (warehousing to be exact) and their corresponding updates to “master data” over time. this allows the odv to accept web-service transactions, store the history, align the data sets, and respond to web-service requests with current record components.
if you want to create a master data operational data vault, then you must make a copy of the dv model (follow most of the rules of hub-link-sat), then remove the time basis from the satellites. this allows you flexibility of the joins, but it makes the hubs + hub sats act like “single copy” operational stores for true master data.
ok, enough of that. the odv is a very powerful concept that provides high-speed operations and enterprise bus system backbones to use “historical or current” copies of consolidated data sets that just happen to be stored in a data vault format. the odv architecture (usually) no longer relies on batch related loads, but is part of the active nature of web-services. web-services + data access controls + acl (access control lists) = high security and monitoring capabilities for master data sets housed in an odv. now, in order to complete part of the master data picture here – you should be storing a list of business terms that define all the elements in the odv and their hierarchies.
then, when a web-service is queried for “give me information about the kind of data i can get”, the code behind the web-service can use the ontology portion of the master data store to retrieve what it needs to.
please reply with comments, let me know what you’d like to hear about.