i’ve long held a belief that the data vault modeling style ushered in a new era, that being one where we have the power to dynamically accept and automatically absorb new data elements. of course, as always there are some rules and regulations (patterns) around when and how this can happen, and when the modeling concepts require modeling intervention. however – imagine a world for a minute where this can happen. what would it do for you? how helpful would it be? is it something you are interested in?
it is a very important step to making a data warehouse even easier to deal with – and it’s one that is long over-due. i’m of the belief that dynamic data models are the future. what i mean by that term is: the ability to change not only the etl/elt on the fly to accomodate the new elements, but the ability to automatically assign the new elements to the target model (be it: staging, data vault, or even raw star schema).
i have a working prototype of this technology in my labs right now, and it’s proving to be quite elegant. when it sees a new element it tries to determine where to attach it to the target model (through a custom process), and then re-generates the specific etl/elt load code. it finally sends me an email letting me know there’s a new element available.
granted, if you are dealing with xml and a new xsd shows up one night with the feed, then it makes it easier – the new element, it’s parent, and it’s relationship to the parent records have already been defined. but if you simply receive a new element on a flat-file, well, that’s a different story.
it’s somewhat an easy task to modify both the staging and the data vault models once the target structure is figured out. generating the new etl/elt loader code is also mostly easy. the harder task is actually getting those etl/elt processes imported to the correct place for running. of course this sort of activity in production has “long” been frowned upon, but i say this: for the data vault and for the staging area, capturing the information is of the utmost importance.
the fact that it arrived becomes a part of the audit trail. i also say this: business users who are used to changing business rules through dynamic interfaces on the business user side, already do this sort of thing in production. so if they do it in oltp, why shouldn’t they do this for data warehousing?
i believe dynamic data warehousing is on it’s way, and i will do everything i can to get it there and to prove the concept works, and works well. who knows, it might never be accepted by mainstream, but then again – it might be forcibly pushed by business user community for more rapid response to changes by the edw team.
do you have a dynamic data warehousing need? ask your questions… i’d love to hear what pros/cons and thoughts you have about the subject.