Man, talk about “wars”…. I’m not here to tell anyone (you included) that the Data Vault will solve all your problems… quite the contrary, the Data Vault is just an evolutionary step in the overall picture. I just got done posting a long-winded reply to the Kimball Forums. He’s made some claims that I feel are unjustified.
I don’t feel the pain you talk about….
Look, if you HAVE a Star Schema that WORKS for you then CONGRATULATIONS!
Don’t fix what isn’t broken!! If it works, leave it alone. But just because you have something that works for you, doesn’t mean it works for everyone else (the same can be said for Data Vault modeling and methodology).
Furthermore, if you can’t feel, or see the pain I am referring to, then you would be hard-pressed to even try building a Data Vault Model. If this is the case, you won’t see the benefits that the model brings to the table.
For example: if I’m not a big “auto racing fan”, and furthermore if I don’t have the proper track – then I don’t understand the value of owning or operating/driving a formula 1 racing car. It may be fun to watch others race, but I might go to a “race” once a year, and only if the price is right for the ticket, and I can get a decent seat, and so on…. For some people, “racing” is their life – they not only own the car, they also race it… they understand the differences and subtleties of getting the engine tuned just-so, and making sure the aero dynamics of the new parts on the car are correct so as not to introduce 1/100th percent more drag. But you wouldn’t want an F1 racing car to drive around town every day… It’s too much hassle just getting the engine started, and just try to stop at a red-light a block ahead.
Please don’t get me wrong, the Data Vault does NOT solve all the problems that the Data Warehousing industry has… It is built to solve only specific problems that I’ve seen, and make the process of building, loading, and using Data Warehouses: Faster, Easier, More Consistent, Measurable, Repeatable, Flexible, and Scalable.
I’ve heard many arguments against the Data Vault over the years, and my simple answer is: if you don’t have a need, or don’t feel the pain, or don’t have a current “architecture/system” that’s in need of redesign, then there is no need to look for something new. Also, if you don’t like “change” or you don’t believe in something because “it wasn’t invented here”, then don’t learn the Data Vault.
CALL TO ACTION: Please post the arguments you’ve heard – against using the data vault, in the comments section below. I will try over the coming months to answer each one independently, we’ll see how that goes.
Issues with Data Vault Implementations?
Now to be fair, just like every other “modeling and methodology” I have heard about problem implementations, and “failures” of the Data Vault – but when I researched the ones I know about, to be honest – it wasn’t the Data Vault Model or Methodology… It was the lack of proper certified training, and the inability of the people to follow the standards and rules put forward that caused the issues. The Data Vault model and methodology is not something that you can “simply pick up and do” just by reading the 5 original white papers on TDAN.com
Sure it looks easy enough, but those short articles don’t begin to do justice to the 10 years of R&D that I put in to the entire definition.
Us Versus Them – who’s right?
In the end, I don’t see this as INMON VS KIMBALL VS LINSTEDT…. No, this isn’t the right way to look at this.
In reality, Bill Inmon provides the foundational framework – telling us what a data warehouse and business intelligence system should be. Kimball provides the data delivery guidelines and expertise – including how to present to business users, and make it successful on the front-end. I provide the middle tier, or back-end data warehouse work horse, and tell you how to avoid pitfalls, mistakes, and how to leverage lessons learned by hundreds of people that came before us.
In the Forum Post…
In the forum entry, I discuss the nature of JOINS, how it ties in to MPP, and why the number of joins simply don’t matter. So if you are interested in just the issues around joins in the Data Vault, you might be interested in this article.
Let me know what you think? Respond here, or respond to the Kimball post on the forums.