this post is in response to others on the web. lately there have been some suggestions by individuals to break the standards. this post is a reminder as to why you should not break the standards, what went in to the standards in the first place, and if you are to suggest changes, please put your testing in place before suggesting these changes to others.
first things first…
apparently there are individuals out in the market place now who would have you believe one of the following statements:
- you do not need link satellites
- link satellites are bad
- link effectivity is not necessary
- some links become hubs
- some hubs are composite with link keys
- hub to hub relationships via foreign keys are ok
the problem with this, is none of these statements have been thoroughly vetted or tested. none of them hold the test of time. in fact, these statements are completely and utterly inaccurate. they can and will cause your data vault solution to fail.
the fact of the matter is: around 1996 i already went down this road. i already tested these statements when i was designing and building the standards for data vault. i tested these statements in the model, methodology, and architecture – as well as the implementation standards.
the fact of the matter is every single one of these statements failed the tests that i put them through. some failed in agility, some failed in maintenance cycles, some broke the model (made it brittle and inflexible), some failed in big data, some failed in nosql, and some failed in a combination of these.
the point to this is: following these statements leads you to construct something called: “conditional design”
conditional design is:
if <this condition>
then <apply this design>
else <apply some different design>
you will never have a successful, scalable solution that works for real-time and batch, big data and small, nosql and relational, and is repeatable, consistent, and redundant when you build conditional design.
conditional design always leads to re-engineering based on something in your environment breaking your condition!!!
the root standards of the data vault as put forward in my supercharge your data warehouse book, and in my building a scalable data warehouse with data vault 2.0 book, are vetted, tested, and non-conditional. it’s what makes them so robust. it’s why they’ve stood the test of time, it’s why they can apply to every solution build in data warehousing without failure.
background and previous history…
i discussed a lot of the background and history of the data vault standards here: http://danlinstedt.com/allposts/datavaultcat/datavault-standards-what-really-matters/
in this post you will find a lot of the questions i ask about how standards come to be, and why the standards in the data vault world are so important. i will add to that post, with the following statements:
data vault 2.0 includes agility, it cycle time reduction, better faster cheaper, six sigma, and tqm components. that said, not only are the standards for the data modeling important, but also the people and the processes they go through to build data warehouses. so in addition to the questions in the original post i would ask you to consider this:
- does your newly proposed standard affect agility in a negative fashion?
- can your new standard be repeated for all cases?
- does your newly proposed standard increase maintenance costs?
- is your standard be pattern based?
- can your standard be applied in all parts of the data warehouse and bi lifecycle?
- does your newly proposed standard negatively impact the flexibility of the processes or queries?
- does your newly proposed standard need to be re-engineered based on volume? real-time? number of tables in the model?
i appreciate your zeal…
i really do. i challenge the data vault standards every year. it’s how and why data vault 2.0 came around, and how and why data vault 2.0 is subtly changed and expanded – to meet the needs of cross platform mpp, big data and nosql solution sets. if you don’t believe me, just ask my friends kent graziano, sanjay pande, michael olschimke, and roelant vos.
there is a need to challenge the standards, there is always a need to innovate… this i don’t deny. it’s how and why we make a change to the standards that matters to me most. please don’t take any of this the wrong way: i welcome you to challenge the standards too, but don’t just be a lemming and follow someone elses lead just because they said: “hey do this – it’s ok to break the standards”.
no, do it right… bring your tests and your scientific controlled experiments to the table, bring evidence, test it. if you can answer all the questions appropriately (from above and the original post), then i’m happy to entertain the ideas that the standards need to be changed. i’ve always felt this way.
i will tell you this: i have had many discussions around the challenges to the standards along the way, some of them are very fruitful. that said: my number one concern for changing standards is: conditional architecture / conditional design. i refuse to put forward a standard that requires re-engineering the base design just because condition x appeared on the radar.happy to entertain your thoughts and comments on the matter, that said: i can point out customer cases where standards have been broken, and a year later – the data vault “broke” in the customers eyes, i am hearing about some of these right now.thoughts? comments? feel free to reply to my post.thank-you kindly,
(c) copyright 2016 dan linstedt, all rights reserved