hello everyone, i am bringing you the exciting news: if you want your data vault 2.0 to be 100% insert compliant (as it was in the beginning, way back in 2001) then remove last seen dates, and remove load end dates.
i’m here to tell you, that if you aren’t focused on an insert only architecture for your data vault 2.0, then you aren’t focused in the right place. due to scalability (volume), and sheer arrival speed (velocity) of data, your data vault 2.0 model should be highly focused on 100% insert only architecture.
last seen dates were always a kludge, a crutch, to help developers get around “full table dumps” and detecting data that disappeared. load end dates were always a kludge, a crutch, to help developers issue between clauses. on query access.
well, i’m happy to say: there is a way forward without either of these two kludgy attributes.
- implement record source tracking as defined in my book: building a scalable data warehouse with data vault 2.0, or as defined in my cdvp2 class. what??? you’re not yet certified? why? what’s stopping you? i am offering two additional courses this year, you can sign up for them at: http://datavaultalliance.com
- implement point in time and bridge tables – and do it properly!!! again, as defined in my book: building a scalable data warehouse with data vault 2.0, or in my cdvp2 class.
by the way, the benefits of properly implementing these two paradigms far outweigh the costs of “updates” against the data vault model. furthermore, these two structures move your entire edw dv2 in to the 21st century, and make it 100% insert only compliant.
moving to 100% insert only architecture is currently (and in to the future) the best thing you can do for your edw dv2.
if you have questions or comments, feel free to post them below.
(c) dan linstedt, 2016 all rights reserved