A short intro to #datavault 2.0

in this post i will introduce some of the items covered in data vault 2.0, i am refining and polishing the specifications at this time, and should be able to release them shortly. in the mean time, here are some reasons why dv2.0 is up-and-coming, what role it will play in the data warehousing landscape, why it’s important, and what dv2.0 certification is about.

what is data vault 2.0?

data vault 2.0 is the evolution of the standards for data vault.  before i go any further i think it is wise to state that wikipedia needs an update, as it does not cover all of the components nor the full definition of data vault.  to that end: here is the proper definition of “data vault” itself:

data vault is a system of data warehousing and business intelligence that is comprised of three major components: the data vault model, the data vault methodology, and the data vault systems architecture.

changes to the standard:

data vault 2.0 brings to the table improvements in the following areas:

  1. data vault modeling – changes have been made to adapt better performance levels for both loading and querying.  these changes are also specific to ensuring the success of data vault modeling with the use of nosql environments in a seamlessly integrated fashion.
  2. data vault methodology – while i’ve introduced bits and pieces of the methodology (such as implementation rules and procedures) during older certification classes, i’ve not really published in full, the complete components around the methodology.  data vault 2.0 brings the full methodology to bear on the projects to ensure cmmi level 5 compliance (which relies also on people’s ability to execute), tqm (overall for bi), six sigma (for project based build outs), and agile for rapid 2 to 3 week delivery cycles.  data vault 2.0 certification will require knowledge of the methodology as well as knowledge of the modeling components.
  3. data vault architecture – in my book: super charge your data warehouse i outline some of the systems architecture components necessary to make the “data vault system” a success.  in data vault 2.0 not much has changed here, with the exception being: the inclusion of nosql environments for multiple purposes.  data vault 2.0 certification will require knowledge of the nosql environments from both an architectural perspective, as well as an implementation perspective.

why bother with data vault 2.0? business value?

the world is shifting, and including big data is a necessary step in forward movement.  not just that, but subtle changes to the data modeling architecture are necessary for seamless integration of big data and nosql environments.  beyond that, for far too long the methodology has been “discounted” or forgotten or left in the dust.  well, i’m here to say that this will no longer be the case.

customers are demanding it do more with less (as they always have), and they are asking us to be agile about our implementation strategies.  long gone are the days of 6 month deployments, or even 3 month deployments.  we now must step to the plate and begin to acknowledge things like:  (these things are parts of the business value)

  • managed self service bi
  • automation and back end self-healing systems
  • dynamic structure adaptation of the data warehouse
  • rapid time to deliver (2 or 3 week delivery cycles)
  • lower cost of maintenance and management (tco)

businesses are clamoring for the data warehousing industry to “catch up”.  well, with data vault 2.0 they can, and you can help them get there rapidly.  this year as i release the standards, i will be releasing specifications for the following areas:

  • data vault modeling 2.0 specification (specific to the data modeling and architecture components)
  • data vault implementation 2.0 specification (specific to etl and loading components including big data and nosql)
  • data vault methodology 2.0 specification (specific to delivery, project management, agile production, etc..)
  • data vault architecture 2.0 specification (what’s required, best practices, and standards for systems architecture designs)

what does certification say to customers?

certification in data vault 2.0 tells the customers that you are well briefed in the standards, the methods, the architecture and design of end-to-end data warehouses for a corporate level.  that you have the capacity to deliver in an agile fashion, that you can build, automate, and deploy data vault systems from end to end with only weeks of design, and implementation.  that you can discuss the business benefits and uplift (or value) with the business users.  that you can manage a full data vault 2.0 project, that you can include big data and nosql systems seamlessly and easily in your data vault 2.0 implementation plan.

what’s the difference with data vault 1.0 certification?

the original certification course and test was highly focused on just data vault modeling.  it did not have any of the considerations for big data nor, nosql.  there were a few bits introduced in the 1.0 certification class that discussed implementation and architecture, but the certification test did not ensure the knowledge was in place.  the original certification test does not discuss methodology at all.

can i get data vault 2.0 certified without taking the class?

yes, and the dv2.0 certification test will be available for a fee, and from a licensed and authorized data vault 2.0 training partner.

can i take a data vault 2.0 update class if i’m familiar with data vault 1.0?

yes, there will be a number of on-line courses available at: http://datavaultalliance.com/training over the course of this year which will allow you to take different classes to learn each of the aspects of dv2.0

after which, you can take the data vault 2.0 certification test to complete your knowledge level with the customer base.

will there be downloads / artifacts available with the on-line class?

yes.  i will be providing quite a few downloads for each section for use with your project, the implementation, and different tool sets involved.  all will have examples that you can follow or apply to your project.

what on-line classes can i expect to see this year?

the following courses are being developed for data vault 2.0 release this year:

  • informatica etl data vault loading & data mart building best practices and patterns (due out end of february)
  • ssis etl data vault loading & data mart building best practices (date: tbd)
  • data stage etl data vault loading & data mart building best practices (date: tbd)
  • ansi-sql data vault loading & data mart building best practices (due out end of april 2013)
  • data vault bootcamp – end to end data vault 2.0 full coverage – prep course for dv2.0 certification (due out q3 – 2013)
  • data vault 2.0 certification preparation – for those wanting to prepare for dv2.0 certification (due out early q3-2013)
  • data vault testing strategies and best practices – (due out early q3-2013)
  • from data vault to data marts – specializing in loading data marts (due out q4-2013)

the class that is available today is:

  • data vault fundamentals – implementation basics, ansi-sql based standards, designs, goals, covering scalability, performance, parallelism and fault tolerance.  you can find this one already available at: http://datavaultalliance.com/training

can i teach dv2.0 or use any of the materials?

not without licensed written authorization from me.  you must as an individual (not an organization) sign up to become a licensed data vault 2.0 trainer.  you must have completed data vault 2.0 certification yourself.  you must have real-world experience with big data and nosql solutions.  in other words, you will be asked to pass an instructors test which is far more rigorous than that of the standard data vault 2.0 student certification.

once authorized, and accepted in to the program, you will be licensed (as an individual not an organization) to teach data vault 2.0 certification classes.

note: organizations will be listed as authorized data vault 2.0 training centers, but individual trainers within the organizations are required to adhere to the rigors.  authorization to teach data vault 2.0 will be granted to individuals, not organizations.

legal made me say this: anyone without authorization, caught infringing on my copyrights, trademarks, and intellectual property will be punished to the full extent of either international or us law.  data vault 2.0, dv2.0 are registered trademarks and copyrights of dan linstedt.  all intellectual property around data vault 2.0 is the sole property of dan linstedt, all rights reserved.

i want to know more about teaching or certification or on-line courses…

use the contact us form on this site, i look forward to hearing from you.


Tags: , , , , , , , , , , ,

5 Responses to “A short intro to #datavault 2.0”

  1. Ahmed Fayed 2014/05/13 at 11:58 pm #

    We’re planning to use the data vault architecture in Dubai Customs data warehouse, it’s a very flexible technique combining the features of centralized and bus architectures and avoids their drawbacks, the e-book provides adequate information about the structure and design, but we need information about the recommend development process, required team roles, artifacts…, where can I find such information, is it published somewhere?


  2. Dan Linstedt 2014/05/15 at 5:43 pm #

    Hi Ahmed,

    Thank-you for contacting me. Yes, the information you seek is currently offered in two places: http://LearnDataVault.com – the on-line, hands on ETL training classes (one that is SQL generic based, and one that is Informatica ETL specific). The only other place to get information about the development process, team roles, artifacts, SCRUM, agile, etc… is in my Data Vault 2.0 Boot Camp course. Currently I offer this in person only. Although I hope to soon offer this class on LearnDataVault.com as well.

    Please feel free to contact me directly at: danLinstedt@gmail.com for further information or if you have additional questions.

    Thank-you kindly,
    Dan Linstedt

  3. Stefano 2015/03/16 at 7:49 am #

    Hi Dan,
    I’m considering the use of data vault modeling for a new dwh project. I’ve seen that data vault 2.0 is going to be released. With the project starting in 1-2 months, what’s your opinion? start with data vault 1.0?
    Additionally, I’ve found as reference ‘The Official Data Vault Standards Document (Version 1.0) ‘ on amazon, but I suppose is not updated, correct?
    I know that more info (particularly ebooks) are available on learndatavault.com but site is in maintenance.

  4. Dan Linstedt 2015/04/17 at 2:44 pm #

    Start with DV1.0 for now. DV2 is on its way shortly. There isn’t yet an official release of DV2 standards yet. There is a book coming this summer published By Morgan Kaufman which will cover DV2 in it’s entirety. In the mean time, you can learn about DV2 here on my blog, and on the LinkedIn.com “Data Vault Discussions” group.

    http://LearnDataVault.com is back on-line, and so is our training site: http://KeyLDV.com

    Hope this helps,


  1. Data Modelling: Anchor/Buoy is dead, long live… – Detailed Files - 2017/07/25

    […] recently I have become very interested in “Data Vault” but this new modelling technique (new to me at least) is very interesting to me.  It has a number […]

Leave a Reply