2014 is the year of #DataVault 2.0

hello everyone, this post is about 2014 and data vault 2.0 – it’s been a long time in the making, and i am finally beginning to release information about the standards, best practices, what it contains (what it doesn’t), how to get certified as a data vault certified practitioner, and so on.

i’ve been extremely busy, so first i must apologize to my fans and readers for not posting or blogging for quite a long time.  i have a ton of announcements coming soon, so bear with me.  but for now, i am announcing the following:

dv2.0 – what it is…

data vault 2.0 is the evolution of data vault.  data vault 2.0 is a system of business intelligence and data warehousing that goes beyond just the data vault model.  data vault 2.0 is comprised of four major pillars:

  • architecture – systems and etl, along with big data, and nosql inclusion
  • methodology – project management, standards, best practices, agile build out, scrum and sei/cmmi level 5
  • implementation – rules and procedures, standards, bpel, bpm,
  • modeling – of course the data vault model has evolved, ontologies, concept models, data models

a big big part of the release is the inclusion of nosql platforms, the ability to seamlessly interconnect these various platforms (hybrid, relational, non-relational, columnar, text, etc..) with your existing data warehouse.  also, with the release come the following key concepts:

  • big data – changes to the model to support big big data sets (hyper data sizes / extreme data sizes) in to the petabyte ranges
  • function point counting techniques, estimating rules and standards, risk mitigation strategies
  • ontology modeling concepts for business keys, fact based modeling inclusion and rules/standards/patterns for building those artifacts in to the data vault model
  • automation – utilizing tooling (what when, why, where, how) – and the best practices for auto-generating as much of the data warehousing processes as possible
  • business rules engines & real-time processing – turning your data vault warehouse in to a self-sustaining operational data vault or enterprise asset that is kept up to date in real-time
  • write-back techniques – how to handle, where to handle them, patterns – what to do with the data
  • managed self-service bi – how it fits, how to enable it for your business, what it means
  • virtualization of data marts – why it works, how to make it happen, automation components
  • security and data protection – with the big data come big risks… you’ll want to know how to manage those risks
  • agility !!   rapid 2 week delivery time frames, fixed-cost / fixed time deliverables leveraging automation and data vault 2.0.  i’m working with scott ambler on this one, he has agreed to review the materials in the dv2.0 boot camp and private certification course, and to assist me with edits where necessary.
  • mixed modeling aspects – what to do with logical data models, when and where to mix and match with data vault

and lots lots more….  so you can see, data vault 2.0 is a system of data warehousing and business intelligence.  it defines the “who, what, how” to make your team agile and successful.  it really pulls the whole package together.   i think most of you will be pleased with the outcome.

some of the finer points will be introduced in the upcoming world wide data vault consortium this year.  (there’s still time to register)

data vault 1.0 was highly focused on data modeling for your data warehouse, data vault 2.0 is truly an evolution.

dv2.0 training and certification

yes, there is training.  i am pleased to announce the following high powered courses for everyone:  (these classes will be available both on-line, and in-person)

  • data vault 2.0 boot camp and private certification – this class is a 3 day intensive course that covers a lot of ground (many of the topics listed above).  it really teaches you how to be a general practitioner, and enables you to implement data vault 2.0 successfully within your organization.  cdvp2 (certified data vault 2.0 practitioner) is available privately for those who complete this course.
  • advanced data vault 2.0 boot camp and private certification – for those of you with experience in data vault 1.0, or already certified in data vault 1.0, you can take this course.  this is an intensive 2 day class covering only the new pieces, and the new knowledge you need to bring your qualifications up to speed.  you do not have to be “certified” in data vault 1.0 to take the cdvp2 
  • data vault 2.0 modeling (introductory, intermediate, advanced) – i will be offering three different flavors of classes around just the data vault modeling components.  these classes will be available only on-line, they will also be highly focused on hands-on exercises and workshops to give you practical application experience.

you do not need to be already certified in data vault 1.0 in order to take any of these courses.  nor do you have to be already certified to take the private dv2.0 practitioner certification exam.

in order to earn your cdvp2, you *must* contact us directly to setup a private certification.

you can find out more (we are currently changing and updating the site) about private certification by going to: http://datavaultalliance.com

will there be other training?

yes.  i have a host of other classes i am bringing to datavaultalliance.com  the following classes will be brought on-line.  if you are interested in signing up early, and getting a discount off the retail price, please let me know.  if you are interested in detailed agendas of any of these courses, please contact me directly.

note: run-times subject to change, contact us for pricing details

class # of days
agile bi visions

about this class: this class is an introductory course.  it sets the tone for the students as well as the business users.  it defines the issues, the market pressures, and pains of current enterprise bi solutions.  this course also sets the tone for the recommended solutions, and helps to explain why data vault 2.0 meets those needs.

defining the bi platform

this class is an introductory course.  it focuses on the base definitions of the critical components of all b.i. solutions: dv2.0 model, dv2.0 methodology, dv2.0 architecture, and managed self service b.i.  these topics are handled at an overview level with some deep dive in to the technical definitions.  the second section of this course dives in to the kpa’s and kpi’s of a proper bi solution, discussing estimates & actuals, governance, roles and responsibilities, and concludes with an introduction to automating your data warehouse.

architecture in depth

this class is an in-depth look at the b.i. landscape of architecture.  it covers: data architecture, and systems architecture – both data vault and kimball style bi solutions.  it discusses positioning, definitions, and best practices, while comparing and contrasting the different architectural components.  the main topics include defining: staging areas, edw, data mart layers, nosql layers, metrics layers, and metadata layers.  each section dives in to best practices, practical application, and proper definition (where it fits), and what it should store – and how it should be utilized.

methodology in depth

this class is an in-depth look at the dv2.0 methodology.  it focuses on agility, practical application and delivery, 2 and 3 week sprint cycles.  in other words – it teaches how-to make all of this work.  the topics it discusses include: technical numbering, agile requirements gathering, agile project plan (wbs), mapping requirements to data sources (dbs), mapping business processes to projects (pbs), organizational roles and responsibilities (obs), kpi’s and kpa’s of data warehousing, optimizing your processes, parallel teams, scope control, and automating your edw (overview).

1 ½
advanced dv2.0 boot camp

this class is an in-depth look at upgrading your dv1.0 system to dv2.0.  this course covers only the new components (what they are), and where they fit.  it also talks about the changes to your data vault 1.0 model that are necessary to bring it in to compliance with dv2.0.   topics include: physical modeling pieces (hashing etc..), ontologies, etl & sql processing, write-back, and some advanced dv2.0 modeling cases.  this course dives in deep in to the technical implementation of dv2.0, anyone with prior working knowledge of dv1.0 will find this course beneficial.  ** this is a prerequisite to dv2.0 private certification for those with dv1.0 certification, or dv1.0 experience **

defining the business vault

this class is an in-depth look at a component of dv2.0 called the business vault.  this course defines and introduces the concepts of business vault, and discusses the various components.  it also talks about how to build the business vault, and the benefits it carries with it.  the course dives in to physical modeling components, business ontology application, and master data pieces.  the course finishes up with a discussion on managed self service bi & write-back.

creating a historical vault

this is an intermediate class.  it covers some out of the box ideas around how, when and why to load historical data.  it discusses interesting aspects of adjusting the dv2 model to meet the needs of the historical data, along with combining historical data in the data mart layers for business presentation purposes.

intro: building marts from vaults

this is an introductory class, the goal of this class is to help students get their feet wet with building, loading two data mart models: flat-wide and dimensional.  the students will use sql, the examples used in class will be made available for download by the student – for use with hands on.  the ddl, and the data sets for the data vault will be made available for hands-on.  the database used will be sqlserver r2 2008.  the students should be able to translate that to any database they wish, or download sqlserver express and try it there.

intermediate: building marts from vaults

this is an intermediate class, the goal of this class is to help students become familiar with gaap principles, reconciliation, business rules, eventual consistency; along with handling error marts, report collections, and incremental change application to existing fact tables.  the students will use sql, the examples used in class will be made available for download by the student – for use with hands on.  the ddl, and the data sets for the data vault will be made available for hands-on. the database used will be sqlserver r2 2008.  the students should be able to translate that to any database they wish, or download sqlserver express and try it there.

2 ½
advanced: building marts from vaults

this is an advanced class, the students will work with hands on examples, and discuss security, cubes, redaction, business process management, reconciliation, fuzzy logic and master data.  the database used will be sqlserver r2 2008.  the students should be able to translate that to any database they wish, or download sqlserver express and try it there.

transitioning to dv2.0

this class is an intermediate to advanced overview that focuses on what you need in order to transition in to a dv2.0 solution.  it discusses the starting states of “federated star schemas as edw’s” and “normalized models as edw’s”.  the course goes through what you need to know to get going in dv2.0.  how to kick start, possible pit-falls, things to consider before beginning your dv2.0 project.  this is a preparation course for those considering a dv2.0 solution.

dv2.0 performance and tuning

this class is an in-depth look in to performance and tuning components for the dv2.0 model and architecture.  it’s focus discusses mpp, smp, normalization, mathematics, data co-location, partitioning, etl/elt parallelism, clustering, load balancing, i/o storage, indexing, and structure layout.  this course is for those who want to maximize their data ingestion and retrieval rates.  even if you don’t have a data vault, you will find value in this class at a data architecture level.

applying test cases to data vault 2.0

this class is an in-depth look in to testing in the dv2.0 model.  we discuss generic test cases, and offer specific how-to instructions on getting the most out of your testing paradigms in a dv2.0 modeling environment.  we discuss different test types, how to test, ensuring results, generating test data, and performance testing.

intro: data vault 2.0 modeling

this class is an introductory course, meant to help the team get started in data vault 2.0 data modeling.  it covers the basics of what a data vault 2.0 model is, how it functions, and contains a hands-on example case for practitioners to get familiar with the ideas and concepts.

dv2.0 boot camp & private certification

this class is an introductory and intermediate class that takes you through the why/what/how of data vault 2.0.  it includes the coverage of the business justifications, then follows with the technical descriptions of the architecture, implementation, methodology, and modeling.  the course finishes with descriptions on etl design time paradigms, including templates, best practices and working sql.  this class is a prerequisite for anyone wishing to achieve dv2.0 certified practitioner status.   certification is available privately after the course is completed.

dv2.0 for business analyst and data architect

the purpose of this class is to provide a high level overview of the data vault 2.0 architecture, methodology, and model; along with comparing and contrasting specific points in each with “traditional” or “legacy” data warehousing constructs.  it will provide enough information for the business analyst to understand and guide data vault 2.0 projects.

tool based: automating dv2 etl with mapping manager

this class is an introductory look at using mapping manager to handle cross-reference specifications.  it then walks you through examples and teaches you how to generate data vault 2.0 code using the provided code-generation templates.

tool based: automating test data with mapping manager and rowgen

this class is an introductory look at using mapping manager to generate test data for specific data vault 2.0 cases.  it provides examples around custom business rule formulation in order to “automatically” generate specific test cases needed for your environment.  the generated code is actually script, that is then run through rowgen to produce the end-result test data.

tool based: how-to build code gen templates for mapping manager

this class walks through an introductory level to get you started in writing code-generation templates for mapping manager.  you can think of this course as the “hello world” side of code generation.  you are given a working code-generation template that you can use in your own environment to experiment with.



Tags: , , , , , , , , ,

2 Responses to “2014 is the year of #DataVault 2.0”

  1. Martin Ekeblad 2014/04/01 at 3:35 am #

    Hi Dan

    A question regarding Data Vault 2.0 and DataWarehouse Design/Modeling and implementation tools.

    Will Data Vault 2.0 change so much in design/modelling that that these tools will lose their functionality when using them to design/model and implement a Data Vault 2.0 solution?

    As an example: I am thinking about purchasing an EDW modeling(design/impelemtation tool like Wherescape. I know that Wherescape supports Data Vault but will Data vault 2.0 change so much that purchasing these tools today is a bad idea.

    Next Question: Do you have any plan to make a list of tools that support the design,modelling and implementation of Data Vault 2.0 as well as to what grade they support the Data Vault methodology?


  2. Dan Linstedt 2014/04/03 at 8:25 am #

    Hi Martin,

    The existing tooling in the market will have to change to support Data Vault 2.0 features. However from a modeling perspective, there aren’t that many changes that would “invalidate” all the work or support they currently offer.

    But I will say this: WhereScape has chosen not to sign any partnerships with me over the past 3 years. They’ve had several opportunities to partner, and have chosen not to do so.
    That said, I don’t know what their plans are (if any) to support DV2 going forward.

    Regarding a list of tool vendors that support DV2.0 today:
    http://www.mid.de/en – business modeling + data modeling
    http://AnalytixDS.com – Mapping Manager & LightSpeed Conversion

    I will eventually produce a list once it “gets larger”. Today, no one except the two I mentioned above, actually supports Data Vault 2.0

    Thank-you kindly,
    Dan L

Leave a Reply