hello everyone, i have been able to help teams achieve one day sprints for delivery with data vault 2.0! i am happy to say, that following disciplined agile delivery (dad by scott ambler and mark lines) that we can reduce the sprints to half day or one day. in this entry i will talk about the benefits of getting there.
in my recent travels i worked with quite a few teams. in particular, i was able to work with a highly qualified and specialized team at a particular customer. together, we were able to put together one day sprint cycles for moving data from source all the way through the data vault, and out to the bi tooling layer. this requires special focus by the team, by the scrum master / team leader in order to accomplish. that said, there is no “magic fairy dust” involved!
how did we do it?
number one issue for data vault teams is: lack of understanding of how to scope properly. by scoping down to the proper level, we were able to get the data processed all the way through to the bi layer. by the way, i have done this before with a number of other teams as well, this is not unusual for data vault 2.0 methodology practices!! this is the normal methodology implementation when done properly… generally these types of things are taught in workshop settings and kick start packages on-site.
- split tasks in to different sprints
- scope tasks down to single delivery points (see #6 below).
- scope should be limited to 3 to 4 source systems at most, and one reporting output
- critical!! have the data vault model built before starting (just the portion that will be utilized)
- have the right team (5 or 6 individuals, maybe 10 at most)
- involve a data vault 2.0 certified team coach / agile delivery master
- requirement up front: find a single output and build to that. the team must:
- already understand the data (already have done the profiling if needed)
- already have access to the source system
- be able to stage the data in a matter of minutes (10 minutes at most)
steps to build:
schedule a war room – get the entire team in the room, get them focused, and build. use the white-boards to iterate, use the overhead projector to iterate.
note: automation tooling can dramatically increase productivity, reducing build time from an hour or two, down to minutes. i recommend either wherescape 3d & red or analytixds mapping manager as my preferred data vault automation solutions.
- load data to stage
- build hub, link, satellite loads (generate if possible)
- execute loads (unit test!!!)
- design pit and bridge tables in data modeling tool, generate to business vault
- either: generate pit & bridge load code, or write it by hand (unit test!! load data!!)
- design virtual mart views (virtual dimensions, virtual facts) unit test
- open bi query tool, ingest views, and build report (unit test)
- then, begin data testing (yes there are dad steps to testing as well)
deliver! deliver! deliver!
by the end of the day, the team should have output to the bi query tool. if this doesn’t happen in a single day, then either the scope was too large, or the prerequisites i listed were not followed! that means, go back and do it again. it may be helpful to hire an authorized data vault 2.0 coach or trainer to get this process under way quickly. my authorized trainers are all trained in agile data vault delivery practices, and can guide your team to successful day-long sprints.
not only did we deliver, we also iterate during the day long sprint as well!!
separate the sprint cycles folks, but always always always, demonstrate data value in the bi layers!! as fast as possible. in 1997, with a team of 3 people that i managed, we had 2 hour turn-around times for the bi delivery cycles, serving 5000 business users*** (qualified: this was the information delivery cycles only, all data was identified and loaded already in the data vault).
other suggested day long sprint cycles:
remember: these are all targeted at a small scope for rapid delivery…
- developing the data vault model (identifying business keys)
- gathering requirements (should be a 45 minute sprint!!)
- bi report layout & build (this should be done by the business users, not by i.t.) – this is self-service bi.
- development of business vault pieces
common misconceptions in data vault landscapes
there are a couple of common misconceptions that need to be dispelled. bad practices that are followed and need to be stopped immediately. if you are doing these things, please readjust and optimize your flow, or call an authorized and certified individual (coach or trainer) to assist. these misconceptions drive long delivery cycles, blow budgets, blow expectations, and these projects never finish properly or on time, or in budget.
- thinking you have to build the entire data vault model before you build any output.
- no no no… please don’t do this!! this can lead to catastrophic failure of the entire project. the dv model is designed to be built incrementally and in pieces. it is designed to be agile, and to be flexible to change going forward.
- waiting for business to sign off on requirements before you start building and loading the data vault model.
- no, you don’t have to wait! if you have a few subject matter experts or a ba on staff who can attend the design session for the limited output scope, they should be able to give the team enough information about the business processes and business keys to get the data vault model built, and to identify the source systems where the data comes from. don’t wait for requirements!! get the dv model built and loaded as fast as possible.
- that a business data vault is required to get data out.
- nope, not true!!! bdv’s are sparsely built – and only built for several reasons (covered in cdvp2 certification class). what needs to be built are the point-in-time and bridge table structures.
- that you can *never* show a development prototype to business users
- wrong again, during the day long sprint, the team should have a bi dashboard by the end of the day on the virtual facts and virtual dimensions. there should be a business analyst in the room to help qualify / examine the output and make corrections. a demo should be scheduled as soon as possible with at least the stakeholder (possibly the next day) to show output!! even if it is a development data set.
- building virtual facts and virtual dimensions or views directly against the raw data vault without point-in-time or bridge tables.
- please don’t do this!! please build point-in-time and bridge tables, your queries will run very very fast if built properly, and should handle extremely large data sets with ease.
day long sprints for different components are achievable. i have helped many teams in just the past year get to this point. better faster cheaper is built in to the data vault methodology!! this is only available in data vault 2.0 training and coaching from authorized partners. utilizing automation tools from wherescape and analytixds decreases time to market dramatically.
if your “data vault effort” is stagnating, and not delivering, then something is seriously wrong. time to get it fixed!! time to change the way the team is working, time to hire an authorized data vault 2.0 trainer / coach.
have a success story from your data vault project? have a problem with your current efforts? please comment and share below.
hope this helps,
dan linstedt (c) 2017 all rights reserved.