this is a very short lesson on the architectural steps necessary to get from a “mostly” normalized model to a data vault model. it really isn’t that hard after all, and you should have a very good handle on the repeatable process.
step 1: identify the natural world keys or the business keys in one or more source models
step 2: figure out if these business keys actually mean the same thing, and represent the same grain of data, if they do, construct 1 hub for all these keys to land in, if they are different, or defined to represent different things – then they each deserve their own hub.
step 3: take the relationships (foreign keys) from the source model, and in relation to the pk – find out where to build a single link for each set of foriegn keys, keep these keys together as a unit of work. build your links in the target model
step 4: take the remaining data set, and place it in to satellites. try not to collide or override existing satellites unless the data really is meant to be combined across source systems. be wary of multi-active satellite rows, split the satellite data by classification (or type) and rate of change.
step 5: decide where to attach the satellites, if the pk of the source table is a composite fk, then the satellite attaches to the link you just built. if the pk is a single non-fk (composite or not – there’s no fk for the pk fields), then attach it to the corresponding hub table.
as always, feel free to reply with your questions.
hope this helps,