Cat Continuously Smacks Head Against Star Schemas

huh?  yes, i have a cat, and yes – unfortunately she’s not very bright.  we used to live in a house with highly polished hard wood floors.  we had a very long hallway, and when she was a kitten – she would continue running down the hallway at full-speed; knowing full well that the wall and a corner were coming up. did she slow down?   (warning: this post is tongue-in-cheek, and is not meant as insulting – however it may come across this way…) this is a controversial post, and somewhat lopsided.  please read it with an open mind and light-heart

no.  she kept on running, then at the last minute tried to change directions – resulting in a massive slide and her whole body (including her head) smacking in to the wall.  this happened nearly every day for the first 6 months when she was a kitten.

so what does this have to do with business intelligence?

you would have thought that: “after the 2nd or 3rd time she would have learned that this hurts, and she would have stopped or slowed down at least.”  but no, she kept on crashing in to the wall.

in the bi world, i see many of the same types of projects.  these projects use star schemas as an enterprise data warehouse, over and over and over again.  don’t get me wrong: star schemas aren’t the problem the problem is applying the star schema design as a use for a data warehouse.  something you may not know: in order for me to create the data vault, i had to open my mind and learn new things, think in a new way.  i too once built enterprise data warehouses by using type 2 federated star schemas, and i too – kept running at the wall full speed…

the star schema (dimensional modeling) design was never intended to be an enterprise data warehouse.  it was originally written, designed to be a data mart, providing fast, quick, and subject oriented answer sets to the business.  the original writings of dr. kimball (which can no longer be found on the web) never said anything about type 1, type 2, or type 3 styles of dimensions.  the original design was never built to house history.  it was originally designed and built to provide: olap and drill down answer sets to the business.  it was not designed to be a data warehouse, it was designed to be a data mart, then because it was the only architecture we had that “worked well” for business delivery, it was adapted to data warehousing needs.

again, i’m not saying star schemas are bad, quite the contrarythey are excellent when applied in the right place for the right purpose: as data marts for delivery to business users.  i’m also not saying that type 1, 2, or 3 is bad, i am saying that there are some things about the design that keep it from being the optimal design for your data warehouse.

so again, what does this mean for business intelligence?

i am saying that if you know of a project that has any of the following pain: can’t scale, can’t audit, disparate answer sets (data junkyard), inconsistent answers, inflexible to change, non-agile it teams, etc…  then that project might be a candidate for a new approach, a new way of doing things.

i am saying this: if these projects continue down the same road, using the same architecture over and over again, they will experience the same pain over and over again in the long run.  why not change the direction? try something new?

“if you do not change the direction in which you are going, you will end up where you are headed.” – confucious

“insanity: doing the same thing over and over again and expecting different results.” – albert einstein

why do projects in this situation continue to run down the hallway, see the corner coming – then bang in to the wall at the end over and over again?  don’t get me wrong and please don’t take offense…

  • perhaps it’s simply because they don’t know the data vault model and methodology exists…  ok, i’ll buy that.
  • perhaps they don’t understand the data vault model and methodology, or what it’s good at…  (ok, i can buy that too)..
  • perhaps, (and most importantly), they don’t run at the wall head-long at top speed…
  • perhaps they’ve constructed a good working system, and they don’t have changes, or pain..   cool!  congratulations!

that’s wonderful news, and as i’ve said before: if it ain’t broke, don’t fix it….

reasons for not using the data vault model and methodology:

  • if the project doesn’t have pain in it’s systems, flexibility, scalability, auditability, etc…
  • if you are a small to medium sized business with a single system
  • if you want / need a single throw-away point solution for a very narrow and specific data set with limited history
  • if you don’t have to deal with external data, unstructured or semi-structured data as inputs to your warehouse
  • if you are not looking for an enterprise solution, but rather a departmental solution
  • if you don’t have federal or world-wide regulatory requirements
  • if what you currently have works for you.

any of these reasons are good/great reasons to not make a change, and to not learn /use the data vault model and methodology.

concluding thoughts

but: if the pain is there, and the awareness of the data vault is there, then take the time to get the project’s staff to a training class – certified or otherwise.  because:

“you have to learn the rules of the game. and then you have to play better than anyone else.” – albert einstein

i hope this has been informative, and again, i’m sorry if i offended anyone.  if you are one of my readers, i welcome your feedback.  if you have funny stories about pain points and lessons learned, that would be wonderful. if on the other hand, you have success stories with star schemas as data warehouses – again, i’d love to hear from you, and congratulations in the most sincerest manner.

Tags: , , , , , ,


  1. Dimensional modeling and Data Vault – a happy marriage? : IN2BI - 2011/09/05

    […] lot of emphasis on the Internet lately has been placed on the difference between both methods. This blogpost from Dan Linstedt and this discussion on the Kimball forums are examples of this strange […]

Leave a Reply