In-Database Analytics and your EDW/BI

well well, we’ve come full circle haven’t we?  there’s an interesting (yet long and dry and somewhat technical) explanation of in-database analytic technology here. (sybase iq, forrester, fuzzy logix).  but i have my own opinions.  i’ve blogged and written about these topics for years on and in teradata magazine.  in this entry i will explore the meaning, the consolidation, and the relationships to the next generation edw – which i have also dubbed: operational data warehousing. 

full circle, what the heck are you talking about?

in the beginning there was oltp and applications, and there was a huge debate over where to put application logic: should it be in the database or in the application outside the database?  over the years, people have realized that a balance of functionality is necessary.

then came history of data, and up popped data warehousing.  you know this story…  the edw split off from the oltp systems, and for what?  reporting purposes only.  then, came the principles of “trending, analyzing, and grouping” the data – pivoting, etc… which lead to application development (and logic / manipulation of data sets within applications).  including microsoft excel which is still used today (along with macros, etc..).

i wrote articles years ago about the “future of data warehousing” in which i stated my belief that eventually, the applications, the logic, and all the controls would be consolidated into a hardware/software platform that had some sort of query logic built in…  along came the appliance market. this market is still in flux, and hasn’t yet matured to the point i envisioned.  none-the-less, here’s sybase, fuzzy logix, and a few other vendors (not to pick on sybase specifically) that say “in-database analytics” or “deep analytics” is the next best thing since sliced bread…  they discovered that push-down optimization of sql and application logic is more powerful than moving the data out to a middle tier or an application, and then manipulating it.  go figure!

anyhow, they’re right…  it is more powerful, and by far a much better approach (the etl vendors learned that lesson 8 years ago… why has it taken so long for the reporting and database vendors to reach this conclusion?)

by the way, that was a very unfair question.  technology had to advance to the point where it is feasible (as it is now) to control the temperature of data, and be able to store information sets in sizeable ram caches for manipulation.  compression algorithms, data mining algorithms, faster hardware, advances in ssd storage, and of course: 64 bit os technology.  all these things make the underlying platform (along with mpp) very attractive for this type of proposition.

so, the bi market is all about: consolidation, consolidation, consolidation?

yes, yes, and yes.  in other words, the sooner these database vendors realize that moving more and more “data manipulation logic” down to the database layers is valuable, the faster they will move into providing what the users want.  the presentation i mentioned (while dry), does a great job (technically) of telling us what we already knew – application logic must begin to make it in to the database layers at the core-engine level, along with operational data, and historical data – all combined or consolidated in to a single platform.

the result?  an operational data warehouse, not just operational bi (this is the output of the operational data warehouse).

how do i get there?

ahh well, the easiest way is to construct first, a raw data vault – then, begin to feed real-time data to it (pushing the latency cycles of arrival timing ever lower).  from that point, build a business data vault which contains operational data, a vision of master data, and strategic data.  wait a minute….  hold the horses – (did you miss the discovery here?) 

use in-database analytics routines to process the data from the raw data vault to the business data vault in real-time through data mining and statistical algorithms (deep analytics), resulting in a hot master data store…  here’s another discovery: use some advanced (new) bi tooling that empowers the business users to build the business logic that runs the analytics on the data being moved from the raw data vault to the business data vault.  don’t use etl for this layer!!

i hope the gravity of these statements sinks in.  what i am proposing is back-end etl data movement, with sophisticated front-end analytics driving the rules that move data from a raw store to a business data vault.  throw in mpp, or column based, or ssd, or other new technology and you’ve got yourself a wicked bi system (i think this is more in tune with the next generation of edw).

it is no longer responsible for “interpretation layers”, and business users have “nearly” free reign on creating the logic that makes sense of the information.

yea, yea, i know – there are some engineering challenges and technical hurdles – but this is the fun part!

deep analytics, or in-database analytics is one step toward moving more business logic in to the database.  i hope that in the future, the database vendors continue to push these advances (absorbing more and more functionality).  this would make the job of bi applications what it should be: responsible for visualization and manipulation.

would love to hear your thoughts on the subject…  did i get it all wrong?  it’s possible….  register for free, and post a comment!

dan linstedt

Tags: , , , ,


  1. Tweets that mention In-Database Analytics and your EDW/BI -- - 2010/09/01

    […] This post was mentioned on Twitter by johnlmyers44, Daniel Linstedt. Daniel Linstedt said: In-Database Analytics and your EDW/BI: Well well, we’ve come full circle haven’t we? There’s… […]

Leave a Reply