a fact based look at hashing and sequences and collision strategies in data vault 2.0 standards.
yet another dive in to hash keys and data vault 2.0
a walk back through the joins, hashing, teradata, key selection, and arguments for and against sequence numbers. while this post is specifically related to teradata, there are some generic statements here that apply to all mpp solutions.
recently i taught a class on-site for a customer all about data vault 2.0. when i got to the point where i shared the template / process for end-dating (updating end dates in place, using a characteristic function) i was point-blank told: “teradata does not use or have indexes other than the primary index” and […]
a look at nosql, bigdata, and data vault with regards to transactions per second – the hype around ingestion rates, and questions around why / when you would or wouldn’t need to switch to nosql platforms to handle bigdata.
hadoop, nosql, big data, the introduction continues on a deeper technical level this time. i expose some of the issues, and discuss some of the benefits of applying your data warehouse to a hadoop platform. would love to hear from you, please comment at the end of the entry.
an introductory discussion on data warehousing, hadoop, and big data. how it works, the differences between it and a relational system, and what some of the benefits/drawbacks are to plugging a hadoop system in to your architecture.
a short introduction to data vault 2.0, and a discussion of the old standards of data vault 1.0 – along with some information about where dv2.0 is going, and the future of data vault
in this entry i will explore the use of implementation of data vault on teradata. however, this entry is applicable to everyone in the data vault community, as it covers definitions and descriptions of primary keys, indexing, surrogate keys, natural keys, and performance and tuning.
i’ve begun researching data vault models on hadoop solutions, including hadoopdb and hive. recently i came across a number of articles which describe the solutions of hive and hadoopdb in detail on top of hadoop solutions. i had to take a minute to write this article, to explain my view points of using the data […]