A fact based look at Hashing and Sequences and Collision Strategies in Data Vault 2.0 Standards.
Yet another dive in to Hash Keys and Data Vault 2.0
A walk back through the joins, hashing, Teradata, key selection, and arguments for and against sequence numbers. While this post is specifically related to Teradata, there are some generic statements here that apply to all MPP solutions.
A short entry explaining the thought process behind the switch in DV2 away from sequences over to hash keys. Also describes some of the issues if Natural or Business keys are selected instead of hashes as PRIMARY KEYS in the Data Vault Model.
Recently I taught a class on-site for a customer all about Data Vault 2.0. When I got to the point where I shared the template / process for end-dating (updating end dates in place, using a characteristic function) I was point-blank told: “Teradata does not use or have indexes other than the Primary Index” And […]
In this entry I will explore the use of implementation of Data Vault on Teradata. However, this entry is applicable to everyone in the Data Vault community, as it covers definitions and descriptions of primary keys, indexing, surrogate keys, natural keys, and performance and tuning.
For many years, I have built, authored and maintained the #Datavault standards. This includes Data Vault 1.0, and Data Vault 2.0. There are others in the community who believe that “these standards should evolve and be changed by consensus of the general public”. (more…)
Hadoop, NoSQL, Big Data, the introduction continues on a deeper technical level this time. I expose some of the issues, and discuss some of the benefits of applying your data warehouse TO a Hadoop platform. Would love to hear from you, please comment at the end of the entry.
In this entry I describe parts of MPP and parts of SMP, and how they relate to the Data Vault Architecture. I also discuss some of the finer points of the MPP scalability mathematical principles that are applied within the Data Vault Model.
I’ve long maintained that the Data Vault architecture is backed by mathematics, well, here’s a post that explains briefly the architectures and principles at work, including MPP, vertical and horizontal partitioning, and shared nothing components. We discuss WHY the number of joins in the model are a MOOT POINT.