i’ve been asked recently if there are any industry based data vault models available. unfortunately (today) there are not. i would like to think that because the data vault is generic and open-source so to speak, that any industry models that would be built based on data vault would be also open-sourced. i understand the drive for monetary compensation and intellectual property, but i think it’s time (again) for another industry change. i think we (as a community) need to begin producing open-source based industry data models based on data vault architecture for our data warehouses.
where do we start?
i think ontologies are a great place to start, i will do some homework and research in this area, and try to find suitable ontologies for us to begin with. i’ve already located at least one open-source ontology for finance: http://lsdis.cs.uga.edu/projects/meteor-s/downloads/index.php?page=2
manufacturing ontology example: http://www.nist.gov/manufacturing-ontologies-portal.cfm
formal shareable ontologies: http://ksl-web.stanford.edu/knowledge-sharing/ontologies/
protoge ontologies: http://protege.cim3.net/cgi-bin/wiki.pl?protegeontologieslibrary
if you see or can find other open-ontologies that we can use, please post the links in your comments here.
there is a great discussion on using ontologies for data models here: http://virtuoso.openlinksw.com/whitepapers/open%20conceptual%20data%20models.html
you can also download free software from stanford university called: protoge, at: http://protege.stanford.edu/download/download.html
i will code an open-source tool that will try to turn ontologies in to ddl data models, unless protoge already has a plugin for this…
why should i participate if i can’t earn money from it?
that’s a good question, and there are many different answers to this, but my view is: the industry needs a change. it’s high time we got together as consultants and leveraged our thought patterns to produce industry models that are open in definition. but you can earn money from it… in the implementation side helping customers understand the industry dv models, and implementing them properly. this is where our focus should be (i believe). also, this may help in data sharing, cloud based data vaults, who knows what new things can rise from this….
where should we keep the models?
i would propose that they become sourceforge projects with gpl licenses (just like open-source software). we can also link to them from linkedin and all of our respective web-sites. or the ontologies can/should be registered with protoge, and the ddl should be driven/generated from the ontologies directly. i do recommend however that the ontologies be used only as a starting point, and that the ddl (once built) be maintained separately, but i’m open to your thoughts on the matter as well.
what are your thoughts on this matter? please comment… let me know….