Teradata Data Modelling

Teradata Data Modelling

An excellent session covering Teradata data modelling was delivered by the main speaker at last week's Teradata CTO roadshow in London - perhaps unsurprisingly, the Teradata CTO - Stephen Brobst.

A strong 'second string' consisted of none other than our friends Martin Willcox and James Mesney of Teradata UK. It was good to catch up with James in particular, who we hadn't run into since the inaugural Netezza user conference in Cambridge, MA in 2005. Yes, that's Netezza mentioned in a piece about Teradata - get over it ;-)

Stephen's talk covered Teradata data modelling in general, and semantic data modelling in particular. The main difficulty was keeping up with him as we attempted to tweet the good bits as they happened - they came so thick and fast it was quite a challenge!

To summarise a few of Stephen's Teradata data modelling pearls of wisdom:

•    'single version of the truth' may not be possible or even desirable

•    business value comes from data integration not the DW platform per se

•    simple beats clever

•    'non-invasive' indexes (e.g. JI based) are preferred to 'invasive' indexes that alter the base table (e.g. PPI based)

•    direct access via SQL to base tables should not be allowed - always use a full-table view (FTV)A particular Teradata data modelling favourite, and which we advised to one of Teradata's biggest customers a few years ago:

"Your business users do not want to see parity in the logical data model"

Clearly this guy knows a lot about Teradata data modelling. No surprises there.

The key 'take-away' messages relate to the logical data model (LDM), the physical data model (PDM) and the semantic data model (SDM).

Logical Data Model (LDM)

- make sure you buy an LDM, even if it's not from Teradata (seriously)

- don't re-invent the wheel, you are almost certainly more like your industry peers than you realise- the LDM is only a start point and needs further customisation to meet your requirements (probably ~25%)

The last point about LDM customisation is the one that we see violated the most frequently. Any LDM is not the final blueprint - there's still work to do folks!

Physical Data Model (PDM)

- don't let the modelling tool use the logical primary key (PK) to generate the physical primary index (PI)

- don't just hit 'generate DDL' and roll with itSurely no-one is in that much of a hurry!?!?! Sad but true, despite the wealth of Teradata physical implementation guidance out there, this happens all to often.

Semantic Data Model (SDM)- use the VIP approach: start with Views then Indexes then Physical tables- views may not give the desired performance

Whilst we agree entirely with Stephen's guidance, for most folks it's our experience that they'll have to 'go physical' with the semantic model sooner rather than later. Anything else is just delaying the inevitable.

We've yet to see a case where the semantic layer is entirely logical (i.e. consists of views only) and delivers acceptable performance.As the saying goes: your mileage may vary (YMMV).

Happy Teradata data modelling :-)