Big Data - ebay @ Teradata Partners 2011

So, a few days ago, we were talking through various consulting engagements we've had over the years...the kind of marketing collateral discussion you just can't avoid, no matter how hard you might try! 

Some of the bigger systems we've encountered are at the likes of Vodafone, Verizon, BT and Nokia. All are big in their own right, but client confidentiality precludes saying any more than that, as you might expect.

Then talk turned to ebay. Not that we've had any dealing with that system. I just remembered that I'd scribbled some notes down from the various ebay sessions at the Teradata Partners 2011 conference in San Diego.

So, here's a few snippets relating to ebay from Teradata Partners:

  • ebay use a Teradata EDW, a Teradata high capacity appliance system and a Hadoop system (‘horses for courses’)

  • the Teradata EDW is 6PB and dual-active

  • the 'Singularity' high capacity system is 40PB and consists of 256 high capacity appliance nodes

  • the Hadoop system is 20PB

  • ETL is controlled by Ab Initio and metadata-driven

  • most feeds are daily with inputs landed on disk as fixed width files

  • the Teradata loading approach is Fastload/BTEQ

  • Teradata TD13 compression delivered a 50% IO reduction

  • maximum loading throughput is 12TB/hour

  • 50TB/day of new data is received

  • 100 trillion name/value pairs are stored in a single table

  • 100 PB/day is analysed, mainly for web site optimisation

In addion to the system metrics above, some words of wisdom that I noted (and agree with):

  • “Keep atomic data, it supports deep insight’

  • “Data marts are expensive chaos, which cannot be cheap enough to justify, and lead to data drift”

Ebay seems to have overtaken Walmart as the 'grande fromage' of the Teradata world. They also like to share their story, which is nice.

We're big fans of Ab Initio and the FastLoad/BTEQ approach to Teradata ETL, so it's nice to know there are like-minded folk at ebay.A 100 trilllion rows in a single table - I'll bet there's no FALLBACK on that baby :-)