Wednesday’s Teradata User Group (TUG) in London felt a bit like a reunion for anyone who’s ever worked at Lloyds in Manchester. They were out in droves – you know who you are – and it was great to catch up with them all. No prizes for guessing who turned up in jeans, t-shirt and trainers…
you can take the boy out of Brum etc…
As expected, the team at Teradata UK put on a great selection of speakers, at a great venue just off Trafalgar Square. So, on to the event…
After milling around with the Lloyds crowd over coffee, Trevor Jukes (WH Smith), the current Teradata User Group (TUG) chairman, opened proceedings. After a brief update from Teradata UK head-honcho Chris Armstrong, it was on with the sessions.
First off, Tom Fastner (ebay) walked us through the 3 system setup at ebay that seems to be the basis for Teradata’s vision of a ‘Unified Data Architecture’. This consists of a traditional Teradata enterprise-class EDW, a cheaper Teradata or Aster appliance for ‘discovery’, and a very cheap Hadoop stack for sorting the signal from the noise amongst less well structured data, such as web logs.
What looks like a significant deviation from the ‘grand central EDW’ theme that Teradata has been promoting for many years is entirely sensible, and justifiable… even if ‘unified’ is perhaps not the best way to describe lots of boxes, data flows, and inevitable data duplication. The main driver for the move away from the single EDW platform is the famous “Big Data three Vs” - the volume, variety and velocity of the data being produce by the ever-increasing digitisation of our lives in this interwebs-connected world. I wonder if Tim Berners–Lee was an early stage Teradata investor?
The user playpens/sandpits at ebay are still known as ‘virtual data marts’, which I would describe as ‘physical virtual data marts’…but that’s another story ;-)
Following on from Tom, Jeff Peckham (Wells Fargo) talked about the Aster POC that Wells have been running. Jeff has been in data warehousing a long time…in fact he was the DBA at Bank Of America when I was on my very first freelance engagement way back in ’92 - anyone else remember accessing Teradata via ITEQ on an IBM machine running green screens on VM/CMS? Yuk!
It sounded like the biggest challenge Jeff faced was to gain access to the data centre to land the Aster box. Banks eh! The theme that emerged, and continued later in the day, was that the 50 or so SQL-MapReduce functions that ship with Aster enabled time series analysis/graph analysis etc. that simply wouldn’t be feasible using SQL, mainly due to the iterations and brute force processing that would be required. Anyway, it sounds like the Aster POC at Wells Fargo was a great success.
The Teradata CTO, Stephen Brobst, followed on from Jeff. I’ve sat and listened to Stephen more times than I can remember. He’s always got something interesting to say and can be relied on to keep the audience engaged, and not just due to his choice of shirts! The one he wore in London was disappointingly subdued, but never mind.
The main theme of Stephen’s talk was how the ‘old’ (SQL/database) and ‘new’ (noSQL/Hadoop) worlds can and should co-exist, building on the earlier message on the same lines from ebay and Wells Fargo. The way Stephen describes the Silicon Valley stand-off between the ‘database old fogies’ and ‘Hadoop dotcommers’ is truly hilarious. No prizes for guessing that Stephen supports both camps, as we would have expected.
We’ve long held the view that the MapReduce paradigm can be summarised as ‘parallel processing for Java programmers’. That’s no bad thing given that the software is free and can be run on ‘cheap tin’. Of all the enterprise features MapReduce lacks, the key one for us is the SQL query optimiser we take for granted in systems like Teradata.
Stephen nailed it for us when he told the audience: “With MapReduce, you are the optimiser”.
That folks, for us, is why MapReduce/Hadoop will never go truly mainstream in its current form. If analytic applications have to be hand-coded and optimised to run on a cluster of servers using non-declarative languages like Java, it will remain the preserve of the dotcommers who created it in the first place. They're not wrong in any way. It's just *way* too hard for mere mortals.
The simple fact is, SQL is here to stay in the analytic mainstream. SQL developers on the whole simply don’t care what goes on under the covers. How the system runs the query is the optimiser’s job, and so it will remain. The power of a good SQL optimiser should become increasingly obvious to all over the next few years.
Having recently followed Professor Marcus Du Sautoy onto the stage at a speaking engagement, I didn’t envy Martin Willcox following on from Stephen. Poor chap. Martin is Teradata’s EMEA head of platform and solution marketing. His unofficial role appears to be Stephen’s wingman when he’s over this side of the pond. Nice work if you can get it etc…Martin has recently been discussing the ins and outs of ‘in memory databases’ with the folks over at SAP, on which we’ve commented.
Martin gave a pretty detailed insight into Teradata’s development roadmap, with no commitment to any timescales – more a tease than a schedule, if you will. Very interesting all the same. Until we know which bits are NDA-free we’ll keep it zipped for now.
The final session we attended was Mike Whelan’s live Aster demo. Everyone knows live demos are risky, don’t they? Well, sadly, due to technology ‘challenges’, all Mike managed to demonstrate was what all sales folks know already – it always works in Powerpoint!!! Better luck next time Mike.
In a similar vein to the Wells Fargo POC, the main focus of Mike’s talk/demo was the Hadoop ‘npath’ capability that is so hard to achieve in SQL. When asked which other functions are also useful, Mike quipped ‘all of them’. Priceless.
As this was a techy session, a show of hands revealed most of the audience to be SQL developers. Only a couple admitted to also knowing Java. No surprises there at all. What Mike hit upon is the other side of the Hadoop adoption issue - Java programmers are rarely developing analytic applications, and few have experience of developing on large-scale clusters. Java + analytics + clusters is a very small set of folks indeed. Analytic folks and tools know SQL, and mainly only SQL, plain and simple.
Anyway, it's beer o'clock and that’s more than enough for a blog article…I’ll finish off by thanking everyone at Teradata and all of the speakers for a great event. Well done to all.