Teradata on AWS and Azure

Microsoft Azure SQL Data Warehouse

Teradata on Amazon AWS and Microsoft Azure

For those that missed the news, Teradata recently announced that the Teradata DBMS is to made available via the public cloud, initially on Amazon Web Services (AWS), and subsequently on Microsoft Azure.

The Register described the news as 'moon-on-a-stick' and in competition with Amazon's own Redshift offering.

Teradata In The Cloud, Surely Not?

'What to make of this Teradata-in-the-cloud news' we hear you cry (or not). Well, to put things in context, let's start with a little bit of Teradata history...Back in the late 1980's, yours truly ran his first Teradata SQL query via the Teradata ITEQ client running under IBM MVS on an IBM mainframe against a new fangled Teradata DBC/1012. In these earliest incarnations Teradata was very much a proprietary affair. The eponymous database ran on Teradata's own 'Teradata Operating System' (TOS). The early Teradata DBC/1012 systems needed re-booting...a lot.

Roll forward to the early 90's and Teradata became part of NCR, itself a part of AT&T. As a result the Teradata database was ported to NCR's SVR4-derived MP-RAS version of Unix. The Teradata VPROCs (AMPs and PEs) became virtual, rather than each running on its own dedicated Intel x86 CPU. As well as virtualising the AMPs, the MPP cluster consisted of several SMP nodes. The Ynet also became the Bynet. The virtualised systems running on clustered NCR MP-RAS nodes became known as Teradata V2, with the previous non-virtual DBC/1012 systems known as Teradata V1.The release of a single-node Teradata demo version, latterly known as Teradata Express, and a full port to Windows NT also occurred during the 1990's.

In the early 2000's, largely in response to Netezza, Teradata started to ship systems known as 'Teradata data warehouse appliances', not to be confused with the 'Teradata active enterprise data warehouse' offerings. The differences between the 'appliance' and 'enterprise' offerings were largely around the disk IO sub-system, workload management and, you guessed it, price. While Teradata appliances were designed to mitigate the Netezza competitive threat, the enterprise platforms remained the premium Teradata offering.

During the 2000's Teradata was also ported from NCR's 32bit Unix MP-RAS to 64bit SUSE Linux. Teradata systems with a measly 2GB-4GB RAM per node were thankfully no longer the norm.

More recently, during the 2010's, Teradata has also broadened the data platforms offered with the addition of Teradata Aster,Teradata Hadoop and Teradata Cloud (private cloud, not public cloud).The point of the Teradata history lesson is that Teradata has a long track record of change, covering almost 30 years, some/all of which is not necessarily obvious. The latest news about upcoming Teradata availability on the public cloud AWS and Azure platforms is another part of the onward Teradata journey.

For those that are interested, Teradata's version of their own history is here.

Teradata's Cloud Journey

Teradata's existing 'cloud' offerings consist of either the not-for-production 2 AMP Teradata Express on EC2 or Teradata's private cloud, which is essentially access to a remote, shared Teradata platform i.e. not very 'cloudy' at all. These Teradata offerings scored more marketing points than anything else.

To really claim to have a cloud offering, any DBMS vendor must surely make their product(s) available via the shared public cloud?

Well, it looks like Teradata finally agrees. The Teradata DBMS being offered on AWS and Azure is a *big* departure from the current choice of on-prem appliance, on-prem enterprise or remote shared appliance.

Teradata's Public Cloud Motivation

Teradata Stock Price

Teradata's value-add is based on capabilities like resilience, performance, functionality, scalability and workload management. The stuff Teradata folks take for granted is not always apparent, until it's not there. Rather than any one particular 'killer' feature, Teradata's holistic ability to stand up to the real-world rigours of high end data warehousing set Teradata apart from the competition. Teradata's impressive customer list can't all be wrong!

Teradata has always been a company guided by customer demand and feedback. The Teradata PAC bears witness to this. So, part of the driver for Teradata in the cloud could be from the existing Teradata customer base.

Another driver is likely to be the general 'dash to the cloud'. There is simply no point in trying to swim against that tide.

There can also be little doubt that the Teradata execs are feeling the need to put a smile back on the face of Teradata's investors. Following strong gains of over 200% from the IPO in 2007 to a peak in 2012, the Teradata share (stock) price is down by 25% in the last 5 years against a 70% gain for the S&P500 over the same period, with a peak-to-trough decline of over 60% in the last 3 years (graph from Google Finance):

Against the competitive backdrop of ever-increasing interest in Hadoop, Teradata has, by their own standards, been on an acquisition spree over the last few years. This has yet to arrest a downward trend in the share price. Maybe making Teradata available to the mass market via AWS and Azure will have a positive impact on Teradata's share price? Only time will tell.

Teradata's Public Cloud Challenges

At first blush Teradata on a public cloud platform such as Amazon's AWS seems like a very strange fit. Teradata has historically been a low volume, high margin, premium priced offering. Amazon operate at the opposite end of the scale - pile it high, razor thin margins, sell it cheap.

It may be the case that certain Teradata features are unavailable or less capable in order to deliver a price point that works within the AWS platform. Only time will tell.

Product positioning and price issues aside, who will provide support for the potentially large upswing in Teradata users? Teradata? Maybe. Amazon? Maybe.

Teradata's own support team currently enjoy the certainty of a tightly integrated stack within which Teradata provided all of the hardware, the SUSE Linux OS (and the occasional Windows system!) and the DBMS software. There is currently also a relatively small installed Teradata user base to support. All bets are off when it comes to being able to predict demand for Teradata support once Teradata on AWS is available.

One of the key enablers for Teradata's MPP architecture, as I'm sure we all know, is the inter-node Bynet interconnect. Like all MPP systems, and Hadoop clusters for that matter, there will *always* be a need to ship data between the nodes via a high-speed, fault tolerant interconnect. Teradata has lots of value-add (and patents) in this part of the MPP stack.

How Teradata deploys the interconnect, or some form of it, on a public cloud infrastructure will be a key determinant of whether Teradata is able to support the kind of capability taken for granted with non-cloud deployments. It is notable that Amazon's own Redshift is not noted for it's ability to ship data between the nodes in a cluster.

Perhaps more notable is that Teradata confirmed via Twitter to ourselves that the initial AWS offering will be single node only, with MPP clusters available later in 2016. There is clearly work to do in this area before Terdata MPP via the public cloud can be fully unleashed.

In summary, this is a very interesting development, challenges lie ahead, but the more Teradata systems out there needing expert assistance from bona fide Teradata experts the better ;-)