Infrastructure at your Service

Mehdi Bada

Introduction to Oracle Big Data Services

By September 29, 2017 Big Data, Oracle No Comments

Since few years, Oracle decided to move forward in the Big Data area, as their main competitor. The goal of this blog post is to explain you, how the Oracle Big Data offering is composed.

As the Oracle Big Data offering is continuously improving, I’m always open to your feedback :-)

Oracle Big Data offering is split in 2 parts:

  • On-Premise
  • Public Cloud

Note: It’s important to know, that the 2 main Big Data distribution on the market are Cloudera and Hortonworks. We will see later how Oracle stands with this 2 main distributions.

On-premise:

Oracle Big Data Appliance:

The main product of the Oracle Big Data offering is the Oracle Big Data Appliance. OBDA is an engineered systems based on the Cloudera distribution. The Big Data appliance offers you an easy-to-deploy solution with Cloudera manager for managing a Big Data cluster including a complete Hadoop ecosystem ready-to-use.

Oracle Big Data Appliance starts with a “Starter” rack of 6 nodes for a storage capacity of 96TB. Below the details configuration per nodes.

Oracle X6-2 server:

  • 2 × 22-Core Intel ® Xeon ® E5 Processors
  • 64GB Memory
  • 96TB disk space

Oracle Big Data Appliance is a combination of open source software and proprietary software from Oracle (i.e Oracle Big Data SQL). Below a high-level overview of Big Data Appliance software.

Screen Shot 2017-09-27 at 08.25.45

Oracle Big Data Cloud Machine:

On customer side, Oracle offers the Oracle Big Data Cloud Machine (BDCM). Fully managed by Oracle as it’s a PaaS service (Platform as a Service), based on customer infrastructures, designed to provide Big Data Cloud Service. The BDCM is a Big Data Appliance managed and operated by Oracle in customer’s data center.

The Big Data Cloud Machine starts with a “Starter Pack” of 3 nodes. Below the minimal configuration:

  • 3 nodes
  • 32 OCPU’s per node
  • 256GB RAM per node
  • 48TB disk space per node

Oracle Big Data Cloud Machine princing: https://cloud.oracle.com/en_US/big-data/cloudmachine/pricing

Oracle Public Cloud:

Oracle provides several deployment and services for Big Data:

  • Oracle Big Data Cloud Services
  • Oracle Big Data Cloud Services – Compute Edition
  • Event Hub Cloud Services (Kafka as a Service)
  • Oracle Big Data SQL Cloud Service

Oracle public cloud services, including Big Data, is available in two payment methods, metered and non-metered.

  • Metered: You are charged on the actual usage of the service resource :
    • OCPU/hour
    • Environment/hour
    • Host/hour
    • For the storage : GB or TB/month
  • Non-metered: Monthly or annual subscription for a service and it’s not depending on the resources usage. Charging is performed monthly.

For more information you can refer to the following links:

https://blogs.oracle.com/pshuff/metered-vs-un-metered-vs-dedicated-services

Oracle Big Data Cloud Services:

OBDCS is a dedicated Big Data Appliance in the public cloud. An engineered system managed and pre configured by Oracle. OBDCS is a large system from the start with Terabytes of storage.

The offering starts with a “Starter pack” of 3 nodes, including:

  • Platform as a Service
  • 2 payments methods: metered and non-metered
  • SSH connection to cluster nodes
  • Cloudera’s Distribution including Apache Hadoop, Enterprise Data Hub Edition
  • Oracle Big Data Connectors
  • Oracle Copy to Hadoop
  • Oracle Big Data Spatial and Graph

The cost entry is very high, that’s why this service is recommended for large and more mature business cases.

Pricing information: https://cloud.oracle.com/en_US/big-data/big-data/pricing

Oracle Big Data Cloud Services – Compute Edition:

OBDCS-CE provides you a dedicated Hadoop cluster based on Hortonworks distribution. The cost entry is smaller than Oracle Big Data Cloud Service, that’s why this service is more suitable for small business use case and proof and concept.

OBDCS-CE offering details:

  • Platform as a Service
  • 2 payments methods: metered and non-metered
  • Apache Hadoop cluster based on Hortonworks distribution
  • Free number of nodes for the deployment – 3 nodes is the minimum for a High Availability cluster, recommended for production. You can actually have one node clusters, but this is obviously not recommended.
  • Apache Zeppelin for Hive and Spark analytic
  • 3 access methods:
    • BDCS-CE console (GUI)
    • REST API
    • SSH

Pricing information: https://cloud.oracle.com/en_US/big-data-cloud/pricing

Summary

Engineered systems PaaS
On-Premise (customer side) - Big Data Appliance (BDA)- Big Data Cloud Machine (BDA managed by Oracle) Oracle Cloud Machine (OCM)  + BDCS – Compute edition
Oracle Public Cloud Big Data Cloud Service (BDCS) – a BDA in Oracle public cloud – Cloudera distribution Big Data Cloud Service – Compute edition – Hortonworks distribution

More details about Oracle PaaS offering:

http://www.oracle.com/us/corporate/contracts/paas-iaas-public-cloud-2140609.pdf

I hope, this blog will help you to better understand the Oracle Big Data offering and products.

 

Leave a Reply


8 − = zero

Mehdi Bada
Mehdi Bada

Consultant