Tug’s Blog

My journey in Big Data, Hadoop, NoSQL and MapR

Getting Started With Kafka REST Proxy for MapR Streams

| Comments

Introduction

MapR Ecosystem Package 2.0 (MEP) is coming with some new features related to MapR Streams:

MapR Ecosystem Packs (MEPs) are a way to deliver ecosystem upgrades decoupled from core upgrades - allowing you to upgrade your tooling independently of your Converged Data Platform. You can lean more about MEP 2.0 in this article.

In this blog we describe how to use the REST Proxy to publish and consume messages to/from MapR Streams. The REST Proxy is a great addition to the MapR Converged Data Platform allowing any programming language to use MapR Streams.

The Kafka REST Proxy provided with the MapR Streams tools, can be used with MapR Streams (default), but also used in a hybrid mode with Apache Kafka. In this article we will focus on MapR Streams.

Getting Started With MQTT and Java

| Comments

MQTT (MQ Telemetry Transport) is a lightweight publish/subscribe messaging protocol. MQTT is used a lot in the Internet of Things applications, since it has been designed to run on remote locations with system with small footprint.

The MQTT 3.1 is an OASIS standard, and you can find all the information at http://mqtt.org/

This article will guide you into the various steps to run your first MQTT application:

  1. Install and Start a MQTT Broker
  2. Write an application that publishes messages
  3. Write an application that consumes messages

The source code of the sample application is available on GitHub.

Getting Started With Apache Flink and Mapr Streams

| Comments

Introduction

Apache Flink is an open source platform for distributed stream and batch data processing. Flink is a streaming data flow engine with several APIs to create data streams oriented application.

It is very common for Flink applications to use Apache Kafka for data input and output.

This article will guide you into the steps to use Apache Flink with MapR Streams. MapR Streams is a distributed messaging system for streaming event data at scale, and it’s integrated into the MapR Converged Data Platform, based on the Apache Kafka API (0.9.0), this article use the same code and approach than the Flink and Kafka Getting Started.

MapR Streams and Flink.

Getting Started With Apache Flink and Kafka

| Comments

Introduction

Apache Flink is an open source platform for distributed stream and batch data processing. Flink is a streaming data flow engine with several APIs to create data streams oriented application.

It is very common for Flink applications to use Apache Kafka for data input and output. This article will guide you into the steps to use Apache Flink with Kafka.

Streaming Analytics in a Digitally Industrialized World

| Comments

Get an introduction to streaming analytics, which allows you real-time insight from captured events and big data. There are applications across industries, from finance to wine making, though there are two primary challenges to be addressed.

Did you know that a plane flying from Texas to London can generate 30 million data points per flight? As Jim Daily of GE Aviation notes, that equals 10 billion data points in one year. And we’re talking about one plane alone. So you can understand why another top GE executive recently told Ericsson Business Review that “Cloud is the future of IT,” with a focus on supporting challenging applications in industries such as aviation and energy.

Setting Up Spark Dynamic Allocation on MapR

| Comments

Apache Spark can use various cluster manager to execute application (Stand Alone, YARN, Apache Mesos). When you install Apache Spark on MapR you can submit application in a Stand Alone mode or using YARN.

This article focuses on YARN and Dynamic Allocation, a feature that lets Spark add or remove executors dynamically based on the workload. You can find more information about this feature in this presentation from Databricks:

Let’s see how to configure Spark and YARN to use dynamic allocation (that is disabled by default).

Getting Started With MapR Streams

| Comments

You can find a new tutorial that explains how to deploy an Apache Kafka application to MapR Streams, the tutorial is available here:

MapR Streams is a new distributed messaging system for streaming event data at scale, and it’s integrated into the MapR converged platform. MapR Streams uses the Apache Kafka API, so if you’re already familiar with Kafka, you’ll find it particularly easy to get started with MapR Streams.