Blog | Tug's Site

Getting Started with MQTT and Java

January 2, 2017 · 4 min read

MQTT (MQ Telemetry Transport) is a lightweight publish/subscribe messaging protocol. MQTT is used a lot in the Internet of Things applications, since it has been designed to run on remote locations with system with small footprint.

The MQTT 3.1 is an OASIS standard, and you can find all the information at http://mqtt.org/

This article will guide you into the various steps to run your first MQTT application:

Install and Start a MQTT Broker
Write an application that publishes messages
Write an application that consumes messages

The source code of the sample application is available on GitHub.

Getting started with Apache Flink and Mapr Streams

October 17, 2016 · 7 min read

Introduction

Apache Flink is an open source platform for distributed stream and batch data processing. Flink is a streaming data flow engine with several APIs to create data streams oriented application.

It is very common for Flink applications to use Apache Kafka for data input and output.

This article will guide you into the steps to use Apache Flink with MapR Streams. MapR Streams is a distributed messaging system for streaming event data at scale, and it’s integrated into the MapR Converged Data Platform, based on the Apache Kafka API (0.9.0), this article use the same code and approach than the Flink and Kafka Getting Started.

MapR Streams and Flink .

Getting started with Apache Flink and Kafka

October 12, 2016 · 5 min read

Introduction

Apache Flink is an open source platform for distributed stream and batch data processing. Flink is a streaming data flow engine with several APIs to create data streams oriented application.

It is very common for Flink applications to use Apache Kafka for data input and output. This article will guide you into the steps to use Apache Flink with Kafka.

![]( center /images/posts/flink-kafka/flink-kafka.png Flink-Kafka )

Streaming Analytics in a Digitally Industrialized World

September 26, 2016 · 4 min read

Get an introduction to streaming analytics, which allows you real-time insight from captured events and big data. There are applications across industries, from finance to wine making, though there are two primary challenges to be addressed.

Did you know that a plane flying from Texas to London can generate 30 million data points per flight? As Jim Daily of GE Aviation notes, that equals 10 billion data points in one year. And we’re talking about one plane alone. So you can understand why another top GE executive recently told Ericsson Business Review that "Cloud is the future of IT," with a focus on supporting challenging applications in industries such as aviation and energy.

Setting up Spark Dynamic Allocation on MapR

September 1, 2016 · 3 min read

Apache Spark can use various cluster manager to execute application (Stand Alone, YARN, Apache Mesos). When you install Apache Spark on MapR you can submit application in a Stand Alone mode or using YARN.

This article focuses on YARN and Dynamic Allocation, a feature that lets Spark add or remove executors dynamically based on the workload. You can find more information about this feature in this presentation from Databricks:

Dynamic Allocation in Spark

Let’s see how to configure Spark and YARN to use dynamic allocation (that is disabled by default).

Save MapR Streams messages into MapR DB JSON

March 30, 2016 · 4 min read

In this article you will learn how to create a MapR Streams Consumer that saves all the messages into a MapR-DB JSON Table.

Getting Started with MapR Streams

March 10, 2016 · One min read

You can find a new tutorial that explains how to deploy an Apache Kafka application to MapR Streams, the tutorial is available here:

Getting Started with MapR Streams

MapR Streams is a new distributed messaging system for streaming event data at scale, and it’s integrated into the MapR converged platform. MapR Streams uses the Apache Kafka API, so if you’re already familiar with Kafka, you’ll find it particularly easy to get started with MapR Streams.

Getting Started with Sample Programs for Apache Kafka 0.9

February 10, 2016 · One min read

Ted Dunning and I have worked on a tutorial that explains how to write your first Kafka application. In this tutorial you will learn how to:

Install and start Kafka
Create and Run a producer and a consumer

You can find the tutorial on the MapR blog:

Getting Started with Sample Programs for Apache Kafka 0.9

Using Apache Drill REST API to build ASCII Dashboard with node

December 10, 2015 · 5 min read

Apache Drill has a hidden gem: an easy to use REST interface. This API can be used to Query, Profile and Configure Drill engine.

In this blog post I will explain how to use Drill REST API to create ascii dashboards using Blessed Contrib.

The ASCII Dashboard looks like

Dashboard

Convert a CSV File to Apache Parquet with Drill

August 17, 2015 · 3 min read

A very common use case when working with Hadoop is to store and query simple files (CSV, TSV, ...); then to get better performance and efficient storage convert these files into more efficient format, for example Apache Parquet.

Apache Parquet is a columnar storage format available to any project in the Hadoop ecosystem. Apache Parquet has the following characteristics:

Self-describing
Columnar format
Language-independent

Let's take a concrete example, you can find many interesting Open Data sources that distribute data as CSV files- or equivalent format-. So you can store them into your distributed file system and use them in your applications/jobs/analytics queries. This is not the most efficient way especially when we know that these data won't move that often. So instead of simply storing the CSV let's copy this information into Parquet.

How to convert CSV files into Parquet files?

You can use code to achieve this, as you can see in the ConvertUtils sample/test class. You can use a simpler way with Apache Drill. Drill allows you save the result of a query as Parquet files.

The following steps will show you how to do convert a simple CSV into a Parquet file using Drill.

Introduction​

Introduction​

How to convert CSV files into Parquet files?​

Introduction

Introduction

How to convert CSV files into Parquet files?