Welcome!

Get Cloud Ready!

Janakiram MSV

Subscribe to Janakiram MSV: eMailAlertsEmail Alerts
Get Janakiram MSV via: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn


Blog Feed Post

Apache Kafka: A Primer

Apache Kafka is fast becoming the preferred messaging infrastructure for dealing with contemporary, data-centric workloads such as Internet of Things, gaming, and online advertising. The ability to ingest data at a lightening speed makes it an ideal choice for building complex data processing pipelines. In a previous article, we discussed how Kafka acts as the gateway for IoT sensor data for processing hot path and cold path analytics.

In this article, we will introduce the core concepts and terminology of Apache Kafka along with the high-level architecture.

Why Kafka?

Before the introduction of Apache Kafka, Message Oriented Middleware (MOM) such as Apache QpidRabbitMQMicrosoft Message Queue, and IBM MQ Series were used for exchanging messages across various components. While these products are good at implementing the publisher/subscriber pattern (Pub/Sub), they are not specifically designed for dealing with large streams of data originating from thousands of publishers. Most of the MOM software have a broker that exposes Advanced Message Queuing Protocol (AMQP) protocol for asynchronous communication.

Kafka is designed from the ground up to deal with millions of firehose-style events generated in rapid succession. It guarantees low latency, “at-least-once”, delivery of messages to consumers. Kafka also supports retention of data for offline consumers, which means that the data can be processed either in real-time or in offline mode.

The fundamental difference between Message Oriented Middleware and Kafka is that the clients will never receive messages automatically. They have to explicitly ask for a message when they are ready to handle.

Expanding further on the persistence and retention, Kafka is designed to be a distributed commit log. Much like relational databases, it can provide a durable record of all transactions that can be played back to recover the state of a system. The key thing to understand is that the data is stored durably in an order that can be read deterministically. Due to the distributed design, Kafka provides redundancy, which ensures high availability of data even when one of the servers faces disruption.

Read the entire article at The New Stack

is an analyst, advisor, and architect. Follow him on Twitter,  Facebook and LinkedIn.

Read the original blog entry...

More Stories By Janakiram MSV

Janakiram MSV heads the Cloud Infrastructure Services at Aditi Technologies. He was the founder and CTO of Get Cloud Ready Consulting, a niche Cloud Migration and Cloud Operations firm that recently got acquired by Aditi Technologies. In his current role, he leads a highly talented engineering team that focuses on migrating and managing applications deployed on Amazon Web Services and Microsoft Windows Azure Infrastructure Services.
Janakiram is an industry analyst with deep understanding of Cloud services. Through his speaking, writing and analysis, he helps businesses take advantage of the emerging technologies. He leverages his experience of engaging with the industry in developing informative and practical research, analysis and authoritative content to inform, influence and guide decision makers. He analyzes market trends, new products / features, announcements, industry happenings and the impact of executive transitions.
Janakiram is one of the first few Microsoft Certified Professionals on Windows Azure in India. Demystifying The Cloud, an eBook authored by Janakiram is downloaded more than 100,000 times within the first few months. He is the Chief Editor of a popular portal on Cloud called www.CloudStory.in that covers the latest trends in Cloud Computing. Janakiram is an analyst with the GigaOM Pro analyst network where he analyzes the Cloud Services landscape. He is a guest faculty at the International Institute of Information Technology, Hyderabad (IIIT-H) where he teaches Big Data and Cloud Computing to students enrolled for the Masters course. As a passionate speaker, he has chaired the Cloud Computing track at premier events in India.
He has been the keynote speaker at many premier conferences, and his seminars are attended by thousands of architects, developers and IT professionals. His sessions are rated among the best in every conference he participates.
Janakiram has worked at the world-class product companies including Microsoft Corporation, Amazon Web Services and Alcatel-Lucent. Joining as the first employee of Amazon Web Services in India, he was the AWS Technology Evangelist. Prior to that, Janakiram spent 10 years at Microsoft Corporation where he was involved in selling, marketing and evangelizing the Microsoft Application Platform and Tools.