Build

Connecting Kafka and PubNub

Darryn Campbell on Mar 6, 2024
Connecting Kafka and PubNub

Kafka is fast becoming the defacto Event Broker for Event-driven architecture (EDA) development. EDA is helping accelerate software development, generate real-time functionality and insights, and decouple monolithic systems.

But Kafka is not designed to directly connect with edge devices, like smartphones, browsers, IoT devices, and laptops. That’s where PubNub comes in.  PubNub is a complete platform for building & managing real-time apps that run on edge devices that Kafka can’t reach. Luckily, our customers can have the best of both worlds, powering their real-time apps with data that comes directly from their systems (where events are already streaming through Kafka) and connecting to their edge devices using PubNub.

Also, many of our customers will stream events from edge devices back into Kafka, with optional filtering, aggregation, and other pre-processing needed “at the edge” (i.e., within PubNub’s network) before those events are sent to Kafka.

What is the PubNub Kafka Bridge?

The PubNub Kafka Bridge enables you to seamlessly bridge events with your Kafka deployment into and out of PubNub using our new Sink Connector and Kafka Action, respectively. The Kafka bridge supports deployment on-premises, on Amazon through their Managed Kafka service (MSK), or on Confluent cloud.

The ready-made Kafka Sink Connector integrates directly with your Kafka deployment, enabling real-time event distribution from Kafka to edge devices, including web browsers, mobile apps, and IoT sensors. 

The fully managed Kafka Action, accessible within our Events & Actions service, facilitates streaming events from edge devices back into Kafka, offering optional filtering, aggregation, and other pre-processing capabilities needed “at the edge.”

If you just want to know how to configure the PubNub Kafka Bridge, feel free to skip straight to the end of this article or check out our dedicated Kafka documentation.

Why use PubNub + Kafka for your Event Driven Architecture (EDA)?

Long story short: Kafka is the most popular solution to power event-driven architecture inside the data center. PubNub does not replace Kafka but extends it outside of the data center to end-user devices and other data centers. 

But what does that mean in practice?

All successful large companies have made the transition to an event-driven architecture to improve the management and scalability of their systems, delivering a more responsive experience to customers.  EDA uses events to communicate between back-end services and/or end-user devices.  Think of an event as a change in system state, like a new order being placed through an e-commerce website or a status update received from a field worker.  These events are published onto an event bus, and then routed through to anyone who has subscribed to receive those events.  

Events, Event Bus, and Subscribers are generic terms, and you might see different terms used to describe functionally similar concepts, for example, the “Event Bus” is variously known as the “Message Bus,” ‘Event Broker,” or “Pub/Sub System,” depending on your source or industry.

Why Kafka?

Kafka is a reliable, open-source EDA solution that 80% of Fortune 100 companies use. 

So, why use PubNub with Kafka?

While Kafka is used and deployed within the customer’s data center, PubNub extends the event-driven concepts of EDA outside of the data center to the network edge, including to web browsers, mobile apps, and IoT devices.  Both Kafka and PubNub are designed with scale in mind, so when used in conjunction, can connect millions of end-user devices outside the firewall with hundreds of backend servers within a secure network.

Additionally, using a single service such as PubNub to communicate with your edge devices can simplify your certification process, since you no longer have to perform a vendor check for every service/device you need to send events to/from - you only need to do a single vendor check, with PubNub.

Think of PubNub as your “global event bus” or message bus, enabling you to expand your Kafka EDA deployment from centralized data centers to a broad range of edge devices.  PubNub already supports all event bus features, such as Pub/Sub messaging for real-time event streaming, enhanced security, persistence to retrieve past messages, and on-the-fly event processing.  

  • Are you generating Kafka events that you would like to see at your network edge?  You can achieve that with PubNub.

  • Are you gathering data at the network edge?  For example, users interacting with your mobile or web application?  You can send these events to your Kafka instance using PubNub

PubNub + Kafka: EDA use cases:

Let’s look at some use cases that can be achieved using a combination of PubNub and Kafka:

  • Any Kafka event you are already generating within your system can be streamed in real-time directly to applications, including JavaScript apps running in the browser.  You can filter and process the Kafka events before they are published, ensuring that only important and relevant events are sent.

  • Extend your Kafka events to the network edge in a secure way that doesn’t require specialized custom gateways to be running within your environment.

  • Connect heterogeneous systems with ease: PubNub has over 50+ SDKs and is built from the ground up with security in mind, so interoperability will not be a concern.

  • Build apps that stream real-time events directly to each other (like large chat rooms, device locations, sensor readings, etc.) while still ensuring all events stream simultaneously into your Kafka broker for analysis, archiving, and more.

  • Bridge multiple data centers or hybrid data center deployments without setting up complex VPC environments.

  • Syndicate data streams to business partners, exposing a secure streaming API based on PubNub, with a zero-trust security model that allows you to filter, configure, and disable event streams with ease.

Configuring the PubNub Kafka Bridge

The Kafka Bridge is PubNub’s umbrella term for all Kafka/PubNub integration and comprises two separate solutions.

  1. The “PubNub Kafka Sink Connector” is used to send Kafka events into PubNub.  We provide a custom Kafka Sink Connector that allows PubNub to receive Kafka events. 

  2. The “PubNub Kafka Action,” which is part of our Events & Actions product, enables you to send PubNub events into Kafka.

As a prerequisite to any configuration steps detailed here, you will need to sign up to PubNub and obtain your API keys (publish_key and subscribe_key).

Configuring the PubNub Kafka Sink Connector

To better understand how to install, configure, and test the PubNub Kafka Sink connector, I would strongly recommend this earlier blog post, ‘Installing the PubNub Kafka Sink Connector,’ written by the developer of the connector.

You can also find additional documentation on our PubNub Kafka Sink Connector docs page.

In summary: the PubNub Kafka sink connector is available on Github at https://github.com/pubnub/pubnub-kafka-sink-connector.  

If you just want to test the connector to see how it works, then a Docker image is provided within the repository, which will spin up a Kafka instance, install dependencies, and emit test events consisting of a timestamp parameter.  After enabling the connector using a POST call, you will see PubNub events being received with low latency that match Kafka test events. 

To use the sink connector in production, a Maven command can be used to build the .jar file from the source, which can then be copied into the appropriate directory (depending on your Kafka Connect setup) and the connector configured accordingly.

Please see the documentation page for the complete connector configuration, but all deployments will need, at minimum:

  • The Kafka topic or topics that you want to send to PubNub

  • The PubNub keys and ID that Kafka should use when publishing to PubNub 

  • Additional options for serialization and error handling.

Configuring the PubNub Kafka Action

Events & Actions is a PubNub service that allows you to send real-time data to third-party systems for storage, processing, or analytics.  Events & Actions works by allowing you to specify PubNub events, for example, a message was published to a specific PubNub channel, or a user with a particular ID publishes a message.  For each event, you can specify which Actions should take place whenever the event is triggered, for example, you might want to trigger a Webhook, send your data to Amazon Kinesis, or (as you might have guessed from the topic of this article) send your data to Apache Kafka. 

As a quick reference to the documentation, the steps are:

  1. Define the PubNub event or events that represent the data you want to send to Kafka.  This is also the step that allows you to filter for events that match specific patterns

  2. Configure the Kafka action with details of your Kafka endpoint.  This could be self-hosted, an Amazon MSK cluster, or Confluent Cloud.

Example configuring the PubNub Kafka Action

By selecting ‘Events & Actions’ from the PubNub admin portal, you have the ability to specify both an event listener and an action. 

Select ‘+ Add Event Listener’ and choose an Event source.

Let’s assume we want to export all messages received on the channel whose name is "send-to-kafka," the event configuration will look something like this:

Next, Create an Apache Kafka action

What each of the fields in the action configuration means is detailed in the Kafka action documentation, but the following screenshot matches the configuration for:

  • Kafka topic: topic_0

  • Kafka topic key: stamford

  • Authentication: SCRAM-SHA-256

  • URL of the Kafka cluster to receive events: 574mf0rd-bridge.ldn-west:1905

  • Failed attempts will retry twice, with an interval calculated using a formula based around 450 seconds 

In Summary

If you are already using Kafka in your organization, you can enhance the real-time capabilities of devices at your network edge using PubNub.  This means that events happening on your Kafka event bus can be distributed to your clients regardless of the platform, for example, a web browser using JavaScript or a mobile device running iOS or Android.  You can also ingest data from your client devices into Kafka in real-time over the PubNub network without any complex setup, scalability, or data security concerns.

To learn more, check out our dedicated Kafka Bridge developers page or our Kafka integration documentation.  If you have any questions about setup, then please contact our support team or our developer relations team by emailing devrel@pubnub.com.