I’ve had a fair number of conversations with our customers to develop PubNub case studies, and I’ve deduced a common theme embedded in each of their stories. Most of them have first experimented with building a real-time data stream network infrastructure in-house and have run into a series of common roadblocks. This blog post will categorize the issues you may run into when building a real-time data stream infrastructure.
Defining Build vs. Buy
Before we dive into this blog post, it’s important first to define what it means to build and what it means to buy.
Build: Building a data stream network on your own means you’re taking an open source protocol, tool, and framework that already exists in an open source community (like Socket.IO, SignalR, etc.), then installing and orchestrating the operations of that open source service (into physical or virtualized hardware) while maintaining platform enhancements.
Buy: Buying means installing an easy-to-setup SDK (JavaScript, Python, Ruby, C#, etc.) that is pre-built to connect over a globally scaled real-time data stream network vendor. These vendors greatly differ in their services, but their goal is to take care of the heavy lifting for you, such as scaling, reliability, replication, maintenance, analytics, developer tooling, testing, security, etc.
Why you might want to build a WebSocket solution yourself
We love, respect, and embrace the spirit of DIY innovation. After all, that’s how PubNub was born. Installing and operating a socket server for the first time is fun and fulfilling. However, the DIY solution typically fails as scale increases due to orchestration challenges.
The purpose of this blog post is not to discourage building a real-time data stream network on your own. Rather, it’s to point out some of the issues you may face if you build it yourself and share how many developers have gotten past those issues using a real-time data stream network service provider.
Often, one of the first things that might go through your mind when looking into whether to build vs. buy is the idea of upfront cost: how much will it cost to build it yourself from scratch versus buying it from a real-time data stream service provider? As you know, it’s not just the initial cost, but the total cost down the road. Here are a few things to consider when estimating those initial and down-the-road costs.
Problems you may face when building it yourself
There’s not much code you have to write to set up a real-time system other than some ops code (real-time open-source protocols and frameworks are pretty streamlined these days). When you start with an open-source protocol or framework, you can perform basic real-time functionality on your socket-type broadcast server. These functions include publish/subscribe, unicast, and broadcast messaging.
But building out a fully functional data stream network on your own isn’t as easy or cheap as you may think. There are a number of considerations and costs that need to be taken into account when building it yourself. It’s important to weigh the total cost of ownership for your solution.
Engineering & Operations Setup Time:
Time is money. But an often overlooked cost associated with building a real-time infrastructure on your own is time. The days you and your development team put in to building infrastructure can be prolific. It’s hard labor; you’re lifting the shovel and scooping the dirt out.
Maintenance and Orchestration for WebSocket Servers:
Maintaining and orchestrating a data stream network is an entire job in itself. This includes ensuring software and versions are always current, compatible, and tested with the hardware you’re running on. Socket-based services require heavy orchestration behind the scenes at scale and requires a dedicated individual or team to ensure your network is reliable, secure, and up-to-date.
WebSocket Server Costs:
When you begin building a user base and are no longer running your app on your local network, you’ll need a server, so you’ll have to go with companies like Amazon, Softlayer, Joyent, Azure or Rackspace. This means that you’re already spending money on servers early in your build process. As you scale out your infrastructure, the costs and maintenance requirements must also be scaled out, which will also increase server costs.
Security and WebSocket SSL:
Security is paramount no matter what type of real-time app you’re developing. Message, transaction, and data streaming information must be kept private from end-to-end. To ensure this, your real-time infrastructure must include various encryption and access control options to manage all stream data.
Deep Technical Expertise:
Building, maintaining and scaling a data stream network is a behemoth job in itself that doesn’t just require time and money, but a deep understanding of real-time architecture as well. And as you add more features and functionality, the laundry list of technical requirements will continue to add up.
Talk to an Expert
Let's connect to discuss your real-time project.
Advantages of buying a WebSocket solution on a Global Data Stream Network
If a primary concern of buying a service is upfront cost, it shouldn’t be. Many real-time service providers provide a free sandbox developer tier where you can build and test your real-time app free of charge for as long as you need to.
When you decide to move your app from the lab and deploy it into production, scale and orchestration need to be considered. When you use a data stream network service provider, they handle this in all its complexity so you don’t have to. It’s their job to set up and maintain servers, install and update software and versions, upgrade features, add new service capabilities, and perform any necessary maintenance.
You may also run into the issue of picking the correct framework or protocol for your exact need. A data stream service provider will oftentimes build a number of SDKs available for whatever platform or device you want your app to run on. These SDKs are MIT open-source and can be modified as needed. They are also monitored and updated continuously.
Reliability and scalability
Reliability and scalability are critical. Reliability of your data stream network is essential for ensuring a consistent, fast real-time user experience. This means that when deploying and scaling apps to thousands of users simultaneously, you need a global presence; data centers must proliferate across different regions of Earth. If you’re targeting a certain region exclusively, one data center is able to handle that kind of user base. Otherwise, it’s important to make sure the network can scale easily when launched.
Redundancy
Redundancy is another key component. In the event that your data centers go down, your data stream network goes down. Data stream network service providers offer globally redundant data stream networks, so if one data center degrades, all real-time data has already been replicated to all other data centers in different regions, ensuring delivery of all messages.
We enable developers to build real-time interactivity for IoT, web, and mobile devices. The platform runs on our real-time edge messaging network, providing customers with the industry's largest and most scalable global infrastructure for interactive applications. With over 15 points of presence worldwide supporting 800 million monthly active users and 99.999% reliability, you’ll never have to worry about outages, concurrency limits, or any latency issues caused by traffic spikes. It’s perfect for any application that requires real-time data.
Sign up for a free trial and get up to 200 MAUs or 1M total transactions per month included.