Image result for socket.io + nodejs images
Node.js is getting more and more popular in development world, so are the WebSockets (real time connection), but still to make WebSockets and Node.js Cluster work well together using socket.io isn’t well documented and a taboo . However its not anymore☺.
In this article we will learn What websockets are, How Node.js cluster works, What problems we face while setting up WebSockets with Node.js cluster and  How we are going to solve them. Before getting into the article its assumed you have little knowledge on node.js and how it works.If you are a newbie don’t worry, we have still got you covered. Go through this blog Understanding node.js and come back right after.

What are WebSockets:

      “Websockets are an advanced technology that makes it possible to open an interactive communication session between the user’s browser and a server. With this API, you can send messages to a server and receive event-driven responses without having to poll the server for a reply.”

 

But why do we need webSockets we already have AJAX? WebSockets represent a standard for bi-directional realtime communication between servers and clients. Firstly in web browsers, but ultimately between any server and any client. The standards first approach means that as developers we can finally create functionality that works consistently across multiple platforms. Connection limitations are no longer a problem since WebSockets represent a single TCP socket connection. Cross domain communication has been considered from day one and is dealt with within the connection handshake.

Now as we know what WebSockets are lets dive into setting up a Node.js Cluster.

Node.js cluster API:

Node.Js processes runs on a single process,While it’s still very fast in most cases, this really doesn’t take advantage of multiple processors if they’re available. If you have an 8 core CPU and run a Node.Js program via

it will run in a single process, wasting the rest of CPUs. Hopefully for us NodeJS offers the “cluster” module that allows you to create a small network of separate processes which can share server ports; this gives your Node.js app access to the full power of your server.

uhh! That’s enough of talking lets see a real example which:

  • Creates a master process that retrives the number of CPUs and forks a worker process for each CPU, and
  • Each child process prints a message in console and exit.

Save the code in


file and run executing:

The output should be something similar to:


simple isn’t it? Indeed it is.

Making socket.io work with node.js cluster API:

So can we just start using web-sockets(with socket.io library) with node cluster API?  ummm not yet.
The problem comes with how socket connections are established. Before going further lets understand basic steps in establishing a socket connection.

Handshake

When creating a WebSocket connection, the first step is a handshake over TCP in which the client and server agree to use the WebSocket Protocol.

The handshake from the client looks like this:

The handshake from the server:

Create a WebSocket Connection

A WebSocket connection is established by upgrading from the HTTP protocol to the WebSocket Protocol during the initial handshake between the client and the server, “over the same underlying TCP connection“. An Upgrade header is included in this request that informs the server that the client wishes to establish a WebSocket connection.

Hence, If we plan to distribute the load of connections among different processes(i.e a cluster), we have to make sure that requests associated with a particular session id connect to the process that originated them.
This is due to certain transports like XHR Polling or JSONP Polling relying on firing several requests during the lifetime of the “socket”.
Failing to enable sticky balancing will result in the dreaded:

Which means that the upgrade request was sent to a node(one among all the available cluster nodes) which did not know the given socket id, hence the HTTP 400 response.

This can be solved with a simple trick, that is to ensure file descriptors (ie: connections) are routed based on the originating remoteAddress (requests from a particular address are routed to same node) rather than in a round-robin fashion.
Lets do it 😀

As you can see requests originating from same IP address goes to same node in the cluster hence sticky balancing the requests.

Please note that this might lead to unbalanced routing, depending on the hashing method we use.

There you go. This is how you make socket.io work with Node.js cluster API:)

Here’s the repo of sample chat application using Socket.io and node cluster API:
https://github.com/ANURAGVASI/socket.io-multiserver-chatAp