I built an app on node.js using Docker and I'm not sure how to scale it on a Kubernetes cluster so that I take the most out of my cluster hardware.
From a performance perspective which of the following is better:
clusterize my node app and run as many containers as needed
just run as many containers as needed without clustering ?
When I say clustering I mean this <a href="https://nodejs.org/api/cluster.html" rel="nofollow">https://nodejs.org/api/cluster.html</a>
My app is a simple CRUD Api backed by mongoDB. We estimate that it will have 1000 concurrent users. Our cluster has 3 nodes.Answer1:
The <a href="https://nodejs.org/api/cluster.html" rel="nofollow">NodeJS cluster</a> mechanism is useful to allow NodeJS to more effectively use greater than a single core, so depending on your code it may benefit you, but it's highly dependent on your code and the various dependencies and how well they work (or not) with clustering.
As a general practice, if you can break your containers down into nicely parallelized efforts that can be run as pods within kubernetes, then I'd recommend the following as a process to see what works for you:<ol><li>set up a single pod with your code in it, and run a load test against it. Use the data that Kubernetes has from cAdvisor to characterize how much resources (cpu & memory) your pod likes to have.</li> <li>set a resource limit for cpu and memory based on what you see above.</li> <li>run a load test to validate what your single pod handles in terms of scale</li> </ol>
And from there, you have a baseline where you can use Kubernetes to scale this horizontally to validate the 1000 user concurrent baseline you want to achieve. There's a good talk on this process from the 2017 Kubecon called <a href="https://www.youtube.com/watch?v=_l8yIqMpWT0" rel="nofollow">Load Testing Kubernetes: How to optimize your cluster resource allocation in production</a>
Once you have a baseline, you can run a prototype out leveraging the clustering in your code, and then compare against the non-clustered version. If you do this, I'd double-check that any limits you set are > 1 core for CPU, or you'll be self-limiting outside of the NodeJS runtime to get access to multiple cores, which would defeat the purpose of using clustering.
Depending on what you're doing in your code, there may be significant re-work needed to enable clustering, as it wants to leverage its own worker concept, and it's not clear what frameworks you're using and if they'll fit reasonably into that structure.