- SG: +65 64383504
- IN: 022 25771219
- IN: 022 25792714
- IN: +91 9987536436
NGINX Architecture – An Insight (Part 1)
Sandeep Khuperkar I CTO and Director, Ashnik
In a traditional web server architecture each client connection is handled as a separate process or thread, and as the popularity of a website grows, and the number of concurrent connections increases—the web server slows down, delaying responses to the users. From the technical standpoint, spawning a separate process/thread requires switching CPU to a new task, and creating a new runtime context—which consumes additional memory and CPU time, and negatively impacts performance.
NGINX was developed with the thought of achieving 10x more performance and the optimized use of server resources—while being able to scale and support dynamic growth of a website. As a result NGINX became one the most well-known modular, event-driven, asynchronous, single-threaded web server and web proxy.
In NGINX users’ connections are processed in highly efficient runloops inside a limited number of single-threaded processes—called worker(s). Each worker can handle thousands of concurrent connections and requests per second.
Event-driven is basically about an approach to handle various tasks as “events”. Incoming connection is an event, disk read is an event and so on. The idea is to not waste server resources unless there’s an “event” to handle. Modern operating system can notify the web server about initiation or completion of a task, which in turn enables NGINX workers to use proper resources in a proper way. Server resources can be allocated and released dynamically, on-demand—resulting in optimized usage of network, memory and CPU.
Asynchronous means the runloop doesn’t get stuck on particular events—it sets condition for “alarms” from the operating system about particular “events” and continues to monitor the “event queue” for “alarms”. Only when there’s an alarm about an event, the runloop triggers actions (e.g. read/write from the network interface). In turn, specific actions always try to utilize non-blocking interfaces to the OS so that the worker doesn’t stop on handling a particular event. This way NGINX workers can use available shared resources concurrently in the most efficient manner.
Single-threaded means that many user connections can be handled by a single worker process which in turn helps to avoid excessive context switching—and leads to more efficient usage of memory and CPU.
Modular architecture helps developers to extend the set of the web server features without heavily modifying the NGINX core.
NGINX does not create a new process or thread for every connection. Worker process accepts the new requests from a shared listen queue and executes a highly efficient runloop across them—to process thousands of connections per worker. Worker gets notifications about events from the mechanisms in the OS kernel.
When NGINX is started, an initial set of listening sockets is created, workers then start to accept, read from and write to sockets when processing HTTP requests and responses.
As NGINX does not fork a process or thread per connection, the memory usage is very conservative and extremely efficient in most of the cases—it’s basically a true on-demand handling of memory. NGINX also conserves CPU cycles as there’s no ongoing create-destroy pattern for processes or threads.
In a nutshell what NGINX does can be described as orchestration of the underlying OS and hardware resources to server web clients—by checking the state of the network and storage events, initializing new connections, adding them to the runloop, and processing asynchronously until completion, at which point the connection is deallocated and removed from the runloop. Consequently NGINX helps to achieve moderate-to-low CPU usage under even most extreme workloads.
NGINX spawns several worker(s)—it’s typically a worker per CPU core—which in turn helps to scale across multiple CPUs. This approach helps the OS to schedule tasks across NGINX workers more evenly.
General recommendations for worker configuration might be as following:
• For the CPU-intensive workload—the number of NGINX worker(s) should be equal to number of CPU cores.
• For I/O-intensive workload—the number of worker(s) might be about two times the number of cores.
Thus NGINX is able to do more in less resources (e.g memory and CPU).