Why we should use a Database proxy

·

4 min read

In this article, I will discuss the following points:

  1. What is a database proxy

  2. A little introduction to a proxyless architecture

  3. The bottlenecks caused by databases even with rich system resources and their impact on the overall throughput lead to poor user experience

  4. How introducing proxies aims to solve the above scenario

What is a database proxy -

It is an intermediate component that sits between client applications and one or more database servers. Its primary purpose is to manage and control database connections, requests, and queries on behalf of the client applications, offering various benefits such as load balancing, security, and performance optimization.

Let’s deep dive into each of these and see how it offers to achieve an excellent performance

How a proxyless architecture should be defined -

Well, a proxy essentially is someone who sits between the client and server and helps to communicate the client with the server through and through.

To understand what a proxyless architecture is, let's consider a scenario where there are tens of thousands of clients and a couple of instances of a service running independently in containers deployed on a cluster.

Now as multiple instances are the traffic has to be distributed between them otherwise it will lead to performance issues. To do that we have to pull the entire clustered network topology into client applications which will introduce complexity in code and may become an overhead for developers. To solve this we need a proxy that will sit between the client and servers abstract out the network topology and will streamline and distribute the requests as per the algorithm.

The above scenario explains how a proxy acts as a load balancer in a distributed environment. Later in this blog, we will also see other benefits that a proxy offers.

To understand the bottlenecks let’s consider that we are using RDBMS as our primary database. There are a couple of options in the market such as Oracle, PostgreSQL, MySQL etc. Depending on the requirements and use cases we will build upon one of them that fits best. To achieve high availability, we will replicate or shard as per our needs. In case of replication the read and write requests will be distributed between replicas and the proxy as a load balancer will take care of that. Now an SQL query is expensive such as performing joins of two tables. If the exact query is requested from different users, we should have some way to cache the response to avoid disk I/O and save CPU cycles. Hence proxy comes to the rescue.

Another bottleneck proxy solves is connection pooling. Opening and closing a lot of TCP connections might impact the throughput. Let's see how. In production, we will use HTTP or a secured TCP connection to avoid security breaches. To establish that a TLS handshake will happen and will take CPU time. Some might argue that we use a dynamic thread pool. But what if your connections grow very rapidly? Of course, there are solutions like two-tier load balancing and others. But not always. Just spinning up a new instance to handle the incoming traffic and increasing costs is not a very feasible solution. We definitely should come up with some out-of-the-box solutions and optimize costs.

Here is what we can do. Rather than closing the connections every time we can use a dynamic pool of connections, that can be reused to serve the requests. This way we can effectively use the connections and resources like memory and cpu.

One more benefit is that proxies often provide logging and monitoring capabilities, allowing administrators to track and analyze database usage, performance, and potential issues.

In this article, I have discussed the important use cases of database proxy.

I also have a database proxy in JavaScript that requires node js runtime. It's a CLI tool to create databases and tables. I have not added many of the features. Also, I have integrated health monitoring and aggregating and indexing of the error logs using Elk Stack. Just cloning and installing dependencies by “npm install” and then installing the local module as global by “npm install -g” will make the node js file executable and the path will be accessed from anywhere in your shell. For Linux, it's a little different. The file has to be made executable by using chmod.

Source code - https://github.com/raja-dettex/database-proxy