Ruby on Rails is a great framework for startups, but we often hear people talk about Rails scalability issues when a startup project grows too large (read: becomes very popular).
One of the key events that triggered the discussion that Rails can't scale was when Twitter switched to Scala in order to handle their growing number of user requests.
But as a counterexample, we would like to mention that Shopify is an advanced Rails application that has scaled quite well many years in a row. So who’s right? Do you need to ditch this framework if your app goes big?
Let's take a look at how to scale a Ruby on Rails application to find out.
What Is Rails Scalability?
A framework’s scalability is the potential for an application built with this framework to be able to grow and manage more user requests per minute (RPM) in the future. Actually, it's incorrect to talk about framework scalability, because it's not the framework that must, or can, scale, but rather the architecture of the entire server system. But we should acknowledge that, to a certain extent, application architecture does have an impact on scalability.
In short, telling 'How to scale Ruby on Rails' or 'Ruby on Rails scalability issues' is basically incorrect.
Let's take a look at the example below. This is what the entire server architecture looks like at the very beginning of a Rails project.
what we usually have is a single server, on which we install the following software:
This server computer will be able to cope with, say, 1,000 or even up to 10,000 requests per hour easily. But let’s assume that your marketing is very successful, and your Rails application becomes much more popular; your server starts getting 10 or 100 times more requests. When the load increases to a high enough level, a single server architecture cracks under pressure. That is, the application becomes unresponsive to users.
That’s why we’ll explain how to resolve this scalability issue – serving data to users – with Ruby on Rails.
Scaling vertically is the simplest way to make the server handle an increased number of RPMs. Vertical scaling means adding more RAM, upgrading the server’s processor, etc. In other words, you give your server computer more power. Unfortunately, this approach doesn’t work in many situations, and here are a couple of reasons why.
Vertically scaling a server running an app gives a positive effect only at the initial stage. When your traffic increases yet again, you eventually come to the point when upgrading the processor or adding more RAM is technically impossible.
But there’s another downside to this approach. When you need to scale your Ruby on Rails application, there are usually some parts that demand more computational resources than others. For example, Facebook needs servers offering different performance for updating the news feed and processing images. Image processing is used less frequently than the news feed. Therefore, Facebook installs a less powerful server for image processing than for the news feed. We’ll talk more about this architectural approach (Service-Oriented Architecture or SOA) to scaling Rails app shortly.
When vertical scalability is no longer practical, or when we look for other scalability options right away, we scale a Rails application horizontally.
Horizontal Scalability with Ruby on Rails
We can scale a Rails application horizontally similarly to how we scale with many other frameworks. Horizontal scaling means converting the single server architecture of your app to a three-tier architecture, where the server and load balancer (Nginx), app instances, and database instances are located on different servers. In such a way, we allocate equal and smaller loads among machines.
Nginx
Nginx, a commonly used server software for Rails applications, is deployed on a single machine to serve as a load balancer and reverse-proxy. You need a medium-powered server for Nginx, because this server requires little computing power to function normally under high loads. Nginx’s sole aim is to filter and distribute the load among multiple servers.
We set up this server to receive the initial request and forward it to the first machine. The second request will be sent from Nginx to the second machine, and so on. When you have only three machines with your Rails application instances, then the fourth request from the client (browser) will be sent, naturally, to the first machine again.
App Instances
As we mentioned previously, you need additional servers to run app instances separately from the Nginx server. From the user’s perspective, the app stays the same; they simply access different app instances thanks to Nginx.
For communication between the application and Nginx, we use a special interface called Rack, which is an application server. There are many application servers for Rails-based apps, the best known being Unicorn, Phusion Passenger, and Puma. Application servers are responsible for input-output, letting the application deal with user requests.
Although Rails applications are mostly monolithic, we can extract a functional block and use it as a standalone service. This will also lessen the load on servers.
Database Instances
Scaling a database is another important aspect of scaling a Rails application. We can deploy a database on the same server as the application to economize, but this economization has several drawbacks. When every application instance saves and retrieves data from its own database instance, then data for the entire application is spread across many machines. This is very inconvenient for website users. If Nginx redirects a user to a different app instance than where their data is stored, they can’t even sign in because their data is located on a different machine!
Another reason to separate a database from other servers is to create a fault-tolerant architecture. If our database receives too many requests and can’t manage them, it collapses. But other database instances in the data tier can accommodate this load, and the Rails app will continue to work.
That’s why, when you are ready to scale your Rails application, your database should be transferred to its own server right away. The database tier can be a single server with a database that’s used by all app instances. Or, this tier can consist of several machines each running databases. In order to update data across all databases, we use database replication. Once a database has new data to share, it informs all others about changes (which is standard procedure for multi-master relationships among databases).
In a master-slave replication variant, databases communicate with the master (or main) database, which stores important data about other databases. All other database instances receive updates only when the master database updates data. Multi-master database replication and master-slave replication are the two common approaches to making our data layer consistent across multiple databases.
When the database receives a large quantity of data, it must save it quickly, update its status quickly, and send new data back to the user. An ordinary relational MySQL database is usually replaced by PostgreSQL or MongoDB databases to accommodate large quantities of data when you scale an app.
In order to further reduce the load on the database, several other techniques can be used. Caching is one important solution, although it’s indirectly related to database scalability. Caching provides cached data from a storehouse to a client more quickly. Imagine that your Rails app compiles statistics about user activity, such as user sessions. This produces a high load on the CPU. We can calculate this information once per day, and then show the cached data to the user at any time.
The other two solutions for reducing the load on a database are Redis and Memcached, which are special databases that perform read/write operations extremely fast.
Other Options when Scaling a Rails App
Let's say you want to spend as little money as possible on new servers. Is it possible to scale your Rails application while minimizing server costs? Indeed it is.
First, consider code optimization. Let's assume your service offers image editing, which is demanding on the the CPU. Given that your startup has evolved very quickly, it’s almost a given that your image editing algorithm wasn't optimized in its initial form, so there’s a lot of code to refactor and optimize. For example, at RubyGarage we needed to parse a huge CSV file with a user data. It took as much as 74 seconds for the algorithm to parse the file. But when our developer removed unnecessary checks and reused some resources, the algorithm took only 21 seconds to accomplish the same task. Code optimization helps you get more out of existing server resources.
Another option is to re-write computationally intensive tasks in a low-level language, such as C++. Due to low concurrency support – because of the Matz Ruby Interpreter (MRI) – Ruby isn't so great for computation compared to low-level programming languages.
It's also possible to stick with Ruby and Ruby on Rails in your project, but deploy a service-oriented architecture in order to scale your application. Imagine your project uses chat functionality (as one of our client’s apps does). The chat can be loaded with many requests from users, while the other parts of this application are used infrequently. It's unnecessary to break the server structure into three tiers. But it's possible to break the application into two parts!
We deployed the application part responsible for chat functionality on a separate server as a stand-alone application. Chat could then communicate with the original application and database, but the load on the main server was decreased. Further, it will be easier, faster, and less expensive to scale just this small part of your application – either vertically or horizontally – should the load increase once more. The service-oriented architecture is a great solution for scaling Rails application and is used by many companies, including Facebook and Shopify.
Ruby on Rails Application Architecture and Scalability
We mentioned before that the Rails application architecture can negatively affect scalability. Scalability issues become apparent if an application consists of many services that cannot run separately from other services.
Let's consider an example of a bad Rails application architecture. An application can constantly save its current status when a user enters new data in a form or visits several web pages successively. If the load balancer redirects this user to the next server, this next server won't know anything about this new data. This will prevent us from scaling the app horizontally. The application’s status must be saved either on the client (browser) or on the database.
If we avoid this common architectural mistake, or similar mistakes, when building a application, we will be able to scale it with no trouble.
Tools that help you scale Rails apps
Proper scalability of a Rails application is only possible when you find and eliminate bottlenecks. Finding bottlenecks is a bit difficult to do manually. And we can’t figure out scalability issues by just scratching our heads.
A much more effective way to locate problems when scaling applications is to use dedicated software. New Relic, Flame Graphs, Splunk, StatsD, and other software programs are all great options, and they work similarly: each checks and collects statistics about your application. We must then interpret the gathered data to figure out what to change.
We can use Flame Graphs to learn about CPU or RAM loads when running our application. When the app interacts with the database, we can track code paths. Learning what processes take the most time can lead us to our application’s bottleneck(s). When we’ve identified a bottleneck, we can design a solution to remove it.
You can see a strangely looking graphic with vibrant colors, which we took from Flame Graph, in the screenshot below. No matter it's appearance, the graphic is genuinely helpful to track how our code works from the very beginning.
In conclusion, we would like to repeat what Shopify said several years ago at a conference dedicated to Ruby on Rails scalability: “There isn't any magic formula to scaling a Rails application.”
In this article, we mentioned some problems that might appear when scaling a Ruby on Rails application, and described several solutions. But applications are unique. Therefore, you must know the characteristics of your particular application in order to overcome your applications specific limitations. It's important to know exactly what part of a Rails application to scale initially as we begin optimizing from crucial issues down to minor bottlenecks.
One of the key events that triggered the discussion that Rails can't scale was when Twitter switched to Scala in order to handle their growing number of user requests.
But as a counterexample, we would like to mention that Shopify is an advanced Rails application that has scaled quite well many years in a row. So who’s right? Do you need to ditch this framework if your app goes big?
Let's take a look at how to scale a Ruby on Rails application to find out.
What Is Rails Scalability?
A framework’s scalability is the potential for an application built with this framework to be able to grow and manage more user requests per minute (RPM) in the future. Actually, it's incorrect to talk about framework scalability, because it's not the framework that must, or can, scale, but rather the architecture of the entire server system. But we should acknowledge that, to a certain extent, application architecture does have an impact on scalability.
In short, telling 'How to scale Ruby on Rails' or 'Ruby on Rails scalability issues' is basically incorrect.
Let's take a look at the example below. This is what the entire server architecture looks like at the very beginning of a Rails project.
what we usually have is a single server, on which we install the following software:
- Nginx server;
- Rack application server – Puma, Passenger, or Unicorn;
- Single instance of your Ruby on Rails application;
- Single instance of your database; usually, a relational database like PostgreSQL.
This server computer will be able to cope with, say, 1,000 or even up to 10,000 requests per hour easily. But let’s assume that your marketing is very successful, and your Rails application becomes much more popular; your server starts getting 10 or 100 times more requests. When the load increases to a high enough level, a single server architecture cracks under pressure. That is, the application becomes unresponsive to users.
That’s why we’ll explain how to resolve this scalability issue – serving data to users – with Ruby on Rails.
Let's Scale Your Ruby on Rails Application!
Vertical Scalability with Ruby on RailsScaling vertically is the simplest way to make the server handle an increased number of RPMs. Vertical scaling means adding more RAM, upgrading the server’s processor, etc. In other words, you give your server computer more power. Unfortunately, this approach doesn’t work in many situations, and here are a couple of reasons why.
Vertically scaling a server running an app gives a positive effect only at the initial stage. When your traffic increases yet again, you eventually come to the point when upgrading the processor or adding more RAM is technically impossible.
But there’s another downside to this approach. When you need to scale your Ruby on Rails application, there are usually some parts that demand more computational resources than others. For example, Facebook needs servers offering different performance for updating the news feed and processing images. Image processing is used less frequently than the news feed. Therefore, Facebook installs a less powerful server for image processing than for the news feed. We’ll talk more about this architectural approach (Service-Oriented Architecture or SOA) to scaling Rails app shortly.
When vertical scalability is no longer practical, or when we look for other scalability options right away, we scale a Rails application horizontally.
Horizontal Scalability with Ruby on Rails
We can scale a Rails application horizontally similarly to how we scale with many other frameworks. Horizontal scaling means converting the single server architecture of your app to a three-tier architecture, where the server and load balancer (Nginx), app instances, and database instances are located on different servers. In such a way, we allocate equal and smaller loads among machines.
Nginx
Nginx, a commonly used server software for Rails applications, is deployed on a single machine to serve as a load balancer and reverse-proxy. You need a medium-powered server for Nginx, because this server requires little computing power to function normally under high loads. Nginx’s sole aim is to filter and distribute the load among multiple servers.
We set up this server to receive the initial request and forward it to the first machine. The second request will be sent from Nginx to the second machine, and so on. When you have only three machines with your Rails application instances, then the fourth request from the client (browser) will be sent, naturally, to the first machine again.
App Instances
As we mentioned previously, you need additional servers to run app instances separately from the Nginx server. From the user’s perspective, the app stays the same; they simply access different app instances thanks to Nginx.
For communication between the application and Nginx, we use a special interface called Rack, which is an application server. There are many application servers for Rails-based apps, the best known being Unicorn, Phusion Passenger, and Puma. Application servers are responsible for input-output, letting the application deal with user requests.
Although Rails applications are mostly monolithic, we can extract a functional block and use it as a standalone service. This will also lessen the load on servers.
Database Instances
Scaling a database is another important aspect of scaling a Rails application. We can deploy a database on the same server as the application to economize, but this economization has several drawbacks. When every application instance saves and retrieves data from its own database instance, then data for the entire application is spread across many machines. This is very inconvenient for website users. If Nginx redirects a user to a different app instance than where their data is stored, they can’t even sign in because their data is located on a different machine!
Another reason to separate a database from other servers is to create a fault-tolerant architecture. If our database receives too many requests and can’t manage them, it collapses. But other database instances in the data tier can accommodate this load, and the Rails app will continue to work.
That’s why, when you are ready to scale your Rails application, your database should be transferred to its own server right away. The database tier can be a single server with a database that’s used by all app instances. Or, this tier can consist of several machines each running databases. In order to update data across all databases, we use database replication. Once a database has new data to share, it informs all others about changes (which is standard procedure for multi-master relationships among databases).
In a master-slave replication variant, databases communicate with the master (or main) database, which stores important data about other databases. All other database instances receive updates only when the master database updates data. Multi-master database replication and master-slave replication are the two common approaches to making our data layer consistent across multiple databases.
When the database receives a large quantity of data, it must save it quickly, update its status quickly, and send new data back to the user. An ordinary relational MySQL database is usually replaced by PostgreSQL or MongoDB databases to accommodate large quantities of data when you scale an app.
In order to further reduce the load on the database, several other techniques can be used. Caching is one important solution, although it’s indirectly related to database scalability. Caching provides cached data from a storehouse to a client more quickly. Imagine that your Rails app compiles statistics about user activity, such as user sessions. This produces a high load on the CPU. We can calculate this information once per day, and then show the cached data to the user at any time.
The other two solutions for reducing the load on a database are Redis and Memcached, which are special databases that perform read/write operations extremely fast.
Other Options when Scaling a Rails App
Let's say you want to spend as little money as possible on new servers. Is it possible to scale your Rails application while minimizing server costs? Indeed it is.
First, consider code optimization. Let's assume your service offers image editing, which is demanding on the the CPU. Given that your startup has evolved very quickly, it’s almost a given that your image editing algorithm wasn't optimized in its initial form, so there’s a lot of code to refactor and optimize. For example, at RubyGarage we needed to parse a huge CSV file with a user data. It took as much as 74 seconds for the algorithm to parse the file. But when our developer removed unnecessary checks and reused some resources, the algorithm took only 21 seconds to accomplish the same task. Code optimization helps you get more out of existing server resources.
Another option is to re-write computationally intensive tasks in a low-level language, such as C++. Due to low concurrency support – because of the Matz Ruby Interpreter (MRI) – Ruby isn't so great for computation compared to low-level programming languages.
It's also possible to stick with Ruby and Ruby on Rails in your project, but deploy a service-oriented architecture in order to scale your application. Imagine your project uses chat functionality (as one of our client’s apps does). The chat can be loaded with many requests from users, while the other parts of this application are used infrequently. It's unnecessary to break the server structure into three tiers. But it's possible to break the application into two parts!
We deployed the application part responsible for chat functionality on a separate server as a stand-alone application. Chat could then communicate with the original application and database, but the load on the main server was decreased. Further, it will be easier, faster, and less expensive to scale just this small part of your application – either vertically or horizontally – should the load increase once more. The service-oriented architecture is a great solution for scaling Rails application and is used by many companies, including Facebook and Shopify.
Ruby on Rails Application Architecture and Scalability
We mentioned before that the Rails application architecture can negatively affect scalability. Scalability issues become apparent if an application consists of many services that cannot run separately from other services.
Let's consider an example of a bad Rails application architecture. An application can constantly save its current status when a user enters new data in a form or visits several web pages successively. If the load balancer redirects this user to the next server, this next server won't know anything about this new data. This will prevent us from scaling the app horizontally. The application’s status must be saved either on the client (browser) or on the database.
If we avoid this common architectural mistake, or similar mistakes, when building a application, we will be able to scale it with no trouble.
Tools that help you scale Rails apps
Proper scalability of a Rails application is only possible when you find and eliminate bottlenecks. Finding bottlenecks is a bit difficult to do manually. And we can’t figure out scalability issues by just scratching our heads.
A much more effective way to locate problems when scaling applications is to use dedicated software. New Relic, Flame Graphs, Splunk, StatsD, and other software programs are all great options, and they work similarly: each checks and collects statistics about your application. We must then interpret the gathered data to figure out what to change.
We can use Flame Graphs to learn about CPU or RAM loads when running our application. When the app interacts with the database, we can track code paths. Learning what processes take the most time can lead us to our application’s bottleneck(s). When we’ve identified a bottleneck, we can design a solution to remove it.
You can see a strangely looking graphic with vibrant colors, which we took from Flame Graph, in the screenshot below. No matter it's appearance, the graphic is genuinely helpful to track how our code works from the very beginning.
In conclusion, we would like to repeat what Shopify said several years ago at a conference dedicated to Ruby on Rails scalability: “There isn't any magic formula to scaling a Rails application.”
In this article, we mentioned some problems that might appear when scaling a Ruby on Rails application, and described several solutions. But applications are unique. Therefore, you must know the characteristics of your particular application in order to overcome your applications specific limitations. It's important to know exactly what part of a Rails application to scale initially as we begin optimizing from crucial issues down to minor bottlenecks.