Increasing or decreasing the capacity of a system by making effective use of resources is known as scalability. A scalable system can handle increasing numbers of requests without adversely affecting response time and throughput.
In software engineering, scalability is a desirable property of a system, a network, or a process, which indicates its ability to either handle growing amounts of work in a graceful manner or to be enlarged.
For example, it can refer to the capability of a system to increase total throughput under an increased load when resources (typically hardware) are added.
Scalability, as a property of systems, is generally difficult to define and in any particular case it is necessary to define the specific requirements for scalability on those dimensions which are deemed important.
It is a highly significant issue in electronics systems, database, routers, and networking.
The concept of scalability applies to technology as well as business settings. The base concept is consistent - The ability for a business or technology to accept increased volume without impacting the contribution margin (= revenue - variable costs). For example, a given piece of equipment may have capacity from 1-1000 users, and beyond 1000 users, additional equipment is needed or performance will decline (variable costs will increase and reduce contribution margin).
Scalability is working in parallel.
Complexity and human-reliant approaches don’t scale; simplicity and algorithm-driven approaches do.
- In the past: “Works even if data doesn’t fit in main memory”
- Now: “Can make use of 1000s of cheap computers”
- In the past: If you have N data items, you must do no more than operations – polynomial time algorithms
- NOW If you have N data items, you must do no more than operations, for some large k. Numbers of computers. Polynomial-time algorithms must be parallelized
- Soon: Streaming: If you have N data items, you should do no more than operations
- As data sizes go up, you may only get one pass at the data
- The data is streaming —— you better make that one pass count
- Ex: Large Synoptic Survey Telescope (30TB I night)
A system whose performance improves after adding hardware, proportionally to the capacity added, is said to be a scalable system.
Algorithm / Program
An algorithm, design, networking protocol, program, or other system is said to scale if it is suitably efficient and practical when applied to large situations (e.g. a large input data set or large number of participating nodes in the case of a distributed system). If the design fails when the quantity increases then it does not scale.
A scalable online transaction processing system or database management system is one that can be upgraded to process more transactions by adding new hardware resources (processors, devices and storage) and which can be upgraded easily and transparently without shutting it down.
“Scale up” is when you upgrade a machine to a more powerful machine (e.g. faster CPU, faster GPU, engine with more HP, etc…) to get more processing power.
To scale vertically (or scale up) means to add:
- resources (addition of CPUs or memory)
- or process (addition of the same process)
to a single node in a system.
Such vertical scaling of existing systems also enables them to use virtualization technology more effectively, as it provides more resources for the hosted set of operating system and application modules to share.
Taking advantage of such resources can also be called “scaling up”, such as expanding the number of Apache daemon processes currently running.
Example: You currently have a 800MHz CPU in your computer. To increase processing power, you decide to buy a new computer with an 8-core 3.4GHz CPU to replace your old computer.
“Scale out” is when you increase the number of processing machines (computers, processors, servers, etc) to increase processing power.
To scale horizontally (or scale out) means to add more nodes to a system, such as adding a new computer to a distributed software application. An example might be scaling out from one web server system to three.
As computer prices drop and performance continues to increase, low cost “commodity” systems can be used for high performance computing applications such as seismic analysis and biotechnology workloads that could in the past only be handled by supercomputers.
Hundreds of small computers may be configured in a cluster to obtain aggregate computing power.
The scale-out model has created an increased demand for shared data storage with very high I/O performance, especially where processing of large amounts of data is required, such as in seismic analysis. This has fuelled the development of new storage technologies such as object storage devices.
You also must scale out processes to achieve redundancy when you want to configure a highly available environment.
Example: Your wagon is currently pulled by one horse. To increase the horsepower of your wagon, you buy 10 more horses to pull your wagon.
There are tradeoffs between the two models.
Larger numbers of computers means increased management complexity, as well as a more complex programming model and issues such as throughput and latency between nodes; also, some applications do not lend themselves to a distributed computing model.
Scale up got expensive fast. An other solution is to buy cheaper, commodity boxes to scale_out instead of up, distributing the application (database,…) out over hundreds or even thousands of servers.
In the past, the price difference between the two models has favoured “scale out” computing for those applications that fit its paradigm, but recent advances in virtualization technology have blurred that advantage, since deploying a new virtual system over a hypervisor (where possible) is almost always less expensive than actually buying and installing a real one.
A number of different approaches enable databases to grow to very large size while supporting an ever-increasing rate of transactions per second.
Not to be discounted, of course, is the rapid pace of hardware advances in both the speed and capacity of mass storage devices, as well as similar advances in CPU and networking speed. Beyond that, a variety of architectures are employed in the implementation of very large-scale databases.
- Partitioning and cluster
One technique supported by most of the major DBMS products is the partitioning of large tables, based on ranges of values in a key field. In this manner, the database can be scaled out across a cluster of separate database servers. (MPP)
Also, with the advent of 64-bit microprocessors, multi-core CPUs, and large SMP multiprocessors, DBMS vendors have been at the forefront of supporting multi-threaded implementations that substantially scale up transaction processing capacity.
Network-attached storage (NAS) and Storage area networks (SANs) coupled with fast local area networks and Fibre Channel technology enable still larger, more loosely coupled configurations of databases and distributed computing power.
The widely supported X/Open XA standard employs a global transaction monitor to coordinate distributed transactions among semi-autonomous XA-compliant database resources. Oracle RAC uses a different model to achieve scalability, based on a “shared-everything” architecture that relies upon high-speed connections between servers.
While DBMS vendors debate the relative merits of their favoured designs, some companies and researchers question the inherent limitations of relational database management systems. GigaSpaces, for example, contends that an entirely different model of distributed data access and transaction processing, named Space based architecture, is required to achieve the highest performance and scalability. On the other hand, Base One makes the case for extreme scalability without departing from mainstream database technology. In either case, there appears to be no end in sight to the limits of database scalability.
More hardware for scalability ?
It is often advised to focus system design on hardware scalability rather than on capacity.
It is typically cheaper to add a new node to a system in order to achieve improved performance than to partake in performance tuning to improve the capacity that each node can handle.
But this approach can have diminishing returns (as discussed in performance engineering).
For example: suppose a portion of a program can be sped up by 70% if parallelized and run on four CPUs instead of one.
- α is the fraction of a calculation that is sequential,
- 1 − α is the fraction that can be parallelized,
- and P the number of processor
then the maximum speed-up that can be achieved is given according to Amdahl's Law:
Substituting the values with an α of 0.3, we get:
- for 4 processors: 2.105.
- for 8 processors: 2.581.
Doubling the processing power has only improved the speed-up by roughly one-fifth. If the whole problem was parallelizable, we would, of course, expect the speed up to double also. Therefore, throwing in more hardware is not necessarily the optimal approach.