You'll probably disagree with the ideas posted here or may have some of your own, either way, I encourage you to comments to make this list richer. In the end, the main goal of a scalable system is to maximize throughput and minimize response time with the minimum resource utilization.
NOTES:
- I've tried to keep this post clear and concise (as a checklist) so I left many explanations out.
- Some of the points are targeted towards managed languages like Java or .NET, but they should still be helpful if you're using other technologies.
Software Architecture/Design
- Partition your application into logical layers which can be scaled out to physical tiers when necessary. At the very least define: presentation, business logic, data access and DB layers initially.
- Make sure the above mentioned layers are testable independently.
- Low coupling: connect your applications via services exposing interfaces with well defined contracts
- High cohesion: design stateless services when possible, minimize their scope and promote reusability between them.
- For internal application services, promote the use of TCP or binary connections/serialization (make sure your choice is load balancer compatible). Use SOAP and REST when interoperability is a concern.Keep the business logic in the middle layer/tier. Clients should focus on presentation and authorization (trusted subsystems/facades)
- Avoid or minimize roundtrips between layers/tiers as much as possible:
- Use asynchronous calls from the browser instead of full page postbacks
- Promote middle tier asynchronous service calls
- Cache resources in the middle tier to avoid DB roundtrips and IO
Memory management
- Clean up unmanaged/disposable resources as soon as possible. Instantiate late, dispose early
- Avoid unnecessary boxing/unboxing
- Avoid excessive string concatenations, use string buffers/builders for this purpose
- Do not catch exceptions to validate processes
- Remove event handlers as soon as they're not in use
Multithreading/Concurrency
- Too many threads consumes resources, increases context switching and contention overuses CPU.
- Too few threads unnecessarily affects throughput, underuses CPU.
- Avoid the use of lock (pessimistic), rather promote the use of more optmimistic, lock-free patterns such as Compare-And-Swap (CAS) when possible (use Atomic variables).
- If lock is necessary, reduce granularity as much as possible (e.g. do not use synchronized methods in Java).
- Do not lock static methods, this locks all "instances" of the class
- If you have a long running single-threaded task, and you have multi-core server(s), analyze the possibility of parallelize such task.
Data access
- Preferrably do not use ORM's
- Use stored procedures, do not use inline SQL
- Keep an eye on large result sets, page/cache them efficiently
- Keep an eye on the isolation level at all times
- Focus on connection pooling, use as few trusted identities as possible when connecting to the DB
System Architecture
- Clustering.
- If your app server supports it, try to take advantage of this.
- Configure your Presentation Tier servers as Web Farms.
- Load Balancing. If clustering is not possible...
- For stateless services, scale out the servers by "cloning" them and load balance the requests.
- For stateful services, maintain session information in a coherent cache.
- NOTE: Session management is most effective when done in the Middle Tier, Web Farms should be kept as stateless as possible. This shields the client servers from the internals that deal with the data making your application more maintainable.
- Caching.
- Cache as much data as possible, starting with Reference Data
- For a multiple node cluster/system, use a distributed cache
- Define the cache invalidation strategies carefully, such as activation/passivation and expiration and eviction (e.g. LRU, LFU, etc.).
- Non-blocking I/O based application servers
- Almost all app servers at the time of writing support NIO operations, however is always good to double check.
Databases
- Separate OLTP and OLAP clearly and tune each one independently
- Schema design
- Look to achieve at least 3rd normal form in OLTP's.
- Analyze stored procedure queries and performance on an individual basis, choose your indexes wisely.
- To spread out reads:
- Use Replication (master-slave)
- To spread out writes:
- Use transparent horizontal table partitioning to different file groups/disks (e.g. SQL Server Data Partition Views or MySQL partitioning options)
- Use DB sharding. NOTE: DB routing logic should be kept in the Middle Tier