Today’s subscription businesses require speed and flexibility to launch new products in the market, the ability to automate manual processes, and visibility to the key metrics required to make accurate business decisions. The SaaS infrastructure must be able to support reliable 7x24x365 operations and deliver these capabilities requires a reliable ‘enterprise-grade’ system with services that are built on a secured, high-performing and scalable infrastructure.
This Academy guide will cover the key capabilities required to support your mission-critical subscription business operations.
Note that cloud security is covered in it’s own guide.
SaaS applications need to deliver high speed system performance.SaaS applications face the internet – this requires ongoing investment to continue improvements in linear scaling. Customers rely on their service providers to meet and maintain system performance requirements. Application vendors should provide detailed information about system performance and availability.
At the very heart of Zuora’s 9 Keys framework, is a commitment to deliver the performance, scalability and capacity to support the requirements of complex operations in a mission-critical billing system.
‘High performance’ is typically measured by:
Synchronous APIs = operation that issues API request and waits until it gets a response.
Asynchronous APIs = processes that run in the background e.g. monthly bill runs, data synching with SFDC, large scale data extracts
For example, Zuora uses both Synchronous and Asynchronous APIs to communicate all requests within the application and achieve maximum performance objectives.
Capacity and response times
I. Synchronous Transaction Capacity:
This is especially important for high volume businesses like B2C consumer services. There is a common requirement for a sustained peak of at least 100 transactions per second. Another benchmark is Amazon.com, whose peak rate on cyber Monday in 2012 was 306 orders per second. Zuora exceeds both of these benchmarks.
II. Asynchronous Transaction Capacity:
Needs to scale out horizontally to handle whatever load it has to carry. As with any billing system, the bill runs are intensive and there’s a need for near linear scaling of bill run performance as you fan out horizontally across multiple threads. Zuora is architected to scale out bill runs horizontally.
III. Synchronous Transaction Response Times:
Response times vary based on the complexity of the operation that the API supports – e.g. some Zuora APIs perform a very targeted operation that can be completed very rapidly (in as little as 5 ms). Others perform an entire end-to-end business use case and can take longer to execute because they interact with 3rd party systems such as payment gateways. Most Zuora synchronous transactions execute in a range of 5 and 300 ms.
How is your service provider demonstrating a commitment to performance and scalability?
Zuora continues to improve and invest in our infrastructure to support larger, more scalable enterprise customer architectures. Over time, Zuora has delivered:
You can have the fastest application in the world but if you don’t have a fast network you won’t see the performance. It’s important to use web performance solutions to improve your response time and throughput. Savvy cloud application vendors know how to do this. Zuora uses Akamai’s web application accelerator to slash internet latency.
As SaaS applications add more customer accounts and add feature functionality, there’s a need to ensure that system performance keeps up pace with added scale to avoid performance bottlenecks.
There’s typically a performance tradeoff between horizontal (scale out) and vertical (scale up) architecture. Typically, you lose some vertical capability when you increase horizontal and vice versa. Good companies should always prefer horizontal since it aligns to business growth. Vertical scale can be added later once horizontal scale is achieved. Zuora is architected and built on a horizontal architecture.
Best practices in building scalable SaaS architecture include the following:
Snapshot: Zuora’s own scalable architecture
Zuora’s core platform has been architected from the ground up to organically scale to massive volumes. The core system architecture and application are designed to scale horizontally at the application level, messaging infrastructure level as well as the database level. The diagram below represents the architecture of Zuora’s cloud infrastructure.
Zuora utilizes the following resources to ensure that the platform is scalable at all levels:
Load Balancer/Web Servers:
Running mission critical applications like a SaaS-based Relationship Business Management system requires a provider focused on maintaining high service availability and ensuring your applications and data are recoverable in the case of a disaster.
High Availability refers to ensuring the highest level of availability of the service by providing redundancy at all the layers of the architecture so that if one infrastructure component (network, server, storage) fails, overall service remains available.
Disaster Recovery addresses service continuity in the case of a disaster that affects the physical datacenter, so that service is maintained through a standby site. Two independent environments, typically in separate and distinct facilities, each contain their own data (in the file system and database) and executables. Data and configuration information are replicated between the production and standby sites.
Defining your recovery objectives (source – MSDN):
RTO (Recovery Time Objective): The duration of acceptable application downtime, whether from unplanned outage or from scheduled maintenance/upgrades. The primary goal is to restore full service to the point that new transactions can take place.
RPO (Recovery Point Objective): The ability to accept potential data loss from an outage. It is the time gap or latency between the last committed data transaction before the failure and the most recent data recovered after the failure. The actual data loss can vary depending upon the workload on the system at the time of the failure, the type of failure, and the type of high availability solution used.
RLO (Recovery Level Objective): This objective defines the granularity with which you must be able to recover data — whether you must be able to recover the whole instance, database or a set of databases, or specific tables.
Best practices for High Availability and DR capabilities of a Relationship Business Management system:
Zuora meets or exceeds these availability and DR requirements.
Leading SaaS application providers should use the right data centers for mission-critical applications. A good provider may choose to use either predominantly a private cloud delivery model or a public cloud service for delivery of services, depending on requirements and mission criticality. It’s good to use a hybrid approach to balance for scale-out agility and cost to serve.
Zuora has two state of the art datacenter facilities to ensure the highest levels of security, performance, availability and DR failover. Zuora uses a private cloud for it’s production environments and a public cloud for development and test.
Both of the datacenter facilities are 100% synchronized. The second datacenter runs in warm standby and is able to take over full service capacity. Each datacenter is located in separate disaster zones (Las Vegas and San Jose) and are used by notable customers with high security and availability requirements (e.g. Wells Fargo and Paypal).