Data architecture is an important aspect in Software as a Service (SaaS) delivery model. This article is a study on the database of a single-tenant application that could be extended to support multiple tenants. The application is web-based developed using Microsoft technologies – classic ASP, ASP.Net, VB.net and SQL Server 2005.
The key aspects to be considered while moving to multi-tenant data architecture for the application are given below.
Firstly, data isolation should be maintained between different tenants. This is the basic concept of multi-tenancy where data of one tenant should not be available to other tenants.
Secondly, we need to consider some data statistics in the application. Data is inserted into primary tables every 20 seconds and there could be around 100,000 rows per day in each of these tables. In future, there should be provision for data to be added to these tables every 4 seconds. There are around 20-25 tables per tenant and around 5 common tables.
Finally, one of the most important requirements is that changes should be done with no impact to infrastructure as it would mean additional costs.
Multi-tenant data architecture:
With SQL Server (as this is the database being used), the approaches to implement multi-tenancy are Separate Database, Shared Database Separate Schema and Share Database Shared Schema (Refer to Multi-Tenant Data Architecture from MSDN
). Let’s look at the pros and cons of these approaches with respect to the candidate application.
: In this approach, each tenant would have its own separate database.
Data isolation between different tenants can be maintained
This approach is easiest to implement as the database design need not be modified. Tenant-specific connection information would be required to ensure that tenants connect to the correct database
Database backup and recovery process would be easy and will not affect other tenants.
The number of tenants per database server would be limited as per SQL Server settings.
As the number of tables per tenant is less, it does not justify the need for having a database per tenant.
For the common data tables, one of the two approaches has to be followed. Data could be replicated in all the databases which would mean that it has to be maintained across all tenant databases. Alternatively, a common database could contain the common tables used by all tenants.
Shared Database Separate Schema
: In this approach, multiple tenants’ data is stored in the same database with separate schemas containing tenant-specific tables. Each tenant would have access only to the specific schema. Shared tables will be stored in a common schema to which all the tenants would have access.
Data isolation between different tenants can be maintained by creating database users that have access only to specific schemas.
As there are no database design changes, it is easy to implement.
More number of users can be hosted per database server.
Data backup and recovery process needs to be defined for each tenant.
Additional security features needs to be implemented to ensure data isolation.
Shared Database Shared Schema:
In this approach, database and schema is shared across the tenants. Data tables will have data for all tenants and will be identified using TenantId field.
Amongst these 3 approaches, this can host the maximum number of tenants per server.
Database design would change and hence there would be additional development effort.
Any changes to the database design even for one tenant would impact all other tenants.
There could be as many as 1.2 million records per tenant per day in some of these tables. This would impact performance while accessing these tables.
Data backup and recovery needs to be defined for each tenant.
Shared database separate schema approach was suggested for this project as it was the most feasible. Separate database seemed to be the best option. However, due the restrictions on the number of tenants per server this option did not seem economically viable in the long run. Due to the huge data size and additional development/maintenance costs involved, the Shared Database Shared Schema approach was ruled out. Shared Database Separate Schema is the most appropriate as it could be implemented with least maintenance/development costs and is also economically viable.
This is how the data architecture for the suggested approach would be like. Firstly, there would be a common schema which would contain all the common database objects like shared tables and views, stored procedures and functions. This would also include the Tenant Configuration tables. Users will have read-only access to the common schema.
Next, for every tenant there would be one schema. A database user would be created for each tenant with access only to the tenant’s schema and common schema. Refer to figure below.
How to implement:
When a new tenant is added to the application, the following steps needs to be taken for creating the data infrastructure:
1. Create a new schema for the tenant – Schema_T001
2. Create a new database user T001 with access to the above schema and common schema. The default schema for the user T001 is set to this schema - Schema_T001.
3. Create tenant specific tables under the schema Schema_T001
4. Add Configuration details for the tenant in the Common configuration table. This would also include the connection string that will be used by the application to connect to the database.
5. Deny access to T001 on all other tenant schemas in the database.
When a user for this tenant accesses the application, the configuration details are used to connect to the database. Stored procedures will have queries like:
SELECT EmployeeId, EmployeeName FROM tblEmpDtl
As the tenant T001 has default schema set to Schema_T001, the above query will retrieve details from tblVehicle table in Schema_T001 and not from any other schema. Also as T001 is denied access on other tenant schemas, this user will not be able to access data from other tenants. For retrieving data from tables in common schema, table names should be prefixed with schema name Schema_Common.tblTenantCon
Using separate schema approach most of the key aspects are met with minimal impact to cost and time. In-built security features provided by SQL Server ensures that development and maintenance costs are less. By separating tenants into different schemas takes care of performance issues due to data volume and more concurrent users. Also this approach could be implemented on existing servers without any changes to the infrastructure.