This document describes how schema extensibility is typically achieved in commercial enterprise software applications; in particular, in customer relationship management (CRM) applications.
Flexibility Requirements
In CRM applications, implementing organizations need to extend the customer data model with business-specific information. The data model extensions consist of adding single-valued and/or multiple-valued properties to the customer data record.
- The data model extensions must be efficient. The customer data tables typically have tens of thousands to hundreds of thousands of records in a business-to-business setting, and millions to tens of millions of records in a business-to-consumer setting.
- The data model extensions must facilitate efficient query and join processing in the database.
- The data model extensions must not prevent the vendor from easily upgrading the implementation in the field.
- The data model extensions must allow the vendor to ship a single binary code image that can be run in all customer implementations.
General Approach
Vendors typically use a metadata-driven approach:
- Data model extensions are limited to certain tables and are described with schema extension tables.
- The customer of the application can extend the schema by describing the schema extension in a UI, which saves records in the schema extension tables.
- Once the customer has finished designing the schema extension, the application extends the schema by adding columns to the extensible tables (for single-valued properties) or by creating new sub-tables (for multi-valued properties).
The implementing organization wishes to add a "favorite color" field to their customer records. The extension table has the following schema:
Extension Table Schema:
COLUMN NAME | COLUMN TYPE |
Table_name | Varchar |
Prop_name | Varchar |
Prop_type | Varchar |
Prop_length | Integer |
Prop_precision | Integer |
Prop_singlevalued | Boolean |
Through a UI, the user creates the following record:
COLUMN NAME | COLUMN VALUE |
Table_name | Customer |
Prop_name | Favorite_Color |
Prop_type | Varchar |
Prop_length | 25 |
Prop_precision | Null |
Prop_singlevalued | True |
When the user is satisfied (ie when the user hits a button), the application executes the schema change by issuing the following SQL statement:
ALTER TABLE Customer
ADD COLUMN Favorite_Color varchar(25);
Notes:
The Column_precision column will only be used for NUMERIC fields.
Extension columns must typically be NULL-able or the user must provide a default value for the new column.
Example: Multi-Valued Schema Extension
The implementing organization wishes to associate a list of purchased products with a customer record. The schema extension record looks like this:
COLUMN NAME | COLUMN VALUE |
Table_name | Customer |
Prop_name | Purchases |
Prop_type | Integer |
Prop_length | 11 (or NULL) |
Prop_precision | Null |
Prop_singlevalued | False |
The application executes the following schema change in response:
CREATE TABLE Customer_Purchases(
Customer_id INT NOT NULL,
Purchases_id INT NOT NULL,
Purchases_Value INT (11),
PRIMARY KEY (customer_id, purchases_id),
FOREIGN KEY (customer_id)
REFERENCES Customer (customer_id)
);
Thus, the Customer_Purchases table associates a set of Purchases values with the Customer record.
Scalability Issues
Record Length Limits and Views
Databases typically limit the size of a single record. Implementations deal with this by putting all the extension fields in a separate table joined back to the main table. If necessary, a view can be created to maintain the illusion of a single table:
-- this table is not extensible:
CREATE TABLE Customer_Base (
Customer_id INT NOT NULL PRIMARY KEY,
[OTHER FIELDS GO HERE]
);
-- this table is extensible
CREATE TABLE Customer_Ext (
Customer_id INT NOT NULL PRIMARY KEY,
);
-- add view
CREATE VIEW Customer AS
SELECT CB.*, CX.*
FROM Customer_Base CB JOIN Customer_Ext CX
ON CB.customer_id = CX.customer_id
The view is mainly there to hide the details of the customer schema from report writers.
Indexes
It is generally not a good idea to allow users to easily create indexes on the single-valued schema extension columns. Indexes impose overhead and should only be added to extension fields after it is determined that the index will result in a speedup that justifies the index maintenance overhead.
Hi, found in google that you blogged on "NullPointerException: Illegal null name at com.tagish.auth.TypedPrincipal.".
ReplyDeleteI have the same problem and am wondering on how you solved it; your post on this seems to be gone...
Move-yo: The tagish post is kind of obsolete. I've been using pluggable ldap authentication providers if I want to use Active Directory to authenticate java web apps. e.g. http://community.jboss.org/wiki/ldapextloginmodule
ReplyDelete