CONCLUSIONS:
When using natural keys, as they have a business meaning, the parent key information is available within the child table.
Using surrogate key requires one more join: sql queries are more complex, and less efficient.
=> If not needed for other reasons, the rule is to avoid unnecessary surrogate keys here.
=> If a surrogate key is however chosen, we need to tune the join operation (indexing the foreign key for example).
2- Business key updates:
REQUIREMENT:
User SCOTT is moving to another company site. His old phone number was 99-1234. His new phone number is 11-6789.
There is two possible scenarios here: old calls from previous number are kept (scenario A-) , or they must be reassigned to the new phone number (scenario B-)
NATURAL KEY MODEL:
A- If business wants to keep the call history assigned to the old phone number, then there is no need to cascade the update:
update PHONELINES set USER_NAME=null where PREFIX_NO=99 and EXTENSION_NO=1234;
insert into PHONELINES (PREFIX_NO,EXTENSION_NO,USER_NAME) = ('11','6789','SCOTT') ;
commit;
There is no foreign key lock issues here because the CALLS foreign key is indexed: as it is part of the primary key, then the foreign key is the first column of the index.
If it were not the case (non-indexed foreign key) then the delete will wait until there is no concurrent insert in CALLS - even if they don't concern the same phone number.
So in that case the modification cannot be considered as an OLTP operation and may have to be planned at offline hours to avoid locks.
B- Now, if business wants to move the call history to assign it to the new phone number - as if calls were originated by this new number - then we really have a mutable key problem.
We need to cascade the update to all calls originated by the old number and Oracle (as well as most other RDBMS) does not have efficient ways to do that.
insert into PHONELINES (PREFIX_NO,EXTENSION_NO,USER_NAME) = ('11','6789','SCOTT') ;
update CALLS set PHONELINE_PREFIX_NO=11 and PHONELINE_EXTENSION_NO=6789 where PHONELINE_PREFIX_NO=99 and PHONELINE_EXTENSION_NO=1234;
commit;
That is a big query that may have performance or concurrency issue if run online. It should be planned as an offline operation.
It may even have to be considered as a maintenance operation, such as when the phone prefix plan changes do to company reorganization.
SURROGATE KEY MODEL:
A- If business wants to keep the call history assigned to the old phone number, then the new number is seen as a new phone line:
update PHONELINES set USER_NAME=null where PREFIX_NO=99 and EXTENSION_NO=1234;
insert into PHONELINES (PHONELINE_ID,PREFIX_NO,EXTENSION_NO,USER_NAME) values (PHONELINE_SEQUENCE.NEXTVAL,'11','6789','SCOTT') ;
commit;
B- If business want to move the call history to assign it to the new phone number, then the phone line do not change and only its phone number is changed:
update PHONELINES set PREFIX_NO=11 ,EXTENSION_NO=6789 where USER_NAME='SCOTT';
commit;
Both operations are quick as they do not update the primary key (PHONELINE_ID). They can be normal OLTP operations.
CONCLUSIONS:
Surrogate keys are necessary when the business key is not static enough (static as the with the '{frozen}' UML property)
But, if the updates to the business key are rare and can be considered as an offline batch operation, or as a maintenance operation, then business key can still be chosen as the primary key.
However, we have seen 2 scenarios here, and we need more information from business to make the choice.
The fact that we cannot identify a static natural key may point to the fact that we have done an incomplete functional analysis.
In our example, we can imagine that we must distinguish the phone number from the physical phone line, and have an association table that records the assignation of a phone number to a phone line. If we introduce a surrogate key here, we hide the fact that the surrogate key is in reality a new business entity - the phone line - distinct from the phone number.
The risk here is that one day business needs to manipulate that entity (as it has a business meaning) and then it is not a surrogate key anymore.
=> The rule is to introduce a surrogate key when the natural key can change during online activity.
=> However, it is a good practice to check first if that surrogate key does not hide a business concept that was forgotten during functional analysis.
3- Candidate key values are unknown at object creation (at insert):
REQUIREMENT:
We need to create a phone line, and have calls from it, even before a phone number is assigned to it.
NATURAL KEY MODEL:
Primary key cannot be null. So we must assign a dummy value (that is not a real natural key as it has no business meaning), and then we will have to update it and cascade the update to the calls that have already be recorded. A primary key that is not know at insert time is not a good primary key.
SURROGATE KEY MODEL:
There is no problem here when using a surrogate key. It is always assigned at insert time (usually with a sequence).
We can even create a row with only the surrogate key and update attributes later. But in that case, we cannot use not null integrity constraints on the other columns.
CONCLUSIONS:
Surrogate keys are needed when we don't have a natural key assigned at the creation of the row.
However, that may reveal an incomplete functional analysis.
In our example, it seems again that there is an ambiguity between phone numbers and phone lines, and we probably need 2 entities (tables) for that.
Phone number is not an attribute of the phone line in that use case. It should be another entity that is associated to a phone line.
=> Surrogate keys must be introduced when natural key is not known at insert.
=> However, it is a good practice to verify first if that surrogate key does not hide a business concept that was forgotten during functional analysis.
4- Partitioning possibilities:
REQUIREMENT:
We need scalability when number of customer increases, and we need to regularly purge old orders.
NATURAL KEY MODEL:
Table CALLS can be partitioned by range of CALL_START_DATETIME so the performance of queries on recent calls is not dependent on the history retention.
All indexes can be local so that purging old calls is a quick operation (truncate partition) without maintaining/rebuilding indexes.
Table CALLS can also be partitioned (or subpartitionned) by hash or list on PHONELINE_PREFIX_NO so that filtering calls from a specific prefix is quick.
SURROGATE KEY MODEL:
Table CALLS cannot be partitioned on PHONELINE_PREFIX_NO. It can be partitioned on PHONELINE_ID but that does not help queries on PREFIX_NO
Table CALLS can be partitioned by range of CALL_START_DATETIME but then the unique index that enforces the primary key (CALL_ID) cannot be local.
Purging old data when having a global index will be less efficient.
CONCLUSIONS:
There are severe restrictions on partitioning when introducing surrogate keys because less business attributes are present in the child table, and because unique index for the surrogate key cannot be local if it is not part of the partition key.
Two oracle features can overcome those problems when using surrogate keys:
- Referential partitioning (available in 11g) to partition on a parent table column.
- 'UPDATE INDEXES' clause when doing operations on partitioned tables in order to maintain the global indexes (but has some performance overhead)
=> If not needed for other reasons, the rule is to avoid unnecessary surrogate keys for tables that may be partitioned (small volume is not concerned)
=> If surrogate key is however chosen, some Oracle features can be investigated to limit the disadvantages.
5- Indexes overhead:
REQUIREMENT:
The CALLS table is heavily transactional and inserts on it is a transaction where performance requirements are critical.
In addition, we need indexes to search calls by origin (phone line) or destination (partner phone line)
NATURAL KEY MODEL:
We have 2 indexes on the calls table :
- one for the calling line / call time (PHONELINE_PREFIX_NO, PHONELINE_EXTENSION_NO, CALL_START_DATETIME) to enforce primary key, and to search by call origin
- one for the call partner (PARTNER_PREFIX_NO, PARTNER_EXTENSION_NO, CALL_START_DATETIME) to enforce uniqueness and allow searching by call destination
If both indexes have a b-tree of 2 levels, each insert on CALLS accesses 7 blocks (1 for the table, and, for each index, 2 branch blocks to follow b-tree plus 1 leaf block).
SURROGATE KEY MODEL:
We have 3 indexes on the calls table :
- one on CALL_ID for the primary key
- one for the calling line / call time (PHONELINE_ID_ID,CALL_START_DATETIME) to enforce uniqueness and search by call origin,
- one for the call partner (PARTNER_PHONELINE_ID,CALL_START_DATETIME) to enforce uniqueness and allow searching by call destination.
Here we have one more index to maintain for each inserted rows: 3 more blocks must be accessed which is in this case an additional performance cost of 42%
Besides that, that additional index on the surrogate key has constantly increasing values (as generating them from a sequence is the only way to avoid contention to generate unique id) and all inserts goes to the high end of the b-tree index. That is very good for index space as there is no need to keep free space in other blocks, but that can result in contention on that high end block. A possible solution is the use of reverse index, but then index will need free space in all blocks for new inserts. We have to decide to implement normal or reverse key index for that surrogate key.
CONCLUSIONS:
Surrogate key often introduce another index where natural key index can serve 2 goals: key enforcement as well as business meaning.
In addition when table is partitionned, the index cannot be local. So that this overhead is not scalable.
=> If not needed for other reasons, the rule is to avoid unnecessary surrogate keys when performance requirement for inserting new rows is important.
=> If still chosen, this overhead must be evaluated, the usage of reverse index or not must be analyzed carefully for the index on the surrogate key.
6- Special values handling:
REQUIREMENT:
We have calls that are not assigned to any partner phone lines:
PHONE 99-1234 dials an erroneous number : '1234-12'
PHONE 11-6789 calls his voicemail
NATURAL KEY MODEL:
Those numbers are not real phone lines, so they cannot have a key that has a business meaning.
We can have null value for the partner phone number:
insert into CALLS ( PHONELINE_PREFIX_NO, PHONELINE_EXTENSION_NO, CALL_START_TIME, PARTNER_PREFIX_NO, PARTNER_EXTENSION_NO, DURATION )
values ('99','1234',to_date('01-JAN-2007 08:30:00','dd-mon-yyyy hh24:mi:ss'), null , null , 3);
But we have only one special value - NULL - that means that number is unknown.
And in addition those calls will not be indexed with a regular index on (PARTNER_PREFIX_NO,PARTNER_EXTENSION_NO)
If we want to record the call to voicemail (and not have it as unknown also) then we need to introduce a special phone line when we create our PHONELINES table:
insert into PHONELINES (PREFIX_NO,EXTENSION_NO,USER_NAME) values ('XX','VMAIL',null)
And then the call to the voicemail is recorded as:
insert into CALLS ( PHONELINE_PREFIX_NO, PHONELINE_EXTENSION_NO, CALL_START_TIME, PARTNER_PREFIX_NO, PARTNER_EXTENSION_NO, DURATION )
values ('99','1234',to_date('01-JAN-2007 08:35:00','dd-mon-yyyy hh24:mi:ss'), 'XX', 'VMAIL', 60);
But that is not anymore natural key as it has only a technical meaning.
SURROGATE KEY MODEL:
Surrogate keys, as they have only a technical meaning can have infinite special values that are set when we create the PHONELINES table. We add a SPECIAL_NAME column on PHONELINES to differentiate them:
insert into PHONELINES (PHONELINE_ID,PREFIX_NO,EXTENSION_NO,USER_NAME,SPECIAL_NAME) values (-1,null,null,null,'error')
insert into PHONELINES (PHONELINE_ID,PREFIX_NO,EXTENSION_NO,USER_NAME,SPECIAL_NAME) values (-2,null,null,null,'voicemail')
Then CALLS can reference those values as any other phone line.
CONCLUSIONS:
A natural key must have a business signification for all values. Only NULL (that has the business meaning of 'unknown') is allowed.
If there is a need to have several technical values, that have no business meaning, then a surrogate key can be helpful.
But, that may also come from an incomplete functional analysis.
First, with the surrogate key we have introduced a new column to differentiate those values. That may hide the fact that the PHONELINE covers different business concepts.
In addition, what if the dialed number do not fit in the phone number format (2 + 4 digits) ? Or if a future evolution of the system allows external calls, with another digit pattern ?
In our case, we probably have 2 business concepts: the internal phone line, and the dialed number.
However, choosing surrogate keys to group similar business concepts can be a good choice in datawarehouse star schemas, in order to limit the number of dimensions.
Inheritance implementation is also a case where we need one key to cover different entities. An example is one table that group all company actors (suppliers, customers, employees) which have a different natural key.
=> Surrogate keys are useful to record several special (technical) values for one entity, or to implement all classes in an inheritance tree into one table.
=> However, it is a good practice to verify first if that surrogate key does not hide a business concept that was forgotten during functional analysis.
7- Surrogate keys visible to users:
REQUIREMENT:
A web application displays the call history, using an URL that gets directly to the phone line call history
NATURAL KEY MODEL:
URL will be something like 'http://application.intranet.com/showCalls?phone_prefix_no=99&phone_extension_no=1234'
The user can bookmark that link, send it by mail to a colleague, etc. The URL manipulates only business values.
SURROGATE KEY MODEL:
URL will be something like 'http://application.intranet.com/showCalls?phone_id=0026874'
The user now manipulates blindly a value that should not be used outside of the database.
In the case where a user moves and changes his phone number, then the bookmarked URL will point to incorrect data.
This is not a rare example. We all have bookmarks in our navigator that points to a '404: NOT FOUND' page because the page location has changed: page location was internal to the web application system, but were made visible outside to end users.
This is an example with URL. It is also frequent to see surrogate keys printed in some reports, on product labels or identification cards.
CONCLUSIONS:
When using surrogate keys, there is a risk that, one day, the key is visible outside the system, and then becomes a natural key.
Business often needs to identify its objects. If business has not been able to define a natural key, but need to have an identifier, we can see situations where they will use the surrogate key. Then, that key has both disadvantages of surrogate key and natural key.
If business do not have a natural key, but needs one, then we can generate one from a sequence, and make it visible to the end user. however, this is not a surrogate key. it is a real natural key that is assigned by the system, but then can be manipulated outside of it. An example is a customer number: the system checks if the customer is an existing one (same name, same address, etc...) and generate a value for new ones. Customer number is a natural key that has a business meaning, it can be exchange between several systems (customer care, orders, billing, accounting, etc), The customer is aware if his customer number.
In this example, Business had no natural key (as name, address, email, etc. are not good natural keys: they are not identifying, and are mutable). So the IT systems creates a natural key for them.
=> The rule is to introduce surrogate key only for technical reason, and not because business need an identifier that they can't define.
=> It is a good practice to verify first if that surrogate key does not hide the need for a new business attribute, needed for the whole information system (IT and non IT).
8- Identifying natural key:
REQUIREMENT:
From functional analysis, we have difficulties to define a real business key. For example, we are not sure that (phone line + call time) uniquely identifies a call when using special services such as conference calls (one phone line can call several numbers at the same time, and each one are a different call).
NATURAL KEY MODEL:
That is a real problem as we must be sure that primary key is unique. And changing the primary key definition concerns several tables, so it is a very deep change.
If a future evolution need to accept 2 calls that can be issued from the same phone at the same time then we need to revisit a big part of our data model, and related software components.
SURROGATE KEY MODEL:
As the referential integrity do not depend on the business key, we can accept badly defined business keys, or future evolution of the business concept definitions.
If 2 calls can be issued from the same phone at the same time, then only the CALLS table structure will change.
CONCLUSIONS:
Surrogate keys can help when business candidate keys are not properly identified. However, business usually need one day to identify the objects they define, and there is a risk when using surrogate keys that it hides some errors done during analysis.
For example, if business do not have a real mean to identify calls, then one day they will probably use our surrogate key to identify them: first within the system, later with external systems, and finally with the end user.
=> Candidate keys are often difficult to identify, and surrogate keys can be introduced.
=> However, that must not be used as an easy way to design on bad specifications.
9- Merging different systems:
REQUIREMENT:
Each company site have the same phone software, but deployed at each company sites, with their own databases.
Now that the network bandwidth is enhanced, and IT department wants to merge all those systems into one database.
NATURAL KEY MODEL:
As natural keys have a meaning outside the system, there is no problem to merge calls from the different company sites. There will not be duplicates as phoneline prefix is different among sites. This maintenance operations involves the following operation on the target database:
insert into PHONELINES select * from PHONELINES@DBLINK_SITE1 union all select * from PHONELINES@DBLINK_SITE2 ...
insert /*+ append */ into CALLS select * from CALLS@DBLINK_SITE1 union all select * from CALLS@DBLINK_SITE2 ...
commit;
Then change the sequence current values to the maximum existing ones.
That is straightforward and has optimal performance.
SURROGATE KEY MODEL:
Here each system has generated its surrogate keys from its own sequence. We cannot merge that as-is because we will have duplicates. We need to re-numerate the surrogate keys.
- import PHONELINES assigning a new PHONELINE_ID from the sequence
- join the source CALLS table with the PHONELINES table to translate surrogate key into PHONE_PREFIX_BO and PHONE_EXTENSION_NO
- join the result with the target PHONELINES table to get the new PHONELINE_ID associated with PHONE_PREFIX_BO and PHONE_EXTENSION_NO
- insert that result to target CALLS table
That usually introduces a step where identifying a natural key is needed to be consistent between the systems.
CONCLUSIONS:
Replicating data from different system is a complex operation when using surrogate keys because those keys are relevant to one system only.
That operation is not a frequent online operation, but same issue is encountered in several real-life situations:
- merge several systems into one
- distribute application in several systems
- share data with external systems
- feed a datawarehouse from different operational systems
For those operations, surrogate keys must be resolved to natural keys before data is transported.
=> If not needed for other reasons, adding surrogate keys adds complexity to several operations that transport data between different systems.
=> If surrogate key was introduced because of the difficulty to identify a business identifier, that difficulty will need to be solved for those operations.
10- Natural key evolution:
REQUIREMENT:
Our system evolves to manage several companies:
We need to add a COMPANY_CODE to the phone number to properly identify a phone.
NATURAL KEY MODEL:
We need to add COMPANY_CODE column to the PHONELINES table, and that column will be part of the key. Probably referencing a COMPANIES table.
As a consequence, all child tables must add the company code in their foreign keys: several DDL changes, and it increases the key length.
The good thing is that we can partition those table by the company code to get same performance as when they were multiple databases.
SURROGATE KEY MODEL:
We add COMPANY_CODE, but it is not part of the key, so there is no DDL changes to cascade. As we are using surrogate keys, we will probably add a COMPANIES table with COMPANY_ID surrogate key , and COMPANY_CODE non-key column. So the drawback is that we will have one more table to join if we use a predicate on company code. And no possibility to use partitioning on that.
CONCLUSIONS:
When a new attribute is introduced in the business key, the change is more complex when using natural keys.
But that evolution lead to disadvantages at both sides:
- the number of columns for composite natural keys, and the number of changes in the data model.
- the number of joins when using surrogate keys, the lack of possibility to partition child tables on that attribute.
As those points are some of the parameters that influence the decision on surrogate vs. natural key, the the decision should be re-analyzed.
=> Surrogate keys can be introduced for flexibility in schema evolution - if not prevented for other reasons.
=> However, in both cases, we need to investigate the consequences on other concerns.
11- Composite key:
REQUIREMENT:
Our system has a high depth of referential integrity.
We already added the company code to the phone line, and in addition to the model described above, one call has several services associated to it, each service has a rate that depends on the duration, so we have to detail calls into call services and call services into call service time buckets.
NATURAL KEY MODEL:
Each master-detail relationship adds a new column to the key. the CALL_SERVICE_TIME_BUCKET table will have the following primary key:
PHONELINE_PREFIX_NO,PHONELINE_EXTENSION_NO,CALL_DATETIME,SERVICE_CODE,BUCKET_NUMBER
Even if compound keys are not a problem by themselves, we can go to a limit where sql queries will be too complex to read, where indexes will be very large, and where joining tables on those columns will be less effective that with a single number.
There is no limit, but having 5 or more large columns can be bad.
SURROGATE KEY MODEL:
The key is always one single numeric column. However, composite keys are still needed for association tables that implement the many-to-many relationship.
CONCLUSIONS:
The complexity of composite keys must be analyzed to take a decision about natural or surrogate key.
=> Surrogate key may need to be introduced at some level to avoid composite natural keys become too large.
12- Cost Base Optimizer:
REQUIREMENT:
We have very different distribution of calls from the different sites (recognized by the prefix number)
and we need optimized performance for all queries.
NATURAL KEY MODEL:
We create histograms on PHONELINE_PREFIX_NO columns, so that the Oracle
optimizer can estimate the accurate cardinality for queries that are
done on a specific prefix.
SURROGATE KEY MODEL:
We can create that kind of histograms only on PHONELINES, that will
show the distribution of phone lines among company sites, but not the
distribution of calls.
CONCLUSIONS:
Natural keys have more information about the data that can be used by the optimizer.
Even without histograms, the CBO has more information when the
appropriate datatype is used and composite keys have statistics for all
part of them. So the complexity we have with composite keys can be a
real advantage for the optimizer.
=> If not needed for
other reasons, natural keys may be preferred to give more information
about data to the Oracle optimizer.
=> If surrogate key is introduced to replace a large composite key,
we must check if we don't hide important information (metadata) to the
optimizer.
13- Storage (Row Size):
REQUIREMENT:
Our system has to store a large call history and we want to avoid unnecessary storage, so we need to have the minimal rows length.
NATURAL KEY MODEL:
PHONELINES has the minimal size as it records only business columns
CALLS has however to store the foreign key (PHONELINE_PREFIX_NO,PHONELINE_EXTENSION_NO) that is 6 bytes.
SURROGATE KEY MODEL:
PHONELINES has an additional column for the foreign key, and an additional index on it.
If we have less than one thousand phone lines it takes only 3 bytes per row, so the overhead is not so important.
CALLS replaces the natural foreign key with the surrogate key that is a number. If we have one million calls, the number takes less than 4 bytes, instead of the 6 bytes for the primary key. Here again the gain is not so important.
CONCLUSIONS:
The difference on the average row size is not very important, and it depend on the size of the natural key, the depth of referential integrity, etc.
=> The size of rows is not a real argument to choose surrogate or natural key, except special cases (natural key very large, and big detail table for example)
14- Number of possible key values :
REQUIREMENT:
We need to be sure that there is no limit on the number of calls to store.
NATURAL KEY MODEL:
There is no limit: all values are possible as they have all a business meaning.
SURROGATE KEY MODEL:
A number can store 38 digits. No limit will be exhausted even when generating billions of numbers per seconds during hundred of years.
CONCLUSIONS:
=> number of values for the key is not an argument.
15- Data modeling tools:
REQUIREMENT:
We want to use some data modeling tool that do not support composite keys very well.
NATURAL KEY MODEL:
We may have issues as natural key is often composed of several columns.
SURROGATE KEY MODEL:
We still have issues as we cannot avoid composite keys for many-to-many association tables.
CONCLUSIONS:
Composite primary keys cannot be avoided even when using surrogate keys as much as possible.
In addition, the data model should be driven by business needs and performance requirements, not by the features of a tool .
=> This cannot be real argument.
16- SQL complexity:
REQUIREMENT:
We want our code to be easily readable, avoiding complex sql queries.
NATURAL KEY MODEL:
We have less joins, but when we have joins, they are more complex (several columns). In addition, checking that natural key is not updated during online transactions adds a little complexity.
SURROGATE KEY MODEL:
We have simpler joins, but more joins. In addition, checking that surrogate keys are never shown to the end user adds some complexity.
CONCLUSIONS:
Both have advantages and disadvantages.
=> The SQL complexity is not a real argument.
16- Index Organized Tables:
REQUIREMENT:
We need to quickly find the call history for a user (for billing purpose)
NATURAL KEY MODEL:
The query will access PHONELINES for a USER_NAME and then will get its calls by a nested loop, using the index on CALLS primary key (PHONELINE_PREFIX_NO , PHONELINE_EXTENSION_NO)
In order to improve that access, we define CALLS table as an IOT (index organized table) so that when calls are inserted, they are clustered together for the same phoneline.
SURROGATE KEY MODEL:
The query will access PHONELINES for a USER_NAME and
then will get its calls by a nested loop, using the unique index on
CALLS (PHONELINE_ID)
But as PHONELINE_ID is not in the primary key, we cannot cluster rows that are related with same phoneline. An IOT on primary key (CALL_ID)
CONCLUSIONS:
Heap tables are great to handle concurrent inserts. But rows are spread across the whole table and retrieving them will need a block read for each index entry.
Index Organized Tables gives an efficient way to cluster rows physically, so several index entries can be served by only one block read.
=> If not needed for other reasons, the rule is to avoid unnecessary surrogate keys when data can be accessed by its business key.
=> If a surrogate key is however chosen, we may need to reorganize the table frequently to keep a good clustering factor.
Conclusion
Conclusion
NATURAL KEYS
-
There are business attributes that identifies the business object instance
-
Those attributes are known during the whole business object life cycle (from its creation)
-
Those attributes values do not change during the whole business object life cycle - or it is exceptional.
-
Those attributes are used to access to specific business objects (and then are used in indexes, partition keys, IOT)
-
For each business object instance, the value of those attributes has a meaning for business.
-
The data has to be exchanged among different system.
-
We don't expect future evolution on the business object definition (and the key definition).
-
The key is composed with few attributes (5 starts to be high)
SURROGATE KEYS
-
No business attributes identifies the business concept, and business do not need an identifier.
-
The business identifier is not static during the object life cycle
-
The business identifier values must also include special technical (non-business) values
-
The functional specification cannot be made stable enough to limit future evolution of the business key
-
The surrogate key will never be visible outside if the system (in prints, reports, external systems, datawarehouses, etc)
-
There is no critical performance requirements to query that object
This is a case-by-case analysis, and some compromises may have to be done.
If a compromise has to be done, some RDBMS features can be implemented to limit the drawbacks
If surrogate key seems to be necessary, it is often a good idea to iterate on the functional specifications to be sure that it does not hide a missed business concept.
The goal the document was to list all concerns related with the choice of natural or surrogate key, so that the choice is not done by a systematic 'rule of thumb', but with a complete knowledge of the consequences.
References
...