Glossary of Terms Used in Microsoft Data Patterns
Retired Content |
---|
This content is outdated and is no longer being maintained. It is provided as a courtesy for individuals who are still using these technologies. This page may contain URLs that were valid when originally published, but now link to sites or pages that no longer exist. |
Version 1.0.0
Complete List of patterns & practices
Table 1: Definitions for Data Patterns
Access profile | A description of the characteristics of how application queries access the data store, such as search condition, size of the result set, frequency, and required response time. |
Acquire | A service that gets a movement set from a data source. Acquisition may be a simple one-step process, or it may a multi-step process. The Acquire can enrich the data by adding details (such as time the data was acquired) to allow for management of the overall data integrity. It can acquire the movement set from the data structures directly, or it can acquire the set from other caches where only data changes are stored. Typically these caches are either database management system (DBMS) log record stores, message queuing system stores, or user-written databases. Acquire must either collect all changes, in which case the ordering of the changes is vital so that they are written correctly by the Write service; or it must collect the net change, which is the final result of all the changes that have occurred to the set since the last transmission. |
Aggregation | Creation of a compound record or element from individual records or elements, where a record is a collection of data elements in the data store. In relational terms, a record is a row, and elements are columns. |
Asynchronous | A style of processing where an application posts a request for an event to occur and then continues without waiting for the event. A separate service will recognize the request and take responsibility for ensuring that the event occurs. |
Cascading replication | A hierarchical assembly of replication building blocks used for related replication transmissions. In this structure, the source(s) replicate to intermediary targets (of which there may be several layers). The intermediaries switch roles and become sources for the next replication link. This process continues until the replicated data reaches the end target. This configuration is used to reduce workload on the source when there are many end targets that all want the same replication set or a very similar replication set. |
Composite movement set | A collection of one or more movement sets. Data in a composite movement set can come from one or more data stores. In a data store, the subset that is to be moved is a movement set. The sum of these movement sets is called a composite movement set and comprises all the data you want to move to the application. Composite movement sets are usually relevant to Extract-Transform-Load (ETL). |
Conflict | A conflict arises whenever two or more copies of the same data are independently updated in a time interval. Conflicts are detected only when one of the copies replicates its data to the other copy and the Write service discovers that the other copy of the data has been changed since the last replication. The conflict must be resolved by the Write service. |
Conflict detection | The process of detecting conflicting change transactions on the common data in a source and target during a transmission. |
Conflict resolution | The process of resolving conflicting change transactions on the common data in a source and target during a transmission. The resolution method specifies whether the source or the target change should overrule the other; or it may return data as an aggregated result of the conflicting transactions; or it may require manual intervention to resolve some conflicts, for example, those where complex business rules need to be invoked. |
Data movement | The act of reliably and repeatedly moving a copy of data from its current physical location(s) to different location(s) and possibly transforming its contents. This action requires several architectural components that are described in the patterns, and a process that is outside the scope of these patterns. |
Data movement building block | The fundamental architectural building block for data movement. This block is used to assemble all solutions that move data copies to the applications that need them. It consists of a source, a movement set, a data movement link, and a target data store. |
Data movement link | A connection between the source and target along which the relevant source movement set moves from one data store to another with appropriate security. This link includes the method of transmission of data at each step that moves the data (which includes any needed intermediary transient data stores). The data movement link also includes the Acquire, Manipulate, and Write replication services. |
Database | A collection of data managed by a DBMS. The scope of the term database can vary depending on the DBMS product used. For clarity, these patterns use the term as defined by Microsoft SQL Server. |
Full replication | A replication in which a whole replication set of complete rows is moved from the source to the target on every transmission through a replication link. (A full replication is also called a snapshot replication.) |
Immediate replication | A replication in which every change to the source triggers a replication transmission to the target. When using a database, the changes will be transmitted immediately after the changing transaction commits its changes to the DBMS. |
Incremental replication | A replication in which the replication set consists of only the changes that have been made to rows since the last transmission are sent (as opposed to the complete replication set). When designing an incremental replication, you need to decide whether to send all changes that have occurred to any particular record during the replication interval, or whether to send only the net effect of those changes. |
Key updates | Changes to the primary key of a database record, such as SQL updates to the columns of the table key within a replication set. The replication must handle such key updates with special care. |
Manipulate | A service that changes the content or form of the movement set in some way and passes it on in a format that can easily be written to the target. Manipulations can vary in complexity from a null event (where Manipulate does not change the data) to very radical data alternations. |
Master-master | A source-target relationship in which the replication set can be changed at either the source or the target within a replication interval, and these changes are to be posted back to the other party on the next transmission in its direction. Thus the source and target are equals with respect to rights to make changes to the replication set. (Master-master replication is also known as peer-to-peer replication.) The write logic of a replication link must include logic for resolving multiple-updater conflicts, and two replication links must exist between the peers since they will swap source and target roles when exchanging data (see following figure). The Master-master relationship should not be confused with a pair of master-subordinate relationships between source and target. Although the configurations look similar, the pair of master-subordinate relationships does not provide the capability to update a common set of data at either end (see following figure). |
Master-subordinate | A source-target relationship in which the source replication set is written to the target without checking for conflicts. Either the replication data at the target data store is read-only; or any updates to the replication data from the source are not to be copied back to the source database and the changes can be overwritten by a later replication transmission. |
Movement set | An identified subset of data that exists within a single source. A movement set is copied from that source and is sent across a data movement link to one or more targets. During the copy operation, the content and form of the movement set may change as it is acquired, manipulated, and written. |
On-demand replication | A replication in which transmissions are started by explicit operator request, as opposed to being triggered or scheduled by an automated process. |
Operation | An action performed on a row of data, such as an INSERT, UPDATE, or DELETE operation. |
Optimistic concurrency control | A data integrity technique that allows multiple parties to update different copies of the same set of data. When the changes are merged, a check is done to see if the changes affected the same data within the data store. If such a conflict is detected, it must be resolved by a defined method; for example, the more recent change overrules any older changes. |
Pattern | A three-part relationship between a general problem, its context, and its solution, which is based on real-world experience and is documented in a consistent, formal structure. A pattern encapsulates experienced practitioner knowledge and can be used as a starting place for creating solutions to specific situational problems. |
Pattlet | A placeholder for a pattern where there is good cause to believe that a pattern exists that has not yet been written. Usually expressed as a name, and a problem or solution statement, or both. |
Periodical replication | A replication in which transmissions are scheduled to be run at a fixed time or after a fixed interval. |
Pessimistic concurrency control | A data integrity technique that requires an application to acquire a data lock before it can change data. This means that only a single party can change the data at any point in time. |
Projection | A selected subset of columns from tables. If the replication set is not a full replication, then it is a projection. |
Publication | The Microsoft SQL Server term for a set of data made available for replication by a publisher. |
Publisher | In Microsoft SQL Server, the role of a platform that provides the source for a replication link. |
Push replication | Replication that is invoked at the source. |
Pull replication | Replication that is invoked at the target. |
Redundant data | Any data that does not provide new information but already exists elsewhere in the environment as an exact copy or is derived by more complex manipulations, such as aggregations. |
Related replication links | Replication links that require information about one another's actions because of the relationship they support between the source and target. Master-master replication uses two related replication links between source and target in opposite directions to allow changes to the replicated data at either end, and to transmit the changes to the counterpart. |
Replication | The act of reliably and repeatedly moving a copy of a set of data from its current physical location(s) to different location(s). If both source and target have updated the replicated data since the last replication transmission, the process of writing the data to the target may be complex. Otherwise, the process of moving the data is very simple. |
Replication building block | The fundamental architectural building block for replication. This block serves as the basis for all replication solutions. It consists of a source replication set to be replicated, a replication link, and a target database. |
Replication interval | The period of time between replication transmissions. |
Replication link | A connection between the source and target along which the relevant source replication set can be moved from one database to another with appropriate security. This link includes the method of transmission of data at each step that moves the data (which includes any needed intermediary, transient data stores). The replication link also includes the Acquire, Manipulate, and Write replication services. |
Replication set | A movement set that is used for data replication. A replication set consists of one or more replication units. |
Replication transmission | The act of moving a replication set from source to target. |
Replication unit | The smallest amount of data that can be discretely recognized in a transmission. The replication unit can be one of the following: The complete replication set A table in the replication set A transaction A row (of a table in the replication set) A column (of a row in a table in the replication set) |
Snapshot replication | A replication in which a whole replication set is moved from the source to the target on every transmission through a replication link. (A snapshot replication is also called a full replication.) |
Source | The data store that contains a movement set to be replicated. |
Subscriber | In Microsoft SQL Server, the role of a platform that acts as the target for a replication link. |
Subscription | In Microsoft SQL Server, the metadata that defines a replication set. |
Synchronization | The process of replicating and applying changes from a source to a target when data from the replication set may potentially have been updated at both ends, and these conflicts need to be detected and resolved. |
Synchronization building block | A refinement of the replication building block consisting of two replication links and a synchronization controller. The controller manages the synchronization and relates the replication link pair. |
Synchronous | A style of processing where an application requests that an event occurs and waits for the event to complete, so that the application is certain of the result of its request before it proceeds. In the case of data operations, it means performing a set of data operations within a common unit of work as defined by DBMS commit services so that the state of the set of data is certain. |
Target | The data store in a data movement building block where the copy data is written. |
Topology | A layout of related data movement building blocks that provides a map of the source and target data stores and the links between them. By describing the relationships between these elements, a topology helps you to determine the provenance of the movement set and assess the impact of changes to the data movement set or to the configuration of the movement building blocks. . |
Transaction | A collection of one or more manipulations of a database. A transaction should adhere to the ACID principles: Atomicity, Consistency, Isolation, and Durability. |
Transaction log | A special data store provided by a DBMS that allows copies of transactional database changes to be persisted to a location other than the data store itself. A transaction log's primary purpose is to allow a DBMS to recover from failures. |
Transactional Replication | A type of incremental replication in which the replication unit is a transaction. |
Transform | Transform is a service of the complex data movement process commonly known as ETL (Extract-Transform-Load). |
Transmission | The process of moving of the movement set from source to target according to defined functional and operational requirements. |
Trigger | A database object attached to a table that invokes additional actions on behalf of an initiating operation. The common usage of a trigger is to perform additional actions on certain kinds of manipulations and to free the application from the implementation of these actions. |
Write | A service that writes a movement set to the target data store(s). Write deals with any errors returned from the attempted write, which may be simple (such as database error codes) or more complex (such as multiple-updater conflicts). |
Retired Content |
---|
This content is outdated and is no longer being maintained. It is provided as a courtesy for individuals who are still using these technologies. This page may contain URLs that were valid when originally published, but now link to sites or pages that no longer exist. |