 |
| Directory Replication |
Author: Jens Moller
Directory Replication is a complex topic, because of this, I've tried to keep
the examples as simple and to the point as possible. All of the existing
Production quality Directory Server vendors have created their
own replication solutions and in some cases, its
possible to replicate bits and pieces of Directory information on an
'As Needed' basis. There are efforts to
create a standardize replication model (you'll hear the term LDUP
mentioned when people discuss this), but as of yet its not something that
you should count on.
First, realize that very few Directories that have replication
capabilities that are compatible with other
Directories Replication models. Even the Standards Based Models often
do a few things in an inconsistent fashion. This does not make it
impossible to synchronize things, however it often means that it requires
an understanding of where the sources of record are and how to sync that
data in an appropriate way.
There are a wide range of solutions that the vendors provide. Interestingly
enough, many people write their own replication functions because they don't
like the solution provided by the vendor, or they need some functionality
that the vendor didn't provide.
Replication is never immediate - there is always a time lag from the time one
system is updated before the other systems pick up the changes to the data.
Its safe to assume that changes will take at least 5 minutes to show up on
remote systems, however, if there are thousands of updates being done, it may
take hours before a change is replicated to a remote system. This is normal
operations and something to consider when designing an application (all the
more reason to avoid storing frequently changing things in a Directory).
Recovery from failure modes will determine how well any replication
model really works. Over time the Directory Server will encounter most of
the possible failure conditions that will cause data consistency issues.
If you have very few writes/updates/deletes and mostly reads, there
will be few opporitunities for a failure to occur. Most applications do far
more writes/updates/deletes than people think they do.
Master Servers allow Adds, Deletes, Modifies and general Read against
the Directory. All non-Master Servers operate in a Read Only fashion.
The concept of a GUID (Global User Id) is gaining
popularity in Multi-Master
Directory Servers. It is a unique identifier that is created on a Master
Directory Server and it will never be the same as a GUID created on different
Master Directory Server from the same software vendor. When a Directory entry
is replicated, the GUID follows the entry. If duplicates show up but the
GUID values are different, this usually indicates a failed re-synch
operation as a result of an outage of one of the other master Directory
Servers in a Multi Master environment.
Note: the Multi-Master examples discuss
simplified implementations of specific vendors operations. These are
operational
models that people have made available in their products and are listed
here to show how they might work. Also listed are related problems
Single Master Model
By far the most common, and easiest to implement. Many vendors offer it
as an option, and often, each implementation includes other capabilities.
How it Works
A transaction log file is written to by one process, while one or more
other processes read thru the transaction log file and attempt to
update one or more remote systems. The transaction log file is the
single master list of updates that must take place on other servers.
Some systems offer replication agreements that allow subsets of the
updates in the transaction log to be replicated to different systems.
If the Master becomes unavailable, updates cannot be made to any Directory.
Some Servers will queue up modifications on the Read-Only Servers, however
this is often not a good idea since random Read-Only queues may attempt
to update the same entry and you have no idea what order any of these will
occur in.
|
|
Who Uses this model
- iPlanet (using LDAP)
- OpenLDAP (using LDAP)
- DCL (using DAP)
- OpenWave LDAP Server (using LDAP)
Good things about it
Simple to implement. As long as the network is reliable and there is not
an excessive amount of data to sync, it works remarkably well.
If you had to sync a bunch of unrelated Directory Servers, such as
DCL, Active Directory and iPlanet with each other at the same
time, this would be the easiest method to
use, and is reliable enough to put into a production environment.
Problem areas
Over time, you tend to lose occasional transactions. This occurs
as a result of many environmental things. There is a dependency that the
'DN' (Distinguished Name) remain consistent on all of the nodes, otherwise
the transaction log file (which typically has only the DN and the changes
made to that DN in it) won't match records on all nodes.
This is not always easy to enforce, even with controls on the
sources of record on the Master system.
There are a multitude of actions that could cause one system to stop
replication to a remote node,or instances where the remote node was
accidently (or intentionally) updated by another process and suddenly you
get duplicate entries because the DNs don't exactly match each other.
Another common issue is where one of the nodes is intentionally shutdown
and all the
other nodes sync data as normal, however the shut down system is unable to be
updated. If the Master Systems's Transaction Log file is
somehow truncated before the node is brought back on-line, then any additions,
modifications or deletions that occurred on the other system can not
be delivered to the node once it comes available again.
Some of the Directory Servers manage the truncation of
the log files for you and you might never even know that this happened. Since
some of the replication tools are designed to retry a limited amount
of times before skipping a transaction in the log file, you might not have
any indication that anything ever went wrong until something else gives you an
indication of the inconsistency.
If iPlanet V 4.x detects that something is out of sync
on a remote node, it disables processing on the Master and tries to re-sync
everything based on its replication model - In
theory, a wonderful idea, however, in reality, it may lock up the Master
database for hours as it tries to resolve issues that can only be
resolved by manual intervention.
Safe Operation
You must assume that over time if you have more than one node, it will have
occasional inconsistencies that require some sort of intervention. You need
have a module that runs occasionally (once a week, for example) that compares
entries on nodes (and it needs to be run from the remote nodes) to see if
any differences exist and logs them.
Dual Master Model
This is the next step above a Single Master. Usually implemented because
it is generally better than a Single Master Model, and it offers the potential
that if the Primary Master goes down, you can still handle updates with a
Secondary Master.
How it Works
Its very similar to the Single Master Model in that transaction Log files
written to and replication agreements are drawn up on the Master Directory
Servers as to what pieces are to go where. There is usually a Primary Master
and a Secondary Master. The Primary Master is always the master if it is
operational. The Secondary Master takes over when the Primary Master is
not available.
Unlike the Single Master, writes need to be directed to a Proxy that decides
which of the Masters are running and who gets Write/Update/Delete requests,
since only one of them should ever be in control at any one time. This whole
arrangement is substantially more complex to manage.
Typically, out on the Nodes where the replicated directorys are, there is
a process that allows updates to occur, but they actually get queued up and
get sent to the Write/Update/Delete proxy. Reads are handled locally.
|
Who Uses this model
Good things about it
You are never without a Master Directory server as long as one of the
two Masters are available. The Secondary Master will attempt to re-sync
data based on timestamps (that are part of each entry) with the Primary
Master Directory Server.
Problem areas
If the network holds together, this is more reliable than Single Master,
however it suffers from many of the same problems. It introduces a new problem
when the network causes it to appear that the Primary Master has failed,
when in reality, the network links between the Primary and Secondary master
fails and each of them assume the Master Roles. If the Proxy is on multiple
systems (for redundancy), any system on either side of the network will
operate as if a failover occurred. In these instances, the 2
Masters happily allow Adds, Updates and Deletes, syncing data to nodes that
they are able to access. The problem comes when the 2 Masters are able to
talk to each other again - since both were allowing updates at the same
time, the resulting changes that appear in the Primary Master directory are
a mix of each servers changes. This can leave quite a mess, often causing
duplicate DN entries that don't show up as obvious duplicates after
an attempt at resolution by the Primary Master.
To allow the Read-Only systems to appear as if they allow writes, they
capture update requests and attempt to process them. These requests are dropped
into a local queue (on the same physical system as the Directory Server)
then sent to a proxy that decides which of the Multi Master Directory
Servers does the updates. If there are a
lot of updates, the delays introduced by the additional processing
may take quite a while to
actually be replicated back to the system that the
write appeared to occur on. So, 2 things happen:
- Write/Update/Delete operations are usually quite slow
- Reads of the data that was just Written/Updated/Deleted have not been
pushed back out to the system that the change occurred on yet (depending
on what else is happening, it could be hours before the update actually
shows up). Typically a 5 minute delay should be expected.
This confuses most applications and people that thought that they had just
successfully done an update, and they request the data and its not there.
Account Creation and Password resets are common places where this is noticed.
This is really not any different than would happen in a Single Master system,
however, the Multi-Master tends to be slower because there are more processes
involved.
Casual alteration of the network can have a major impact on this Replication
Model.
Safe Operation
Operate this in the same way that you would a Single Master Directory. A
close monitoring of the networks is quite important. Schedule all network
changes at a high level and keep the Directory Management people informed.
Pseudo Multi Master Model
Novell NDS systems allow parts of a directory to be partitioned to various different
NDS Directory Servers. When an update occurs, the change goes to the Directory
Server than manages that given piece of the Directory. A GUID (Global User
ID) is used to track the entry and maintain its uniqueness in case of a
duplicate entry (the entries will always have a unique GUID).
Multiple servers that hold replicas of the same partition compose
a replica ring. NDS automatically synchronizes the servers in a
replica ring.
How it Works
- Master Replica
The first NDS server installed on the network holds the master
replica by default. Partitions have only one master replica. Other
replicas are created as read/write replicas
by default. The "creating a new partition" and "creating a new
replica" operations require that a master replica is available. If the master
replica is on a server that is down, NDS cannot create new partitions or
replicas until the server is up again.
- Read/Write Replica
NDS can access and change information in a read/write replica and
then propagate those changes to the other replicas. If users
cross slow WAN links or busy routers, you
can place a read/write replica locally to them to speed their
access of network resources. Keep in mind that the more
read/write replicas you create, the more network
traffic necessary to keep the replicas synchronized.
- Read-only Replica
Network clients can read network information from read-only
replicas, but they cannot change them. NDS synchronizes
read-only replicas with the changes from read/write
and master replicas.
- Subordinate Reference
Subordinate references are system-generated and contain
only enough information to allow NDS to resolve names across
partition boundaries. The system creates subordinate references
on servers that hold a replica of a parent partition, but not
of its child partitions. This allows NDS to walk the tree to
replicas of the parent partition's child partitions on
other servers. The system automatically deletes the
subordinate reference, if a replica of the child partition
is copied to the server holding a replica of the parent
partition.
Who Uses this model
Good things about it
The NDS database is loosely consistent, which means that
replicas are not guaranteed to hold the latest changes to
the database at any specific moment in time. To ensure
database integrity, NDS automatically synchronizes all
replicas. So, the database is guaranteed to completely
synchronize. If there is any period of time during which no
database changes occur, the database will completely
synchronize and all replicas will hold the most recent
information.
A good example of this occurs whenever users change
their passwords. If they attempt to login with their new
password immediately after they change it, the authentication
often fails. But, if they wait a few minutes and then
login with the new password, the system allows them in. This
is because NDS had time to synchronize the password with
all the replicas.
Fast synchronization, which is synchronization of all
object modifications, occurs every 10 seconds. Slow
synchronization, which is synchronization of the attributes
dealing with login time and network addresses, occurs every 5 minutes.
Instead of synchronizing all information for every object,
NDS only synchronizes the updated (delta) information.
Each attribute has a time stamp associated with it. NDS updates
this time stamp whenever the attribute is updated.
Problem areas
The partitions that you put things can't hold very much.
Novell recommends never going over 1250 objects per NDS
container and per NDS partition. This is unrealistically low
for most Directory instances.
Too many things depend upon NetWare. The support applications
are not well integrated with LDAP based tools.
Safe Operation
NDS has build a reliable system by reducing the amount of data that any
one partitioned area can hold at a time, and by forcing you to have replica
rings if you need 100% uptime. Without the Replica rings, you could easily
lose Master access to pieces of your Directory.
These issues make it hard to utilize NDS as a Directory Server for anything
outside of general Novell created applications. In this case, the replication
models complexity and need for large amounts of systems is a limiting factor
for any directory instance.
Multi Master Model
How it Works
Windows 2000 Active Directory provides multi-master replication.
Multi-master replication means that rather than having a single point
where changes can be made, we actually have multiple points where
changes can be made. In Windows 2000, all DCs are equal with each
containing a read-write copy of the database.
If one of the domain controllers (another word for Directory tree running
on a Windows 2000 server in
Microsofts eyes) goes down, you can make modifications to the Active
Directory database on any other domain controller.
Who Uses this model
- Microsoft Active Directory
Good things about it
If Administrator A is modifying an attribute for a
user object on one DC while Administrator B is modifying
the same attribute on a different DC, one of the updates
will have to end up overriding the other update. Windows 2000 AD
relies on something called USNs (Update Sequence Numbers) to
determine which update to use. If the USNs are identical,
Windows 2000 AD employs other tie-breaking methods.
GUIDs are also used to determine the uniqueness of the elements.
There is no specific limit to the amount of entries allowed.
In general Microsoft had the benefit of looking at the problems
that the other Directory Server systems had in place and trying
to pick out the good things and avoid the bad things. It is generally
successful, except that it modeled itself as a replacement for the
previous Domain Controller rather than a general purpose Directory Server.
Problem areas
The structure of Active Directory assumes a great many things, many of which
are not really X.500 compliant. You also get attributes that you were not
expecting (because AD simply requires them).
Microsoft limits what can be part of the Multi-Master replication scheme.
Certain domain and enterprise-wide operations not well
suited to multi-master placement reside on a single domain
controller in the domain or forest. The advantage of
single-master operation is to prevent the introduction of conflicts
while an operation master is offline, rather
than introducing potential conflicts and having to resolve them later.
Most actions require Microsoft specific tools to operate against them. Most
of these tools only support known (to Active Directory) Object Classes, so
if you extend the directory, and you don't alter an existing Object Class,
you won't be able to use Microsoft tools to see that the data is getting
replicated. If you extend the schema and you don't set the attributes
properly at that time, you cannot correct your errors without re-installing
and rebuilding your Active Directory (its a feature).
Safe Operation
Do all development and testing in a test environment. You will not be allowed
to 'unextend' an Active Directory Server or its replication agreements for any
attributes that you add.
Summary
- Replication cannot be a time critical function of your application
- It is very probable that replication will usually occur
quickly, however at times, replicated data may take many hours
to get to all of the systems. This does not make the Directory Inconsistant,
it is simply how it works in the real world.
- Replication success depends on how well the Directories manage failures.
- Problems can occur when there are network outages or network
performance issues.
- Validation of the data on remote systems should be a task that the
Directory Manager will do at a given interval. The question is not If,
rather, it is When things get out of sync. A plan must be in place
to deal with the sync errors.
- You cannot automate everything for all possible failure and recovery modes.
- Identifying a Directory entry by using GUID's may help to resolve
issues where resyncing appears to introduce duplicates - at least you will
know if they variants of existing entries or new ones.
- Don't trust anyone who tells you that the Directory Server will always
recover from all system outages. Eventually, you will find an instance that it
will not recover from. It may not happen often, but be prepared.
- There are no such things as transactions on a Directory Server.
There is no way to Roll Back a change, you have to do the appropriate
update/delete/write steps to put things back the way they were.
- All of the above methods work well to provide replication services. Many
organizations run well with Single Master systems, however it will never be
completely redundant in case of an outage. Multi Master systems make things
more complex, but strive to provide better redundancy and scalability.
Comments? Questions? Contact Engineering
|
|