Author: Jens Moller
Some applications need to look at diverse data sets. This data probably
already exists in your organization, however it may not be accessible in a
way that ties it to the other organizations that may want or need to use it.
The reason for this is simple - organizations solved their business problems
by building point solutions. Often, these are customized very specifically
for the business need and this allows it to provide the best possible data
for that organization.
The problem is that there are often many instances of data that is related
to another instance of data, but these data stores cannot casually be merged
without substantially impacting the performance and usefulness of the other
systems. In reality, the issue is not typically one or 2 systems that need
to be tied together, but rather dozens of systems that have been created
specifically for the business need or highly customized software packages.
There was no business reason to make them compatible at the start, but
business models have changed and there is a need to bring this data together.
What are Meta Directories?
The concept of 'Federated' databases is central to Meta Directory
implementation. 'Federated' databases basically allow you to join tables from
Relational Databases that are distinct and separate from each other. These
are all SQL based and can be very complex to setup and manage. These
'Federated' accesses normally
are read only, because the data that they tend to use is scattered within an
organization and cannot be 'locked' in any consistent fashion without
causing major performance problems.
A Meta Directory looks at this same issue, but approaches it in a slightly
different way. SQL is not the basis anymore, but the access is abstracted
out further to LDAP (Lightweight Directory Access Protocol) and/or XML
(Extensible Markup Language). The actual access still tries to do similar
functions as an SQL Based 'Federated' database, but no longer enforces SQL
rules on the transactions, opening the access up to any form of database,
not just those that can be configured to provide support for 'Federated'
Basically, a Meta Directory Engine is a
dynamic database access tool that can be
adapted to any existing set of databases to retrieve information and/or
provide translation services for the information returned. It should be able
to talk to practically any existing database that provides a TCP/IP access
point and be able to use it without requiring that the database, or the
applications that currently use it to be altered.
Typically a Meta Directory is a series of applications (usually called
'connectors' or 'listeners') that provide the services. the Services are
typically defined using Policies, where the Policies are stored within an
LDAP Directory server (Meta Directory Policy Store).
Why use a Meta Directory
- Consolidating Data
Organizations often have multiple data stores, frequently scattered across
the country or the world and need a way to connect this information together.
They may also have a desire to get a better handle on the status of existing
production or resource utilization. These resources can be data or they could
be functions of any sort - for example, finding out the thermostat settings
in all of their buildings, or determining where someone is today.
Many Meta Directory applications are used in Real-Time control applications.
Others are used in reporting. The most common use is to insert a Meta
Directory function in between applications, having the Meta Directory
application intercept the communications and act upon them. Most of the time
the data exists in a data store that the Meta Directory accesses. In some
instances, unique data will exist as a Meta Data within the Meta Directory
Policy Store. These Policies are used to implement business rules.
- Data is owned by different groups
Often, key portions of data belong to groups that have little interest in what
other groups might be doing. Rather than duplicating all of the data (which
usually requires syncing and can be problematic when trying to enforce
as business rules accross
diverse organizations), relationships are defined and the data is mapped.
The Meta Directory Engine appears to be a single system in itself to the end
users who access it, but in reality, it allows controlled access of
many sources of data.
- When data needs be transformed to look consistent.
Any organization who can define the structure of the data they need to resolve
a specific business need, and identify that it needs to be the result of data
that appears in unique and seperate data sources, is a candidate for Meta Directory
How are they Different than LDAP Directory Servers or Relational Databases?
Most data stores are entities into themselves. Their data, while accessible
to the outside world thru various API's or applications, are still bound as
a point solution database. A Meta Directory is not really a single place
where data resides, rather its a method by which multiple databases appear
as a single data source.
Meta Directories don't care what the data is, only that they can access it
and return it as some part of a request for data. In this sense, the data
that it accesses is virtual, and can be from anything that can be connected
to a network.
Typically, a Meta Directory speaks any of LDAP, XML or SQL as needed. The
front end processing is most often done using LDAP or XML.
By default, most Meta Directory Applications are re-usable for many
different functions and until they have read their policies, they usually
don't do anything. The reason that they don't do anything is to provide a
level of security to their operations. Unless The Meta Directory applications
can identify their data sources and provide a security infrastructure to
access of that data, they should do practically nothing.
The Policies are read by the Meta Directory application at start-up as well
as at any interval that the application developers deem necessary.
More about Policies
When you create a data map and define how a Meta Directory application will
support this, it should be committed to a safe application management system.
The Meta Directory functions are highly dependent on their policy data and
will not operate as expected without it. Other than configuration specific
data within a Policy, this data should never be altered casually by any users.
Any changes to Policy must be done by Engineers who understand the various
systems and their access.
Benefits - Pros
- Since a Meta Directory application enables existing data, its ROI appears
very early on.
- Unlike data that is related can be tied to one-another. This allows
greater access and control of assets and resources.
- Data that is similar from different existing systems can operate as if
it was on a single unified data store. This alone often highlights
inconsistencies in the different systems and allows it to be addressed in a
controlled fashion - as a result, the data quality usually improves over time.
- Data Warehousing takes on a different perspective as Meta Directory
Application actions can be expanded very quickly by a very small staff. if
your business changes, your Meta Directory can adapt very quickly.
Issues - Cons
- The Meta Directory is sensitive to network availability - Its
functionality is directly tied to being able to access its data sources.
- Someone needs to take ownership of data issues. Inconsistencies will
appear once data is made visible; someone (organization) will need to
address this, otherwise the data will not be trusted.
- The Meta Directory applications will be highly distributed. Managing
this is a larger effort than point solution applications. Some one who
understands the whole system must be involved in resolving production issues.
- Interfaces to existing systems must be well documented. This is not
always available for existing systems that the Meta Directory is attempting
to utilize. Minor uncoordinated changes in the point solution applications
may greatly affect the ability of the Meta Directory applications to resolve
- Requires strong management backing - most point solution application
developers see Meta Directories as an invasion of their environment. Often
the biggest stumbling blocks are political rather than technical.
- Monitoring of applications that are used by the Meta Directory becomes
far more critical to operations. An outage in one area may bring the entire
Meta Functions down during that time. Some Meta Applications cache data for
these situations, however it is often impossible to cache all of the data
required for all operations. Redundant data stores may be required for some
applications, and this must be considered in any implementation.
Meta Directory applications provide a service that is layered upon your
existing systems. Done correctly, it allows an organization to access data
in ways that previously were very hard to do and not adaptable to changes in
the business models.
Meta Directory solutions need not be expensive, or extensive. They are best
built out in controlled portions and adapted to organizational needs.
Typically, the communications and access methods need to be designed and
supported by the various data sources. Interface documents need to be
created and Service Level Agreements need to be hammered out to make sure
that the data that is required is there when its needed.
It all comes down to a simple fact, A Meta Directory becomes an
When its working correctly, most people won't even know its there.
Infrastructure is often hard to justify, because it becomes an invisible
operations layer. Once in place, it can invigorate, or if poorly defined,
it won't be used.
Any Meta Directory solution requires buy in from the
highest levels of management, otherwise it is destined for failure.
Comments? Questions? Contact Engineering