Monday, July 24, 2006

Virtual Directory Caching

So there's been a lot of discussion lately about Virtual Directories and caching, so here's my take on it.

First of all, the key concept behind a Virtual Directory is that the virtualized data is not be moved from the back end repositories and should be accessed in real time as the Virtual Directory is queried by the LDAP client.


So in my humble opinion... The Virtual Directory must not rely on a cache when executing search operations. That is to say that caching mechanisms, when utilized, must be optional and not be a core mechanism in the virtualization process. When caching is needed in a particular scenario it should utilize a dynamic base where data remains for a specified time. I believe that caching mechanisms detract from the true nature of a Virtual Directory. The point of a Virtual Directory is that it is just that, Virtual. One cannot point to any specific place and say “This is where the Virtual Directory data exists”. When any type of cache is implemented this is no longer the place as there is now a presence to the Virtualized data. To minimize the “presence factor”, attributes held within the cache should have a definable duration or Time to Live (TTL). When the TTL expires the information should be flushed from the cache. This reduces the risk of data being improperly obtained from the cache when queried. Constantly changing data must be have a short TTL so that the most recent values are present, a must when the Virtual Directory is used for Authorization or Authentication. Less volatile data could have a longer TTL and potentially improve Virtual Directory performance. In particular, static, or file-based caches offer additional issues in the form of increased latency when executing and returning search queries. When the cache is held in a non-static form (i.e., Server RAM) latency delays drop and there is less of a chance that the information held in the cache can be accessed by other people or applications.

Of course, that's not to say that caching is entirely bad, but I'll leave that for a future entry along with an examination of the political factors involved with a Virtual Directory.

Friday, July 14, 2006

Virtual Directory Models

I'm starting to think that there are three basic models to implementing a Virtual Directory solution:

1. The Virtual Directory runs in what one could refer to as a "Pure Virtual" model. When this is done, no caching is executed and the Virtual Directory solution can deliver real time representation of back end repositories and act as a source for authorization. In this case there is never any chance of data being stored in any other location, allowing data owners to feel secure and preclude any "political" issues.

2. The Virtual Directory can utilize a RAM based cache. This option would allow some or all query results to be held in cache for a specified amount of time. Subsequent queries to the Virtual Directory would first examine the cache and then proceed to check back end respositories if the desired data were not held in cache.

This scenario can be excellent for relatively "steady state" implementations such as white pages apps.

3. The Virtual Directory can utilize what I refer to as a "static" cache. Static caches are located on physical media such as hard drives. Again, best practices would dictate that attributes be stored with configurable lifetimes and updated data would automatically replace old data.

In my opinion, there's a lot of overlap here with Metadirectory solutions. It would seem to me that the question of how the data is populated in this scenario is extremely important. If the connectors seek to "pre-emptively" seek out data from the back-end, then this is bad and does not hold with the "real-time" properties of a Virtual Directory.

There are some advantages to this scenario, especially when used in conjunction with a data sync tool. Using such a tool for populating a repository that is then virtualized helps to make sure that the data is more available in the event of issues with connected repositories. If a Virtual Directory in Scenario #1 or #2 has a problem connecting to back end repositories then there is a high chance that data will not be returned. If there is a physical store to fall back on then there is a degree of failover.

I'm sure we'll be revisiting this topic from time to time...

Saturday, July 08, 2006

Virtual Directory properties

I've been doing a lot of thinking about the Virtual Directory space. One of my colleagues, Matt F., has been doing a great job thinking about the uses of virtual directory technology.

I'm wondering if we can take this a step further and consider what is the base functionality of a virtual directory application. Certainly, we in the IdM world know that it is not a web server's presentation of a file container. :-)

I'll be thinking more on this in the coming days and see what comes of it.

What should be the characteristics of a Virtual Directory?

What functionality should an application have to be considered a true Virtual Directory?

To some extent these things are a matter of opinion, but I'm wondering what you think.

Thursday, July 06, 2006

IdM Implementation Process

As I’ve mentioned previously, a major obstacle to IdM project implementation is fragmented ownership. Consider:

    • HR owns the “Identity Data”
    • IT Owns the Technology infrastructure, email and PBX systems
    • Legal is responsible for compliance
    • Corporate Security owns the Access Systems and assists with compliance

This fragmented ownership makes it quite difficult to determine who owns, funds and administers the IdM project. So ultimately the following things need to happen:

  1. Identify an executive owner. Since there’s probably no CIdO in the organization, someone at the C-level needs to own the project. This person will be the champion of the project at senior levels of management, who will fight for budget, acceptance of the system and control the ultimate destiny of the project. Without this person there can be no clear vision or representation of the project and it will most likely be doomed to failure.
  2. Begin the Project plan. Typically this will require:
    a. Identifying repositories to be used
    b. Designating an authoritative repository; begin reconciling data into the repository and applying cleansing rules. Synchronization tools such as MaXware’s synchronization technology found in DSE and MIC are perfect for this.
    c. If desired replicate applicable data to the repositories mentioned in (2a). This ensures that no matter where users look in the enterprise infrastructure, they will see clean, authoritative data.
    d. Outline and implement provisioning workflows for the repositories in (2a).
    e. Outline and implement password management, self service, administrative and other workflows as dictated by your IdM processes and software.
    f. Develop compliance and metrics reporting processes. g. Review processes and compliance checks on a scheduled basis.

This is not meant to be a comprehensive outline, but rather the beginnings of the implementation plan. This is a basic flow that will need to be expanded on as dictated by your organizations goals, IdM Software, compliance needs and infrastructure. Many organizations will find themselves caught for a while in creating a cleansed and valid authoritative store. The important thing is that as the project is designed, make sure there are separate, distinct phases and that the executive sponsor is kept aware of progress and milestones reached. Hopefully when IdM products and consultants are chosen by the organization, they will come equipped with the ability to make a detailed project plan.

Since there is no designated person or department that currently owns the Identity Process, I believe that understanding the underlying process to implementing your identity management project is essential and will make the difference in achieving the objective.