Thursday, November 06, 2008

Enterprise Identifiers

I've been thinking recently about how one should be identified within the enterprise.
It's no secret that there are several identifiers that are used by an organization for identification. Some examples include:
  • Government issued numbers - Using identifiers such as Social Security or Driver's license numbers is legal, if not mandatory in some places. These are nice since they don't have to be maintained by private organizations, but this can also present problems, not to mention privacy and legal conflicts.
  • Email Address - IDs based on email address offer guaranteed uniqueness, however these can change over time especially with life changes, particularly when they are based on the user's name.
  • Name combinations - Creating an ID that is made up of x number of characters from first name and y number of characters from surname has both good and bad points. These are wonderful because they are easy to remember for the end user, however there can be challenges to IT making sure that all IDs are unique.
  • Application centric ID's - Some applications create their own sequenced ID's. These tend to be easier for IT and application owners, however they tend not to be as easy for people to remember. They also have the advantage of not revealing too much information about what is behind the login ID in the way of personal information. For instance there's no way for anyone to know that user ID H10032 is really Matt Pollicove (FYI - not my userID) But they can give some basic information by simple formatting, such has having certain characters indicate employment status, whether or not the ID is for a service account, if the account is tied to a particular group or location, etc. However this degree of tight formatting tends to make the user IDs not terribly portable as status changes occur. Once conventions are set for how users are to be identified within the enterprise, some additional challenges depending on how the user entries are used within the enterprise.

Which one is right? Which one is wrong? I don't think there is any one correct answer besides what works for the organization. Certainly from a security and privacy perspective showing less information that more is better but does this really solve anything? Identity information is still exposed via external services such as portals, white pages and other search methods. So even if personal information is abstracted by a different identifier, it can still be determined.

4 comments:

Ash said...

Why not let the IdM system create its own internal id for the user? Doesn't ever have to be known by the user, but at least you can ensure its uniqueness, and map it to unique id's for all target apps.

Matt Pollicove said...

Well that's certainly a thought which could work well using algorithims that create GUIDs. However there's usually an idea of creating an identifier that's used throughout the enterprise both for continuity of experience for users (so that they don't have to remember a separate login ID per application) but also to provide the unique key for linking data between repositories.

Unknown said...

For internal users, I like generating a random ID, and then linking it to their HR record, if there is one. Do not use the persons name in generating the ID. Make it random, but it must start with a letter. It should be 5-7 characters long. When I see a 3-4 character ID, I assume it is a service account. Some target systems do not accept an ID of 8 or more characters. Use only characters from the set [A-H,J-K,M-N,P-Z,2-9] so as to avoid mistaking an I for 1 or l or 0 for O.

I agree with Matt that there is great value to the end-user of having the same login ID for all internal systems. This also makes it much easier to link accounts and to correlate audit log entries across multiple systems with a single person.

I differ from many of my IdM colleagues on one issue: Let the end-user select their own e-mail address. The IdM system should not generate it, only ensure its uniqueness, preferrably its historical uniqueness as well. The IdM system should also control the domain (right-hand) portion of the address to ensure that the person is a member of its represented organization.

This particular identity attribute represents how the person would like to be identified outside the organization, which is especially important to customer-facing and leadership roles. Sometimes James N Smith prefers jim.smith or Norman P Tyes prefers paul.tyes.

Also, make it a reasonable process, if not completely automated, for a person to change their e-mail address, while still allowing messages the their old address to reach them.

Matt Pollicove said...

Adrian,

Those are some great thoughts. I don't know however, that a randomized ID would work since you'd have to keep checking for Id Collisions and it could take several iterations to come up with a clean ID.

The 8 character limit does make a lot of sense particularly when considering that there is always the chance that legacy systems will need to be supported as well.

I really hadn't thought about the I/1/l and O/0 conflicts. I think this would be a good thing to consider on a case by case basis. The "dictionary" or invalid names could cover most of this. I think if you remove the letters from the ID you won't need to worry about the numeric elements. Also you could consider enforcing case sensitivity to help as well.

But these are some good things to think on. Thanks!!!!