Maintainable Metadata – An EDRMS / GCDOCS Checklist
We have often made the case that a metadata-based approach is the most forward-thinking one to take when implementing an EDRMS / GCDOCS. This post is intended as a corrective for the other extreme: “why not use as much metadata as possible?” and provides a checklist of criteria for ensuring your metadata is maintainable.
With the advent of Electronic Documents and Records Management (EDRMS), departments can leverage the full suite of functions available in systems such as GCDOCS to build out a whole new user experience for daily work. One of the key business offerings of such systems is the potential to leverage metadata for searching and finding information. Documents created in the system have the capability to be defined, contextualized, and optimized for search upon creation. Metadata not only provides a means to find specific objects, but also structures the entire context of the information objects in a system independent of folders or location. There is a lot to be gained by having a fulsome approach to metadata, and we have often made the case that a metadata-based approach is the most forward-thinking one to take when implementing an EDRMS / GCDOCS.
So why not go wild with metadata, and use as much as possible?
Every system has its limits. Metadata, like the information it describes, only has value if it is used. If it is not being actively searched on, or is too difficult to maintain within the limits of a system (like GCDOCS), then too much metadata can actually have a negative impact on user experience.
Here are some criteria I have used in the past to determine candidates for good pieces of metadata for GCDOCS / EDRMS implementations.
- Will users actually search on it?
- Does it provide value in terms of aggregating types of information?
If users are not actually going to search on the term, there is little value in implementing it. Metadata needs to reflect concepts users will understand and use. For instance, many users will probably want to search objects by generic terms like fiscal year but the same may not be true for specific financial processes which exist but are not really relevant at the document / information resource level.
- Can this metadata element be maintained as a set-list of values?
- Can the list of values be standardized?
It’s important to be able to standardize metadata values into consistent and manageable lists, and this is exactly what a controlled vocabulary entails. If a set of terms does not really form a logical grouping, or are is too broad in scope, it is probably not a very good candidate for metadata. In such cases, alternative approaches might satisfy end-users. For instance, instead of maintaining messy metadata elements like “topic” or “keyword” as set/managed lists, let users enter this information into a description or keyword field. That information is still searchable as metadata, but relieves some of the burden of metadata management.
Stability / Frequency of Change
- Do the values change frequently?
- Is it feasible to keep the list up-to-date?
Metadata that becomes out-of-date quickly or is too frequently altered will be very difficult to maintain, especially in a system like GCDOCS. This will have an immediate impact on the quality of the system as out-of-date information is detrimental to a good user-experience. Metadata should therefore be stable and unlikely to change too often. Although the exact criteria will depend on the specific situation, I find that for GCDOCS bi-annual / yearly changes are about as frequent as you want to go (monthly would start to get tricky, especially using the out of the box metadata management capabilities).
Project or contract numbers are an example of a metadata element whose value would be frequently changed, as each new contract is given a unique identifier. While perhaps valuable for searching it is all but impossible to maintain a list of these values since they are oftentimes generated on the spot and could grow ad infinitum. Instead of a set metadata field for these unique identifiers, encourage users to incorporate these as part of their naming convention or as part of a description metadata field.
- Is it covered by another piece of metadata?
- Does the metadata duplicate or contradict a piece of system-generated metadata?
Some metadata will overlap partially with others. For instance, it’s possible to have overlaps between specific named document types and whole business activities (e.g. the Management Accountability Framework could be both a document type and a business activity, depending on your architecture). It’s important to reduce redundancy as much as possible, and to re-use existing metadata across the Enterprise rather than repeating the information.
It’s also important to check and make sure the system isn’t already capturing something automatically. In addition to content types, dates, etc. the system tracks a lot of specific actions by means of audit and version control.
Automatically Inherited or User-Entered
- Does the client really need that piece of metadata?
- Could any other existing metadata meet their need?
User-entered forms are complicated to manage and should generally only be offered if the client can demonstrate a specific need to go beyond a generic description field or naming convention. They must also be willing to accept the discipline that comes with filling out metadata correctly each time they input a document. Otherwise, standard entry fields like the Title and Description should suffice.
The Right Context
- Does this metadata need to be mandatory, or can it be optional?
When determining whether something is suitable as metadata, it can also help to frame it in terms of mandatory vs optional on the part of the user. Will this metadata have value if not all objects are tagged with it? Will it be a burden on users to tag objects with this metadata? Is it worth making it mandatory in the system, or will this be a hindrance to usability?
- Is this metadata applicable to the entire enterprise, or only to specific business units?
Not all metadata will apply to every area or line of business, but some may be universally used. When determining the viability of a piece of metadata, it’s worth considering whether it will have value if it is used locally as opposed to across the enterprise. For example, is “fiscal year” generic enough that every group could potentially use it? Are there specific HR classifications which, while valuable as filters for HR, have little or no value for the rest of the organization?
Remember, the goal of metadata in an EDRMS is to contextualize and enable the information for use; anything else is wasted real-estate that detracts from the user experience and makes the system difficult to manage.