Thursday, November 12, 2015

5 Tips for Creating Discoverable Metadata

The wonderful thing about the digital age is the sheer magnitude of information and resources available for exploring, discovering, and learning.  The greatest challenge is the sheer magnitude of information and resources available for exploring, discovering, and learning.

For information seekers, it’s daunting sifting through the endless amounts of information and frustrating trying to decipher between good information and fallacies.  It’s aggravating when in that search, they find themselves trudging through murky paths that lead to dead ends. 

For content providers, it’s a constant battle trying to assist seekers in locating their carefully curated items.  It’s a game of hot and cold as they strive to lead them from one step to the next in their exploration.  It involves massaging and manipulating information in an effort to optimize placement within major search engines.  It often requires finding collaborative arrangements in order to aggregate their content with others to effectively expand their reach.   

Metadata librarians are on the front lines in this battle.  They represent an under-appreciated, yet critically important cog in the battle to make information openly available.   We are apt to appreciate the content, but slow to remember the individuals who spend countless hours sifting through the items, digging up information, and masterfully organizing that information to provide seekers with the best chance of discovery.
In an effort to understand this process better, I reached out to a couple of top-notch Metadata Librarians to provide some insights and best practices for creating discoverable metadata. 

Here are some tips:

1.  Think globally

Success in content curation and delivery is often measured by the expanse of its reach.  Fortunately, the Internet knows no boundaries.  Once online, content can theoretically reach a global audience.  However, in order for those items to be found, it is important to consider whether or not the descriptive data is understandable to a non-local person.  Think about what additional context may be included in order to clarify as much detail as possible. 

Be sure to take precautions as to which fields would most appropriately contain that information.  An overly descriptive “Title” field may hinder rather than help your cause.  The “Description” field would be the most appropriate for extensive detail and context.

2.  Pay attention to detail

Anne-Marie Hamilton Brehm from the HendersonDistrict Public Library in Nevada said: 

“You have to be a good editor with a sharp eye and determination to produce high quality results. Nobody’s perfect, but you have to be willing to learn and adapt and make things right.”

Part of that adaptation may require looking in greater depth into the context of the item.  For example, you may have an old photo with some metadata wrapped around it.  Upon further inspection of the subjects within the photo, you may find that the particular dress standards may not be consistent with the date.  Or, you may notice landmarks or other clues that might provide additional detail related to the location at which it was created.  Further investigation and information finding may be merited. 

In addition, it is important to inspect your metadata before loading it into your digital library platform to ensure consistency within metadata fields.  For example, are all names in a consistent format such as Last Name, First Name or are there some that start with the first name?  Given that much of the information contained has been entered by humans, human error is common.  Most digital library systems will allow for the creation of collection-based controlled vocabularies to help in this inspection.  However, it is still worth a glance-over to ensure all is correct. 

3.  Adhere to professionally accepted standards

Anna Neatrour, Digital Librarian at the J.Willard Marriott Library at the University of Utah stated:

“Keeping track of standards and having a good understanding of how metadata can be enhanced, both at a local level and in a discovery system can be very useful. As we move towards linked data, using linked data compatible controlled vocabularies becomes more and more important.”

One of the greatest challenges with the Open Archives Initiative has been the consistency, or lack thereof, of field names, terms, references, etc.  Generally speaking, people are quite inconsistent in how they describe or reference things.  One data set may describe Baltimore, Maryland as Baltimore, MD..  Another may spell it out completely.  Likewise, the author Charles Dickens may be referred to as “Dickens, Charles”, or “Dickens, Charles J.H.” or “Charles John Huffam Dickens”, or “Dickens, Charles John Huffam”.  There are a myriad of ways to describe the same thing.  Researches are often challenged by these inconsistencies.  Standardized methods are always appreciated and greatly enhance efficiencies in research. 

For data aggregation and harvesting purposes, it is important to seek a standardized method for describing certain aspects of your items.  The Library of Congress is a great place to start when seeking standardized terms.  You can find standardized subject headings, name authorities, etc. to assist in your efforts.  For field names, standards have been created to encourage consistency.  The most common, referred to as Dublin Core, was developed as part of the Dublin CoreMetadata Initiative (DCMI).

Staying current on all standards is essential to success in this area.  Anne-Marie Hamilton-Brehm mentioned:

“Access to collections depends on adherence to standard naming and coding conventions and file formats. You not only have to research and apply standards, but you have to update your knowledge continuously over time as standards and markup languages evolve.”

4.  Focus on Curation

According to Anne-Marie Hamilton Brehm: 

“Placing a collection and its materials in historical context with engaging descriptions helps you connect with your patrons. Providing additional details about images, documents, and other historical materials will encourage visitors to browse and may inspire them to donate related historical materials.”

The bottom line is, the more information and context you can provide, the more engaging will be your materials.  That engagement will open up the minds of your patrons toward additional discovery and, hopefully, contribution of additional information and materials. 

5.  Leverage Technology

There exists a wide array of technological solutions to assist in the process of digital collection creation and curation.  Here are a few items worth mentioning.

Digital Library Platforms
There are many platforms available for archiving and displaying your digital content.  They are quite diverse and are called different things depending upon their core functionalities.  They may be referred to as Digital Library Systems, Digital Asset Management Systems, Internal Repositories, Digital Management Systems, etc. 

If your primary intention is to make your digital collections available for public consumption and dissemination, a Digital Library platform should do the trick.

In deciding on a platform, first and foremost, pick a system that is user-friendly and engaging on the part of the patron and the librarian.  Digital librarians should be focused on what they do best—the creation and curation of digital content.  Loading the materials into your online platform should be of minor concern.  Patrons should find their way into your collections and around them with ease and simplicity.  A buttery-smooth experience will only enhance their engagement and magnify the appreciation for and contribution to your various collections.

Many librarians are concerned about the technological expertise required to implement some of these systems.  Particularly since high-quality IT talent is difficult to come by, particularly within a library setting.  An open-source solution, though appealing, may be too difficult to implement and manage as a result. 

An easy-to-use, out-of-the-box solution worth considering is Simple Digital Library or SimpleDL.  They are focused on simplicity and ease of use for librarians and patrons alike.  Implementation is immediate and the ability to customize and tailor toward a specific look and feel is slick and easy.  They have a quality staff that is able to design and implement a personalized interface for you at a very reasonable cost. 

Regarding this service, Ellen Dubinsky, Digital Librarian at the Clement C Maxwell Library said:

“SimpleDL has given us the ideal platform to present our digital image gallery—beautiful display, easy-to-use interface on both the front-end and back-end, and great tech support.”

Spreadsheets
As a metadata librarian, it is of paramount importance to familiarize yourself and become expert in the use of spreadsheet applications. 

The beauty with spreadsheets is that they effectively divide and categorize all of the metadata in an easy to use, easy to review interface.  In addition, they improve efficiencies tremendously through their inherent fill-down tricks and various functions. 

As an example, recently I was creating some metadata for some individual census records and realized that the best title for each item was going to be the last name followed by the first name and age.  The trouble was that there were hundreds of records and the data I wanted to include in the title field were in individual columnar fields.  It would have taken me hours to type each title by hand. 

Instead, I utilized a concatenation function in Microsoft Excel to bring all the data from the respective fields into a new field named “Title”.

The data looked like this before:


 

The function I used to bring the data from columns E, F, and G together into D was:

=E2&”, “&F2&”, age “&G2

The result was as follows:



By copying and pasting the formula down to the bottom of the list, I was able to quickly and easily have a title field that I wanted. 

This, of course, is one of many tricks that may be utilized with many spreadsheet applications to make working with data easier and more efficient.

Other Tools
Anna Neatrour offers a couple more technologies worth considering when working with Metadata:

XSLT

“As I’ve been working as a metadata librarian, I’ve grown to appreciate the ways that metadata can be extracted and transformed through leveraging things like XSLT. If I was starting from the beginning, I would have learned XSLT earlier because I would have been much more efficient in some of my earlier work!”

OpenRefine


“Tools such as OpenRefine are becoming an essential part of a metadata librarian’s work now. Practices for descriptive metadata often evolve over time, and it is often time consuming to go back and review and enhance metadata in older collections. Having clear documentation and training for people developing descriptive metadata is key in getting it right the first time.”

If you have any other tips or tricks you would like to share with the community, please comment below.  We would love to get your thoughts and feedback on this important topic.