Saturday, October 19, 2013

Increasing access through tagging

When trying to assign subject terms to items, the librarian has to find a balance between control and access, between specificity and broadness, between imposing meaning and allowing users to find their own meaning, between institutional viewpoints and user interpretation, between objectivity and subjectivity. It is a hard balance to achieve, and a blended approach between controlled terms imposed by the library and uncontrolled tags submitted by viewers may be best. Controlled and uncontrolled vocabularies each have pros and cons, and ideally, they could both be used to balance the faults of the other.

Controlled vocabularies, such as Library of Congress Subject Headings and terms from the Getty Art and Architecture Thesaurus, aim to limit subject access to specific and uniform terms. They eliminate most redundancy, and they limit syntax and vocabulary so that all users are using the same terms. If the subject term is known by the user, then finding items that fit that topic is easy, because all items will be tagged with the same exact term, with no variations in spelling or punctuation or phrasing. Controlled vocabulary also aids the institution in keeping their subjects politically neutral and objective. The terms were chosen by another organization, and any potential choices about political correctness of terminology fall on the organization who developed the vocabulary.

On the other hand, controlled vocabularies often deviate from natural language, such as the order of names when searching. Users who search using natural language will search for Firstname Lastname, while many controlled vocabularies have the name listed as Lastname, Firstname. Controlled vocabularies also eliminate synonyms and redundancy which can aid in access. Users may not know the exact term and would search for a similar but different word (say, they would search for “bubbler” instead of “water fountain”, or they search for “dress” instead of “dresses”). Users might then be frustrated that their search is returning no results. Lastly, controlled vocabularies limit input and access for many users who do not fit the “default” demographic. The controlled vocabularies are often created by white institutions of power, and may not reflect the voice of everyone in society. The Library of Congress’s view of social issues like feminism, racism, and the struggles of LGBTQ may not use the terms that the respective people would want used. They only represent the viewpoint of the one social group. It also limits access by international users. The lack of redundancy and similar terms mean that international users who are searching for pictures of trucks will get no hits on their search term “lorry”. The internet is an international forum, and it should be accessible by a varied and international community. Controlled vocabularies do not allow this.

Uncontrolled vocabularies offset many of the downsides of controlled vocabularies. Uncontrolled vocabularies greatly improve access by allowing users to tag images with colloquial terms. Uncontrolled vocabularies have redundancy so different but similar searches will still return results. Natural language searches become much easier, as do searches with various spellings and syntax. For instance, while the “official” spelling of the Korean dancing girls is “kisaeng,” many people spell it “gisaeng,” and if the image was tagged with both, then both groups of people would be able to find it. While controlled vocabularies often try to remain detached and objective, assigning subjects to only what the cataloger considers the most important parts of the image, uncontrolled vocabularies and social tagging allow people to apply their subjective views to the image. Different social and political groups can use their own terms for issues. Small side objects in pictures can be highlighted by a tag, such as the musical instrument in the corner of the weaving picture that was mentioned in the LoC report. While it may not be the “point” of the image, uncontrolled vocabularies and user tagging would allow people looking for pictures of that particular instrument to still find the image. Uncontrolled vocabularies essentially exponentially increase the accessibility of an item by giving equal voice to all interpretations and objects in the image in whatever terms the users find best to describe them.

Which is not to say that it does not have downsides. Uncontrolled vocabularies have no rules about spelling or syntax, or rules about whether to use a singular or plural. They allow almost infinite combinations of terms and infinite broadness and specificity. No image can truly be tagged with every single variation of a term, so inevitably some tags would be missing from some images. Non-thorough taggers may tag an image “dress” but forget “dresses” and the searcher would then have the same problem as with controlled vocabularies. Other problems include the issue of controversy. Users at either end of a political spectrum may disagree with the other’s terminology, and may complain about the other viewpoint. This places a burden on the librarian to decide whether to give both views equal space, or to try to decide the political correctness of each point of view and thereby making a moral, social, and political statement on behalf of the institution.

Ultimately, I think the best solution would be to use both sets of terms. Let controlled vocabulary terms be added to the tags list, and then allow user to fill in the rest on their own. The user tags will allow better access and variety, and nothing on the internet is worth anything if it is not accessible by a wide audience. Also, by having both sets of terms, the cataloger can make sure that the areas that the institution wants highlighted are definitely available as access terms, but there is also the flexibility to allow more terms. There are always things in pictures that the cataloger won’t see or won’t think are important that others would. These items that the users see can then be new access points for others. The wonderful thing about tagging on platforms like Flickr is that the number of access points is greatly increased. Controlled vocabularies are rooted in the days of card catalogs, when access was limited by the space that the cards took up. With more space, we can create more access points, and we can better describe the object by allowing users to help us create the access points that they want. It may not be as consistent as just using controlled vocabularies, and there may be items that don’t get tagged efficiently for access. But the increase in access by uncontrolled vocabularies more than offsets the few items that fall between the cracks.

As for whether the Flickr Commons could benefit institutions of all sizes, I say yes. It increases visibility of the collections and therefore the institutions. It allows increased access to valuable material that can be then be used by the internet community at large. Smaller institutions can better lobby for local support, as well as attracting the attention of historians and researchers from the wider audience who can help identify and add meaning and context to items. Institutions of all sizes can benefit from the increased name awareness and traffic.

No comments:

Post a Comment