Competency G: Classification Systems
& Metadata
“What are these forms for?”
“These are forms to get the forms that enable us to order more forms, sir.”Colonel Blake and Radar O’Riley, M*A*S*H*
Introduction
Classification systems are vital to retrieving and organizing information. Thus, it stands to reason that Library and Information Science (LIS) professionals should know and understand these types of systems. From the Library of Congress Classification (LCC) System to the Dewey Decimal system, MARC, DublinCore, and others, there are so many ways to present data about data, and controlled vocabulary systems can even include proprietary systems, such as those for music or art databases online.
How GLAM!
Rovira et al. (2023) aimed to produce a metadata schema and database for Galleries, Libraries, Archives, and Museums (GLAM), as these facilities often lack long-term digital backups and properly formatted or consistent metadata. Furthermore, Rovira et al. (2023) propose splitting the database resource maintenance between various GLAM institution members, depending on their department. For example, if they are gallery or museum employees, they would attend to the data entry for each taxonomic item. Library and archives professionals would attend to the database, archival organization, and metadata schemas. These items being cataloged are often considered world heritage items and must be maintained for the betterment and enrichment of humanity.
Handwritten to Digital
Another important issue of note is the development of handwritten text identification, also known as handwritten text recognition (HTR). Nockels et al. (2023) mention that before the invention of this technological wonder, parsing manuscripts that were handwritten was difficult, time-consuming, and generally a taxing burden on resources and time that could be spent elsewhere. Optical Character Recognition (OCR) is a subset of Handwritten Text Recognition (HTR) and allows each letter to be recognized on its own before being identified and classified into another word category. This allows for the mass digitization of archives previously meticulously hand-transcribed and uploaded manually for each taxonomy.
Late et al. (2023) discuss the further issue of lack of information about how digitized photographs are used in the arts and humanities, particularly for cultural heritage sites. Late et al. (2023) goes on to state that while there has been previous digitization of GLAM products for academic purposes, they have almost always been textual in nature and that it’s unclear how photographs and other visual assets are used, cataloged, and generally organized in a GLAM or even an academic environment. These qualitative interviews on the part of Late et al. (2023) attempt to classify the overall use and usefulness of digitized visual archives.
Linked Data and Google Strikes Again
Finally, the concept of library metadata in the digital realm is another point of interest for library and information science (LIS) professionals. Huwe (2024) states that linked data innovation is being driven by library metadata for the following reason(s): “Coders and industry leaders such as OCLC follow its development closely since it is fundamentally network-agnostic and might become a lingua franca that connects a spectrum of metadata systems.” Given the number of metadata schemas and differing classification systems to organize library and information materials, this is an exciting development in digital archives and classification. Huwe (2024) goes on to summarize: “Second, linked data encourages the reuse of existing controlled vocabularies and finding aids. Continuous revision, onerous at first, will become more cost-effective over time. Complex database processes, long overdue for simplification, are additional candidates for a makeover.”
Conclusion
Despite the overwhelming amount of classification systems, metadata organization taxonomies, et. cetera, there is great hope in streamlining these processes for future library professionals, such that existing taxonomies are easier to learn and use, without needing high levels of technical skill to access them. Even if these systems seem overwhelming now, they will likely soon, with the help of computer technology and digitization, be an invaluable resource for years to come.
Artifacts and Evidence
Artifact 1
Assignment:
Course: INFO 220 Music Librarianship and Informatics
Description:
This project sought to come up with a controlled vocabulary and overall data management plan for both physical and digital assets. Much like many other GLAM facilities, the typical music library has taxonomies and categorizations that are completely proprietary to the medium (particularly music). This project also describes the necessity for cross-functional teams to come together to ensure that items are preserved, categorized, and archived in a tidy and responsible manner. Many audiovisual libraries deal in fragile documents, manuscripts, film reels, photographs, compact discs, and other physical assets that are difficult to transfer to an online database. There will certainly need to be more emphasis on these types of learning and teaching experiences in the future, particularly for the average librarian if we expect to preserve our cultural heritage in the arts and humanities at large. The internet has made our ability to search and find easier than ever, however, there is a marked gap in collections such as these, as far as consistency and metadata logging.
Artifact 2
Assignment:
Course: INFO 202 Information Retrieval and Database Design
Description:
This controlled vocabulary exercise taught my classmates and me how to make and organize a controlled vocabulary and prepare metadata for a catalog. This fed into our other projects in Caspio and helped us to organize our alpha and beta testing information into a taxonomy that made sense. While indexing is admittedly not my strong suit, I am better equipped to understand this type of work than I once was, and am able to at least see the correct way to begin or to accomplish a project of this type, even if I required outside help from another resource, such as an Information Technology (IT) department. Out of everything that I have done when working with databases, I have always felt that inventing controlled vocabularies was one of my favorite exercises regarding metadata. There was a class at SJSU on Vocabulary Design and I regret to say that I did not attend the class, but now feel I perhaps should have, as my classification skills would now be stronger.
Artifact 3
Assignment:
Course: INFO 202 Information Retrieval and Database Design
Description:
The alpha prototype returns. Snack chips, ahoy (no pun intended)! Not only did I complete extensive work in Caspio for this and other group projects in this class, but we were also responsible for coming up with controlled vocabularies for our databases at any given time. While I don’t recall (due to file loss) what those vocabulary terms are anymore, I understand that there is a great and desperate need to create and maintain taxonomies for small projects of this type as well as larger ones. If I could return to school and continue to study, I would say that this project piqued my interest in learning SQL, Vocabulary Design, and other related information. I did attempt a Python class after this project but quickly learned that like many female and AFAB librarians (who are in the majority), I struggled with learning to code, likely due to being underserved in the field. Still, this project (and this class) were eye-opening for me.
References
Huwe, T. K. (2024). Library metadata is driving linked data innovation. Computers in Libraries, 44(11), 16–18. Retrieved from https://www.infotoday.com/cilmag/nov24/Huwe–Library-Metadata-is-Driving-Linked-Data-Innovation.shtml
Late, E., Ruotsalainen, H., & Kumpulainen, S. (2024). Image searching in an open photograph archive: Search tactics and faced barriers in historical research. International Journal on Digital Libraries, 25, 715–728. https://doi.org/10.1007/s00799-023-00390-1
Nockels, J., Gooding, P., & Terras, M. (2024). The implications of handwritten text recognition for accessing the past at scale. Journal of Documentation, 80(7), 148–167. https://doi.org/10.1108/JD-09-2023-0183
Salse-Rovira, M., Jornet-Benito, N., Guallar, J., Mateo-Bretos, M. P., & Silvestre-Canut, J. O. (2024). Universities, heritage, and non-museum institutions: A methodological proposal for sustainable documentation. International Journal on Digital Libraries, 25, 603–622. https://doi.org/10.1007/s00799-023-00383-0