Competency E: Information Retrieval Systems
“The brain, the brain,
The center of the chain
That’ll never happen if you use your brain
Neurons run through the cortex
Into the frontal lobe
Past the hypothalamus, and it’s good to go,”Claudia’s Study Song, The Babysitter’s Club Movie
Introduction
Information retrieval systems are a complex and sometimes overwhelming aspect of the library and information science field. If library management, marketing, and so forth are difficult for the average librarian, one can only imagine that information retrieval, particularly in the design of information systems and databases, is particularly difficult. Even so, it’s time to press on and explore the fascinating world of information retrieval.
Internal Data Retrieval Programs Overview
There are as many ways to organize, store, and retrieve information as there is information, which is an overwhelming amount of data. Despite this daunting reality, information professionals manage to come up with innovative ways to draw out information, both new and old. For example, Koopman et al. (2024) have developed an agent for farmers called AgAsk, a data retrieval program that uses natural language processing to assist farmers in finding information from scientific resources. Undoubtedly, this helps farmers find vital information quickly to make last-minute decisions regarding weather, crop yield, and planting advice. Based on the Telegram messaging platform, this amazing tool aims to answer the most pressing questions for farmers and agricultural specialists.
External Data Retrieval Programs (Just Say Google, It’s Okay)
Internal programs for data searching are not the only consideration. Jahani et al. (2024) investigate the ability to retrieve digital library content using popular and well-known search engines, such as Google. They determined that while many offline databases had been studied, there was little to no research on how working and active databases for digital libraries interacted with Google, Bing, and other popular search engines in tandem. Unsurprisingly, Jahani et al. (2024) found that Google dominated this landscape at 32.55% of access. While other search engines were studied, it’s clear that many people (myself included) rely on Google regularly to access all information, including digital library databases.
Other popular and well-known sites, such as Wikipedia, also suffer from bias, overuse, and overreliance, with measurable consequences. Centelles and Ferran-Ferrer (2024) state: “In these cases, categories like “woman” or “non-binary person” are prohibited for the organization of content and thus information retrieval. These community-based decisions lead to some dysfunctions, which are particularly critical in languages that use grammatical gender, such as Catalan and Italian.
Gender Bias and Information Retrieval
Addressing this bias is important for providing equitable information retrieval and knowledge representation.” As the overwhelming majority of library professionals identify as women or nonbinary (Department for Professional Employees, 2024), this is indeed problematic for many reasons. While one can dismiss the romance-language-based versions of Wikipedia as being automatically gendered on the binary and thus difficult to manage equitably in that regard, this phenomenon persists even in the English language database(s). This, combined with the statistic that female and AFAB gender identities still only get paid 83.6 percent of what a male employee would (Department for Professional Employees, 2024), this ultimately means that most librarians are severely underpaid and yet persist in studying meaningful work in data and database design.
As a nonbinary (AFAB) member of the LGBTQIA+ community, I can confirm that gender bias is a real and concerning fact of life that makes meaningful study, for information retrieval and database management in particular (or other forms of analytics) outstandingly difficult. Coding languages like Python, SQL, and R are overwhelmingly learned and taught by men. As of 2023 (Statista, n.d.), 75.7% of software developers identified as male. While the number of female developers is rapidly increasing, this still presents an issue for the library and information science (LIS) field for obvious reasons. If most library professionals are female and most software developers are male, who is making the databases and who is running and testing them?
Conclusion
Ultimately, while database design and information retrieval are so incredibly vital to the information professions, including those holding a Master’s in Library and Information Science (MLIS), this is a clear signal that changes need to be made. Pilot programs that assist female and nonbinary librarians in the education and hiring process would be a huge boon to library professionals worldwide.
Artifacts and Evidence
Artifact 1
Assignment:
Course: INFO 202 Information Retrieval and Database Design
Description:
This group project involved coming up with a conceptual database for an item that a stakeholder might need to be organized. We decided to come up with snack chips for customers running sandwich shops and convenience stores that need to restock and keep an inventory of new, exciting, and novel flavors to pair with their sandwiches. All bags of chips were assumed to be snack-sized, that is to say, individually sized as one might find in the typical sub or sandwich shop. Family-sized chip bags were not considered in this process, nor were standard flavors considered to be “mundane”. My part of this project included research into the demographics of snack chip consumers and also some partial work was done in Caspio (which was later lost to computer error) to organize the chips accordingly within the database. I believe that some mild SQL was also used early in this class to demonstrate how it was meant to work, but was likely not used in this particular assignment or any other after its first introduction.
Artifact 2
Assignment:
Course: INFO 202 Information Retrieval and Database Design
Description:
This assignment allowed me to reflect on the class overall, and the group work I completed with my peers, which ended up being a semester-long assignment with various parts or sections. I discussed my extensive work doing a little bit of everything, but particularly in working with the technological side of things in Caspio. One of my teammates had a family emergency early into the class, and our copy of the prototype database was lost the night before it was due. I quickly whipped up another copy of the database for the assignment, and by the time our colleague came back, we had been able to restore the prototype to its former glory. My team was very proud of the achievement and impressed by my gumption and work ethic, which increased morale and set the tone for the remainder of the class. While all members of the team contributed equally, I felt particularly proud of my contributions therein and was emboldened by my newfound skills.
Artifact 3
Assignment:
Course: INFO 202 Information Retrieval and Database Design
Description:
This assignment was the beta evaluation of another team’s work, which involved the organization of data involving houseplants and choosing the proper indoor plants for one’s home environment. Our team found the project to be delightful, well-managed, and thoughtfully produced and were not eager to criticize the project at hand. This is still a project idea that I think of to this day, as a personal fan of indoor and outdoor gardening alike. I believe that our team suggested perhaps adding a category or checkbox that allowed us to filter out plants that would be harmful to pets, but otherwise could see no fault in this particular project, and those that received our project on snack chips could find little to no fault in ours, either. My memory doesn’t recount whether or not the indoor plant team traded with us project-for-project or if another team received our work, but in any case, classmates overall, whether in our group or not, were hardworking and dedicated to the pursuit of database design.
References
Centelles, M., & Ferran-Ferrer, N. (2024). Assessing knowledge organization systems from a gender perspective: Wikipedia taxonomy and Wikidata ontologies. Journal of Documentation, 80(7), 124–147. https://doi.org/10.1108/JD-11-2023-0230
Department for Professional Employees, AFL-CIO. (n.d.). Library professionals: Facts and figures. Retrieved November 8, 2024, from https://www.dpeaflcio.org/factsheets/library-professionals-facts-and-figures
Jahani, H., Azzopardi, L., & Sanderson, M. (2024). Measuring the retrievability of digital library content using analytics data. Journal of the Association for Information Science and Technology, 75(11), 1233–1248. https://doi.org/10.1002/asi.24886
Koopman, B., Mourad, A., Li, H., van der Vegt, A., Zhuang, S., Gibson, S., Dang, Y., Lawrence, D., & Zuccon, G. (2024). AgAsk: An agent to help answer farmers’ questions from scientific documents. International Journal on Digital Libraries, 25(3), 569–584. https://doi.org/10.1007/s00799-023-00369-y
Statista Research Department. (n.d.). Worldwide developer gender distribution 2022. Statista. Retrieved November 8, 2024, from https://www.statista.com/statistics/1446245/worldwide-developer-gender-distribution/