Skip to Main Content

NCC News: News

Japanese Studies Spotlight: Automating Book Selections and Donation Intake for Japanese (or any other language) Materials

by Paula Curtis on 2024-02-14T16:48:14-05:00 | 0 Comments

The NCC is collaborating with institutions and scholars to release a monthly series on our blog entitled Japanese Studies Spotlight. These features showcase exciting online collections available to researchers and students in Japanese Studies, introducing the archive or project, describing their contents, and demonstrating how they can be usefully engaged in research or in the classroom. If you are interested in submitting something to the series, please contact Paula R. Curtis, NCC’s Digital Media Manager, at digitalmediamanager@nccjapan.org.


Adam H. Lisbon, Japanese and Korean Studies Librarian, University of Colorado Boulder
 

Introduction

In my ten years as a subject liaison librarian supporting Japanese (and Korean) collections, the most significant constraint on my time has been communicating purchases and donations among myself, faculty and student researchers, donors, acquisitions, cataloging, and vendors. Given that each group of people has varying levels of ability to speak an Asian language, and many speak none at all, linguistic barriers hinder the ability to collaborate with ease. In order to address these challenges and quickly gather bibliographic data, I created a program, Chinese, Japanese, Korean Material Processing (CJKmP), which web scrapes data from WorldCat via the FirstSearch database (instead of WorldCat.org) and organizes the data into a spreadsheet-friendly TSV (tab separated value) file format. Because I designed this software to output in a format that is useful to spreadsheets, I also created 3 spreadsheets you can download that work in tandem with CJKmP.

CJKmP is published under an open access license, and is available for no charge. The code is also freely available, so it is possible for any user to customize the software for their particular collections or research needs.
 

Who Should or Could Use CJKmP?

Any librarian selecting Chinese, Japanese, or Korean (CJK) materials for their collections will benefit from this software, and it will be especially useful for those who work and coordinate with non-CJK speakers or who need to support library or archival collections in a language they don’t speak proficiently. Alternatively, this program could also be used by researchers who are interested in exploring bibliographic data as a source of information on Asian-language collections.
 

How Steep is the Learning Curve? Is it Open Access?

CJKmP is designed to be as easy to use as possible. The software is designed to work in tandem with Excel spreadsheets that you can download from the GitHub site. Start the software, open one of the Excel spreadsheets, open a web browser and log into FirstSearch, and you are good to go. The software and spreadsheets are designed with minimal technical expertise in mind. Being familiar with and able to interpret bibliographic data is much more important to ensure you are finding materials in FirstSearch and understand the nature of the books you are choosing to record in a spreadsheet.

There are a few initial steps for setting up the software explained in the Read Me. You need to use the Windows OS (10 or above) and must also have institutional access to the FirstSearch database. The CJKmP software is fairly straightforward, and learning which keys to press to run its features will be the biggest learning curve. However, there is also a tutorial mode to help you the first few times you try to pull data from FirstSearch into a spreadsheet. You can turn tutorial mode off after you feel comfortable with the workflows.

Figure 1. CJKmP GUI (graphic user interface)

The software also has various “checks” to make sure you are doing what is necessary for it to run. For example, it will “see” that you haven’t opened a spreadsheet, or that you haven’t loaded FirstSearch in a browser window, and display a notification telling you what you need to do. In terms of technical expertise with Excel, it is helpful but not necessary to be familiar with concepts like tables, conditional formatting, charts, and some basic formulas.

There are also more experimental features for cleaning up ISBNs, translating titles, and getting price data that are covered in the Read Me guide.

When explaining software, showing is better than telling. I welcome everyone to watch this basic introduction of CJKmP in action. I review the process of downloading the software and spreadsheets, understanding the interface, loading FirstSearch, and the steps to extract bibliographic data to create a list of books to order (or use for any other purpose):

 

 

Under the Hood: Using CJKmP’s Software for Custom Needs


Creating a list of materials for purchase:

If you download the “Orders Template” spreadsheet, you can quickly build a list of materials to buy with complete bibliographics data. This is useful for when the patrons you support contact you to purchase books, or for general collection development. You can place a title, ISBN, or OCLC number into the spreadsheet to quickly search FirstSearch/Worldcat and bring that data back into the spreadsheet, or just browse FirstSearch for materials and add their data to the spreadsheet as you search for materials.

Intaking donations:

If you need to review accepted donations before they are processed and sent to cataloging, you can quickly create a list if your donor didn’t provide one, or add more comprehensive data than what the donor did provide. If your colleagues cannot read Chinese, Japanese, or Korean, you can supply paper slips that can be printed from the list generated with OCLC numbers, and what collection each book should go to.

Selection Lists:

This spreadsheet works like the other two but has additional columns so that you can share it with someone you are collaborating with and they can select individual books they want from the list you’ve created. There is an additional column where another user can simply type a “y” for “yes” to indicate they want you to purchase a book. This is useful for understanding research faculties' interests and proactively building such lists.
 

Additional Features in the Spreadsheets

Using conditional formatting, these spreadsheets make it easier to catch errors and assure you have complete data on an item. Some examples include:

  • Cells will change color to alert that there are duplicate titles, ISBNs, or OCLC#s
  • Notice errors in formatting, such as having multiple ISBNs in one cell.
  • Blank red cells indicate Incomplete data, i.e., not saying what collection/location a book should be shelved.

The spreadsheets also have sheets that provide statistical tracking. For example, you can see a count of how many books are being provided by which publishers, get a breakdown of the years of publication, etc. They are completely optional and can be deleted if you are not interested in using them.

CJKmP is programmed in AutoHotKey (AHK). This language was designed with a simple syntax and is made to work “out of the box” without having to set up a custom digital environment. The code is saved in a simple-to-edit format and works instantly. The premise is that “hot keys” are created to run scripts. A hot key is essentially a keyboard shortcut that will run your AHK script (in most programs, for example, ctrl + S for save is a hot key shortcut!).

Figure 2. Notes on the Hot key functions from the CJKmP GitHub instructions.

At its most basic, AHK lets you easily create simple scripts to press keyboard strokes to mimic various keyboard strokes. It is far more feature rich than this basic premise: you can create GUI’s, manipulate text, utilize Windows OS processes, and more to create sophisticated programs and automate various parts of your own work.
 

The Takeaway

I had a working prototype of CJKmP as far back as 2019. In the summers of 2018 and 2019, I hired a student assistant to help me process a particularly large donation. In 2018, I hadn’t yet designed a useful automation tool, and my student worker was able to process about 50-60 books. The next year, with a functional prototype, my student worker processed nearly 300 books.

This automated process also saved my colleagues time. When I learned that my acquisitions colleagues process rush orders for Japanese books by looking up the title on Amazon Japan, and then comparing the kana/kanji of the title (that they can’t read) to the title I wrote in my spreadsheet, I created a new column that can automatically generate a URL to the book’s Amazon Japan record.

Automating placing detailed bibliographic data into spreadsheets benefited me in ways beyond simply saving time. Because I no longer have to use my time doing basic tasks like copying and explaining data, I can focus on higher quality selections for my collections, faculty, and graduate and undergraduate students. It also freed my time to engage with my colleagues in more settings, take on more service opportunities, and because I was more present, I was able to advocate and educate my colleagues on the unique challenges of building collections in East Asian languages.

I am excited for CJKmP to be available to the larger information services community. With more people using it also means more opportunities for feedback. I look forward to making improvements, enhancing accessibility, and creating more tutorials. If you would like more help using CJKmP, I’d be happy to meet one-on-one (email me) to help with setup and demonstrating its abilities.


 Add a Comment

0 Comments.

  Subscribe



Enter your e-mail address to receive notifications of new posts by e-mail.


  Archive



  Follow Us



  Facebook
  Twitter
  Return to Blog
This post is closed for further discussion.

North American Coordinating Council on Japanese Library Resources
北米日本研究資料調整協議会
Copyright 2017
Contact the Webmaster