Pinyin Romanization: New Developments and Possibilities

Philip Melzer
Regional and Cooperative Cataloging Division
Library of Congress
U.S.A.

melzer@mail.loc.gov


Background

As noted in Suping Lu's recent article, "A study on the Chinese Romanization Standard in Libraries" (1995), the Library of Congress, in 1979 and 1980, recommended that the library community undertake conversion from Wade-Giles to pinyin for the romanization of Chinese. It was anticipated that "more and more people will, in the future, approach Chinese through pinyin romanization..."; "... fewer and fewer library users will have a working knowledge of Wade-Giles." The library community, however, voted to retain Wade-Giles.

In 1990, the Library of Congress again investigated the feasibility of converting from Wade-Giles to pinyin. By that time, all other government agencies, as well as most of the scholarly and international communities, used pinyin to romanize Chinese. Libraries were virtually the only institutions in America that continued to use Wade-Giles.

Several different approaches to conversion were studied (for example, superimposition, split or separate files, and dual data in records). Unfortunately, for every scenario, difficulties and disadvantages predominated. Each approach involved considerable cost and human involvement, and none could guarantee a uniform practice that would be convenient for library users.

In 1990, conversion did not appear to be feasible, economically or technically, in the foreseeable future. So, even though the Library supported an eventual conversion to pinyin, and thought such a change was inevitable, it could not justify embarking on any kind of conversion project at that time. Nevertheless, the Library planned to explore, with RLG and OCLC, possibilities of machine conversion of existing MARC records to pinyin. Also, it would lay groundwork by investigating and seeking agreement on an approach to pinyin word division.

Recent developments

The Library has been monitoring recent developments that seem to indicate that conversion to pinyin might now be possible without entailing many of the complications predicted several years ago. For example, Karl Lo has demonstrated the utility of his front-end conversion program, which now makes it possible for OCLC users to key in a brief command to view in pinyin romanization a record cataloged in Wade-Giles, or to go from pinyin to Wade-Giles.

The Library has also been in touch with the National Library of Australia (NLA), which, in the process of constructing a national database, is now converting approximately 800,000 records from Wade-Giles to pinyin in the coming months.

NLA's project is significant for several reasons. First, it makes use of an independent conversion software program that identifies and converts Wade-Giles data in MARC records, and then reassembles the records. The program even identifies Wade-Giles in records that contain a mixture of languages. Parallel databases (containing Wade-Giles and pinyin) are being created. Those files will be maintained indefinitely, to assist other libraries as they also convert to pinyin at their own pace.

Word division follows the guidelines for Wade-Giles: the text is monosyllabic, except that words which were hyphenated in Wade-Giles are automatically connected in pinyin. NLA is not planning to convert authority records at this time, but anticipates that such conversion could be accomplished in much the same way as conversion of other bibliographic data.

NLA has identified a number of problems that it will have to deal with to produce error-free records. A certain amount of human review is required, for example, to distinguish between di and de. Nevertheless, early results have been highly encouraging. In a recent test run, 8000 records were converted without error after being put through the program and human review. Staff at the Library is now reviewing examples of converted records that NLA has generously sent to us.

The Library of Congress is following NLA's progress with considerable interest. If successful, such a program could make it possible to convert to pinyin without encountering some of the disadvantages that were anticipated in 1990. In the coming months, new options for conversion and cooperative efforts will be discussed within the Library, with the major utilities, and then with the Library community.

Preparing for conversion will involve plans and decisions about several related issues. A standard for word division will have to be proposed, discussed, and settled upon. It will be necessary to draft a plan for converting authority records. It is anticipated that, not only identifying Wade-Giles headings, but also finding Wade-Giles references, and verifying their proper conversion, will prove to be major challenges. Agreement will have to be reached on a MARC format to accommodate multiple romanization schemes. And then, of course, an overall plan for conversion will have to be developed and agreed upon.

The Library is interested in your views on this topic. Please convey any comments to the author at: melzer@mail.loc.gov


Bibliography

Lu, Suping. (1995). "A study on the Chinese Romanization Standard in Libraries." Cataloging and Classification Quarterly, 21, 81-97.


Copyright © 1996 Philip Melzer.
Submitted to CLIEJ May 30, 1996.