Joint Statement on Pinyin Conversion by LC, OCLC, RLG, and CEAL

Philip Melzer
Regional and Cooperative Cataloging Division
Library of Congress
U.S.A.

melzer@mail.loc.gov


Representatives of the Research Libraries Group (RLG), the Ohio Online Computer Center (OCLC), and the Council on East Asian Libraries (CEAL), met on May 29, 1997 with representatives of the Library of Congress' Pinyin Task Group at the Library of Congress to begin planning for the planned conversion from Wade-Giles romanization of Chinese to pinyin.

Participants included:

The participants engaged in a wide-ranging discussion of many aspects of conversion. Because CEAL's recently formed Pinyin Task Force had not yet had an opportunity to meet, its representative took part primarily as an observer. He said at the outset that East Asian librarians were concerned about the cost that conversion will bring to everyone. CEAL is also uncertain whether conversion is a necessary step, hoping that perhaps a front-end conversion program might suffice. Other participants noted, while convenient, front-end conversion did not affect the content of bibliographic or authority records.

Most favored declaring an implementation 'Day 1', after which time Wade-Giles should be abandoned in favor of pinyin. It was anticipated that 'Day 1' would not occur until at least 1999 because of participants' prior organizational commitments. Nevertheless, LC was urged to announce, as soon as possible, that it would convert to pinyin, so that others might begin to plan retrospective conversion projects. It was also felt that, as a first step, LC should issue a pinyin standard -- to define pinyin for the library community.

It was agreed that conversion of existing files should occur as close to 'day 1' as possible, but that 'day 1' should not be delayed interminably because of conversion complexities. LC was urged to decide on its implementation date, as well as its approach to the conversion of its files, after having consulted widely and taken others' concerns into account in making decisions.

LC intends to distribute its database after conversion. It is not known, at this time, whether other libraries will convert at the same time.

Participants discussed at length the advantages and disadvantages of using various datafiles as the basis for conversion of LC records. LC proposed using those LC records already converted by NLA as the starting point for conversion. RLG suggested that its files be utilized so that aggregators might be conveniently retained. It was agreed that specifications for conversion should be made available so that libraries and utilities could adapt and adopt them as necessary.

This, in turn, led to discussion of word division and syllable aggregation. LC favors the separation of individual syllables in pinyin except in personal names and geographic jurisdictions, as done by the National Library of Australia. RLG advocated looking first at the PRC standard for word division [GB3259-92]. Word division is part of the romanization rules for Japanese and Korean that RLIN users also apply to the parallel CJK fields. RLG explained that the reason why it developed "aggregation rules" for Chinese was to support word searching for both Chinese materials and across all three CJK languages where many CJK terms are in common use. The "aggregator", however, is RLIN-specific and is not communicated on export. RLG would prefer adoption of a GB standard. If word division is not included as part of the Chinese romanization rules, then the RLIN-specific aggregators could still be used to support word searching as is now done for Wade-Giles; RLG could map the character to a USMARC character like "joiner" for libraries which wish to retain word division in their local systems while other libraries could strip the character out. Aggregators in current records could be used as the basis to convert Wade-Giles in records to pinyin that includes word division.

OCLC was concerned that, because searching a utility was justification for establishing a new heading, unconverted Wade-Giles headings would have to be re-established after 'day 1'. LC representatives favored converting as many bibliographic and authority records as possible initially -- perhaps prior to or in conjunction with 'day 1' -- and then converting on an as-needed, as-encountered basis, in a manner similar to AACR2 implementation. It was recognized that, even if that could be accomplished, databases would still be mixed after conversion because they would continue to include some records with Wade-Giles elements, but this approach was preferred over the creation of split files.

LC representatives said they hoped to convert as much of the name authority file as possible. Several procedural issues arose: for example, would conversion include NARs contributed via OCLC or RLIN? It was suggested that 670 fields not be converted, but concern was expressed over the resulting lack of agreement between that field and corresponding headings and references. Further, it was agreed that records with Chinese 1xx headings which were not created according to systematic Wade-Giles romanization (i.e. well-established English-language form), but do include systematic Wade-Giles 4xx references, would not be systematically converted.

It was agreed that name authorities should bear some indication that they have been converted to pinyin. LC will be exploring the possibility of adding a "flag" to identify language, script and romanization scheme at the field level in a forthcoming proposal to MARBI. Although it could be too costly to add such an identifier to past records, it could be added during the conversion process, and to new records after implementation.

Subject headings containing WG elements were not seen as a major problem because they are not numerous.

Conversion of files and a switch to pinyin will affect classification, especially works of literature by personal authors, geographical locations, and second cutters. However, individual libraries make their own policies about cuttering. Therefore, LC was urged to set and announce its policy, so that others could take what steps they felt were appropriate.

RLG offered to establish a listserv to further effective communications between the parties as planning progresses. It was agreed that the parties would meet again on the issue of pinyin conversion when it is found to be mutually desirable.


Originally posted on EASTLIB, this joint statement is published here with the permission from the original sender.