The Development of China Digital Library and Its Influence on China and the World

 

Sun Wei

Chief Engineer

National Library of China

Beijing, China

sunw@nlc.gov.cn

 

ABSTRACT: This paper discusses how the implementation of the digital library in China affects the country and the world.

 

The concept of digital library started as electronic library. As the Internet became the trend of the 1990s, many research institutions (mainly in USA) realized the problem of massive distension, disorder and inefficiency of information on the Internet. The concept of digital library was expounded in USA with a view to solving the problem.

 

The National Library of China is involved in most of China’s national research programs on digital library. A digital library network came into being in recent years in China. The establishment of the national digital library signified the formation of China digital library system. The Chinese National Science Digital Library and National Educational Digital Library have been completed. The National Digital Library of China, the Socialist Party Digital Library and the National Defense Industry Digital Library are under construction.

 

Despite the problems in the development of digital libraries in China, we are glad to observe that the models of the digital library in China have exerted their influence upon the development of digital libraries throughout the world.

 

The digital library will have a great impact on education, research and the development of rural areas. Digital library is the information system in network. Standardization is a key, without which the digital library is a mere information island, thus difficult to access. Research on standards and legislation of digital library also promotes the research of digitalization standards in China.

 

I: Introduction

 

Since 1995 when China started observing and conducting research on digital library, we have experienced three stages:

 

  1. information pursuit on digital library (1995-1996);
  2. research and experimentation on digital library (1997-2000); and
  3. preparation for the construction of digital library (2000-2002).

 

After the former President Jiang Zemin visited the National Library of China, investment from the central and local government has been increased.

 

The implementation of China’s digital library is a most important part of the Internet in China. It will enrich Chinese online resources.

 

This article summarizes the influence of the development of China’s National Digital Library on China and the world.

 

II: History and Status of China Digital Library

 

The concept of digital library appeared in late 1980s. At that time, it was called “electronic library”. As the Internet became the global trend in the 90s, many research institutions (mainly in USA) realized the problem of massive distension, disorder and inefficiency of information on the Internet. To solve the problem, the concept of “Digital Library” was expounded in USA.

 

In 1995, China began to collect information on digital library. In 1997, The National Library of China (NLC) began the first national-level research project: China’s Pilot Digital Library, funded by the State Development Planning Commission (SDPC). Since then, NLC has been involved in many of Chinese national research programs on digital library, including:

 

·        1998       Project no. 863-306 “Knowledge Grid - Digital Library System”

·        1999       Zhongguancun Science Park – Digital Library Cluster” Project

·        2000       Project no. 863-300 “China Digital Library Application System

·        2002       China Digital Library Standard” Project

 

Over the past 6 years, a network of digital libraries has been completed in China, including digital content providers (Tsinghua Tongfang清华同方, SSReader北京时代超星, Wanfang Data万方数据, and VIP Information重庆维普咨讯); electronic publication agents (Founder方正科技, and ShuSheng北京书生数字技术); a digital content producer and system integrator; digitization system research and integrators (Digital Innovation Technology北京中数创新技术, China soft北京中科软件, Founder Technology方正科技, Tsinghua Tongfang, and ShuSeng); system integrators for search engine (TRS北京拓尔思信息技术, and Tell You Information Technology浙江天宇信息技术); and system integrators (Lenovo联想, Taiji太极计算机, and Huadi华迪计算机). In this network, a digital library is both a resources user and a provider. Since the network of digital library was completed, digital content in the library came from many ways, including:

 

·        Specific digital content of the library

·        Shared digital content with other institutions

·        Purchased digital content from ICP

 

The architecture of China’s digital library system is composed of several national-level professional digital libraries, of which the China National Science Digital Library (CNSDL) and China Academy Library and Information System (CALIS) have been completed; the China National Digital Library (CNDL), the Communist Party Digital Library (CPDL), and the National Defense Digital Library (NDDL) are under construction; and digital libraries in the medical field, agriculture and irrigation works will be built.

 

The concept of digital library, both in theory and practice, has been accepted by information collection institutions such as libraries. Many institutions tried to construct digital libraries, especially in digital content production. For example, the National Library of China experimented in the digitization of the Library’s traditional collections. In the digitization of multi-media content, China Central Television (CCTV) and China National Radio (CNR) created video and audio standards of collecting, cataloguing, searching and service. The State Archives Administration of China (SAAC) and the Palace Museum (PM) are making standards for industrial digitization of manuscript and cultural relics. The State Intellectual Property Office of China (SIPOC) also implemented the standard of digital patent. All those implementations have improved the production and application of digital content. CNSDL attempted to construct a service-based digital library. CALIS is especially advanced in standardization and management.

 

With the knowledge of the development of digital library in China, we are now able to study how our development has influenced China and the world.

Oval: Net Service

Service
Oval: Network 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


III: Contributions to China’s Internet

 

When I visited Singapore as the technical adviser to Mr. Sun Chengjian (formal vice director of NLC) in 1998, South China Morning Post was the most popular online Chinese resource. Beijing Telecom hoped to cooperate with NLC to fill the bandwidth with digital content, thus strengthening the Chinese information on the Internet.

 

In 1998, Chinese information only accounted for 1% of the Internet. Few people searched for Chinese information online. Due to the unbalance of the information input and output of China, the expenses of the international LAN in China are much higher than that for other countries.

 

At the end of 2003, things changed. Chinese information was the second largest on the Internet. Usage rate of Chinese information was in the 4th place. Internet users reached 87 million. 36 million computers were connected to the Internet. The bandwidth of international connection reached 54G.. The gap of input and output was less than 3%. According to the latest statistics of China Internet Network Information Center (CNNIC), a total of 10.2 % of the Internet use in China is for educational purpose, of which 9.1% for learning and 1.1% for research. 42.3% of the visitors log on the Internet for searching information; 34.5% for entertainment; 62.1% for news; 5.8% for online learning, and 3.5% for electronic journals. In a latest survey conducted by CNNIC, those who have never heard of the digital library accounted for 2.5%; heard of but not quite understood, 19.5%; had some knowledge, 52.4%; and known very well, 25.6%. The “digital library” has become one of the most popular buzz words in the Statistical Survey of CNNIC.

 

CNNIC reported in July, 2002 that “the fifth trend is: the Internet has grown so rapidly from lack of resources to massive information, forecasting the arrival of a Content-Centered Age. China’s Digital Library Project has enhanced the growth of massive information on the Internet.”

 

The following are some related statistics provided by the State Council’s Information Office of China (SCIOC):

 

Amount of web pages

Total

311,864,590

Stated web pages

226,725,557

Dynamic web pages

85,139,033

Proportion

2.66:1

Pages per website

523.7

Bytes of web pages

Total

6,059,431,526KB

Bytes per page

19.43KB

Average of bytes

10,174.51KB

 

According to the statistics supplied by large-scale content providers of China, such as Sursen and Founder, 1.1 million books have been digitized, which account for 40-60% of the total publications from 1949 to 2000. 450 million image documents have been produced, totaling 9.9TB, with 1.1 TB as full-text documents.

 

According to the statistics of journal digitization companies, such as Qinghua Tongfang (www.thtf.com.cn) and Chongqing Weipu (www.cqvip.com), 12,000 journals (65-70% of the existing journals) have been digitized. The whole service storage reached 3.6TB, a 450GB increase per year. There are also 600 electronic newspapers and news websites.

 

These statistics show what digital libraries in China have contributed to the Chinese content on the Internet. Traditional books and journals occupy a big place on China’s Internet. Many Chinese in US access Chinese books and journals online. Personally I believe that, in spite of some legal issues and technical problems remained to be resolved, the development of China’s digital libraries has significantly contributed to the robust growth of China’s Internet.

 

At the end of 1999 and the beginning of 2000, delegations from University of California-Berkeley, Ohio University, and Singapore National Library came to Beijing respectively to see NLC’s digital library project and the information system of the Palace Museum. They spoke highly of them. Their technical researchers observed: “China developed its digital library with existing technology and for the purpose of practical use.” Their librarians commented: “It is amazing that China has produced such large-scale digital content in a few years.”

 

IV: Influence on International Digital Library Development

 

Despite some problems in the development of digital libraries in China, I am glad to notice that the model of digital library in China has influenced on the international digital library implementation.

 

A. Digital library model applied to and approved by the world

 

Cooperated with companies and research institutions in China, and considering the practical environment of China’s networks, NLC was the first to advocate for the building of the digital library under the principles of metadata and objective data, and modeling the pattern of “book title” to “table of contents” to “page image”. NLC is also one of the first libraries which have planned and begun to digitize traditional content industrially.

 

 

 

 

 

 

 

 

 

 

 

 


The chart shows the workflow of industrial production. NLC linked digital objectives with MARC as the basic metadata and processed the image files. That insures that the user can find and browse books through catalogue, bibliography and Chinese library classification on the Internet. The following is the original model of resource-preservation-oriented digital library.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


This model has been applied not only in China, but also in recent international digital library projects, such as China-US Million Book Digital Library Project, US-India Million Book Digital Library Project, and US-Egypt Million Book Digital Library Project. The main advantages of adopting this model are:

 

1.      MARC and classification are well-established schemes in the library profession. Using MARC as metadata to search is easy to implement. Table of contents, abstract and hyper links are coded systematically in MARC. Therefore, this model is suited for a standard digital library.

2.      It is easy and inexpensive to use industrial digitization for traditional publications.

3.      The single-page model is an effective way to transfer data on the Internet and meet readers’ needs. The single-page model is much more effective than the whole-book model. Using compressing and extracting technology, the average size of a single page is 22KB, much smaller than a PDF file.

4.      Using arithmetic and regulation to solve the relativity saves time, compared with hypertext processing.

 

B. US Research on China’s Digital Library Technology

 

1. Single page technology

 

The whole book processing technology, developed for handbooks to be used with computers, is the prevalent technology used in US. In the Internet age, this technology is also applied to digital content processing. Consequently, the digital file of a whole book is often too big to be transferred or used online.

 

Comparison of the size of a single-page document in different formats:

 

Txt / 2K

PDF / 7KB

1 : 3.5

Txt / 2K

DOC / 25KB

1 : 12.5

Tiff / 840K

PDF / 161KB

1 : 19.2

 

It is clear to see from the above table that the image file of the text (Tiff) is 161 times larger than that of text file (Txt), 23 times larger than that of a PDF file, and 6.4 times larger than in the MS Word format. If the whole book is 350 pages, it would require a long time to transfer in PDF format. The single-page technology, invented in China, reduces the scanned page to 22KB. Moreover, the single-page link technology makes the page in the bibliography correspond correctly to the image page. Recently, NetLibrary (www.netlibrary.com) in US has adopted the single-page technology.

 

2. Digital resources service model

 

In US, the digital resources service model is quite simple. It is either “pay for all” or “free for all”. It cannot provide a customized service. The model in China is diversified. Customers can pay for all or part of the content. The diversified model makes sure that the content production will provide customized services. The NetLibrary has also adopted this model. China’s Digital Library has set up a mirror site at San Diego State University. The Library of Congress is interested in the system and service mode of China’s Digital Library and is conducting tests on the system. The East Asia library of Stanford University has attempted to mirror parts of China’s digital library system.

 

Part 4: Influence on Resource Research, Education, and Application

 

A. Protection of Historical Heritage

Before the application of the cultural relic management system in the Palace Museum (www.dpm.org.cn), research on calligraphy and paintings depended on manual work and magnifiers or microfilms. Due to Beijing’s desiccation and dust, it is very hard to guarantee that the microfilm is not scratched. For example, the famous hand scroll “A City of Cathay” (清明上河图) was filmed 9 times in a year. After digitizing it with the Kodak technology, researchers can study the painting on a 21-inch color screen. Details of the painting can be blown up on the screen. The fundamental warps and wefts of silk fabric can be identified to distinguish original color from dust. Thanks to the high definition of the digital resources, researchers can make an appointment to study the digital painting in a designated place. The digital technology is good for long-term preservation of the painting. That is typical of using digital resource to serve research needs.

 

“Si Ku Quan Shu” (四库全书), or the “Complete Library in Four Branches of Literature”, consists of 3,503 works, 79,337 volumes, 36,304 books, and 997,000,000 words. Before the digitalization of the full-text, researchers depended on manual work and note-taking. Now the researchers can use many other convenient research methods. Simplified and traditional Chinese related search can identify the difference among various copies. Common words and phrases can be found by search engine to analyze regional difference of words in different books. Researchers can extract parts of works automatically.

 

The Digital Chorography Project of NLC has the functions of image and full-text comparison, full-text search, and knowledge search. The project aims to establish standards, enlarge the distribution of rare books, link related fields of knowledge and serve the world.

 

The application of digital library will resolve the conflict between preservation and research. Using digital library technology to enhance the full-text searching and mining, research on the cultural heritage will be enhanced.

 

B. Influence on Education

 

The establishment of CALIS supported the reference digital resources in Chinese universities. In July 2004, Mrs. Li Xiaoming, an official of the Ministry of Education, attended the Digital Library System Construction Conference. Listed here are the digital journals (including electronic journals) published mainly in universities:

 

 

1996

2003

Peking University

1,098

20,309

Tsinghua Universitu

611

19,318

Xi’an Jiaotong University

290

19,426

Shanghai Jiaotong University

246

18,720

Dalian University of Technology

109

8,165

Suzhou University

107

10,883

 

By dint of 211 Project, most universities use the network digital resources as the main reference resources.

 

In a few years’ time, CNSDL became a digital library that provides resource and service to the research institutes of China Academy of Science (CAS) throughout of the country.

 

The digital library experimental system at NLC has analyzed a large quantity of data and found that 86% of people search for popular science resources and less than 10% for literary works. The result shows digital library service can provide resources more effectively, serving education and research needs of the public.

 

China is set to distribute the resources of digital library technology to the rural areas. The rural area distance learning system of China Education Television (CETV) distributes the resources through networks and satellites. NDCNC (全国文化信息共享工程) system distributes entertainments and knowledge.

 

C. Application of Digital Library in Other Areas

 

Digital library technology is not only applicable in libraries but also in other fields.

 

  1. Digital library technology is applied in museums to show cultural relics and buildings. See the Palace Museum at: www.dpm.org.cn
  2. Digital library technology is applied in archives. The Qingdao Digital Archive, which is the first in China, has passed the inspection by State Archives Administration of China and is open to the public. Digital Archives consist of the digital and network system of collection, management and application. The Qingdao Digital Archive has 4 comprehensive software and hardware platforms managing 14 sub-systems. Four archive information databases contain 5.5 million pieces of information in catalog database; 0.7 million pages in full-text database; 16,000 photos and 2000 multi-media discs.
  3. Patent system is also a classified digital library system which includes cataloging, digitization, management and search of patent files.

 

http://www.sipo.gov.cn/sipo/zljs/default.htm

 

D. Standards and Regulations Facilitate the Development of China’s Information Standardization

 

Early digital library research was focused on multi-media technology. After 2000, Chinese library community laid emphasis on standard research of digital library, as librarianship is the most standardized profession. Digital library is the information service system in network. Without standardization, digital library will become a mere information island. Since 2002 when China focused its research on digital library standards and regularity, much progress has been made in resource regulation, interoperability and data exchange standards. In a way, research on digital library standards in China has promoted China’s research on its information standardization. See various standards at the following websites:

 

 

To sum up, the development of digital library in China has exerted remarkable influence on China and the world. China’s 5,000 years of cultural heritage will be better inherited and promoted in the digital library age.


Originally presented in Chinese at the 3rd China-US Library Conference at Shanghai Library on 23 March 2005 (http://www.nlc.gov.cn/culc/en/index.htm).
Translated into English by Qi Xin and submitted to CLIEJ on 30 May 2005.
Copyright © 2005 Sun Wei & Qi Xin

Sun, Wei. (2005). "The Development of China Digital Library and Its Influence on China and the World," Chinese Librarianship: an International Electronic Journal, no.20 (December 1, 2005). URL: http://www.iclc.us/cliej/cl20sun.htm
Article Editor: Yong'an Wu