Abstract | Proposal
| Final Report
Lewis Lancaster, Electronic Cultural Atlas Initiative, UC Berkeley
January 24, 2001
A small grant is requested to enable the design, development, and testing
of standards for entries in gazetteers and also for characterizing gazetteers
themselves at the level of complexity needed to provide effective support
for humanities and historical computing.
Digital library research has become concerned with a steadily increasing
range of genres and materials and, more challengingly, with the use
of diverse digital genres in conjunction with each other. Researchers
associated with the Electronic Cultural Atlas Initiative are investigating
means of combining textual information with geospatial data, enabling
cultural, historical, and social data to be represented in time and
space through Geographical Information Systems (GIS). Linking the mention
of place names to maps involves three different genres: toponym-rich
texts, GIS maps, and, mediating between the two, gazetteers,
structured records about locations and their names.
Making gazetteers available to the users and contributors of digital
library resources, permitting indirect referencing and GIS mapping of
places, is key to communicability among digital resources with geospatial
information. Gazetteer resources can also integrate scientific and demographic
data about places with the global, multilingual records of human culture-art,
literature, biography, history and other fields-that are rapidly being
digitized.
Important gazetteers are being developed by a number of digital humanities
projects worldwide. The rise of a networked environment makes it possible
to draw on multiple, network-accessible gazetteer servers. Unfortunately,
the effective use of gazetteers in historical and humanities computing
is impeded by the lack of standards for both the records about places
within gazetteers, and for records describing the gazetteers themselves.
The emerging standards for conventional gazetteer entries, based largely
upon contemporary North American gazetteers that focus on environmental
science, are inadequate for humanities computing. New work is required
to extend these standards to accommodate:
- multiple toponyms in multiple scripts that refer to the same geospatial
location
- the instability of toponyms over time
- changing boundaries, locations, and spatial footprints of places
(as towns become cities or rivers spring their banks)
In addition, the range of types of geographical entities currently
used in gazetteer place name type thesauri (bridge, tumulus, church)
are simply not detailed enough to accommodate the range of place name
types found in the global, historical texts about human culture.
A small grant will support three necessary tasks for enabling interoperability
between gazetteers, texts, and maps in a distributed environment:
- Standards validation, enhancement, and development. Gazetteer content
standards need to be tested on real global, historical and cultural
data, and enhanced as necessary to support the international exchange
of gazetteer data. Gazetteer metadata standards and protocols must
be developed to allow interoperability among gazetteers and toponym-rich
texts in diverse languages and formats.
- Infrastructure design and testing. We will create a multilingual
union gazetteer prototype by importing XML records from several gazetteers
in Chinese and English along with qualitative textual information
about those places. These records will be used to establish a unicode-enabled
database to link and enhance gazetteer and text records.
- GIS visualization. The creation of metadata for the union gazetteer
database will enable the data in the enhanced records to be viewed
using the time and space visualization tools of the Electronic Cultural
Atlas Initiative (ECAI) and to be linked to other globally distributed
records about the same places.
Extending Existing Research
Several organizations working on related projects have already agreed
to participate in meetings and to collaborate in the exchange of digital
materials in a testbed environment. The research, based at UC Berkeley,
will be carried out in consultation with a global community of scholars.
- The Alexandria Digital Library (ADL) (www.alexandria.ucsb.edu).
The ADL Gazetteer currently contains more than 4.2 million entries
with worldwide coverage. The contents are based primarily on data
from two U.S. federal government gazetteers, which emphasize named
features that appear on topographic maps rather than historical and
cultural materials. ADL developed a Gazetteer Content Standard and
a Feature Type Thesaurus to support its gazetteer development and
to encourage the growth of standards-based gazetteers and interoperability
among distributed gazetteer services. The standards developed under
this grant will enhance future developments by ADL.
- The Electronic Cultural Atlas Initiative (ECAI) (www.ias.berkeley.edu/ecai).
ECAI is developing a globally distributed temporo-spatial library
of cultural and historical resources with a centralized metadata catalogue
and a GIS viewer. The development of gazetteer reference systems will
enable ECAI users and project developers to conduct queries across
alternative or fluid toponyms and to contribute new data to gazetteers.
It will also free humanities scholars with little or no training in
geography from the need to determine the geospatial location of the
places they study.
- Academia Sinica Computing Center (http://www.ascc.net/center/index_e.html).
Academia Sinica is providing global access to a corpus of 2,500 years
of Chinese historical writing through their Scripta Sinica project.
The project currently amounts to 300 million Chinese characters. They
have recently embarked on the development of a historical and contemporary
gazetteer of China containing over 70,000 historical names and an
additional 5,000 contemporary names. The research carried out under
this grant will contribute to their goal of linking these two projects.
Subsequent Research
Gazetteer standards enhancement, gazetteer metadata standards development,
and prototype linkage of diverse gazetteers and text records in a multilingual
environment, can be accomplished in one year with a grant of $100,000.
These are valuable achievements on their own. However, these activities
will realize their maximum potential through their application to subsequent
projects. This small grant enables us to accomplish the necessary preconditions
for developments of the following kinds:
- Linking texts, gazetteers, and maps in a distributed environment.
Moving beyond the database implementation described above, subsequent
developments in multilingual gazetteer metadata will link the information
about places found in distributed resources, even if those resources
are not structured according to a uniform gazetteer record standard.
- Creating topical indexes about places. Having prototyped the capacity
to associate toponyms with the texts in which they are mentioned,
we intend to link gazetteer research to ongoing developments in the
automated creation of topical indexes. This will make it possible
to conduct queries about places by subject matter and to create second-level
thematic gazetteers on the basis of texts that name places in conjunction
with people, events or any other question.
- Geographical access to library catalogues. Geographical access to
bibliographies and online library catalogues is currently only weakly
supported, even though geographical information is provided in several
parts of the internationally accepted MARC format. Ordinarily, searching
is supported only for title words (20X-24X) and subject heading words
(650 & 650). Gazetteer research will enable sophisticated geographical
access using these clues and visualization in GIS ( e.g. "Map
the geographical spread through time of publishing on this topic"
or "Find works on folklore within the area in which this language
is spoken.") Such refinements can now be designed, but they will
become feasible only if gazetteers can be brought to the requisite
degree of completeness and standardization.
References
Hill, L. L. (2000). Core elements of digital gazetteers: placenames,
categories, and footprints. In J. Borbinha & T. Baker (Eds.), Research
and Advanced Technology for Digital Libraries : Proceedings of the 4th
European Conference, ECDL 2000 Lisbon, Portugal, September 18-20, 2000
(pp. 280-290). Berlin: Springer. Available: http://www.alexandria.ucsb.edu/~lhill/paper_drafts/ECDL2000_paperdraft7.pdf.
Hill, L. L., & Zheng, Q. (1999). Indirect geospatial referencing
through place names in the digital library: Alexandria Digital Library
experience with developing and implementing gazetteers. Proceedings
of the American Society for Information Science Annual Meeting, Washington,
D.C., Oct. 31- Nov. 4, 1999, pp. 57-69. www.alexandria.ucsb.edu/~lhill/paper_papers/ASIS99_confpaper2_final.pdf.
Larson, Ray R., Plaunt, Christian, Woodruff, Allison G. and Hearst,
Marti (1995). "The Sequoia 2000 Electronic Repository" Digital
Technical Journal. 7(3), pp. 50-63.
Woodruff, Allison G. and Plaunt, Christian (1994). "GIPSY: Georeferenced
Information Processing System". Journal of the American Society
for Information Science. 45(9), pp. 645-655.
Back to Gazetteer Project