Insight Storage and Retrieval in a Computer-supported Environment
- / -
Prepared in relation to the UNEP-HEMIS design proposal on environment/development
1.0 Background to comments
The comments in this note derive from consideration of several long-term concerns
explored in connection with programs of the Union of International Associations.
These are described in Annex I under the headings:
1.1 Maintenance of databases on international organizations
1.2 Maintenance of databases on "world problems" and "human potential".
1.3 Conceptual and terminological analysis.
1.4 Visualization of relationships.
1.5 Metaphors as vehicles of transdisciplinarity.
1.6 Institutional information systems.
2. Current concerns
In the light of the above explorations, the following general concerns emerge:
2.1 Dissociation of terminology from concept handling: Because of the
necessary biases of the library sciences, and their reflection in information
system design, some fundamental distinctions are not adequately preserved. The
point is perhaps best made with data records on "world problems":
(a) Many problems are only fuzzily defined. There is no standard terminology.
Some require one (and usually more) strings of terms to capture the range of
words used to name them. New strings may have to be added from time to time
to reflect current intellectual fashions. Some may have to be deleted as sub-problems
or variants are separated out of that initially identified.
(b) Problems are of course named with different strings of words in other languages.
(c) Computers are least efficiently used when records do not have a unique
identifier. There is merit in separating the concept identifier from the names
or descriptors which may from time to time be attached to it. Concept management
may thus be kept distinct from data management.
(d) The challenge of classifying and re-classifying the set of "world problems"
is a continuing one. There is not necessarily any final solution. There may
be a range of temporary solutions. Classification exercises should be kept separate
from concept management and data management.
2.2 Insight capture. Clearly there is advantage in distinguishing the
problem of data capture, handling and retrieval from information handling and
retrieval. But much is assumed about the quality and value of "information",
when already there is a problem of information overload. From the mass of information,
there is a need to distinguish:
(a) Insights: A greater stress is needed on what might be called "insight capture",
which has much to do with the question of tools to facilitate comprehension.
Unless the user is able to learn while using, he will unable to respond to information
beyond the "mechanical" response to the question asked. Here I have been interested
in the potential of metaphor and leitbild.
(b) Perceptions: In dealing with "world problems", "acid rain" is an interesting
example, the distortions to which information is subject to serve various vested
interests (including protecting the reputations of eminent scientists) need
to be borne in mind.
(c) Concepts: With the increasing constraints on resources and learning time,
there is merit in distinguishing key concepts from the mass of other forms of
information. The question might be asked for each discipline what are the contents
of concept sets relevant to different levels of competence -- given that a key
individual in an institution may only be able to master a level below that which
is considered desirable by relevant professions (which may be less than disinterested
in defining what is desirable).
2.3 "Harmonization". There are many hidden problems in the stress placed
upon the term "harmonization". There is a long track record of clever ways and
reasons by which harmonization has been avoided. The challenge faced by ACCIS
in dealing with over 300 UN databases would make a valuable case study, if the
political background could be revealed. Some are political, some are to do with
psychological resistance to other peoples empire-building tendencies, etc. Unless
these issues are clarified, any future proposal will be vulnerable -- just as
the ACCIS initiatives have always been vulnerable and undermined.
2.4. Potential of new technologies. Implementation of software techniques
to enable the user to navigate through the system whilst retaining an understanding
of the whole. There are impressive new developments from Xerox PARC -- beyond
"windows" to "walls" and "rooms". Virtual reality may also make a great difference
in a very short time, especially with the advantages for CD-ROM. However it
is very clear that investments are made in response to market needs, and the
need for non-specialized access is not considered a priority. It would be easy
to argue that investments are made in response to quantity needs and not to
quality needs. They are made in such a way as not to threaten the positions
of experts seeking to protect their domains. Experience suggests that the more
international, interdisciplinary, intersectoral, or sensitive-beyond-established-boundaries,
a project is defined to be, the less probable the funding (especially on a long-term
2.5 Priorities. There is much hype about the information society and
the global village. Many still aspire to a "global brain", or claim that it
already exists. The issue in any new project, constrained by resources, is how
to ensure innovative results for modest investment. An interesting example in
relation to the environment is the creation of a database on all species (whether
plant or animal) and their relationships in order to track vulnerability to
pollution, risks to food webs, endangered species, etc. Given the number of
species, an easy response is that such a project is not feasible. However valuable
approximations to the desired result may be achieved by using higher level clusters
of species (classes, families, etc), only extending into specific species where
resources and interest justify the work. No such database appears to exist,
although clearly it may be made as simple as resources require whilst still
retaining an overview -- however lacking in detail. Such a project is an interesting
challenge of CD-ROM and for data visualization, including virtual reality representations
of food webs with threatening factors.
3.1 Mission statement: As noted above, it is important to be aware of
the factors opposing harmonization, especially those that cannot be printed
in widely circulated reports. There is also a questionable assumption that policy-making
bodies are anxious to receive harmonized information. Policy-making bodies in
fact are usually relieved to be able to take advantage of lack of harmonization
to pursue policies that might otherwise be more readily challenged (cf "acid
rain"). Is it certain that "sound management" is the real desire of policy-making
bodies? There is merit in reflecting what happened to previous ambitious projects
of this nature operating across institutional boundaries within the intergovernmental
system. Briefly the techniques used were: failure to implement, implementation
under a watered-down mandate, redefinition to a narrower mandate, reduction
of budgets, appointment of incompetent personnel, restriction on exchange of
data, effective opposition to any harmonization (implementation of different
bibliographic standards, incompatible computer systems, etc). ACCIS and its
predecessors (IOB, TABS) merit study in that light. UNISIST and UNBIS also merit
reflection. Why is there no UN system-wide documentation system, including the
Specialized Agencies? Why has it proved impossible to extend this to other intergovernmental
3.2 Scope of data: It is always possible to demonstrate that "no other
group holds data on..." by suitably specifying criteria. There remains the question
of to what extent HEMIS starts incorporating environmental data from domains
of other institutions (eg health, education, labour, etc) -- despite the views
of those bodies.
3.3 Standard pleas: It is important to be aware of the tendency of international
meetings to make standard pleas for further action, without the action in fact
being necessarily wanted (eg who would say no to a bibliography, a newsletter,
or another meeting?). A distinction should be made between:
(a) what users say they want
(b) what users will really use on a regular basis
(c) what users want but are unable to name (but usually find themselves obliged
to pay expensive consultants or intelligence services to locate)
3.4 Track records of cooperation: It is appropriate to ask for some
sort of objective study of the track record of cooperation in relation to documentation
between international institutions. It is easy to claim that much has been achieved.
It is not so easy to uncover what was not done and why. It is important to appreciate
the level of intellectual and institutional investment in existing filing and
classification systems. New projects cannot hope to claim that they are "neutral"
and aimed solely at facilitating access to others. They will necessarily be
perceived as a threat by others -- if only in competing for scarce resources.
Much tokenism is exhibited in disguising this reality.
3.5 Fashionable topics: The international community is constantly exposed
to new fashionable topics. But every decade or so there is a major "paradigm"
shift of which "development", "energy" and "environment" are examples. These
place institutions under severe political pressure to totally redesign their
information systems to accommodate to a new pattern of subject and institutional
linkages. At this point in time, "environment" offers a clear pattern and set
of priorities. The question is when the next policy "surprise" emerges, will
an environmental information system be organized in a sufficiently flexible
manner to be reconfigured to handle the new set of priorities which Member States
are liable to require?
3.6 Design considerations: There is great advantage in adopting a modular
approach to design. This obviously facilitates maintenance and future developments
(in response to revised requirements). But it also allows progress to be sensitive
to the availability of resources.
3.7 Multi-lingual access: The issue here, as noted above, is the relation
between a word, a term, a concept, and the various kinds of data identifier
at the computer level (record keys, etc). Ideally there would be a fundamental
distinction between the computer key problem, the conceptual problem, and the
terminological problem (in whatever language).
3.8 Keyword vs Fulltext: Given the potential for information overload
with many existing systems, surely emphasis needs to be placed on levels of
insight or priority, even if these are user defined. Fulltext "information"
will not necessarily offer insight -- which may call for artificial intelligence
to separate out the dross. A mining metaphor is appropriate: it takes a lot
of ore to extract relatively little mineral, which then has to be processed
to render it into a useful form. Possible attention should be given to the sequence:
(a) concept, (b) keyword(s), (c) concepts network, (d) bibliographic information,
(e) abstract, (f) fulltext. The more focus is placed on (e) or (f), the less
investment is devoted to improving the quality of (a) through (c). It could
be argued that users benefit most in terms of creativity from manipulating the
system to discover more interesting questions. This requires facilities to handle
the (a) through (c) items. Items (e) and (f) can always be found in response
to even the most inappropriate questions. To what extent does HEMIS have a responsibility
for encouraging users to ask more interesting questions?
3.9 Windows: One might question why such a hardware-demanding platform
is required if the widest access is sought. In many ways Windows is a ploy to
increase the unit costs of hardware, increasing the unavailability of such facilities
to a wider user group in a time of global budget-cutting.
3.10 Closer cooperation: It is worth questioning the assumption as to
whether a new tool leads to closer cooperation. Consider older projects such
as UNISIST and UNBIS (as noted above).
3.11 Utilization: It is worth questioning whether the kinds of retrieval
system that tend to be designed do not facilitate responses that are less than
useful -- thus leading to underuse of the system, and to pathetic attempts to
boost user statistics to justify an inappropriate investment.
3.12 Graphic interfaces: It is readily assumed that iconic approaches
designed in the West are appreciated in other cultures. However there was early
concern that even at the level of control knobs or buttons on tape-recorders,
there were distinct cultural preferences to which manufacturers had not been
3.13 Document access priority: One approach to prioritizing, if required,
might focus on navigation of the references, without giving access to documents.
3.14 IRS: Is it necessary to use this well-known acronym which can only
breed confusion, especially in the USA.
3.15 Hyper-links: There is a need to distinguish between the facility
of clicking through a hyper-link pathway and that of acquiring some overview
of where one is in a system of such pathways. Current techniques reinforce the
rat-in-the-maze approach, because mapping pathways using graphical techniques
is avoided (see references to Mapping Hypertext, for example). Contextuality
is a requirement. Knowing that one is in the "water pollution" hyper-text maze
is less helpful than seeing where one is in that maze.
3.16 Thesauri: Several issues here:
(a) "Every keyword is related to one unique identifier".
-- This would seem to imply that concept and word are unambiguously tied. This
may indeed be a valid assumption in the case of the HEMIS subject matter. "Mercury"
is always mercury (except when it is another planet). This is less clearly the
case of "health", which may be understood in a variety of ways. Even less so
in the case of value-loaded terms such as "quality of life" or "well-being".
In addition to the points made above, I refer you to the study by Huff on homonyms
and homographs. It is not clear (Fig 10) whether field (F) resolves all these
-- What happens in the case of keywords that can only be rendered unambiguous
by context: mathematical "calculus" vs urinary "calculus"?
-- Is the track record such that it is possible to believe in harmonization
in relation to keywords? The UN record on defining "aggression" is an interesting
example. "Peace" might be another. "Transnational corporations" is another interesting
one. What is of interest are the distinct concepts of aggression or peace, not
that they all happen to use the same inadequate keyword.
-- With the increasing interest in the cognitive function of metaphors, to
what extent would the system be able to handle data using keywords metaphorically?
Is HEMIS supposed to be a metaphor-free system? Is this handled by provision
for synonyms, even though the synonym may then point beyond the same subject
field to make a metaphoric point?
(b) "This thesaurus is based on the INFOTERRA definitions". Here there are
several problems. INFOTERRA is a governmental system and therefore subject to
pressure from Member States to include or exclude terms which are politically
embarrassing. Thesauri can be studied for such bias, recognizing that classification
is a political act. It is unclear to what extent INFOTERRA has allowed itself
to be sensitive to non-UNEP issues -- bearing in mind an early briefing from
UNEP that any communication that did not correspond to the mandates of one of
its 12 departments could not be processed. Tieing a classification system to
the changing political matrix of institutional mandates as to what constitutes
"environment" (or what topics rfelate to what departments) is intellectually
embarrassing as the early history of UNEP reveals.
(c) What is the conceptual significance of someone disagreeing with the thesaurus
structure? And what can who do about it and when? Some issues are directly to
do with conflicting classifications -- as can readily be seen with competing
taxonomic classification systems.
(d) "The HEMIS thesaurus may be enlarged by partner institutes in deeper hierarchy
levels". What is the OECD Macrothesaurus experience in this regard?
3.11 Retrieval: In relation to our own system, there seems to be a dangerous
reliance on descriptors when an approach more tolerant of fuzziness is required.
3.12 Editing thesaurus: By making this a centralized responsibility,
this creates a situation from which key thesauri have suffered -- especially
when they become heavily institutionalized with people with a heavy intellectual
investment in one philosophy as opposed to another. An alternative is to allow
users to make proposed (coded) modifications -- with approval (coded) following
as and when the central office can get around to it. This allows users to encode
new search links in a decentralized mode, making the system immediately relevant.
Users without such privileges can choose to follow their own philosophy of proposed
extensions -- or ignore them.
3.13 Disagreement: Advances in knowledge come in part from the development
of alternative views. These usually have implications for the way information
is classified. The question is then how is a system going to hold the two (or
more) competing classification variants during the transition phase. Or is the
classification going to resist the emerging perspective until the replacement
of the old becomes authorized -- but by whom? The pattern of relationships between
data elements could usefully reflect an extension of the citation analysis approach,
namely to show that Document C is critical of Document F and supportive of Document
3.14 Obsolete documents: One of the problems in information systems
is the accumulation of items which are obsolete or discredited. How are these
to be distinguished by the system -- especially since they are a basis for prioritizing?
Using the date is totally inappropriate, since older material may be valuable
either for data or as an original insight. The responsibility could be left
to the user, but this imposes a heavy scanning burden (as well as increasing
costs) for lack of a sensible technical solution. An alternative is to build
on citation analysis techniques, recognizing that it is often the uncited publications
which herald the future, not those which are part of an intellectual fashion.
3.15 Interdisciplinarity: Despite being concerned with one of the most
interdisciplinary subject areas, the HEMIS approach seems to have no special
provision for the challenging issue of interdisciplinary searches and conceptual
integration. Is this question assumed to be irrelevant because searches can
be made using any range of keywords or using "interdisciplin*" ? Should such
a system not be doing more to encourage searching in terms of the relevant categories
around any specialized search. This is not a hierarchical notion, but is this
reflected in the network solution? How is integration reinforced, and to what
extent are such issues the responsibility of HEMIS rather than of the user?
3.16 Security classification: No intergovernmental system can be created
without making provision for security classifications. This does not seem to
be mentioned. There is also the issue of how a thesaurus would be extended to
cover classified topics. One of the issues with INFOTERRA was restricted access
excluding "nongovernmental" participation -- and thus presumably excluding new
topics which such bodies were exploring before they could be considered worthy
topics by UNEP.
My conclusion is that in the main the problems are not technical especially
if HEMIS is designed to deal only with information items which are unambiguously
defined (namely excluding all social or politically loaded factors) and are
more focused on information (in response to pre-defined questions) rather than
"insight" (in response to interactive questioning), and issues of how information
is to communicated and comprehended. There are some very elegant technical solutions,
especially those which allow the end-used to approach the data through whatever
classification system he chooses, even a personalized one. What is required
at this stage is a sense of what tends to happen to interesting information
system proposals in the light of past experience -- and who would be motivated
to discuss it, able to do so, and in a useful way without covering up the sad
INFORMATION CONCERNS OF UNION OF INTERNATIONAL ASSOCIATIONS IN A COMPUTER-SUPPORTED
The comments in the accompanying note derive from consideration of several
long-term concerns explored in connection with programs of the Union of International
Associations. The references here are to publications and reports produced by
1.1 Maintenance of databases on international organizations
This is a long-term project, originating in 1910, which gives rise periodically
to production of hardcopy versions. CD-ROM versions are scheduled for 1993.
- Yearbook of International Organizations (annual, 3 vols, 5,500 pages).
Profiles some 20,000 international non-profit bodies, whether governmental
or nongovernmental, in all fields of human activity. Organizations have
some 80,000 registered links amongst themselves, and some 220,000 country-organization
- International Congress Calendar (quarterly). Lists some 10,000 future
international conferences, notably those of international organizations.
1.2 Maintenance of databases on "world problems" and "human potential".
This project, long-term started in 1972, gives rise periodically to an:
- Encyclopedia of World Problems and Human Potential (2nd ed 1986; 3rd
ed 1991; 4th ed 1994, 2,400 pages). Profiles some 13,000 problems as perceived
by international organizations and other international constituencies. Registers
some 80,000 links between them. Also has smaller databases on: human values
(2,300), strategies (8,300), human development concepts (4,000).
1.3 Conceptual and terminological analysis. This work was initially
undertaken with the Committee on Conceptual and Terminological Analysis (COCTA)
and resulted in a report:
Further work was undertaken in discovering ways to deal with the fuzzy concepts
underlying "world problems" and "human values". This work took its main form
in the design of databases and software to handle the above publications. However
a report was presented to a UNESCO-sponsored Conference on Conceptual and Terminological
Analysis (Bielefeld, 1981):
Related work was undertaken in connection with the United Nations University
project on Goals, Processes and Indicators of Development (1978-1982), notably:
Further work was undertaken in relation to socio-cultural aspects of Information
Overload and Information Underuse, a project of the United Nations University:
This work continues in connection with the Encyclopedia of World Problems and
Human Potential because of the level of conceptual ambiguity in information
1.4 Institutional information systems. In parallel with the above initiatives
(during the various struggles of IOB from which ACCIS finally emerged), reports
were prepared for two UN-sponsored conferences on the information challenges
facing the intergovernmental organizations and especially the Specialized Agencies:
1.5 Visualization of relationships. There is a long-term concern that
current means of presenting information in text form do not highlight the importance
of relationships between data elements. Various approaches have been explored
to the challenge of representing complex networks of entities in a comprehensible
way using computer graphics techniques. These have been summarized in the following
1.6 Metaphors as vehicles of transdisciplinarity. Aspects of the above
work on concepts are now dealt with in a long-term project on the use of metaphors
to handle higher orders of complexity. Here the concern is with the unexplored
cognitive potential of metaphors to provide conceptual scaffolding as a guide
to transdisciplinary initiatives and problems of governance. Reports include:
- Innovative global management through metaphor (Paper for Conference
on Social Innovation in Global Management, Cleveland, 1989)
- Through Metaphor to a Sustainable Ecology of Development Policies.
In: Trzyna, T C and Gotelli, I (Eds): The Power of Convening; collaborative
policy forums for sustainable development (Proceedings of an International
Workshop sponsored by the Commission on Sustainable Development of IUCN-The
World Conservation Union, California Institute of Public Affairs, and Center
for Politics and Policy (Claremont CA, October 1989)). Sacramento CA, California
Institute of Public Affairs, 1990, pp. 64-81.
- Recontextualizing Social Problems through Metaphor: transcending the
"switch" metaphor. Paper prepare for the International Conference on Demography
Issues and Sustainable Development organized by Development Alternatives
(New Delhi, 1990). In: Transnational Associations, 43, 1991, 1, pp. 37-46
- Guiding Metaphors and Configuring Choices (Paper for the Development
Administration division of the UN Department of Technical Cooperation for
Development). Brussels, UIA, 1991.
- Metaphors as Transdisciplinary Vehicles of the Future (Paper for the
UNESCO-sponsored Conference on Science and Tradition, Paris, 1991)
- Through Metaphor to a Sustainable Ecology of Development Policies (Paper
for an IUCN sponsored workshop on policy forums, Claremont CA, 1990)
1.7 Translation. Although most information is held in English, non-English
organization names and keywords are also used for access and indexing. Investigations
into the possibility of machine translation from English into other languages
are being undertaken, notably for the Yearbook of International Organizations
(which last appeared in French in 1980). These distinguish three levels:
- (a) Unambiguous "translation by substitution" (eg names of countries)
- (b) Machine-assisted translation using packages such as SYSTRAN, EUROTRAN or
- (c) Conventional translation
Anthony Judge. Knowledge representation in a computer-supported environment.