Design of an information system
to facilitate the production of concept thesauri by different
schools of thought
- / -
Notes on the design of an information system to facilitate the production of concept
thesauri by different schools of thought. Notes presented to a workshop of the
Committee of Concetual and Terminological Analysis (COCTA) of the International
Poltical Science Association (Bellagio, September 1971) and susbsequently developed
into a more extensive proposal Relationship between Elements of Knowledge:
use of computer systems to facilitate construction, comprehension and comparison
of the concept thesauri of different schools of thought
In a discussion about any form of thesaurus, it would seem to be
important to distinguish between the problems of identifying and
processing the entities to be included and the problem of classifying
these entities or rejecting some on the basis of a particular set
of criteria. The first problem is a
much simpler one and its
solution, if sufficiently general, can be arranged so as to minimize
controversy and therefore maximize acceptance of the procedure. The
second problem is more complex and the solution may raise real or
perceived theoretical issues or even be perceived as a threat by
certain schools of thought. This could
undermine any attempt to
solve the problems of conceptual anarchy.
In these notes some of the difficulties are summarized and an attempt
is made to demonstrate how adequate structuring of the required
information system may help to eliminate or isolate these problems
in such a way as to render the project feasible. These notes do
not therefore touch upon any theoretical issues. They are solely
concerned with the design of a practical information system to be
used as a tool by scholars in order to facilitate solution of such
1. The identification of entities to be included in a thesaurus and the practical
problems of incorporating these entities into an information system need to
be distinguished from the theoretical problems of classifying and interrelating
such entities. The first is a relatively fast and unskilled operation and the
second is relatively slow and skilled. This means that the technique of identifying
the entity within the system by a numerical tag derived from a classification
scheme should be avoided.
2. The experience of the success of the Académie Française with respect to the French language and the International Federation for
Documentation with respect to the Universal Decimal Classification of subject areas would seem to indicate that such approaches
tend to :
- be slow in responding to change to the point of acting as a constraint
on innovation to those dependent upon them (the UDC Committees in some areas
are rumoured to be 10 to 15 years behind in coding the backlog)
- give rise to a proliferation of competing alternatives for groups of users
with slightly different perspectives on subject areas (e.g. UDC, Dewey and
UN/OECD Aligned List of Descriptors) who need a tool with slightly different
- become associated with particular schools of thought, organizations or
personalities who resent criticism of their perspective and alienate potential
- become viewed as authoritarian and a vehicle of some form of conceptual
imperialism. Unfortunately, the organization of relations between entities
is equated with the imposition of a new set of relations. The organizers
are perceived as acquiring power.
These points raise the important question of providing an adequate degree
of flexibility and responsiveness to the needs of future users and those
in other disciplines and schools of thought (whose theoretical requirements
it is clearly difficult to predict). At the same time, there is the problem
of ensuring that the system meets the needs of those who initially invest
effort in the undertaking.
3. Overdesigning the information handling system to meet immediately perceived
needs Bay reduce its usefulness and relevance to others and therefore increase
the difficulty of ensuring adequate funds over a long period. (The degree
of "hygiene" may be inversely proportional to the utility of relevance
of the system to potential users. )
4. The actual procedures for incorporating new entities into any 'approved'
list within the system may appear bureaucratic and stultifying unless the
system is user-oriented. There is therefore the old problem of minimizing
the bureaucratic desire for due process and order and maximizing user participation.
5. Any single classification scheme may come to be treated by some funding
bodies as a basis for their system of resource allocation for research. This
tendency might be encouraged by the strategies of those seeking research support
and could artificially distort research directions.
6. There is an important possibility that the classification scheme may
constitute or operate as some form of paradigm (or perhaps "meta-paradigm").
Whilst this is acceptable for the solution of immediate problems of conceptual
hygiene it may in the long-term encourage scholasticism. The design of alternatives
to a particular classification scheme, or its redefinition, should therefore
7. Some thought needs to be given to the potential users of the thesaurus,
and the manner in which use of it might be facilitated. The initiative comes
from one school of thought in political science. How can it be made useful
other schools of thought
social science disciplines (in a narrow sense)
social science disciplines (in a broad sense)
8. The assumption has been made that the major users would be scholars or
students. There may however be possibilities of wider use which raises problems
of how users can introduce filters to eliminate excessive detail and
other features of the system which are not immediately essential.
1. A computer-based concept registration or tagging system should
be set up which would allocate sequence numbers to concepts on a continuing
basis. The criteria for concept registration should be kept to a minimum
to ensure that the system remains "open" to a wide variety of users
This approach permits rapid inclusion and organization of the data and rapid
production of updated concept lists. These would facilitate the scrutiny
of the data in various phases and in terms of the perceptions of different
2. Evaluation, classification and identification of concept inter-relationships
would be made independently by a limited number of contributing groups,
possibly associated organizationally with the international academic bodies.
These groups would be primarily concerned with allocating codes to be fed
back to the computer system so that ordered and refined concept thesauri
could be produced to reflect the perceptions and needs of the contributing
groups. An important aspect of this coding function by groups would be the
rejection of those conceptions registered which are considered to be of little
value to the group's perspective.
From the computer data handling point of view, each contributing
group would be building, refining and maintaining its own
"model". Each such model
would be handled as an independent
optional qualifier on the sequentially ordered concept list.
Prom the point of view of any such group, the computer system
would be viewed as holding the concepts in which it is interested
in the order of its own preferred classification scheme.
There would of course be the opportunity at any time to look
at the same concept list through the classification scheme
of any other contributing school of thought.
3. Once the concept registration system is running smoothly and the professional
groups are interacting effectively with the system to feedback their classification
of the concepts within their own models, other groups of different levels
of "multi- disciplinarity" may constitute themselves to work on
the integration into "meta-models" of two or more of the models
already produced (e.g. for political science and sociology into a social science
4. There is no reason why, for example, a copy (on computer
tape) of the concept list and various models should not be
made available to universities for comparative research on the
models or as a tool in the educational process. Alternative
models could be constructed which could be made generally
With respect to research, it is clearly important to enable the
user to examine the thesauri at different levels of abstraction
by introducing filters. In addition
there is the possibility
of comparative study of the manner in which different
perceive and interrelate phenomena.
With respect to education, it is possible to develop educational
meta-models which would permit selection of concepts by filters
corresponding to different educational levels (e.g. an "atom"
may be viewed as a billiard ball-type
structure in the elementary
stages, a miniature solar system, a system of electrically
charged potential clouds or, in the final stage, as something
which can only be described with mathematical symbols.). At each
level a precise definition in the appropriate terms could be
provided . In addition the approach could permit individual
students to create their own concept thesaurus and to learn
from the differences between their own and those of particular
5. Each contributing group may wish to distinguish differently between or
interrelate the "entities" tagged in the computer sequential register.
There is no reason why "concepts", "propositions", "relationships",
"problems" etc should not all be treated as entities and appropriately
distinguished and interrelated (or ignored and rejected) at the modelling
phase. It might, for example, be particularly valuable to include "theories",
"frameworks of inquiry", etc. by first giving each a sequential
number (as indicated above) and then (in the modelling phase) relating them
to the major variables considered significant and necessary to define the
frame of discourse associated with that theoretical viewpoint.
This would permit the same system to handle concept thesauri, inventories
of propositions, inventories of problems, etc.
6. At a later stage users of one model might find it useful to produce an
"authoritative" list of terms to be used for those concepts of interest
to them. This could also be incorporated into the computer system
7. It would require further study to determine whether models need to demonstrate
concept interrelationships by using the decimal classification scheme or whether
a more economical and flexible computer technique could not developed. The
latter might avoid the need to have both a sequential tag and a decimal number.
Instead a concept would be identified by its sequential number plus a number
to uniquely identify the model.
8. Since one of the great disadvantages of computer use is the tendency
to generate long, indigestible and impenetrable lists (however ordered), some
attention could usefully be paid to the technique of displaying networks
of concepts directly onto television-type screens under computer control.
The user can then penetrate and interact with the network and can readily
networks at different levels of the "ladder of abstraction"
displayed (ie the computer can be used to explore conceptual schemes nested
within one another like Chinese boxes)
add or eliminate explanatory texts to clarify concept interrelationships.
(This is particularly useful for the successive clarification of conceptual
schemes at gradually increasing levels of complexity - in the educational
Data to be included on each entity
A. Identification or Registration Phase
1. Entity sequential number
2. Conventional labels or terms
2.1 Language 1 data
2.1.1 Language code
2.1.2 Label in language (of code given in 2.1.1)
2.2 Language 2 data etc
3. Text of definition
3.1 Language 1 definition
3.2 Language 2 definition
4. Sequence numbers of other entities with same label
B. Classification or Modelling Phase
1. Model number
2. Type of entity code (e g. concept, proposition, problem,etc.
3.1 Date first used
3.2 Date last used
4. Relative importance code
5. Compositional relationship coding (i.e. sequence numbers of entities
representing concepts which:(i) are dependent upon this concept,(ii) upon
which this concept depends, (iii) are horizontally related to this concept)
6. Comprehension relationship coding (i.e. sequence numbers of entities
representing concepts as in B.5 but reflecting reflecting relationships between
concepts in terras of ease of comprehension)
7. Systemic relationship coding (i.e. sequence numbers of entities representing
the systems corresponding to concepts and reflecting the manner in which such
systems are nested or interact with one another)
C. Authoritative Term Phase
1. Text of authoritative term
1.1 Language 1 term
1.2 Language 2 term