Other Initiatives toward a Concept Inventory
- / -
Originally appeared in 1973 as part of Toward
a Concept Inventory
Concept coding schemes
There have been many attempts at isolating and classifying elements of meaning
at the root of complex concepts. De Grolier (41) notes that methods and the
need for them have been regularly discovered and rediscovered since the
time of Leibniz or even earlier. He then states:
"We draw attention to these 'anteriorities', not in order to underrate the
work performed by the various researchers or teams of researchers who, in
most cases, truly believed that they had discovered a 'new method' but to
persuade them, rather than to advocate unilaterally any one 'exclusive' process,
to agree that they are all engaged in work on common basic principles, whatever
may be the differences (at times very minor) in the coding method or the particular
type of machine adopted."
De Grolier has summarized the work on classification around the world but
only a few initiatives seem to be directly related to this project. Usually
the work has been directed towards solving a classification problem in some
particular field, which strongly influences the design of the scheme. The following,
noted by de Grolier, are of more direct relevance:
1.1 Perry and Kent (Western Reserve University): Developed a coding
method for the field of metallurgy based on 'semantic analysis' of complex
terms into 'individual terms'. 30,000 terms were assembled from a variety
of sources. The notation is however very cumbersome.
1.2 S.M. Newman (U.S. Patent Office) A "vast attempt at defining or
redefining concepts, which could perhaps be entitled - to paraphrase a famous
title - 'In search of lost simplicity': to discover or rediscover non-equivocal
terms beyond the complications of natural language, which 'unfortunately'
does not have "uniform or logical rules for the denomination of devices or
things". In affect this is an attempt at creating a metalanguage - but again
results in a cumbersome notation.
1.3 C.G. Smith (U.S. Patent Office): Suggested a system which would
isolate "ultimate concepts .... required in the definition of more specific
concepts .... There is a basic layer of concepts which do not require definition.
It is the use of such elemental concepts which is contemplated in the present
system .... A fundamental feature is to seek beneath composite words the basic
organization of elemental concepts which they represent, and to develop the
essential combination for the definition of these words." (42) This was conceived
mainly for patentable contrivances on the US Patent Office Interrelated Logic
Accumulating Scanner. It does however permit chains of related concepts to
1.4 Cordonnier: Worked on methods "to symbolize the elementary points
of view of the classification of ideas and .... to study the grouping of these
symbols in order to obtain composite symbols representing the structure of
complex concepts". He also suggests that "intuition permits the representation
in an intellectual space of a logical figure, to n dimen- sions, a synthesis
of the relationships between a group of ideas into the different classes which
arrange them naturally according to the various possible individual viewpoints".
1.5 M.E. Stevens Worked on use of computers to handle interrelationships
between terms and to 'define', by supplying the generic and descriptive terms
related to the term of which the definition is sought; 'develop', by furnishing
specific examples of a generic term; 'localize', by indicating the place which
can be associated with the proposed concept; 'match', by comparing several
proposed terms together, in order to find a 'common point' making it possible
to relate to these terms another term possessing the same characteristic;
and carry out other logical opera tions. (43)
2. The ADMINS system
Work has been in progress for some years at the M.I.T. Center for International
Studies on the development of very general systems for time shared computer
data management (44). An item of data is treated as a sequence of categories
of information in n-adic relations applied to a specific entity. N-adic data
descriptions for social science propositional inventories are noted as being
quite complicated, e.g. 'violence' is 'power' over 'power' over 'well-being'.
The ADMINS system makes use of a "calculus of relations" for stating the
derivation of a new relation that draws on those already existing, and which
yields a new relational record between particular entities. It is in the structuring
of the programming language around the relational record and in achieving intimate
interaction with many storage levels that this system differs from most procedure
3. Citation indexing
The citation indexing method described by Eugene Garfield (and implemented
in the form of the Science Citation Index and the recently initiated
Social Science Citation Index) is of great interest to this project if
the focus on documents can be replaced by a focus on concepts (45).
The traditional philosophy of classification system design implies that individual
entities (usually documents) can be treated as though they were independent
of one another. This basic fallacy not only results in the loss of important
informational links, but it is basically inefficient. Little or no effort is
made to establish a possible relationship between the entity being classified
and the entities already classified. There are exceptions to this rule, but
generally the building-block development of human knowledge is not perceptibly
reflected in traditional classification systems. In conventional word indexing
systems, the indexers cannot afford the time to establish linkages between concepts.
Each addition to the body of knowledge is treated as one of a series of independent
events, like molecules of a gas. But the literature is not an "ideal gas" -
the molecules interact. Similarly, the body of knowledge, partly embodied in
the literature, is composed of highly interrelated elements. It is a heavily
cross-linked network. The clearly-visible linkages are those ordinarily provided
by authors in the form of explicit citations. Less clearly seen are implicit
references as in eponyms and neologisms. Almost invisible linkages exist in
the natural language expressions which obscure the relationships, especially
to an unskilled observer. Conventional bibliography is essentially a simple
listing or inventory of publications which disregards most of the interrelationships
between the items in the inventory. In contrast, citation indexing integrates
this necessary and useful listing in a huge graph or network. In this graph,
each entity (in this case documents) is a node or vertex in a huge multi-dimensional
network. BY analogy, this model of the literature (which Garfield considers
to be equivalent to man's knowledge) is like a large road map in which the cities
and towns share varying degrees of connectivitv. Even the smallest hamlets are
nodes on the citation map of science.
Garfield refers to previous work of his on this type of historical map (46).
The powerful technique illustrated by Figure 7 is reproduced from one of his
papers (45). Since each document is an "event" and bears a date, a graphical
history may be displayed, but with the important advantage of being able to
show the interrelationships among events. This is a legitimate starting point
for the historian.
There is clearly no technical obstacle to handling conceptual entities in
the same manner as documents. This would clearly be of value to both the historical
and educational model types.
Garfield himself refers to the possibility of having such graphs displayed
directly onto a computer-controlled TV screen or plotted onto graph paper by
a plotting device. Computers currently plot such graphs on standard line printers
as output from the commonly-used PERT programs.
Garfield is only concerned with the time or historical dimension as a means
of sequencing entities, and only with the citation relationship between such
entities. There is no reason, however, why other dimensions and relationships
should not be used: geographical, educational, logical, etc., corresponding
in fact to more of the model-types listed in an earlier heading.
4. Subject Classification Schemes
There are a wide variety of subject classification schemes for document handling.
The Universal Decimal Classification and Dewey systems have become widely used
but many other systems exist for specialized subject areas.
The most recent international review of these schemes in the UNISIST Study
of the feasibility of a world science information system has this comment to
"Librarians and information specialists would generally agree that a world-wide
scheme of subject categorization is needed to facilitate document and information
exchanges .... Opinions differ, however, when it comes to deciding which scheme
best suits the purpose. Several encyclopedic classifications are in competition
- the Dewey Decimal Classification, the List of Subject Headings used in the
Library of Congress, the Colon Classification, the Universal Decimal Classification,
etc. and although the last named has benefitted from extensive international
support through FID, it is by no means the unique candidate for worldwide recognition
as the standard subject category list. Its advantages and shortcomings were
examined by the UNISIST Working Group on Research Needs in Documentation, who
came to a twofold conclusion:
(a) organizational and technical measures could be taken to obviate the managerial
drawbacks of UDC, e.g. slow revision procedure, infrequent re-editions, etc.;
(b) on the other hand, no clear answer could be given to the more controversial
question of overall or local inadequacy_, as regards the content and structures
of UDC divisions .... further studies and experiments are required to assess
the potential value of UDC in its present state, as the unique world list
of subject headings for broad categorization, or "shallow" indexing of documents."
As the UNISIST extract above acknowledges, UDC is one amongst many classification
schemes which are in competition. The tendency for different classifying groups
to favour different category breakdowns should be contained and facilitated
within an information system and not left to deteriorate into sordid squabbles
which do not recognize the value to knowledge advance of alternative views,
and a continuing effort at reconceptualization, restructuring and redefinition
Also of interest is the UN/OECD Aligned List of Descriptors which has now
been developed into a "macrothesaurus". This is primarily oriented around mission-focused
topics which emerge in the work of the major intergovernmental agencies concerned
with economic and social development. From the perspective of this proposal,
the following operations have been blurred together:
- entities are labelled by terms
- terms have to be classified into semantic fields to be incorporated
- terms have to be translated and agreed as terms to avoid language dependence
When terms are dropped, the reverse procedure affecting the structure of the
list must be followed. Each of these steps involves operational and intellectual
difficulties which tend to slow down and resist modification. In addition, the
List makes great efforts to be flexible by being termoriented. To do this it
has had to avoid hierarchical classification of any depth. This choice is not
in the interests of those users who need a "deep" classification structure.
Originally (1967-68) it was intended that UNISIST should cover the basic natural
sciences but arguments were put forward for the inclusion of technology "or
at least some of its branches, especially medicine, agriculture, building and
construction". Ultimately, "the position of the ICSU/Unesco Central Committee
was that UNISIST should devote its primary effort to the basic sciences ...
and at the same time be sympathetic to a progressive inclusion of the applied
and engineering sciences - and eventually the social sciences - on an equal
footing with the former" (UNISIST Report, pp. 135-6). No time scale was given.
The special problems of social sciences are ignored in this vague intention
to broaden UNISIST. Whilst the latter may prove to be a dramatic success in
the field of the natural sciences, it is questionable whether the same
techniques can be successfully applied to the social sciences without
doing violence to the process by which the latter develop.
In the natural sciences, invariants in the objective world are represented
by signs which can in most cases be directly and unambiguously attached to the
object in question, to the satisfaction of the natural science community. The
sign for the object and the conceptualization of it are intimately and unambiguously
related. Another sign in another language may be used but the rules of transformation
are clear (the natural language verbiage is another matter, but is less significant).
It is a case of "one sign, one concept, one object". It is therefore possible
to infer that knowledge transfer tends to accompany information transfer. (This
inference may however be very dangerous in the case of non-Indo-European language
users, for whom the "objective" nature of the world may appear less significant).
But any extension of the world science information system, as it is conceived,
to the social sciences would only be of superficial significance if the above
distinctions were not reflected in the design of the system. This is because
in the social sciences, most of the debate concerns the relation between perceptual
invariants detected (by the consensus of a group), signs (selected by the group)
and the associated conceptual meaning - as has been recently pointed out by
Jean Piaget (47):
"All the social and human sciences are more or less closely concerned, in
their diachronic aspects, with the development of Knowledge (as a subject)
... The foregoing considerations show that the human sciences, in so far as
they necessarily include in their field of study the subject of knowledge
- the source of the logical and mathematical structures on which they depend
- do not merely maintain a set of interdisciplinary relations between one
another... but are part of an extensive circuit or network that really covers
all the sciences ... It was essential to recall this so as to be able to shape
our conclusions in such a way that they might succeed in revealing the true
significance of interdisciplinary relations.
"For their significance far exceeds that of a mere tool for facilitating work,
which is all they would amount to if used solely in a common exploration of
the boundaries of knowledge. This way of viewing collaboration between specialists
in different branches of knowledge would be the only possible one if we admitted
a thesis to which far too many research workers still unwittingly cling -
that the frontiers of each branch of knowledge are fixed once and for all,
and that they will inevitably remain so in the future. But the main object
of a work such as this ... is to push back the frontiers horizontally and
to challenge them transversally. The true object of interdisciplinary research,
therefore, is to reshape or reorganize the fields of knowledge, by means
of exchanges which are in fact constructive recombinations." [p. 521-524,
The natural sciences are therefore primarily interested in the debate on the,
usually tangible, content of categories (which are considered to be relatively
permanent), and the dynamic lies in subdividing the categories and discovering
relationships between their content. Whereas the social sciences, unable to
latch onto an unambiguous content, are primarily interested in the categories
themselves and their interrelationships, and the dynamic lies in reformulating,
reshaping, and regrouping the system of categories in an effort to get closer
to the content. Both natural and social science have conceptual parsimony as
a criterion, whereas the "sciences humaines" are interested in multiplying the
number of possible concepts and increasing their variety. It is clear that the
natural sciences could easily adjust to an arbitrary permanent category hierarchy,
whereas the social sciences would be straight-jacketed and ill-served by
any such system.
Perhaps the clearest example of the need for a concept-or knowledge-oriented
approach in the case of the social sciences (as opposed to a subject/ descriptor
approach) is given by the confusion of meanings associated with the concept
"democracy". Few people know that Unesco arranged an expert meeting to clarify
its meaning. The meeting concluded that at least thirty distinct meanings were
required and in use(48). The report was withdrawn from circulation for political
reasons - it is political dynamite. It means that in most international debates
(in which the word is a vital element of the consensus of interest and common
goal on which the discussion is founded) participants are simply talking past
one another, and resolutions containing the word are of questionable significance.
In fact, the multiplicity of interpretations implicit in term-oriented discussions
and report production may be considered a direct stimulus to the production
of further reports giving clarifying or alternative interpretations - thus
further clogging document systems.
5. Concept Dictionaries
The outstanding importance of dictionaries in the modern world explains why
some lexicographers are dissatisfied with the mechanical method of arranging
words in alphabetical order, and would prefer to classify them according to
the concepts which they express. One would be mistaken in believing that this
is a recent trend, since one finds tentative systematic vocabularies at Babylon
in the third millennium before Christ.
It would obviously be a great convenience if conceptual dictionaries of different
languages, periods, or single authors could conform to the same general pattern
so that they could be readily compared with one another. To this end one would
require a conceptual framework so comprehensive and yet so elastic that the
most diverse languages and the most idiosyncratic writers would fit smoothly
into it. Such a broad classification of concepts was put forward by R. Hallig
and W. von Wartburg in 1952.
The German research on "semantic fields", which later inspired Georges Matoré,
La méthode en lexicologie, domaine français, (Method in lexicology
in the field of the French language), Paris, Didier, 1953; offers (p.70-4)
a diagram of a "comprehensive classification of lexicon facts" different from
that of Hallig and Wartburg, and, moreover, less satisfactory. At the eighth
International Congress of Linguists (Oslo, 1957) there was a (rather disappointing)
discussion on the subject "To what extent can meaning be said to be structured?
(p. 636-704 of the Proceedings).
Needless to say, the Hallig-Wartburg system is only one of various possible
ways in which concepts could be classified; the aim was not so much to devise
an ideal scheme as to have a unique basis for specific investigations. If this
idea were to be widely adopted, a series of coordinated research projects could
be planned with sufficient flexibility to adapt the scheme to the material
examined, and yet with enough common ground to make the results comparable.
For a more detailed review of initiatives in this area, see Ullman (49) and
de Grolier (50).
6. "World Problems" Identification
The author is currently engaged in a project co-sponsored by the Union of International
Associations, Mankind 2000 and the Center for Integrative Studies. This is
an attempt to identify, "register" and describe worldwide problems with a view
to the publication of a Yearbook of
World Problems (51). (Work to date has established that there might be
some 2000-5000). The approach is similar in philosophy to that proposed here
for concepts. Classification of problems is seen as a second and distinct phase.
A crude model is being used to facilitate data collection. Two other models
will be used to plot problem interrelationships. It is hoped to be able to
map and plot problem networks.