- / -
Originally appeared in 1973 as part of Toward a Concept Inventory
There have been many attempts at isolating and classifying elements of meaning at the root of complex concepts. De Grolier (41) notes that methods and the need for them have been regularly discovered and rediscovered since the time of Leibniz or even earlier. He then states:
"We draw attention to these 'anteriorities', not in order to underrate the work performed by the various researchers or teams of researchers who, in most cases, truly believed that they had discovered a 'new method' but to persuade them, rather than to advocate unilaterally any one 'exclusive' process, to agree that they are all engaged in work on common basic principles, whatever may be the differences (at times very minor) in the coding method or the particular type of machine adopted."
De Grolier has summarized the work on classification around the world but only a few initiatives seem to be directly related to this project. Usually the work has been directed towards solving a classification problem in some particular field, which strongly influences the design of the scheme. The following, noted by de Grolier, are of more direct relevance:
1.1 Perry and Kent (Western Reserve University): Developed a coding method for the field of metallurgy based on 'semantic analysis' of complex terms into 'individual terms'. 30,000 terms were assembled from a variety of sources. The notation is however very cumbersome.
1.2 S.M. Newman (U.S. Patent Office) A "vast attempt at defining or redefining concepts, which could perhaps be entitled - to paraphrase a famous title - 'In search of lost simplicity': to discover or rediscover non-equivocal terms beyond the complications of natural language, which 'unfortunately' does not have "uniform or logical rules for the denomination of devices or things". In affect this is an attempt at creating a metalanguage - but again results in a cumbersome notation.
1.3 C.G. Smith (U.S. Patent Office): Suggested a system which would isolate "ultimate concepts .... required in the definition of more specific concepts .... There is a basic layer of concepts which do not require definition. It is the use of such elemental concepts which is contemplated in the present system .... A fundamental feature is to seek beneath composite words the basic organization of elemental concepts which they represent, and to develop the essential combination for the definition of these words." (42) This was conceived mainly for patentable contrivances on the US Patent Office Interrelated Logic Accumulating Scanner. It does however permit chains of related concepts to be handled.
1.4 Cordonnier: Worked on methods "to symbolize the elementary points of view of the classification of ideas and .... to study the grouping of these symbols in order to obtain composite symbols representing the structure of complex concepts". He also suggests that "intuition permits the representation in an intellectual space of a logical figure, to n dimen- sions, a synthesis of the relationships between a group of ideas into the different classes which arrange them naturally according to the various possible individual viewpoints".
1.5 M.E. Stevens Worked on use of computers to handle interrelationships between terms and to 'define', by supplying the generic and descriptive terms related to the term of which the definition is sought; 'develop', by furnishing specific examples of a generic term; 'localize', by indicating the place which can be associated with the proposed concept; 'match', by comparing several proposed terms together, in order to find a 'common point' making it possible to relate to these terms another term possessing the same characteristic; and carry out other logical opera tions. (43)
Work has been in progress for some years at the M.I.T. Center for International Studies on the development of very general systems for time shared computer data management (44). An item of data is treated as a sequence of categories of information in n-adic relations applied to a specific entity. N-adic data descriptions for social science propositional inventories are noted as being quite complicated, e.g. 'violence' is 'power' over 'power' over 'well-being'.
The ADMINS system makes use of a "calculus of relations" for stating the derivation of a new relation that draws on those already existing, and which yields a new relational record between particular entities. It is in the structuring of the programming language around the relational record and in achieving intimate interaction with many storage levels that this system differs from most procedure languages.
The citation indexing method described by Eugene Garfield (and implemented in the form of the Science Citation Index and the recently initiated Social Science Citation Index) is of great interest to this project if the focus on documents can be replaced by a focus on concepts (45). The traditional philosophy of classification system design implies that individual entities (usually documents) can be treated as though they were independent of one another. This basic fallacy not only results in the loss of important informational links, but it is basically inefficient. Little or no effort is made to establish a possible relationship between the entity being classified and the entities already classified. There are exceptions to this rule, but generally the building-block development of human knowledge is not perceptibly reflected in traditional classification systems. In conventional word indexing systems, the indexers cannot afford the time to establish linkages between concepts.
Each addition to the body of knowledge is treated as one of a series of independent events, like molecules of a gas. But the literature is not an "ideal gas" - the molecules interact. Similarly, the body of knowledge, partly embodied in the literature, is composed of highly interrelated elements. It is a heavily cross-linked network. The clearly-visible linkages are those ordinarily provided by authors in the form of explicit citations. Less clearly seen are implicit references as in eponyms and neologisms. Almost invisible linkages exist in the natural language expressions which obscure the relationships, especially to an unskilled observer. Conventional bibliography is essentially a simple listing or inventory of publications which disregards most of the interrelationships between the items in the inventory. In contrast, citation indexing integrates this necessary and useful listing in a huge graph or network. In this graph, each entity (in this case documents) is a node or vertex in a huge multi-dimensional network. BY analogy, this model of the literature (which Garfield considers to be equivalent to man's knowledge) is like a large road map in which the cities and towns share varying degrees of connectivitv. Even the smallest hamlets are nodes on the citation map of science.
Garfield refers to previous work of his on this type of historical map (46). The powerful technique illustrated by Figure 7 is reproduced from one of his papers (45). Since each document is an "event" and bears a date, a graphical history may be displayed, but with the important advantage of being able to show the interrelationships among events. This is a legitimate starting point for the historian.
There is clearly no technical obstacle to handling conceptual entities in the same manner as documents. This would clearly be of value to both the historical and educational model types.
Garfield himself refers to the possibility of having such graphs displayed directly onto a computer-controlled TV screen or plotted onto graph paper by a plotting device. Computers currently plot such graphs on standard line printers as output from the commonly-used PERT programs.
Garfield is only concerned with the time or historical dimension as a means of sequencing entities, and only with the citation relationship between such entities. There is no reason, however, why other dimensions and relationships should not be used: geographical, educational, logical, etc., corresponding in fact to more of the model-types listed in an earlier heading.
There are a wide variety of subject classification schemes for document handling. The Universal Decimal Classification and Dewey systems have become widely used but many other systems exist for specialized subject areas.
The most recent international review of these schemes in the UNISIST Study
of the feasibility of a world science information system has this comment to
"Librarians and information specialists would generally agree that a world-wide scheme of subject categorization is needed to facilitate document and information exchanges .... Opinions differ, however, when it comes to deciding which scheme best suits the purpose. Several encyclopedic classifications are in competition - the Dewey Decimal Classification, the List of Subject Headings used in the Library of Congress, the Colon Classification, the Universal Decimal Classification, etc. and although the last named has benefitted from extensive international support through FID, it is by no means the unique candidate for worldwide recognition as the standard subject category list. Its advantages and shortcomings were examined by the UNISIST Working Group on Research Needs in Documentation, who came to a twofold conclusion:
(a) organizational and technical measures could be taken to obviate the managerial drawbacks of UDC, e.g. slow revision procedure, infrequent re-editions, etc.;
(b) on the other hand, no clear answer could be given to the more controversial question of overall or local inadequacy_, as regards the content and structures of UDC divisions .... further studies and experiments are required to assess the potential value of UDC in its present state, as the unique world list of subject headings for broad categorization, or "shallow" indexing of documents." (p.95)
As the UNISIST extract above acknowledges, UDC is one amongst many classification schemes which are in competition. The tendency for different classifying groups to favour different category breakdowns should be contained and facilitated within an information system and not left to deteriorate into sordid squabbles which do not recognize the value to knowledge advance of alternative views, and a continuing effort at reconceptualization, restructuring and redefinition of knowledge.
Also of interest is the UN/OECD Aligned List of Descriptors which has now been developed into a "macrothesaurus". This is primarily oriented around mission-focused topics which emerge in the work of the major intergovernmental agencies concerned with economic and social development. From the perspective of this proposal, the following operations have been blurred together:
When terms are dropped, the reverse procedure affecting the structure of the list must be followed. Each of these steps involves operational and intellectual difficulties which tend to slow down and resist modification. In addition, the List makes great efforts to be flexible by being termoriented. To do this it has had to avoid hierarchical classification of any depth. This choice is not in the interests of those users who need a "deep" classification structure.
Originally (1967-68) it was intended that UNISIST should cover the basic natural sciences but arguments were put forward for the inclusion of technology "or at least some of its branches, especially medicine, agriculture, building and construction". Ultimately, "the position of the ICSU/Unesco Central Committee was that UNISIST should devote its primary effort to the basic sciences ... and at the same time be sympathetic to a progressive inclusion of the applied and engineering sciences - and eventually the social sciences - on an equal footing with the former" (UNISIST Report, pp. 135-6). No time scale was given.
The special problems of social sciences are ignored in this vague intention to broaden UNISIST. Whilst the latter may prove to be a dramatic success in the field of the natural sciences, it is questionable whether the same techniques can be successfully applied to the social sciences without doing violence to the process by which the latter develop.
In the natural sciences, invariants in the objective world are represented
by signs which can in most cases be directly and unambiguously attached to the
object in question, to the satisfaction of the natural science community. The
sign for the object and the conceptualization of it are intimately and unambiguously
related. Another sign in another language may be used but the rules of transformation
are clear (the natural language verbiage is another matter, but is less significant).
It is a case of "one sign, one concept, one object". It is therefore possible
to infer that knowledge transfer tends to accompany information transfer. (This
inference may however be very dangerous in the case of non-Indo-European language
users, for whom the "objective" nature of the world may appear less significant).
But any extension of the world science information system, as it is conceived,
to the social sciences would only be of superficial significance if the above
distinctions were not reflected in the design of the system. This is because
in the social sciences, most of the debate concerns the relation between perceptual
invariants detected (by the consensus of a group), signs (selected by the group)
and the associated conceptual meaning - as has been recently pointed out by
Jean Piaget (47):
"All the social and human sciences are more or less closely concerned, in their diachronic aspects, with the development of Knowledge (as a subject) ... The foregoing considerations show that the human sciences, in so far as they necessarily include in their field of study the subject of knowledge - the source of the logical and mathematical structures on which they depend - do not merely maintain a set of interdisciplinary relations between one another... but are part of an extensive circuit or network that really covers all the sciences ... It was essential to recall this so as to be able to shape our conclusions in such a way that they might succeed in revealing the true significance of interdisciplinary relations.
"For their significance far exceeds that of a mere tool for facilitating work, which is all they would amount to if used solely in a common exploration of the boundaries of knowledge. This way of viewing collaboration between specialists in different branches of knowledge would be the only possible one if we admitted a thesis to which far too many research workers still unwittingly cling - that the frontiers of each branch of knowledge are fixed once and for all, and that they will inevitably remain so in the future. But the main object of a work such as this ... is to push back the frontiers horizontally and to challenge them transversally. The true object of interdisciplinary research, therefore, is to reshape or reorganize the fields of knowledge, by means of exchanges which are in fact constructive recombinations." [p. 521-524, emphasis added)
The natural sciences are therefore primarily interested in the debate on the,
usually tangible, content of categories (which are considered to be relatively
permanent), and the dynamic lies in subdividing the categories and discovering
relationships between their content. Whereas the social sciences, unable to
latch onto an unambiguous content, are primarily interested in the categories
themselves and their interrelationships, and the dynamic lies in reformulating,
reshaping, and regrouping the system of categories in an effort to get closer
to the content. Both natural and social science have conceptual parsimony as
a criterion, whereas the "sciences humaines" are interested in multiplying the
number of possible concepts and increasing their variety. It is clear that the
natural sciences could easily adjust to an arbitrary permanent category hierarchy,
whereas the social sciences would be straight-jacketed and ill-served by
any such system.
Perhaps the clearest example of the need for a concept-or knowledge-oriented approach in the case of the social sciences (as opposed to a subject/ descriptor approach) is given by the confusion of meanings associated with the concept "democracy". Few people know that Unesco arranged an expert meeting to clarify its meaning. The meeting concluded that at least thirty distinct meanings were required and in use(48). The report was withdrawn from circulation for political reasons - it is political dynamite. It means that in most international debates (in which the word is a vital element of the consensus of interest and common goal on which the discussion is founded) participants are simply talking past one another, and resolutions containing the word are of questionable significance. In fact, the multiplicity of interpretations implicit in term-oriented discussions and report production may be considered a direct stimulus to the production of further reports giving clarifying or alternative interpretations - thus further clogging document systems.
The outstanding importance of dictionaries in the modern world explains why
some lexicographers are dissatisfied with the mechanical method of arranging
words in alphabetical order, and would prefer to classify them according to
the concepts which they express. One would be mistaken in believing that this
is a recent trend, since one finds tentative systematic vocabularies at Babylon
in the third millennium before Christ.
It would obviously be a great convenience if conceptual dictionaries of different languages, periods, or single authors could conform to the same general pattern so that they could be readily compared with one another. To this end one would require a conceptual framework so comprehensive and yet so elastic that the most diverse languages and the most idiosyncratic writers would fit smoothly into it. Such a broad classification of concepts was put forward by R. Hallig and W. von Wartburg in 1952.
The German research on "semantic fields", which later inspired Georges Matoré, La méthode en lexicologie, domaine français, (Method in lexicology in the field of the French language), Paris, Didier, 1953; offers (p.70-4) a diagram of a "comprehensive classification of lexicon facts" different from that of Hallig and Wartburg, and, moreover, less satisfactory. At the eighth International Congress of Linguists (Oslo, 1957) there was a (rather disappointing) discussion on the subject "To what extent can meaning be said to be structured? (p. 636-704 of the Proceedings).
Needless to say, the Hallig-Wartburg system is only one of various possible ways in which concepts could be classified; the aim was not so much to devise an ideal scheme as to have a unique basis for specific investigations. If this idea were to be widely adopted, a series of coordinated research projects could be planned with sufficient flexibility to adapt the scheme to the material examined, and yet with enough common ground to make the results comparable.
For a more detailed review of initiatives in this area, see Ullman (49) and de Grolier (50).
The author is currently engaged in a project co-sponsored by the Union of International Associations, Mankind 2000 and the Center for Integrative Studies. This is an attempt to identify, "register" and describe worldwide problems with a view to the publication of a Yearbook of World Problems (51). (Work to date has established that there might be some 2000-5000). The approach is similar in philosophy to that proposed here for concepts. Classification of problems is seen as a second and distinct phase. A crude model is being used to facilitate data collection. Two other models will be used to plot problem interrelationships. It is hoped to be able to map and plot problem networks.
For further updates on this site, subscribe here