- / -
Documentation or Knowledge?
'The possibilities open to thinking are the possibilities of recognising relationships and the discovery of techniques of operating with relationships on the mental or intellectual plane, such as will in turn lead to ever wider and more penetratingly significant systems of relationships.'
(B.L. Whorf. Language, Thought and Reality]
This paper addresses itself to the practical problems of developing a means of filing concepts and other theoretical constructs in a data bank. Such concepts would be filed as entities having a distinct meaning and not in terms of the word by which they happen to be represented in a particular school of thought. The reason for this approach is that many of the words on which most reliance is placed in the social sciences e.g. 'group', 'class', 'power', or 'structure' have acquired a multiplicity of overlapping meanings to (1, 2).
The concept file so created would be used to generate lists, to facilitate classification and interrelation of concepts to produce concept thesauri, and, finally, to facilitate the allocation of 'authoritative' terms to permit the production of terminological thesauri.
The object of this project would be to ensure that any qualified person -- with a few safeguards -- would be free to register entities in the file which would then become available for secondary analysis at any interested research centre.
One form such analysis might take would be the construction and comparison of various models or classification schemes for theoretical entities. At a tertiary level, efforts could be made to link such entities with each other, cutting across the boundaries of disciplines, ideologies, epistemological approaches, paradigms or problems. This activity would provide new alternative means of approaching the entities held on the file but would not affect their use for more restricted purposes.
In this paper particular attention has been paid to some of the techniques available to analyse complex entity networks or structures. Because of this complexity and the problems of comprehending it, the use of interactive computer graphics has been examined as a powerful means of simplifying the task and making the project more widely significant.
A project to handle, structure, and analyse theoretical constructs is proposed which would be operated as three distinct phases:
A translation phase, to make the project more widely relevant, would run in parallel with the above three. Each succeeding phase builds on the previous one, but need not necessarily follow it immediately in time for the project as a whole to be of value.
1. Concept Inventory Phase
A computer-based concept registration or tagging system should be set up which would allocate sequence numbers to concepts on a continuing basis. The criteria for concept registration should be kept to a minimum to ensure that the system remains "open" to a wide variety of users and contributors.
This approach permits rapid inclusion and organisation of the data and rapid production of updated concept lists. These would facilitate the scrutiny of the data in later phases and in terms of the perceptions of different need groups.
2. Concept Classification Phase
Evaluation, classification and identification of concept interrelationships would be made independently by a limited number of contributing groups, possibly associated organisationally with the international academic bodies. These groups would be primarily concerned with allocating codes to be fed back to the computer system so that ordered and refined concept thesauri could be procured to reflect the perceptions and needs of the contributing groups. An important aspect of this coding function by groups would be the rejection of those conceptions registered which are considered to be of little value to the group's perspective.
From the computer data handling point of view, each contributing group would be building, refining, and maintaining its own 'model'. Each such model would be handled as an independent optional qualifier on the sequentially-ordered concept list.
From the point of view of any such group, the computer system would be viewed as holding the concepts in which it is interested in the order of its own preferred classification scheme.
There would of course be the opportunity at any time to look at the same concept list through the classification scheme of any other contributing school of thought. Concepts would be identified by their sequential number plus a number which would identify the model employed.
3. Term Allocation Phase
At a later stage users of one model might find it useful to produce an 'authoritative' list of terms to be used for those concepts of interest to them. This could also be incorporated into the computer system. Such terms could then be used to produce standard terminological thesauri for the users of one model.
1. The simple and unambiguous administrative task of filing entities is merged into the complex intellectual task of coding and classifying them. This makes the whole project lengthy, costly, and complex.
In this project the identification of entities to be included in a thesaurus and the practical problems of incorporating these entities into an information system are distinguished from the theoretical problems of classifying and interrelating such entities. The first is a relatively fast and unskilled operation and the second is a relatively slow and skilled one.
The technique of identifying the entity within the system by a numerical tag derived from a classification scheme is avoided. The savings in labor associated with this technique are only significant in a system in which all operations are manual. Where computers can be used, the two types of operation can be distinguished in order to save resources, speed up operations and increase the flexibility of reconceptualization of any classification scheme.
2. The classification of theoretical constructs may be associated with an intellectual and material investment in a document physical-location system. This opposes any flexibility or major reconceptualization of relationships between entities.
In this project there is no direct relationship between the classification scheme(s) and the physical problem of locating source documents
3. The classification scheme may be rigid and "final", based upon a high commitment to a particular set of theoretical assumptions of limited comprehensiveness, and therefore unable to adapt to new types of interrelationships.
In this project both rigid and rapidly evolving classification schemes can be used to interrelate the entities handled.
4. The classification scheme may be exclusive or "inhospitable" and therefore of limited use.
In this project both exclusive and hospitable schemes may be used. This gives it a wide range of uses.
5. Some systems are specifically designed with the special problems of a particular field of knowledge in mind. This makes them difficult to use in other areas.
In this project specialized and general overdesigning the information handling system to meet immediately- perceived needs would reduce its usefulness and relevance to others and therefore increase the difficulty of ensuring adequate funds over a long period. (The degree of "hygiene" introduced may be inversely proportional to the utility or relevance of the system to potential users.)
6. Even adequate universal schemes may become viewed as authoritarian and a vehicle for some form of conceptual imperialism. Unfortunately the organization of relations between entities is equated with the imposition of a new set of relations. The organizers are perceived as acquiring power. Exclusive or rigid schemes, once created, are viewed and defended as unique and "universally applicable" by their proposers, thus eliminating any possibility of more comprehensiveness, better-funded, joint efforts.
In this project, every effort has been made to ensure that it does not become associated with particular schools of thought, organizations or personalities who might resent criticism of their perspective and alienate potential collaborators. All such individualism is contained within the model building activity which does not jeopardize other models or the project as a whole.
7. The actual procedures for incorporating new entities into any "approved" list within the system may appear bureaucratic and stultifying unless the system is user-oriented. There is therefore the old problem of minimizing the bureaucratic desire for due process and order and maximizing user participation.
In this project suggestions have been made concerning means of maximizing user participation.
8. The system may be designed with only one type of user in mind, e.g. scholars or students. New systems, which compete for the same resources, then have to be created for other users of the same data.
In this project some consideration has been given to methods of introducing "filters" in conjunction with special models in order to show special relationships between entities in a manner significant to other types of user. Some of the needs of users not immersed in the Western cultural perspective have also been considered.
9. The notation used to indicate the position of an entity in a classification scheme may be very complex. This may make data handling very difficult.
In this project it is not necessary to use a notation in order to file the entity. Only a simple sequence number is required. A standard notation for use in print, but independent of the organization of the system,. has been suggested.
10. The system may be viewed as a "one-shot" job using all the appropriate specialists. This is the case with some concept directories. Even so, nonparticipants criticize the position taken by the participants, thus suggesting the need for new projects.
In this project it is not necessary to limit classification to the views of one specialist. A number of competing specialists can parti cipate together or separately without jeopardizing the ability of the system to adapt and respond to new proposals.
11. Systems may be slow (up to decades) in responding to proposals for change, to the point of acting as a constraint on innovation to those dependent upon them.
In this project, modifications and alternatives can be handled without difficulty.
12. A system proposal may raise problems of standardization for purposes of handling bibliographical or other data. The system design may then become a pawn in debate between the different schools of standardization and information handling.
In this project there are no features which could become a major issue in the ongoing debate, since it is not a conventional documentation system and does not have major bibliographical concerns.
13. A system proposal may constitute a threat to other systems competing for the same resources -- particularly if major changes are proposed for existing systems.
This project does not appear to compete with other systems. It can be considered complementary to some documentation systems.
14. A system may demand, or be designed for, complex computer systems to the point of being unusable in less- richly-endowed environments.
This project is based on a very simple filing system for entities and relationships between them. The resulting file may however then be subjected to analyses of varying power depending on the computer environment available.
15. A system design may raise fundamental theoretical issues, and therefore alienate important potential supporters.
In this project the accent is on providing a simple technique for filing entities and relationships in a way which permits a number of general analytical and display techniques to be used. Every effort has been made to avoid giving a final and exclusive definition of what is incorporated. Such theoretical debates are carefully confined to the activities of modelling groups which are each free to ignore or accept entities and relationships filed by other modelling groups.
The next step is to obtain critical comments on the various proposals Put
forward and to undertake pilot projects in some of the following areas:
Exactly how much pilot project activity is required will depend upon the speed with which it is desired that the project as a whole should move forward and the range of Interests it is desired that the project should serve. These must be decided.
No comments have been made on the fundi ng required since cost estimation depends on decisions taken for the next stage. The computer programs envisaged for the filing of entities and relationships and generation of lists and thesauri are however fairly simple to prepare an d cheap to run. The other major costs would be collection of conceptual entities (unless done voluntarily by a team using existing material),.administration (unless incorporated within the budget of some existing institute) and travel costs of those concerned with modelling (unless it was decided to switch immediately to the postal modelling concept outlined.
1. Fred W. Riggs. 'Concepts, Words and Terminology. Honolulu, University of Hawaii, Social Sciences Research Institute, 1971, 66 p. (Committee on Conceptual and Terminological Analysis, Working Paper No. 13. )
2. Giovanni Sartori. Concept misinformation in comparative politics. American Political Science Review, 64, December 1970, 4, pp. 1033-1053.
3. T. E. Dalenius and O. Frank. Control of classification. Review of the International Statistical Institute, 36,3,1968, 279-295 (includes formal description of classification and introduces various parameters useful for control purposes).
4. N. Jardine and R. Sibson. Mathematical taxonomy. Wiley, 1971.
5. UNESCO. UNISIST; study report on the feasibility of a world science information system. Paris, Unesco, 1971.
6. A. Martinet. Arbitraire linguistique et double articulation. Cahiers Fernand de Saussure, 15, 1957, 107 (cited by Georges Mounin, Les problemes theoretiquAs de la traduction, Paris, Gallimard, 1963, p.122-123).
7. G. Sjöblom. Theoretical testing of approaches in political science. Paper presented at a conference of the International Studies Association, Bellagio, 1971
8. Eric de Grolier. A Study of General. Categories applicable to Classification and Coding in Dcumentation, Paris, UNESCO, 1963, pp. 17-60, 61-142, 143-158
9. Geoffrey Vickers. Value Systems and the Social Process. London, Pelican, 1971.
10. Paragraph numbers refer to columns in figure 3 of the computer record layout. No attempt has been made at this preliminary stage to indicate how many character positions would be required for each zone in the record.
11. Jacques Halkin. Proposal and wishes for an open structure in the communication of information. Scheduled for publication in: A.I.Mikhailov (Ed) The Theoretical Problems of Information Retrieval Systems. (The Hague, International Federation for Documentation, 1971)
12. OECD / Centre for Educational Research and Innovation. Interdisciplinarity; problems of teaching and research in the universities. Paris, OECD, 1972, 321p.13. A. Kaufman. Graphs, dynamic programming and finite gamer. Academic, 1967.
14. Claude Berge. Theories des graphes et ces applications. Paris, Dunod,- 1958, 277 p.
15. Claude Flament. Theorie des graphes et structures sociales. Paris, Mouton, 1965
16. J. Clyde Mitchell (Ed). Social Networks in Urban Situations, Manchester University Press, 1969
17. Norman Schofield. A topological model of international relations. (Paper presented to Piece Research International meeting, London, 1971
18. George M. Beal et al. System linkages among women's organizations. Department of Sociology and Anthropology, Iowa State University, 1967.
19. Robert O. Anderson. A sociometric approach to the analysis of interorganizational relationships. Institute for Community Development and Services, Michigan State University, 1969.
20. D. Cartwright. The potential contributions of graph theory to organisation theory. In: M. Haire (Ed.] Modern Organization Theory, Wiley, 1959. ***
21. In the field of documentation a thesaurus may be represented 'graphically' but more for the visual presentation facility than for any graph theoretic possibilities. For example: the 'genetic maps' of the U.S. Armed Services Technical Information Agency (ASTIA), the concentric circle diagram of the Technische Dokumentatie - en Informatie Centrum voor de Krijgsmacht (TDCK, The Hague), the arrow diagrams used by EURATOM and the Bureau d'etudes van Dijk in Brussels (see Figure 4). See also the computer established 'association maps' of Lauren B. Doyle. (Indexing and abstracting by association. American Documentation, October, 1962). See Also: Kurt Lenin. the Principles of Topological Psychology. McGraw-Hill, 1936; E. Zierer. The theory of graphs in linguistics. Mouton, 197C, 62 p.; R. Quillan. Semantic memory. In: M. Minsky (Ed.). Semantic Information Processing. M.I.T., 1968, pp. 225-2/0; R.B. Banerji. A language for the description of concepts. Unpublished paper, System Research Center, Case Institute of Technology, 1964.
22. V.E. Benes. Mathematical Theory of Connecting Networks and Telephone Traffic. Academic, 1965, p. 53.
23. C. Berge. The Theory of Graphs and its Applications. Methuen, 1962.
24. C. Flament. Applications of Graph theory to Group Structure. Prentice-Hall, 1963
25. F. Harary and R. Z. Norman. Graph Theory as a Mathematical Model in Social Sciences. University of Michigan, 1953.
26. F. Harary, R. Z. Norman and D. Cartwright. Structural Models: an introduction to the theory of directed graphs. Wiley, 1965.
27. This term is used widely to cover both the more common 'alphascopos', which can display letters and numbers on predetermined lines, and the 'vector displays' with light-pen facility, which can also generate lines and curves. It is the latter device which is discussed here. See, for example: See: Ivan Sutherland. Computer displays. Scientific American, 222, June 1970, pp. 56-8. Interactive graphics in data processing. IBM Systems Journal, 7, 3 and 4, 1968, whole double issue. Computer Graphics 1970; and international symposium. Brunel University, 1970, 3 vols.
28. Dean Brown and Joan Lewis. The process of conceptualisation; some fundamental principles of learning useful in-teaching with or without the participation of computers. Educational Policy Research Center, Stanford Research Institute, 1968, pp. 16-18
29. Jay Forrester. World Dynamics. Wright-Allen, 1971, p.14-15.
30. Computer graphics. Datamation, May 1966. p. 22-27.
32. Douglas C. Engelbart. Augmenting Human Intellect; a conceptual framework. Menlo Park, Stanford Research Institute, 1962, pp. 34-37 (AFOSR-3223)
33. Nilo Lundgren. Toward the decentralized intellectual workshop. Innovation (New York), 1971.
Douglas C. Engelbart. Intellectual implications of multi-access computer networks. Stanford Research Institute, 1970. (Conference paper).
For dialogue implications, see U.S.A. National Academy of Sciences Committee on Scientific and Technical Communication (SATCOM), in 1969, that: "More exciting than retrieval of information from a static store is evolutionary indexing, in which user's additions, modifications, restructuring, and critical commentaries steadily improve the initial indexing" National Science Foundation funding of investigation into this approach was recommended.
34. L. Terler, H. Enea and K.M. Colby. A directed graph representation for computer simulation of belief systems. Mathematical Biosciences, 2, 1/2, 1968, pp. 19-40
35. K.M. Colby, L. Tesler, H. Enea. Experiments with a Search Algorithm on the Data Base of a Human Belief Structure. Stanford University, Artificial Intelligence Project, 1969, (Memo AI-94).
36. John C. Loehlin. Computer Models of Personality. Random House, 1968
37. K. M. Colby and D. C. Smith. Dialogue Between Humans and an Artificial Belief System. Stanford University, Artificial Intelligence Project, 1969. (Memo AI-97)
38. T. S. Kuhn. The Structure of Scientific Revolutions. Chicago, University of Chicago Press, 1962. See: K.M. Colby and H. Enea. Heuristic method for computer understanding of natural language in context-restricted on-time dialogue. Mathematical Biosciences, 1,1-25, 1967.
39. K. M. Colby and H. Enea. Heuristic method for computer understanding of natural language in context-restricted on-line dialogue. Mathematical Biosciences, 1, 1-25, 1967
40. Eric de Grolier. A Study of General Categories applicable to Classification and Coding in Documentation. Paris, UNESCO, 1963.
41. C. G. Smith. Descriptive documentation, International Conference on Scientific Information, 1958; Proceedings. Washington, National Academy of Sciences, 1959, p. 1103.
43. Mary E. Stevens. A machine model of recall. Paris, UNESCO, NS/ICIP/J.5.4, 1959. See also: T. Kilburn, R.L. Grimsdale and F.H. Summer. Experiments in machine learning and thinking. Paris, UNESCO, NS/ICIP/5/6/15, 1959.
44. Stuart D. McIntosh and D.M. Griffel. The requirements for a computer-based information system. M.I.T., Center for International Studies, 1968, (c/68-14c), 82 p. - Computers and categorization (Paper presented to the Classification Research Conference, Bangelore, 1969). M.I.T., Center for International Studies, 1969 (C/69-28), 41 p
45. Eugene Garfield. Primordial concepts, citation indexing and historicbibliography. Journal of Library History, 2(3), pp. 235-249 (1967). see also: Eugene Garfield. "Science Citation Index; a new dimension to indexing." Science, 144, pp. 649-654, 1964
46. Eugene Garfield. Citation indexing: a natural science literature retrieval system for the social sciences. American Behavioral Scientist, 7, 10, pp. 58-61 (1964)
47. Jean Piaget. General problems of interdisciplinary research and common mechanisms. In: Unesco. Main trends of research in the social and human sciences. Paris, Unesco, vol.1, 1970, pp. 467-528.
48. F. A. Casadio, Director, Societa Italiana per 1'Organizzione Internationale
49) S. Ullmann. Semantics: an introduction to the science of meaning. Oxford, Blockwell, pp. 254-5.
50. Eric de Grolier. A Study of General Categories Applicable to Classification and Coding in Documentation. Paris, Unesco, 1962, pp. 226-228 (Note 89).
51. To be a sister volume to, and cross-reference, the UIA's Yearbook of International Organizations, which is now produced via computer permitting access to data for research purposes.
52. Maurice Line (Ed). Information Requirements of Researchers in the Social Sciences. Bath University, 1971, 2 vols.
53. B. L. Whorf. Language. Thought, and Reality. Wiley, 1958, 278 p.
54. Marshall Walker. The Nature of Scientific Thought. Prentice-Hall, 1963, p. 103.
55. David Bohm. The Special Theory of Relativity. N.Y., Benjamin, 1965
56. Georges Mounin. Les problèmes théoretiques de la traduction. Paris Gallimard, 1963.
57. A special issue of the ETC (Institute of General Semantics), 15,2, March 1958 is entirely devoted to interpretation and intercultural communication. It gives many examples of this sort of problem.
58. Georges Mounin. op.cit. p. 67-68
59. Colin Cherry. World Communication: threat or promise? London, Wiley, 1971
60. UNISIST Study Report. op.cit., p.1, p.20, p.103, p.115, p.118, p.152 (This point is examined in more detail in COCTA Working Paper No. 3, p.65).
61. The author recently had to wait seven months for an in-print ordered publication. Its title: Foundations of Access to Knowledge. Syracuse University Press, 1966.
62. Lee Thayer. Communication and communication systems; in organization, management, and interpersonal relations, Homewood, Irvin, 1968, p.202.
63. UNITAR/EUR 3/2, 1971, p.2.
64. UNITAR. The Interest of the United Nations Institute for Training and Research in the question of United Nations documentation. Geneva, UNITAR/Eur 3/1, 1971, p.l. UN Document A/7576, 25 July 1969, para. 2, shows that document production by New York HQ increased by 50% from 1964 to 1967, to 600 million page-units. This does not include production of the regional or Geneva offices or specialized agencies. A recent UNITAR document (UNITAR/Eur/3/2 notes that there will probably be one million journals in 30 years time. Currently it is estimated that about 2000 books (i.e. 1 million pages) are printed every minute throughout each day.
65. United Nations. UN Document A/8319, 2 June 1971 (or JIUMEP/71/4). But stemming the generation of new knowledge in developed countries, is about as feasible as lowering the birth rate in developing countries. To severely reduce one means of storing and disseminating such knowledge, without seeking a more appropriate complementary medium, could only be counter-productive and unsatisfactory.
this work is licenced under a creative commons licence.