Challenges to Comprehension Implied by the Logo
of Laetus in Praesens
University of Earth Alternative view of segmented documents via Kairos

1977

Knowledge-Representation in a Computer-Supported Environment

- / -



Originally published in International Classification. 4, 1977, 2, p. 76-81. Completely revised version of a paper which first appeared in 1973 and was published in International Associations, 1974, pp. 208-208  

Summary

Discussion of problems in knowledge handling policy and indication of new software and hardware possibilities especially those making use of graphic representational devices. The necessity for a more adequate knowledge representation is demonstrated in 19 statements contrasting present documentation and information analysis procedures (as inadequate for current needs) with possibilities of future methods and measures. Reference is made to the consequent redefinition of relationships between conventional knowledge handling processes, if only in the special institutional settings where this approach will most probably be adopt

1. Pressing problems in knowledge handling policy

At a time when we are exposed to:

  1. a multitude of documents in every specialized field of knowledge,
  2. a multiplicity of often-unsuspected interconnections between the concerns of different specializations, and
  3. an increasing need to interrelate the knowledge of seemingly unrelated fields, we are having difficulty in:
    1. (i) producing documents cheaply,
    2. (ii) distributing them widely, rapidly and in sufficient languages, and
    3. (iii) organizing the documentation centres, libraries and information systems to handle them.
The complexity of the knowledge handling system is such that conceptual ambiguity is the rule rather than the exception. At the same time we are running short of the paper which permits us the luxury of our incredibly ineffective, document-oriented system.

Furthermore, and more serious, the cumbersome nature of the knowledge handling system effectively prevents the maintenance of "thinking momentum" (2) on any issue, whether for an individual or in group interaction between researchers. Such disruption of innovation is increasingly intolerable as well as dangerous because of our dependence upon collective innovative and rapid responses to the many problem of society. The scholar's relaxed acceptance of extended delays (deriving from the monastic tradition and the priorities of the gentlemen-of-leisure who fathered many of the sciences) can no longer set the standard for knowledge handling [1].

The US National Science Foundation has invested heavily over the past decade in abstracting and indexing services for a range of disciplines. It recently summarized the current state of affairs as follows:

This is not the place to detail the evidence in support of this view. A significant practical example, however, is the case of the United Nations. A former President of the General Assembly remarked that "the United Nations is drowning in its own words and suffocating in its own documentation" (4). The UN Joint Inspection Unit notes that "the point of saturation has now been reached and indeed overstepped and that the law of diminishing returns is taking over" (5). Their solution implemented, however, is "to set once and for all, and strictly enforce, a reasonable but drastically reduced ceiling to the volume of documentation its various bodies call for and its services produce" (5). It can be argued that such a response to the problem is incredibly short-sighted in view of mankind's need for new knowledge and the right of all to participate in the generation of that knowledge and to receive the associated information. To reduce severely the means of storing and disseminating such knowledge within the world's key organizational system, without seeking a more appropriate complementary medium, can only be counter-productive and unsatisfactory.

If some limit is being reached then the National Science Foundation, continuing the above quotation, considers that:

The same document identifies other interrelated goals: As is noted below, the NSF is currently funding field experiments amongst groups of scientists. As has been noted elsewhere (6), it is difficult to convey the nature of the communication process in this new computer-supported, paper-less environment. "Most of our intuitions about face-to-face interaction simply do not apply to this new and unusual form of communication... it is not surprising that computer conferencing might actually establish an altered state of communication in which the realities of fact-to-face communication are distorted and entirely new patterns of interaction emerge" (6). Some impression of the significance of existing applications may be gained from the following section. A major investment in creating and experimenting with such environments has been made over the past decade through the ARPANET at the Center for Augmenting Human Intellect, Stanford Research Institute (7, 8, 9).

2. Software and Hardware

It is much to be regretted that those who have an understanding of the existing, and increasingly available, computer hardware and peripheral equipment are rarely able to envisage innovative uses for that equipment outside certain specialized sectors of engineering and fundamental defence research.

Consequently, when such equipment is used elsewhere, the applications do not constitute breakthroughs in the ability to respond flexibly to relationship complexity and inter-sectoral contact, but only a greater ability to handle the increasing quantities of data within a predefined sector. This lack is matched for those in those sectors, which could benefit immediately from a wide variety of innovative though relatively simple applications, but who are unaware of the possibilities.

It is impossible to explore adequately in this context the significance of these devices and applications for classification-related questions. Some possibilities can however be indicated (7).

1. Computer graphics devices (CRT displays). These are now familiar to many, if only at airline reservation desks. What is much less well-known is the form of this device which can handle not only lines of text but can also display highly complex relationship networks (such as arrow diagrams) whether in two or three dimensions with many possibilities for assisting the user in the exploration, comprehension and re-representation of such (concept) networks for comprehension by others interested in alternative or simplified displays. Very complex domains may be represented on displays of several hundred colours [2].

2. Graph plotters. Complex relationship charts of up to several square meters in size may be drawn, with several colours, under computer control on the basis of data selected by the user (possibly after viewing a display of successive parts of it with the previous device). The significance of a knowledge user or institution being able to obtain and represent the structure of a knowledge domain in this way remains to be appreciated (7).

3. Computer-assisted structure elucidation. Programs are now in use for the interactive exploration by chemists of possible molecular structures in the light of inferred structural fragments and various constraints (12, 13).The approach bears many similarities to an application which should be available to users wishing to explore concept structures (e.g. in association with the proposal of the Committee on Conceptual and Terminological Analysis, and Unesco's current Interconcept program (10, 14).

4. Computer-conferencing The US National Science Foundation is now investigating the consequences of providing computer terminals to individuals who are members of geographically dispersed "invisible colleges" (3). The scholars so linked, whether in one or more countries, can exchange, store and comment on information according to an evolving agenda. Such applications may include those mentioned above. To date however the structures of the "agendas" governing the relationship between the items chosen for a particular computer conference are of the simple hierarchical variety. However, the computer does facilitate linking and sequencing keywords in the series of interventions constituting the transcript of a conference so that such associative networks may be explored in a manner somewhat similar to citation analysis. The significance of being able to use such an environment to facilitate interaction in relation to a complex evolving network of concepts, and to use the environment to explore and experimentally re-structure collectively such a network, remains to be appreciated--particularly with regard to interdisciplinary and intersectoral communication [3].

3. Knowledge representation

The previous section indicates a breakthrough in terms of hardware in support of a new knowledge-handling environment and the NSF initiative indicates that this is being very seriously explored. What is still not appreciated apparently, is the significance in such an environment of the decomposition of the "texts" of a particular author into sentences or even words. The computer permits this (by the very nature of its operation) and facilitates any recombination of his statements into new configurations (perhaps blended with those of his colleagues). This is important for concept analysis (14, 15, 17).

A stage is therefore reached in which a given text is treated by the computer as a network of key words embedded in a field of explanatory comments. The structure of the network bears an iconic relationship to the knowledge it represents. Knowledge innovation is more and more closely represented by the changes to the structure and content of that network. The explanatory and introductory comment, which constitutes the great bulk of any text is only of secondary significance and can be stripped away, given a much lower handling priority, or reprocessed into a more compact and comprehensible form by communication and education specialists. Soergel, in discussing the possibility of an automated encyclopedia, discusses this point (18) with a quotation from Bohnert et al (19):

Little attention has been given to this problem of assimilation, other than a heightened emphasis on "speed reading". It would seem that much is to be gained by looking at the ability of the graphics devices discussed above to provide structured diagrams and displays in which a deliberate attempt is made to use inter face programs (possibly selected according to the presentation preference of the user) to provide a high degree of iconicity. Relationship structures as displayed should bear a strong relationship to the relationship between the knowledge structure which has to be absorbed as a gestalt for learning to take place (20).

In parallel columns below, an attempt is made to clarify the distinction between a hypothetical knowledge-oriented system, now technically feasible, and the current approach. The intention is not to imply that the former should replace the latter but rather to show that the former offers various means of avoiding some of the key problems faced by the latter - the two are however complementary. The distinction is basically between integration or fragmentation in the handling of information.
 

PRESENT: Document/Information System

FUTURE: Knowledge-representation System

 Index tends to be based on simple hierarchy or alphabetic listing of subject, author and title, which can be handled on catalogue cards. Budgetary constraints usually prevent widespread introduction of sophisticated classification and cross-referencing techniques. "Index " constitutes a complex network giving a representation of entities and relationships and the dynamics of any points under debate. This complexity can only be handled by multidimensional computer techniques. Cross-references are necessarily inserted by the author to define the location of his innovation. Others may be inserted automatically, optionally, or experimentally by computer.
Users want rapid access to documents; the index is a temporary inconvenience to gain access to a document. Users want rapid access to the "network index" which represents the needed items of knowledge and their relationships; documents are a temporary inconvenience only used if it is necessary to re-examine data and detailed arguments justifying the entities and relationships incorporated (Document access is a secondary problem for which a documentation system may be used.)
Access to knowledge via documents means multiple reproduction and transfer of documents to a variety of libraries where they may or may not be used. Access to knowledge is direct and does not require reproduction and transfer of documents. (Only one copy of the document justifying the amendment need exist on microfiche so that copies need only be prepared when the data and arguments must be re-examined in detail)
Out-of-date, rejected, low quality, false, old documents are retained in the system and indexed with no index indication of their status. Out-of-date, rejected, false, etc. entities or relationships may be eliminated from the system by listing them on paper, microfilm, or other "documents" with the bibliographical source from which they were obtained (ie they are available if required but do not clog the system).
Only the knowledge held in the documents physically available at that location is accessible. The index frequently only indicates the documents held in the documentation centre in question. All knowledge is on-line, although the supporting documents may not be physically accessible without delay.
Research is conducted primarily using documents, notes and file cards as a stimulus to creativity. Research is conducted primarily using the knowledge-representation structure (i e. the graphical representation) as a stimulus to creativity. Private and tentative amendments can be made experimentally, shared electronically with selected colleagues, and then destroyed, stored or released electronically to a wider audience. The authors "notes and file cards" can be effectively integrated into the system to facilitate his thinking processes.
Different styles of documents are produced on the same topic for research, education, public information and propaganda, program management, policy making, etc., purposes. The same material is repeated, with some extensions and some omissions, for each audience. This leads to a "spastic" or "aphasic" response to new situations, by different portions of society due to delays in production of the documents for different audiences and to significant variations in the importance given by the authors to different items of information. The entities and relationships entered on the basis of research insights are also used for other purposes. Instead of producing different documents and reprocessing the insights, different identified "filters" are used in presenting or displaying the entities and relationships to different audiences. In this way, each new research insight is immediately incorporated into each other form of knowledge-representation; each portion of society works from the same data base. (Problems registered by non-research bodies are immediately evident as a challenge to research.) In this way if an element of knowledge represented cannot be understood, the user merely calls for a new method of representation (of the same knowledge) possibly using isomorphs (or even analogies) from a domain with which he is familiar. (At any point he can move into a programmed learning mode and be instructed with simpler representations or work from an area of knowledge with which he is familiar.)
Each new document must carry a lot of verbal packaging to explain and define the context within which innovative elements are introduced. Such contextual material is repeated by each author concerned with that domain of knowledge. There is no guarantee that the rephrasing of earlier arguments (necessary for status and copyright reasons) will constitute an improvement facilitating greater comprehension (rather than inhibiting it). The author need only enter the specific entities or relationships which constitute his innovation. (Since the academic's status is bound up with his specific modifications to the knowledge structure and not the verbalizations held in a document, the problem of adequate verbalization may be handled separately. Hopefully a limited number of skilled verbal presentations, from a minimum number of different perspectives and literary styles, could be constantly updated by professional writers using the best verbal arguments by any appropriate academic or communicator.)
Articles retain permanently their total length and degree of encumbrance to the document system. Articles may retain their full length only for a period of days before being shortened or stripped (by computer) of explanatory matter and represented as a network of concepts - or simply stored on microfilm or erased.
Alternative concepts or contradictory evidence published elsewhere can be conveniently ignored in a document or textbook--particularly where the counter argument comes from another discipline (or a school of thought publishing in a different language). The risk of explicit published criticism is low in many fields, therefore the degree of support for (or criticism of) any particular element in a document remains unclear. Alternative concepts, relationships or contradicting evidence are immediately forced on one's attention - even in the case of relationships linking to other disciplines. The degree of support for (or criticism of) any particular element is clearly evident. Members of qualified professions may "vote" on particular amendments to the knowledge structure which is their concern.
Interdisciplinary links are ignored if the author has no interest in them. As a result there is no built-in process within the documentation system which encourages integrative studies to counter-balance the further fragmentation of knowledge. Integrative studies have low status, being equated with educational texts, general reviews and journalism. Interdisciplinary links are already held in position whether the author wants to ignore them or not. Integration of isolated items of knowledge into higher orders of synthesis is facilitated and may be undertaken experimentally, selectively and largely by computer program {searches may be made for various degrees of isomorphism between concept structures in different domains). Integrative innovations acquire a high status as a means of comprehending wide domains of knowledge and controlling the associated information.
The documentation system does not permit panoramic summary of any permanent representation of knowledge in a particular domain.

Each verbal summary extant at a particular moment is under criticism and subject to reserves from different schools of thought within the discipline or in other disciplines. In this important respect a document arising from a single group of authors can never contain the totality of views in a domain of knowledge. Only the non-concretized interaction between a succession of documents approximates to it. These invisible qualifiers on any document are a feature of the "collective mentality" of the members of the discipline. The knowledge of the discipline at any moment is very much in (and between) the minds of its members rather than on paper or in a row of books.

The forum of academic debate is concretized in a scattering of journal articles and other documents. There is little interaction between the journals but the debate is somewhat summarized in the various collections of abstracts in which the contents index gives some indication of the interventions on related topics.

Each entity link and qualification is indicated in the knowledge-representation system. In effect one "layer" of the "collective mentality" of a discipline is rendered visible. Each modification to knowledge in the domain can be entered on an hour-by-hour basis.

The knowledge-representation system constitutes a "thinking forum" in which the juxtaposition of relevant ideas from all sources is maximized. The researcher can expose himself to a pattern of theoretical formulations in the process of being continually improved, and to which he can contribute. Concepts and relationships can be "registered " by postcard, but more dynamic possibilities are increasingly available. A dozen or more specialists in a particular field {the "invisible college" for that topic) can contribute simultaneously to work on ideas being written on one "mental note pad " via electronic dialogue support systems which help them to respond to each other's ideas {even if they are a continent apart) with a rapidity that allows each of them to maintain thinking momentum.

Thinking momentum is constantly interrupted when access to new documents is required. (Long delays, 2 - 3 months, are normal; 50 months or more from initiation of research to appearance in abstracts.) Thinking momentum is maintained since the essence of any new domain of knowledge is always accessible - all the links and entities are there {delays are measured in seconds for data links).

{This mode of operation should be compared with some discussions between academics interested in the same topic in which progress is frustrated because if someone thinks of a good idea he wants to "publish " it (to gain credit) before contributing to the thinking momentum of his colleagues - this may mean a delay of months)

Author has "published" when document is in circulation and "available"; index entries are of little significance to the author. Texts must be at least several pages in length before they are considered "documents" worthy of registration in an information system. The documentation system is embarrassed when faced with obtaining "ephemeral" or "phantom" material which has not been made commercially available through the normal publishing channels. Author has "published" when the appropriate knowledge structure in the "index " has been modified; incorporation in "index " (through a terminal) is of highest priority for the author. Acceptable amendments to the knowledge structure can be as little as a single line of text in length, or simply the indication of a relationship between existing, but hitherto unrelated, items of knowledge. Even in the course of rapid change to the knowledge structure the paternity of each emerging formulation is identified and registered (if the author so desires).
Author's status, credibility, pride and interest are primarily associated with visible documents on library shelves and only secondarily with the research community's collective judgment on their value. The documentation problem is aggravated by the "publish or perish" code which governs much of academic life. Unless an academic produces a document he is "invisible" and loses status. Author's status, credibility, pride and interest are associated with the visible entities and links in the graphic representation accessible to all. By switching emphasis to the specific entities and relationships which the academic has formulated, successfully confirmed or criticized - his status is determined by the bonds and entities with which he is associated. Each of his contributions is "visible" until it is superseded. They are not subject to the vagaries of document distribution patterns and the journal referee system.
The key figures in a discipline and the relationships between their spheres of influence are unclear. The "luminaries" in a particular discipline are all visible together with the relationships between their spheres of influence.
The direction of research is governed in part by shifting fashions of credibility, status and politically determined funding (e.g. "environment", "resources", "population") which obscure the basic knowledge structure. This is only partly evident in print but is controlled by an ongoing informal dialogue centred upon the elders of the discipline who legitimate consideration of particular entities and relationships. It is quite evident which issues are currently under debate and the manner in which the demise of a set of entities and relationships will weaken the status of a whole set of dependent elements. Current fashions would not obscure the basic knowledge structure. Ideally the system would also act as a continually updated voting board for each element, providing an opportunity for members of the profession to indicate their approval, whilst at the same time providing an appropriate focus for counter-arguments and alternatives.
The world's publishing and purchasing capacity, and the consequent necessity for the journal referee system and increasing costs, limits arbitrarily (and in many cases inequitably) the number and variety of viewpoints which can be expressed on any subject. The nature of the referee system leads to an inhibition of innovation. The reduction of the volume of text required to "carry" any conceptual innovation, and the integration of the referee and editorial system at the computer level permits a greater number and variety of viewpoints and more subtle and equitable mechanisms for the expression of peer-group support or criticism.
Considerable intellectual, administrative and technical investments are made in achieving a unified standard of classification and description which determine the structural specifications of information systems. Relationships between different standardized schemes of this type are not facilitated nor are experiments with amendments or alternatives to any particular scheme. The information is handled in a very flexible format. A choice may be made at any time between a variety of classification schemes. Some of these may be universal schemes, others may be specialized, and others may be experimentally employed by the user. Considerable use is made of computer power to switch between classification schemes and to restructure them {tentatively) in the light of new insights and relationship coding schemes.

4. Conclusion

The above section attempts to give an understanding of the special characteristics of the knowledge-handling environment which will be increasingly accessible, if only to those in privileged institutions. For whilst there are few technical and economic constraints to prevent such an environment becoming widely accessible, it is probable that this will be obstructed by socio-political factors, including recognition of vulnerability to abuse and government control. On the other hand, there is some probability that government agencies will come to favour and promote the widespread existence of such a system as permitting a sophisticated improvement over telephone surveillance of intellectuals and social change agents.

Whatever the general outcome, it is highly probable that such environments will be developed for creative thinkers in key research disciplines and policy environments and for the conferences and institutions in which they interact. The key to the attractiveness for them of such (micro)environments is the manner in which the processes of thinking and communication are blended with those of storage, retrieval, classification and reclassification. In fact it is the intimate relationship between shared creative thinking and exploratory integrative reclassification in the light of new insights that is the chief feature of such environments. Of special interest is the manner in which the processes of:

effectively blur together into a new and more dynamic process whose nature remains to be explored and for which the current division of labour is inadequate.

It is unlikely that any encyclopedic system based on large amounts of textual information will be as practical or significant as the dynamic, multi-perspective, participative system outlined here--although there may be points of contact between the two approaches.

It is interesting that the right note was sounded by the US National Academy of Sciences Committee on Scientific and Technical Communication (SATCOM) in 1969 when it was stated that: "More exciting than retrieval of information from a static store is evolutionary indexing, in which user's modifications, restructuring and critical commentaries steadily improve the initial indexing. . ."

The challenge for those active in the field of classification will be to provide their proposed schemes or amendments as computer program packages or optional modules which can be easily employed in such environments in order for the user to be able to restructure (possibly only temporarily) the data base with which he is working to perceive it in an alternative light. Hopefully this would lead to improvements in the ability to classify and enhance comprehension of inter- and trans-disciplinary concepts (21, 22, 23).



Notes

[1] The author had the experience in 1970 of having to wait seven months for an in-print publication ordered from Belgium by telephone and telex (and a personal visit to the distributor in the USA). Its title: Foundations of Access to Knowledge.

[2] Further comments on the use of graphics devices are made in the appendices to the publication listed under (11), itself an experiment to set up a large data base on which the use of such devices could be tested. See also the book review of this work in this issue, p. 114.

[3] For further details on computer conferencing and a bibliography see (16).


References

1. Les Problèmes du Langage dans la Société Internationale. Bruxelles, Union des Associations Internationales, 1975

2. Ivan Sutherland. Computer graphics. In: Datamation (1966) May, pp. 22-27; also in: Scientific American, 1970, June and 1974, June.

3. National Science Foundation, Div. of Science Information: Program Announcement; operational trials of electronic information exchange for small research communities. Washington, D.C.: NSF 76-45, Appendix C.

4. The Interest of the United Nations Institute for Training and Research in the question of United Nations Documentation. Geneva: U.N. Institute for Training and Research (UNITAR) 1971. p. 1 = EUR 3/1.

5. United Nations Joint Inspection Unit. Report. United Nations 1971. = Doc. A/8319, 2 June 1971 or JIU/REP/71/4.

6. Jacques Vallee, et al. The computer conference, an altered state of communication? In: The Futurist ,9, 1975, 3, p.ll6.

7. D. C. Englebart. Augmenting human intellect; a conceptual approach. Stanford Research Institure, 1962. (AFOSR-3223).

8. D. C. Englebart. Intellectual implications of multi-access computer networks. Stanford Research Inst. 1970. (Conference paper).

9. D. C. Englebart, et al. The augmented knowledge workshop. In: Blanc & Cotton (Ed.): Computer networking. IEEE Press 1976.

10. Anthony Judge. Relationships between elements of knowledge. Use of computer systems to facilitate construction, comprehension and comparison of the concept thesauri of different schools of thought. Honolulu, Hawaii: Univ. of Hawaii, Soc. Sci. Res. Inst. 1971. 150 p. = COCTA Working Paper No.3. (Later abridged for presentation to the 9th Congress of IPSA under the title: Toward a concept inventory; suggestions for a computerised procedure.) (Committee on Conceptual and Terminological Analysis of the International Political Science Association). [text]

11. Union of International Associations. Yearbook of World Problems and Human Potential. Union of International Associations and Mankind 2000, 1976 (currently titled Encyclopedia of World Problems and Human Potential) [info]

12. R. E. Carhart, et al. Applications of artificial intelligence for chemical inference; an approach to computer-assisted elucidation of molecular structure. In: J. Amer. Chem. Soc. 97 (1975) p. 5755-5762.

13. Computer-assisted structure elucidation (CONGEN); a program for the construction of structural isomers with constraints. Stanford, Calif.: Stanford Univ. Dept. of Chemistry 1976.

14. Giovanni Sartori. Conceptual and Terminological Analysis (CTA); interconnected information and knowledge representation in the social sciences. Pittsburgh: Intern. Studies Assoc. 1974. = COCIA Working Paper No. 24.

15. D. Michie (Ed.). Machine Representations of Knowledge. Proc. of a NATO Advanced Study Inst., Santa Cruz, Calif. 1975. Dordrecht: Reidl 1976.

16. Transnational Associations. Brussels: Union of International Associations. Special Issue, 29 (1977) No. 10.

17. Fred W. Riggs. Conceptual analysis; an illustration of two methods. In: Les Problemes du Langage dans la Societe Internationale. Bruxelles: Union des Associations Internationales 1975. p. 203-217.

18. D. Soergel. An automated encyclopedia; a solution of the information problem. In: International Classification, 4, 1977, l, pp. 4-10.

19. H. G. Bohnert and Manfred Kochen. The automated multilevel encyclopedia as a new mode of scientific communication. In: ADI Proc. Vol. 10, 1963, p. 269.

20. Dean Brown and J. Lewis. The process of conceptualization. Some fundamental principles of learning useful in teaching with or without the participation of computers. Stanford, Calif.: Stanford Univ. Educational Policy Res. Center 1968.

21. B. M. Kedrov. Concerning the synthesis of the sciences. In: Intern. Classificat, 1, 1974, l, pp. 3-11.

22. Erich Jantsch. Towards interdisciplinarity and transdisciplinarity. OECD 1972, pp. 97-121.

23. Anthony Judge. Integrative, unitary and transdisciplinary concepts. In: Yearbook of World Problems and Human Potential. Brussels: Union of International Associations and Mankind 2000, 1976, Section K. [access]

creative commons license
this work is licenced under a creative commons licence.