Challenges to Comprehension Implied by the Logo
of Laetus in Praesens
Laetus in Praesens Alternative view of segmented documents via Kairos

1992

Network Mapping: Software Possibilities

Summary of the problem

-- / --


Annex 3 of Visualization of International Relationship Networks (1992)


Summary

The problem is most easily described by analogy. Consider a relational database with records consisting of subway stations and indications of which station was directly connected to which other stations (and possibly on what "line").

Software "modules"

(a) Relational database

The data is currently held and maintained in a Revelation database (version G2B) running on a Novell network. The database has been specially developed as a text database with facilities to manage networks of relationships between the records. It is desirable that when the data is displayed in map form, interactive changes to the map should be carried back as updates to the database. But since the prime requirement is for publishable hardcopy maps, this requirement may be sacrificed in the short term.

It is appropriate to note that Version G2B can now be upgraded to Advanced Revelation and that some new software has been specifically developed in relation to the upgraded version only.

(b) Map design

Several approaches may be taken to the problem of map design:

(i) Network analysis: This uses specialized extensions of sociometrics to take data of the type described above and to position the elements in relation to each other on the basis of various measures of distance, with those most connected tending to be placed at the centre of a network and those least connected at the periphery. The advantage of this approach is that it endeavours to mirror the network on the basis of its internal characteristics. A number of software packages exist to perform the necessary computations. Various ways of describing a network and identifying key components result from such analysis.

The disadvantage of such software is that it has been developed for relatively small networks only (100 to 300 nodes). Few of the packages are designed to permit mapping of the resultant network. Data is output in matrix form only or as indices in relation to key elements. More seriously, such networks when mapped result in maps which, although they reflect the data, are not designed to enhance the comprehensibility of the data (other than in a purely scientific sense). Such computations can consume considerable amounts of computer time, even on fast machines.

This approach is being explored using test data from the UIA Revelation database consisting of some 5,000 nodes. The work is currently being done on a Mac II using software developed at the University of Dartmouth by JoelLevine of the Department of Mathematical Social Sciences. This software has not been adapted to run under MS-DOS.

(ii) "Crude mapping" A simplistic approach could be taken. This would involve positioning the nodes on a grid determined by the subjects with which they are associated. Such a subject grid (with positions determined by a 4 character identifier) is in use to categorize the UIA data into some 3,000 categories. Relationships would then be plotted between the nodes.

In this case comprehensibility is achieved through the link to the matrix and not through determining the shape of the network. Use of a grid could severely undermine the memorability of the network. It would however be relatively easy to develop and quick to run. A key question would be what kind of interaction it would be possible to have with such a map and whether it would be possible to shift from a detailed focus on a specialized cell of the grid to a wider focus and back (a zoom facility).

(iii) Topological manipulation In this approach, the network of relationships between nodes would be simplified using topological constraints. For example a string of interlinked nodes would be represented by a straight line. The position of the nodes on the line might be equidistant or determined by some logarithmic function based on the distance from the centre of the line. The aim would be to introduce symmetry elements into the data so that it acquires a distinct and memorable pattern or shape. Some of the algorithms required presumably correspond to those of pattern recognition problems.

(c) Plotting

Once coordinates have been determined, software is required to plot the network, whether onto the screen or onto a graph plotter. Many packages exist for this purpose. A distinction should however be made here between adequate quality plots (for working purposes) and high-quality plots for publication in book form. The latter question is discussed later.

The problem in plotting is to be able to introduce distinguishing elements into the plot. These may include variations in line thickness (corresponding to some measure of importance or proximity), variations in node size (corresponding to the number of connections to the node) and the introduction of identifying labels for the nodes.

A key requirement is that the plot be made from the data as processed by one of the above techniques, rather than from data which is manually input. A distinction must also be made between a curve fitting approach and one which passes through the nodes as is required here. A distinction also needs to be made between plotting a graph (from left to right) and plotting a network in which there is no privileged direction. The latter form is more characteristic of CAD programs (see below).

(d) Drawing

It is desirable to move towards an interactive approach to the data. In other words, once a plot is made for a segment of the overall network, editors should be able to modify the network. Such modifications might take one of two forms. The first would consist of simply moving portions of the plot to make it more comprehensible, making room for labels and improving the aestheties. The second might also involve the capacity to add or delete features from the network. It would of course be highly desirable that the latter changes should be carried back into changes to the relational database. This can raise severe problems of compatibility between the relational database and the drawing/plotting software, whether in terms of software or of intermediate files. Such features are available in many CAD programs. It is however important to recognize that the CAD software is here used to "design" logical or topological constructs rather than buildings or mechanical parts. This is not a limitation but it may permit use of simpler (and cheaper) CAD software.

It is appropriate to note that the variant of CAD software used for interactive printed circuit board design (PCB) has many features of value to the present application, especially the "auto-router" feature which positions connections on the circuit board in the most economic manner (avoiding cross-overs, etc). Unfortunately the positioning criteria do not make for maximum comprehensibility.

(e) Interface software

In the case of Advanced Revelation there exists a software product CAD/Base which offers "complete integration of CAD drawings with a database environment", via industry standard DXF files. The drawing is viewed as a Revelation file and the drawing elements as Revelation records and fields. The drawing exists as a master file in both the Revelation and CAD environments. Changes in one environment are reflected in the other automatically without any intermediate file conversion required.

Clearly this offers interesting opportunities for using the network map as a menu through which users can select individual nodes on which they can immediately access additional text data.

(f) High-quality graphic output

One objective is the production of maps to be printed in book form. To achieve this one approach might be to produce output in a form which can be handled by PC-TeX to create files for output on a high quality laser printer.

(h) Integration of features

It is possible that CAD/Base offers an appropriate means of integrating the different features discussed above (except the last). It is also possible that such a product, which is relatively expensive, can be considered as "overkill", and that a more compact approach would be more suitable and easier to make available to others. If the emphasis is on the simpler strategy of generating hardcopy, this would certainly be the case. To the extent that interaction with the data is desirable, then more features would be required, even though only a selection of standard CAD features would be necessary.

For the user, there is obviously great merit in ease of use as an adjunct to normal text editing procedures. Ideally such a package would bear some resemblance to the more sophisticated forms of "outliner", such as MORE and INSPIRATION running on Apple machines. In these an essentially hierarchical outline of topics can be opened up into standard text processing or converted into bullet charts. What is required is an equivalent which is tied into a relational database environment. The different approaches to network "map design" noted above might then be options in the way the data was manipulated for presentation, as is the case in standard business graphics (bar charts, pie charts, etc).

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

For further updates on this site, subscribe here