Enlarged version: challenges to comprehension
Home/Search
Documents  >>
Themes  >>
Visuals  >>
Context  >>
FAQ/Contact  >>

Joy in the Present
      

1992

Network Mapping

Clarification of Requirements

- / -


Annex 5 of Visualization of International Relationship Networks


1. Background

The Union of International Associations is recognized as a non-profit scientific research institute under Belgian law governing international bodies established there; It functions as a clearing house for information on 15,000 international non-profit organizations and their preoccupations; As such it produces a series of reference book from a data base shortly to be placed on-line via the European Span Agency (ESA-IRS). The books are:
  • Yearbook of International Organizations
    • International Organization Participation
    • Global Action Networks
  • International Congress Calendar
  • Encyclopedia of World Problems and Human Potential
2. Data

Magnetic tape files are generated or maintained as part of the production of the references books. In all they currently represent in excess of 60 megabytes of data. The tapes are all 1600 bpi, EBCDIC, odd parity, composed of fixed records of 180 characters, 20 records per block. This note concerns the further processing of a subset of the above data taking the form of cross-references between (the numbered) entries of a certain type. At the present time interest is focused on cross-references are of different types, denoted by a 2-digit numeric code. They indicate a relationship between two entities, each denoted by an alphanumeric 5-digit code (ANNNN). The relationships include: more general problem, more specific problem, aggravating problem (and aggravated problems), alleviating problem (and alleviated problems), associated problem, concerned organizations, concerned discipline, relevant human values, etc. The entities may be international organizations, world problems, treaties, or various kinds of concepts.

3. Preliminary analysis of the cross-reference tape

This is being undertaken by a separate programme (in Brussels). Basically this takes each entity (of the 3,000) as a starting point and produces a record tracing the chain of relationships out from it along each branch. The trace can be continued through up to 9 relationship links. Each record contains the chain for one branch out from the starting entity for one type of relationship. The programme also produces complementary records on a parallel file with the name of each entity registered in the above trace (partly for use as an index); Other products of the programme enable loops in the network to be detected and corrected if appropriate. A statistical summary enables especially interesting nodes to be selected.

4. General processing requirement

The data on the above tapes must be analyzed in such a way as to produce a set of coordinates on tape which can be used to drive a graph plotter of one form or another so as to portray suitable representations of such networks of relationships. At this point two possibilities are envisaged:

- generation of "maps" as a substitute for questionnaires to the concerned international bodies and to enable the relationships networks to be "cleaned up" by editorial staff (Ideally this would be done in-house).

- generation of clean maps with higher quality typography for publication in the 1985 edition of the Yearbook of World Problems and Human Potential. This can be done by Computaprint (London) once the drive tape is formatted.

Briefly explained the problem may be compared to that of getting a computer to plot a suitable schematic map of a subway system (or a road network) when only the links between locations are given. In other words the optimal position must be determined for each location and the links between Them. An additional desirable constraint is that the geometry should be relatively comprehensible namely ordered rather than arbitrary.

5. Breakdown of the task

The task may be split into the following problem areas.
  • Processing strategies and run time choices between them
  • Optimal distribution of network across selected surface ("packing")
  • Determination of suitable scale for map(s)
  • Determination of mapping details (typography etc)
  • Fitting in entity name text.
These points are discussed separately in the following sections.

6. Processing strategics and run time choices between them

The following strategies might be envisaged as run time options:
  • 6.1 Global: Get tops of all "most connected" trees out onto a single map with whatever detail density considerations will permit.
  • 6.2 Sub-global: Split global map into N sub-maps on which greater detail is possible
  • 6.3 Top-of-tree focused: Work out maps for many individual problem clusters, each map being centred on the top of the relevant tree.
  • 6.4 Chain-focused: Work out maps for networks ignoring links to tops of trees or to unnecessary details.
  • 6.5 Selection: Work out only from nodes specified individually at run time, whether according to the 6.1 or 6.3 strategy.
7. Optimal distribution of network across selected surface ("packing")

Various strategies might be used depending partly on the surface chosen.
  • 7.1 Cluster analysis: Some variant of multi-dimensional factor analysis might be used. The weighting of the entities could be determined by the number of their direct and/or indirect relationships of specific types.
  • 7.2 Iterative procedure: It is possible that a fairly simple iterative procedure could be used to determine the optimal locations on a surface. This might be initiated by first positioning (selected?) before inserting nodes of a more detailed level.
  • 7.3 Planar surface: Given the objective of plotting the maps on a planar surface, one option is to project the results of any multi-dimensional analysis onto such a surface and thus determine the plotting coordinates. A planar surface would also simplify any iterative procedure.
  • 7.4 Spherical surface: Determining the positions of the nodes on a sphere is conceptually more attractive since it preserves and emphasizes the "globality" of the network in contrast to the arbitrarily bounded planar surface. The results of any multi-dimensional analysis could be projected onto such a surface. In the case of any iterative procedure this could be initiated with (selected) tops-on-trees positioned in an equidistant (regular) pattern over that surface; Such a surface of course raises questions as to how the results are to be represented on a plane surface.
  • 7.5 Triangulated spherical surface: In order to be able to plot a map represented on a spherical surface, this may be represented by a suitably triangulated polyhedral approximation onto which the spherical results are projected. oordinates for each triangular map can then be plotted. The set of triangles may subsequently be edge-linked to provide a map of larger positions of the network.
8. Determination of suitable scale for map(s)

The following factors determine the possible scale of the map:
  • 8.1 Size of plotted map: This would normally be A4-pages size but chart size maps could be produced for certain purposes.
  • 8.2 Plotting constraints: These would be determined by how much data could be fitted on without running the risk of over-writing.
  • 8.3 Readability: Clearly whilst it may be possible to plot a very detailed map this may be undesirable if the map thereby becomes unreadable. The desirable "density" limits must be a run time option.
9. Determination of mapping details (typography, etc)

In order to increase the readability of the map it is valuable to take advantage of the sophisticated typographicalpossibilities when the map is directly projected onto print-ready film via a vector generator as opposed to being plotted by computer driven pens. Codes to trigger such typographical features need to be provided with the coordinates for the drive tape.

Possibilities include:
  • 9.1 Size of node: The size of the "dot" representation. A problem (etc) may be determined by its connectedness (whether at the same level or at a lower level of detail)
  • 9.2 Thickness of link: The thickness of the line representing the link between two problems may be dependent on the relative importance ('size') of the nodes so connected and possibly also on their distance apart.
  • 9.3 Type of link: Possibilities such as using full line, dotted line, or some other variant may also be envisaged to distinguish different types of relationship or different levels of detail.
  • 9.4 Colour: Whether plotted or vector-generated, different colour conventions may be envisaged to distinguish different types of relationship. When vector-generated onto film, this possibility is related to that of overlays in general.
  • 9.5 Overlays: It may be useful to envisage the possibility of overlays in order to produce maps of higher density and to print those using several colours.
  • 9.6 Stereo-effect: In order to facilitate comprehension of complex networks, consideration should be given to the production of maps overlays (in red and green) calculated to create a stereo-effect (as is done with the representation of complex molecules).
10. Fitting in entity name text

This is the problem of leaving space beside nodes for a name to be inserted (by vector generator or plotter). Various techniques can be used to achieve this and avoid over-writing:
  • 10.1 Vary type size: According to the "size" of the node, more or less space may be reserved. Smaller typesizes may be used for less connected nodes.
  • 10.2 Truncate names: This possibility may be explored given that the names tend to be long
  • 10.3 Omit names: This technique can be used for the smallest nodes or least connected nodes.


Creative Commons License
This work is licenced under a Creative Commons Licence.