Clarification of Requirements
-- / --
Annex 5 of Visualization
of International Relationship Networks
The Union of International Associations is recognized as a non-profit scientific
research institute under Belgian law governing international bodies established
there; It functions as a clearing house for information on 15,000 international
non-profit organizations and their preoccupations; As such it produces
a series of reference book from a data base shortly to be placed on-line
via the European Span Agency (ESA-IRS). The books are:
* Yearbook of International Organizations
* International Organization Participation
* Global Action Networks
* International Congress Calendar
* Encyclopedia of World Problems and Human Potential
Magnetic tape files are generated or maintained as part of the production
of the references books. In all they currently represent in excess of 60
megabytes of data. The tapes are all 1600 bpi, EBCDIC, odd parity, composed
of fixed records of 180 characters, 20 records per block. This note concerns
the further processing of a subset of the above data taking the form of
cross-references between (the numbered) entries of a certain type. At the
present time interest is focused on cross-references are of different types,
denoted by a 2-digit numeric code. They indicate a relationship between
two entities, each denoted by an alphanumeric 5-digit code (ANNNN). The
relationships include: more general problem, more specific problem, aggravating
problem (and aggravated problems), alleviating problem (and alleviated
problems), associated problem, concerned organizations, concerned discipline,
relevant human values, etc. The entities may be international organizations,
world problems, treaties, or various kinds of concepts.
3. Preliminary analysis of the cross-reference tape
This is being undertaken by a separate programme (in Brussels). Basically
this takes each entity (of the 3,000) as a starting point and produces
a record tracing the chain of relationships out from it along each branch.
The trace can be continued through up to 9 relationship links. Each record
contains the chain for one branch out from the starting entity for one
type of relationship. The programme also produces complementary records
on a parallel file with the name of each entity registered in the above
trace (partly for use as an index); Other products of the programme enable
loops in the network to be detected and corrected if appropriate. A statistical
summary enables especially interesting nodes to be selected.
4. General processing requirement
The data on the above tapes must be analyzed in such a way as to produce
a set of coordinates on tape which can be used to drive a graph plotter
of one form or another so as to portray suitable representations of such
networks of relationships. At this point two possibilities are envisaged:
- generation of "maps" as a substitute for questionnaires to
the concerned international bodies and to enable the relationships networks
to be "cleaned up" by editorial staff (Ideally this would be
- generation of clean maps with higher quality typography for publication
in the 1985 edition of the Yearbook of World Problems and Human Potential.
This can be done by Computaprint (London) once the drive tape is formatted.
Briefly explained the problem may be compared to that of getting a computer
to plot a suitable schematic map of a subway system (or a road network)
when only the links between locations are given. In other words the optimal
position must be determined for each location and the links between Them.
An additional desirable constraint is that the geometry should be relatively
comprehensible namely ordered rather than arbitrary.
5. Breakdown of the task
The task may be split into the following problem areas.
Processing strategies and run time choices between them
Optimal distribution of network across selected surface ("packing")
Determination of suitable scale for map(s)
Determination of mapping details (typography etc)
Fitting in entity name text.
These points are discussed separately in the following sections.
6. Processing strategics and run time choices between them
The following strategies might be envisaged as run time options:
7. Optimal distribution of network across selected surface ("packing")
6.1 Global: Get tops of all "most connected" trees out onto
a single map with whatever detail density considerations will permit.
6.2 Sub-global: Split global map into N sub-maps on which greater detail
6.3 Top-of-tree focused: Work out maps for many individual problem
clusters, each map being centred on the top of the relevant tree.
6.4 Chain-focused: Work out maps for networks ignoring links to tops
of trees or to unnecessary details.
6.5 Selection: Work out only from nodes specified individually at run
time, whether according to the 6.1 or 6.3 strategy.
Various strategies might be used depending partly on the surface chosen.
8. Determination of suitable scale for map(s)
7.1 Cluster analysis: Some variant of multi-dimensional factor analysis
might be used. The weighting of the entities could be determined by the
number of their direct and/or indirect relationships of specific types.
7.2 Iterative procedure: It is possible that a fairly simple iterative
procedure could be used to determine the optimal locations on a surface.
This might be initiated by first positioning (selected?) before inserting
nodes of a more detailed level.
7.3 Planar surface: Given the objective of plotting the maps on a planar
surface, one option is to project the results of any multi-dimensional
analysis onto such a surface and thus determine the plotting coordinates.
A planar surface would also simplify any iterative procedure.
7.4 Spherical surface: Determining the positions of the nodes on a
sphere is conceptually more attractive since it preserves and emphasizes
the "globality" of the network in contrast to the arbitrarily
bounded planar surface. The results of any multi-dimensional analysis could
be projected onto such a surface. In the case of any iterative procedure
this could be initiated with (selected) tops-on-trees positioned in an
equidistant (regular) pattern over that surface; Such a surface of course
raises questions as to how the results are to be represented on a plane
7.5 Triangulated spherical surface: In order to be able to plot a map
represented on a spherical surface, this may be represented by a suitably
triangulated polyhedral approximation onto which the spherical results
are projected. oordinates for each triangular map can then be plotted.
The set of triangles may subsequently be edge-linked to provide a map of
larger positions of the network.
The following factors determine the possible scale of the map:
9. Determination of mapping details (typography, etc)
8.1 Size of plotted map: This would normally be A4-pages size but chart
size maps could be produced for certain purposes.
8.2 Plotting constraints: These would be determined by how much data
could be fitted on without running the risk of over-writing.
8.3 Readability: Clearly whilst it may be possible to plot a very detailed
map this may be undesirable if the map thereby becomes unreadable. The
desirable "density" limits must be a run time option.
In order to increase the readability of the map it is valuable to take
advantage of the sophisticated typographicalpossibilities when the map
is directly projected onto print-ready film via a vector generator as opposed
to being plotted by computer driven pens. Codes to trigger such typographical
features need to be provided with the coordinates for the drive tape.
10. Fitting in entity name text
9.1 Size of node: The size of the "dot" representation. A
problem (etc) may be determined by its connectedness (whether at the same
level or at a lower level of detail)
9.2 Thickness of link: The thickness of the line representing the link
between two problems may be dependent on the relative importance ('size')
of the nodes so connected and possibly also on their distance apart.
9.3 Type of link: Possibilities such as using full line, dotted line,
or some other variant may also be envisaged to distinguish different types
of relationship or different levels of detail.
9.4 Colour: Whether plotted or vector-generated, different colour conventions
may be envisaged to distinguish different types of relationship. When vector-generated
onto film, this possibility is related to that of overlays in general.
9.5 Overlays: It may be useful to envisage the possibility of overlays
in order to produce maps of higher density and to print those using several
9.6 Stereo-effect: In order to facilitate comprehension of complex
networks, consideration should be given to the production of maps overlays
(in red and green) calculated to create a stereo-effect (as is done with
the representation of complex molecules).
This is the problem of leaving space beside nodes for a name to be inserted
(by vector generator or plotter). Various techniques can be used to achieve
this and avoid over-writing:
10.1 Vary type size: According to the "size" of the node,
more or less space may be reserved. Smaller typesizes may be used for less
10.2 Truncate names: This possibility may be explored given that the
names tend to be long
10.3 Omit names: This technique can be used for the smallest nodes
or least connected nodes.