- / -
Section of Report of a Preliminary Investigation of the Possibility of Using Computer Data Processing Methods (1968): a summary of the various parts of this report, and details of its contents (with links to the various parts), are provided separately
This study was initiated in order to take advantage of the facilities offered by computer typesetting in the production of reference publications. A description of the computer typesetting process is given in a note in Exhibit 19 (which refers to Exhibits 20, 21, 22, 23, 24). The process is a relatively new one in Europe except for its use in newspaper production. A recent survey showed that only about 1% of such systems are used by the group of organizations of which the UIA forms a part (see Exhibit 25). The same survey showed that of the existing systems, 78% were in the U.S.A. and Canada.
There are several methods by which computer typesetting can be used by the UIA. The process can be used for the preparation of the Yearbook of International Organizations only (2500 main entries; 1000 pages) plus its quarterly supplements. This could be combined with production of the annual International Congress Calendar plus its monthly supplements (3000 entries), and with the bibliographical Yearbook of International Congress Proceedings (8000 entries; 800 pages) plus, its monthly supplements (200 entries per month). Each publication has information common to the others, changing or modifying one (e.g. address or organization title) requires a change in another.
Once data is stored on magnetic tape, it is possible to search the tape and thus perform part of the UIA research and study function. Alternatively, since many of the UIA publications require contact with the organizations listed in the Yearbook, the typesetting feature can be combined with the semi-commercial activities of the UIA. It is also possible to combine the three activities together, namely publications, research and semi-commercial activities.
1. Production of Publications without Research or Semi-Commercial Processing
The simplest procedure would be to treat the Yearbook, Calendar and Bibliography separately although they might be dealt with one after the other on the computer. The advantages of combining the Yearbook and Calendar processing are discussed in a later section. The treatment of the texts for each publication would then he approximately the same and for this reason the procedure will be illustrated by discussing the preparation of the Yearbook.
The main body of the Yearbook text, including the dictionary entries on individual organizations, cross-references, initials, etc., would be punched onto paper tape. This procedure could either be done by a service bureau or else at the UIA offices by a retrained secretary or a temporarily hired specialized secretary, using a hired machine of the Friden Flexowriter type.
The paper tape would then be sent to the computer service bureau where it would be converted into charaters on a magnetic tape (as described in Exhibit 19). At the same time a proof would be supplied for checking at the UIA. It would be possible to reduce the number of expensive proof cycles by using Flexowriter which produces a typed output whilst the paper tape is being punched. The latter could be checked prior to sending a tape to the service bureau. At this stage the typed text would look like that in Exhibit 22.
Once the text has been checked the magnetic tape contains data in a form which can be quickly processed to prepare a second paper tape which would act as instructions to an automatic composing machine. No further proof reading would be required.
The above paragraphs have described the method by which the UIA would ensure the preparation of the main body of the Yearbook text. This would presumably be done at a time when the maximum amount of up to date information was available, i.e. just after questioning international organizations prior to the preparation of a new edition of the Yearbook. Once the main body of the text is on magnetic tape, there would be no further necessity to retype the unchanging portions of text for later editions.
Modifications and Additions: Following the preparation of the first computer prepared edition, modifications to the text stored on magnetic tape would have to be made from time to time. These would be inserted in the appropriate place in the stored text sequence by the following procedure. A modifications and additions (called 'movements') paper tape would be punched at a service bureau or at the UIA as before. This would be sent to the computer service bureau where it would be converted to magnetic tape with production of a proof. Once the proof is approved the movements tape could be used (either on its own or with information on the main tape) to prepare the quarterly supplements to the Yearbook. A paper tape would be prepared by the computer containing instructions for an autosatic composing machine. The Yearbook supplement could then be combined with the remaining magazine pages to produce the usual monthly issue of the magazine.
This procedure would be continued on a quarterly basis until the next edition of the Yearbook was to be printed. A larger movements paper tape would then be prepared on the basis of the replies from individual international organizations. This would be converted to magnetic tape with the usual proof and when correct would be combined with the main body of stored text. This could be done at the same time as the stored text for the various preceding quarterly supplements were incorporated into the main body of stored text. Alternatively, the latter could be combined with the main tape at the time of preparation of the individual supplements.
The sections of text which are not worth storing, namely editorials, tables contents pages, etc. which are completely modified from year to year, could be prepared in the same way, or else the manuscript could be sent directly to the printer in the normal manner.
Once all new information has been incorporated into the main tape, it can then be processed by computer to prepare a complete paper tape to instruct the automatic composing machine. Pagination would be performed automatically. There would be no further necessity for editorial examination of text.
Advantages and Disadvantages: The above procedure would be the simplest to implement and would, as stated earlier, also be applicable to the preparation of the Calendar and supplements, and the Bibliography and supplements. For each of these publications, this procedure has the disadvantage that the manuscript has to be prepared precisely as it is required in the final publication. No advantage is taken of the facility, once the text is stored on magnetic tape, of being able to search through the stored text, pick out items, resort them, and then produce text in a different order, without destroying the original text. This facility would be extremely useful in the preparation of various indexes to the Calendar and Bibliography. It would also be useful in the preparation of the geographical index and the classified list of organizations in the Yearbook.
A further advantage of this facility is that it could enable proofs of each entry to be prepared and sent automatically to the organizations concerned for checking, using the addresses automatically picked out and printed by computer.
Use of this facility means that the reordered text does not have to be prepared in manuscript form with the accompanying costs and errors (both logical and manipulative). In the case of the Yearbook, it could be used to prepare and sort cross-references, either from the organization title or on the basis of keywords in the body of the text. In this way the index of executive officers of organizations could also be prepared and collated with the dictionary presentation of information. This facility would also make it possible to prepare specialized directories of groups of organizations, e.g. medical, educational, etc. without any additional editorial work.
Several UIA publications which can only be prepared at long intervals or not at all, could be prepared in this way. These include the Directory of Periodicals, which would be based on the publications section of each entry. These possibilities, as distinct from statistical surveys, could be of great importance in improving the market for Yearbook information, as opposed to the Yearbook as an isolated publication.
There is one very important difficulty which prevents extensive use of this facility. This is discussed in a later section on the combination of UIA research with the production of publications. It is clear that the more detailed the indexing requirement becomes for the production of publication, the closer the computer search methods come to becoming surveys and studies of the stored data. The indexing problem, therefore, blends into the research problem.
Calendar Production: The Calendar text could be prepared in a manner similar to that outlined above. There is however another possibility. At present a considerable amount of time is spent by the person preparing the Calendar in manipulating card files and indexes to check whether a given meeting has already been listed correctly. This manipulation would still be required under a computer system, since the above procedure does not imply any selection of text prior to preparation of print instructions. The person preparing the Calendar must have reasonable knowledge of the many international organizations and their meetings.
An alternative procedure would not require any such knowledge or involve any manipulation of card files. Every international meeting would be punched directly onto paper tape in the UIA offices as though each were a bona fide new entry. This avoids the preparation of card files. Eachentry would bear a status field which would not necessarily be printed in the Calendar. This status field would be a grading of the reliability of the information source, based on a few simple rules (e.g. date; newspaper 1; organization's periodical 7; letter from organization 9). The computer would then process the paper tape, comparing the new entries against the main tape. Where an entry had not previously been mentioned, it would be prepared for the Calendar supplement in the normal way. Where an entry had been mentioned previously, the computer would select whether the new entry should be printed as a modification on the basis of the respective dates and status fields. The above procedure would avoid the problem of retraining personnel in the Calendar section.
Other advantages of computer-aided Calendar production are that some attempt could be made to follow through series of meetings from year to year, incorporating constant information (e.g. approximate number of participants). This data could be used as a basis for automatically preparing questionnaires to organizations which are expected to hold a further meeting in a series, but for which no data is available. It would be possible to consider including with this data a proof covering all meetings scheduled by the organization for which data was available. This would improve the quality of the information and have the psychological advantage of showing the organization what free publicity was being given to its activities. Once the organizer's addresses can be conveniently used for the preparation of mailings, publicity could be automatically sent on the UIA Congress Science Series of publications.
Combination of Calendar and Yearbook Production: There are two items of data common to the Yearbook and Calendar (in the case of international organizations), namely the organization title and the organization address. It would be possible to prepare a Calendar movements paper tape at the UIA without including details of the address of the organization (where no special meeting address was supplied), so that the computer would perform the time consuming operation of consult- int the Yearbook tape for the exact current address. This would then be transferred to the Calendar tape, prior to preparing the paper tape for the automatic composing machine.
Normally a change of organization address means that alterations have to be made to the Yearbook and to each of the cards on meetings scheduled for future years. Where there are several meetings per year, this is a time consuming operation. By keeping all addresses current on one tape only and transferring the required information just prior to preparing the final paper tape, these multiple corrections are avoided. There are two disadvantages to this. Firstly, it is more appropriate where the whole Calendar is reprinted, since modifications have to be included in the supplements anyway. Secondly, because the Yearbook and Calendar are ordered alphabetically and chronologically respectively, inserting the appropriate address in each calendar entry involves a complex, and probably uneconomical, search procedure.
A further advantage of combining the Calendar and Yearbook processing is that it would be useful to have the meetings of a given organization filed together with the Yearbook data.
Combination of Calendar and Bibliography Production: There are two items of data common to the Calendar and Bibliography, namely the organization title and the title of the meeting. The current format of the entries in both these publications has been made as similar as possible. This has been done to save recomposition costs, either using normal or computer methods.
The system was planned to operate as follows. Notification of a planned international meeting would be received by the UIA. An entry would be prepared for inclusion in the next Calendar supplement. The same entry would appear in successive annual Calendars until the meeting was held. During the intervening period, portions of the text would be used, even if there were any modifications to the basic data. The modified text would, of course, appear in an appropriate Calendar supplement. Just before or just after the meeting was held, the Calendar information would be used as a means of contacting the organizer of the meeting to indicate that the report could be listed in the Bibliography. Since the bibliographical and calendar entries are very similar, the latter would simply be modified when the bibliographical information was received and would appear in the monthly bibliographical list. The sane entry would then appear in the cumulative bibliographical volumes. If these each covered an eight year period (e.g. 1960-67, 1962-69, etc.), the same entry would . be reprinted four times in these volumes as the series progressed. It is clear that for a meeting in 1980, a portion of the data could rest constant through 10 20 reprints, which could represent a considerable saving in composition costs.
Using normal printing techniques, there is one disadvantage to this system. It is not yet clear whether the proportion of constant text is sufficiently high to warrant the increased costs of manipulating the print metal. This disadvantage would also arise with the increased sorting costs of the stored texts. The other disadvantage is that it is not yet clear whether the Calendar or Bibliography will remain commercially viable.
Combination of Bibliography and Yearbook Production: The only advantage of doing this would be to enable the bibliographical data to be filed with other data on the organization so that a complete selection of the organization's publications could be supplied. The disadvantage of attempting to combine the two processes is that the Bibliography and Yearbook are ordered chronologically and alphabetically respectively. If the Bibliography had been in alphabetical order of organizations, as is normally the case, the two tapes could be combined easily.
Combination of Calendar, Yearbook and Bibliography Production: 2ach publication could be processed separately, but provision could be made for items in one tape of stored text to be transferred to or modified by items in one of the other tapes. Different degrees of combination must therefore be considered. This type of system offers no special advantages. The advantages come from combinations between two publications, and these have been discussed above. This system would combine most of the disadvantages listed above.
2. Production of Publications Combined with Research Processing
The advantage of combining the two processes is that separate coding of essentially the same information is avoided. Nearly all current surveys performed by the UIA are based on examination of the files on individual organizations or their meetings. If the costly manual surveys could be performed automatically, the UIA would be able to considerably increase its output of statistical data on international organizations. It has for example been suggested that a companion volume of statistical data could be sold with or incorporated into the Yearbook of International Organizations.
The great difficulty in combining the production of publications with research is deciding to what extent the computer is going to examine each organization or meeting entry to pick up data for statistical purposes or indexing. The more computer time that must be devoted to examining alphabetical data which is not in a precise position within a given entry, the more expensive the operation becomes. It is therefore not sufficient to declare that the UIA wants all possible research and indexing facilities. Each extra facility increases programming costs and computer time and also the rigidity of the system. In order to decide how best to combine publication production with research, precise specifications of the relative desirability of different indexing and research features must be supplied.
The simplest procedure by which the two could be combined would be to indicate at the end of each Yearbook organization entry, which is to be converted to printed text, the keywords to be used for indexing. In the case of other data for survey purposes, this could be coded numerically according to some rules. The computer is much more efficient at processing numerical data. An ideal organization entry would therefore have to be analyzed to determine the number of items of data that might be usefully coded at the time of preparation of each entry (e.g. code(s) for aim(s), date of foundation, member distribution by country, budget and sources of finance, activities, publications, etc.).
It is not necessary to decide how these codes are to be analyzed, since this would be done by a separate program (or a standard program). It is however most important to decide what questions are to be asked during the life of the program, since it is often complicated and costly to modify the code positions and insert new positions. The obvious disadvantage of detailed coding is that it increases the time required to prepare an entry, since the intellectual effort is performed at the time of coding rather than at the time at which the questions are asked. If too much coding is required, this may also considerably increase the space requirement on the tape.
If a separate field for coding is adopted, it is then debateable whether the two functions should not be separated. The computer can scan coded data more rapidly when it is in a standard form, rather than when each code block is separated from the next by a variable length of text. The separate treatment of research is discussed in a later section, as are the present indications of the likely questions to be asked.
It seems clear that the search facility to be used with publication preparation should be restricted to picking up indicated keywords for the preparation of indexes or the production of specialized directories. The detailed survey facility can be more satisfactorily performed on a separate basis. The only argument against this, is that combining the two processes increases the volume of data processed. This is important where the set-up time of the computer forms a significant portion of the run time. This point is discussed in a later section.
3. Production of Publications Combined with Semi-Commercial Processing
The main link between the publication production and the semi-commercial processing, is the names and addresses of international organizations which are listed in the Yearbook and to which the magazine, publicity or questionnaires are sent during semi-commercial processing. A change of address in the one system must be duplicated in the other. The advantage of combining the two is that this duplication would be avoided and there would be a guarantee that the mailing file was complete. The disadvantage of combining the two systems is the computer time required to scan through a mass of text to pick out the addresses and names of organizations. This could be avoided by maintaining the names and addresses on a separate magnetic tape or disk common to both the Yearbook file and to a file containing data on subscription payments, etc. There is however a further disadvantage, namely that international organizations only form about 25% of the mailing file. The purely commercial customer data could not be satisfactorily combined with the Yearbook data even through a common file. Since the number of modifications to international organization name and address data is relatively low, it might be preferable to use the same input format for the data for separate publication and semioommercial processing systems. The data would then be punched once but processed twice by the computer, once in each system.
Semi-commercial processing includes all use of name and address data for preparing publicity mailings, questionnaires, sending magazines to subscribers, reminders, invoices, receipts, congress fee processing, etc. This can be split into two groups, namely mailing and accounting procedures. Both these groups include data processing problems which form the bulk of processing in commercial computer installations. There is therefore a considerable advantage in treating these problems in a separate system, since standard program packages or at least standard tested techniques can be applied to their solution.
4. Semi-Commercial Processing of UIA Data Only
For the UIA, the main difficulty in using a computer for these problems is that the volume of data that would be processed on a monthly basis is very close to the point at which it would cost more to setup the computer than to perform the processing. In other words, the fixed costs of operation are equal, if not greater, than the variable costs of processing. This is not necessarily a disadvantage, provided that the total processing cost is less than the current system costs.
There are two possible means of reducing the cost to the UIA. The first is to make use of the less sophisticated equipment which is used in conjunction with electronic computers. This includes sorters, collators, tabulators and card-entry computers. The disadvantage here is that the volume of data is still very low and the lack of flexibility of these machines means that many runs using different programs have to be performed to achieve the desired result. In some oases, even with many runs, the processing required cannot be satisfactorily performed.
5. Semi-Commercial Processing of Data of Organizations Cooperating with UIA
The second solution is to combine the data processing of the UIA with that of other organizations to increase volume, thus reducing monthly costs, and to reduce the initial cost of preparing the programs required. This solution would be in line with the UIA's aim to pioneer the use of new techniques for international organizations.
6. Semi-Commercial Processing with Research Processing
Each organization in the mailing file must be provided with a code to indicate its importance and field of interest. This coding could be developed to include all the detailed coding required for research purposes. The advantage of this would be that the volume of processing for the two operations would be increased thus reducing the costs. The disadvantage is that only 25% of the mailing file would need to be scanned for statistical studies. This would make the processing inefficient. In addition, the increased number of codes would increase the storage space required on tape, which would also contribute towards increasing the processing time.
Semi-Commercial Processing with Publication Production (3)
This system has been discussed under Publication Production
The data processing for research purposes may be split into two groups. These are the statistical surveys of large groups of organizations or their meetings and individual queries on particular organizations, meetings or the resultant publications. The first group results in articles published in the UIA magazine or is financed under a contract. The second group arises from correspondence queries and is in most cases understood to be performed free of charge. This group cannot be eliminated since it is the ability of the UIA to answer such questions which helps to build its reputation. These queries also represent matters of vital interest to persons in the process of establishing or facilitating the establishment of international links. In terms of the major UIA objective, it is therefore important that such queries should be answered whereever possible. A note on research processing is given in Exhibit 26. It would not be practical to produce a system which could answer all conceivable questions on the data on organizations and their activities. The system should be designed to answer as many general survey questions as possible, together with particular queries that correspond to the general survey level of complexity. All complex questions, particularly those requiring information which is specific to a given organization or requires cross reference between several files should be done manually.
Research with the Production of Publications (2): This system has been discussed under the Production of Publications. The disadvantages of combining the two processing problems was mentioned there. It is clear that it would be more efficient to separate any survey operation which requires preceding of entries.
Research with Semi-Commercial Processing (6): This has been discussed under Semi-Commercial Processing.
7. Research without any other Processing
The main advantage of separating the research from the other processing is that the system can be desired to answer queries in the most efficient manner. If the research was combined with other processing it would make the processing for publications or sales much more complex, it would increase the cost of designing and running such systems and it would make it difficult for alterations to be made to the survey codes. The simplest, method of implementing a research system would be to code information on organizations, meetings and their publications. Punch the codes onto standard cards and scan these either directly using a classical card sorter, or indirectly by first converting the codes to tape and then conducting a standard tape search for which many standard programs are available. The latter method would be better if complex comparisons were necessary.
Provided space was left on the card or cards for further codes, these could be added if some additional item proved to be worth scanning regularly.
It is possible to combine the computer typesetting, the research and . the semi-commercial systems together. The advantage of this is that all addresses are updated only once in one system and do not have to be duplicated in a second system once the data files have been set up for operation, the computer time would be usefully employed. This would mean that the cost of setting up two or three systems separately would be avoided.
The disadvantages of combining the three procedures into one integrated process are:
this work is licenced under a creative commons licence.