30th October 2006 | Draft
Generating a Million Questions from UIA Databases
Problems, Strategies, Values
- / -
Also published in modified form in Statistics, Visualizations
and Patterns (Vol 5
of the Yearbook of International
Organizations, Munich, K G Saur Verlag, 6th
edition, 2006/2007, as sections 10.1.1 and 10.1.2)
Background
The experiment described below follows from an initial interest of the German
Research Centre for Artificial Intelligence (DFKI), in support of the
questions project of the international nonprofit organization Dropping
Knowledge – as clarified during a workshop on the online
databases of the Union of International
Associations (Saarbrucken, 8 December 2005). Dropping Knowledge subsequently
appropirated this information as the basis for establishing an online web facility
to enable people worldwide to ask questions and to be exposed to answers
-- thereby creating a "Living
Library". The categorization of the questions was undertaken using
the ontology developed by the UIA (cf Enabling
a Living Library, 2006)
The concern here, in contrast, is whether it was possible to generate a Questions
database from three long-established UIA databases: World
Problems-Issues, Global Strategies-Solutions,
and Human Values. These databases
are part of the online Encyclopedia
of World Problems and Human Potential, originally initiated in collaboration
with Mankind 2000, whose development was most recently funded by the European
Commission. The databases are integrated with others on international organizations,
international meetings, biographies and bibliographies (cf Yearbook
of International Organizations, International
Congress Calendar).
The thousands of problems, strategies and values identified from the documented
preoccupations of the network of international organizations (governmental
and nongovernmental) provide a relatively objective focus for the generation
of questions associated with those preoccupations -- or implicit in them.
Clearly a particular interest in this experiment is to determine in what ways
the result of generating questions could be meaningful and significant. The
work builds on the possibilities of the use of such databases for simulations
(cf Simulating
a Global Brain: using networks of international organizations, world problems,
strategies, and values, 2001).
WH-questions
There is an extensive literature on what are termed “WH-questions”. “WH-questions” refer to questions of the type: How? Why? Where? What. Which? When? Who? Further comments on studies in relation to such questions are noted below.
The Questions database was first generated experimentally in December 2005, and then more comprehensively in October 2006. In each case by applying a template of the WH-questions to the titles of Problems, Strategies and Values.. This can be done by embedding the "seed title" (XXXX) in a suitable template phrase. For example::
- How is XXXX caused?
- Who is responsible for YYYY?
- Where does XXX occur?
- When does XXXX occur?
- What is XXXX?
The range of templates is illustrated by the following table
| Templates used experimentally to generate questions |
| Source database |
Generated
query (** = not finally used) |
| . |
WH-query |
Phrase-1 |
Seed |
Phrase2 |
World
Problems-Issues
(13 templates
applied)
|
How |
much |
XXXX |
is there? |
| How |
does |
XXXX |
happen? (**) |
| How |
is |
XXXX |
caused? (**) |
| Why |
does |
XXXX |
happen? |
| Why |
give priority to |
XXXX |
? |
| Why |
does God allow |
XXXX |
to happen? (**) |
| Why |
be concerned by |
XXXX |
? (**) |
| Where |
does |
XXXX |
occur? |
| What |
is |
XXXX |
? |
| What |
causes |
XXXX |
? |
| What |
results in |
XXXX |
? (**) |
| Who |
causes |
XXXX |
? |
| Who |
is responsible for |
XXXX |
? |
| Who |
is concerned about |
XXXX |
? |
| When |
does |
XXXX |
occur? |
| When |
will |
XXXX |
occur? |
| When |
did |
XXXX |
arise? |
| Which |
kind of |
XXXX |
? |
Global
Strategies-Solutions
(14 templates
applied) |
How |
can |
XXXX |
be enabled? |
| Why |
is |
XXXX |
unsuccessful? |
| Why |
give priority to |
XXXX |
? |
| Where |
is |
XXXX |
undertaken? |
| Where |
is |
XXXX |
successful? |
| What |
is required for |
XXXX |
? |
| What |
causes |
XXXX |
to fail? |
| Who |
undertakes |
XXXX |
? |
| Who |
is responsible for |
XXXX |
? |
| Who |
is concerned about |
XXXX |
? |
| When |
is |
XXXX |
undertaken? |
| When |
will be |
XXXX |
undertaken? |
| When |
was |
XXXX |
undertaken? |
| Which |
kind of |
XXXX |
? |
Human Values
(constructive
or destructive --
9 templates
applied
) |
How |
is |
XXXX |
elicited? |
| Why |
is |
XXXX |
valued? |
| Why |
give priority to |
XXXX |
? |
| Where |
is |
XXXX |
found? |
| What |
is |
XXXX |
? |
| Who |
exemplifies |
XXXX |
? |
| Who |
values |
XXXX |
? |
| When |
is |
XXXX |
evident? |
| Which |
kind of |
XXXX |
? |
Human Values
(polarities --
10 templates
applied
) |
How |
are |
XXXX |
related? |
| How |
can |
XXXX |
be reconciled? |
| How |
can |
XXXX |
be transcended? |
| Why |
is the |
XXXX |
relation so challenging? |
| Where |
are |
XXXX |
reconciled? |
| What |
transcends |
XXXX |
? |
| Who |
embodies |
XXXX |
? |
| Who |
exemplifies the |
XXXX |
ambiguity? |
| When |
are |
XXXX |
transcended? |
| Which |
kind of |
XXXX |
relationship? |
As is clear from the table above, different templates were used both according to the source database and according to the WH-Question. Since many Problems and Strategies have a number of alternative titles (notably employing synonyms), these too have been used as seeds for the generation of alternative titles for a question -- effectively constituting alternative formulations of the same question (but clustered together in the Question entry). Although they may be accessed through their keywords, they are not treated as distinctly profiled questions.
Preliminary results
The very preliminary results in generating these questions in December 2005 are indicated in the following table.
|
| Preliminary results (December 2005) |
| . |
Seed
entities |
WH-templates
used |
Questions
generated |
|
Total |
Selected
|
|
Main |
WH-Variants |
Problems |
59205 |
12995 |
13 |
168935 |
239252 |
Strategies |
42032 |
12848 |
14 |
179872 |
167426 |
Values |
3257 |
3209 |
9 /10 |
29111 |
16470 |
Totals |
104494 |
29052 |
|
377918 |
423148 |
|
|
|
|
|
801066 |
| Indication of distribution of seed entities by type |
|
Problems-Issues |
Strategies-Solutions |
|
Profiles |
Links |
Profiles |
Links |
|
1996 |
2000 |
% |
1996 |
2000 |
% |
1996 |
2000 |
% |
1996 |
2000 |
% |
A |
0 |
196 |
n.a. |
0 |
3,507 |
n.a. |
0 |
1,518 |
n.a. |
0 |
16,767 |
n.a. |
B |
170 |
187 |
10% |
5,300 |
7,090 |
34% |
158 |
154 |
-3% |
3,697 |
4,253 |
15% |
C |
575 |
722 |
26% |
13,816 |
19,347 |
40% |
1,100 |
1,089 |
-1% |
17,096 |
25,206 |
47% |
D |
2,162 |
2,740 |
27% |
30,613 |
52,451 |
71% |
3,315 |
3,452 |
4% |
19,374 |
43,329 |
124% |
E |
3,857 |
5,378 |
39% |
29,626 |
52,587 |
78% |
3,008 |
5,298 |
76% |
11,092 |
50,677 |
357% |
F |
3,072 |
3,917 |
28% |
38,625 |
61,604 |
59% |
1,382 |
1,972 |
43% |
7,015 |
19,580 |
179% |
G |
2,153 |
30,279 |
1306% |
5,979 |
47,112 |
688% |
7,685 |
13,107 |
71% |
3,604 |
69,059 |
1,816% |
Other |
214 |
12,716 |
5,842% |
905 |
26,255 |
2,801% |
12,850 |
6,105 |
-52% |
61,129 |
34,070 |
-44% |
Total |
12,203 |
56,135 |
360% |
124,864 |
269,953 |
116% |
29,498 |
32,695 |
11% |
123,007 |
262,941 |
114% |
|
| Indication of seed entity relationships |
. |
Hierarchical links |
Functional links |
|
. |
Broader |
Narrower |
Related |
Aggravating |
Aggravated by |
Reducing
|
Reduced by |
|
Problems |
26403 |
35500 |
14264 |
31024 |
31105 |
1507 |
1529 |
|
Strategies |
27134 |
32541 |
3010 |
3302 |
2902 |
17826 |
16911 |
|
Values |
|
11392 |
|
|
|
|
|
|
Totals |
|
|
|
|
|
|
|
|
The subsequent generation of the Questions in October 2006 gave the following results:
| Final results of question generation (October 2005) |
| . |
Problems |
Strategies |
Values |
Value-Polarities |
Totals |
| . |
Main |
WH-
Variants |
Main |
WH-
Variants |
Main |
WH-
Variants |
Main |
WH-Variants |
Main |
WH-Variants |
All |
Seed
entities |
45892 |
31055 |
2978 |
229 |
80154 |
WH-templates |
7 |
6 |
7 |
7 |
7 |
2 |
7 |
3 |
28 |
18 |
46 |
Main |
321244 |
- |
217385 |
- |
20846 |
- |
1603 |
- |
561078 |
- |
561078 |
(Alternative titles) |
133917 |
114786 |
107422 |
107422 |
9807 |
2802 |
- |
- |
- |
- |
- |
WH-Variants |
- |
275352 |
- |
217385 |
- |
5956 |
- |
687 |
- |
499380 |
499380 |
Total
questions |
321244 |
275352 |
217385 |
217385 |
20846 |
5956 |
1603 |
687 |
561078 |
499380 |
1060458 |
Broader |
433244 |
371352 |
370174 |
370174 |
79765 |
22790 |
0 |
687 |
883183 |
765003 |
1648186 |
Narrower |
346577 |
297066 |
284396 |
284396 |
0 |
0 |
79744 |
34176 |
710717 | |