Professional Documents
Culture Documents
eu
Disclaimer: This document reflects only the author's view and the Commission is not
responsible for any use that may be made of the information it contains.
This project has received funding from the European Unions Horizon 2020 research and innovation programme under
grant agreement No 687847
D4.1
Enriched
Semantic
Models
of
Emergency
Events
History
2
|
P a g e
D4.1
Enriched
Semantic
Models
of
Emergency
Events
Table of contents
History
...............................................................................................................................
2
Table
of
contents
...............................................................................................................
3
List
of
tables
......................................................................................................................
4
List
of
figures
.....................................................................................................................
4
Executive
summary
............................................................................................................
6
1
Introduction
................................................................................................................
7
1.1
Objectives
and
Modelling
Principles
..............................................................................
7
1.2
Design
Approach
and
Methodology
..............................................................................
8
1.2.1
The
NeOn
Modelling
Approach
..............................................................................
8
1.2.2
The
Qualitative
and
Structural
Design
Methodology
...........................................
10
1.2.3
Ontology
Evaluation
.............................................................................................
11
2
Structure
of
this
document
........................................................................................
12
Part
I:
Requirements
Analysis
and
Model
Specifications
..................................................
13
3
Introduction
..............................................................................................................
13
4
Requirements
Information
Sources
...........................................................................
14
4.1
COMRADES
General
Requirements
.............................................................................
14
4.2
Stakeholder
Interviews
................................................................................................
14
4.3
Ushahidi
Data
Structures
.............................................................................................
16
4.4
Crisis
Related
Datasets
.................................................................................................
19
5
The
Ontology
Requirement
Specification
Document
(ORSD)
......................................
24
5.1
COMRADES
Aims
and
Model
Purpose
.........................................................................
24
5.2
Intended
Use
and
Users
...............................................................................................
25
5.3
Competency
Questions
................................................................................................
25
5.3.1
Work
Package
Requirements
................................................................................
26
5.3.2
Interviews
and
Qualitative
Requirements
............................................................
26
5.3.3
Ushahidi
Data
Structures
......................................................................................
27
5.3.4
Crisis
Related
Datasets
..........................................................................................
28
6
Term
Glossary
............................................................................................................
28
7
Summary
...................................................................................................................
29
Part
II:
COMRADES
Ontology
Model
.................................................................................
31
8
Introduction
..............................................................................................................
31
9
Model
Principles
........................................................................................................
31
10
Ontology
Components
.............................................................................................
32
10.1
Classes
and
Relations
.................................................................................................
32
3
|
P a g e
D4.1
Enriched
Semantic
Models
of
Emergency
Events
List of tables
List of figures
4
|
P a g e
D4.1
Enriched
Semantic
Models
of
Emergency
Events
5
|
P a g e
D4.1
Enriched
Semantic
Models
of
Emergency
Events
Executive
summary
COMRADES (Collective Platform for Community Resilience and Social Innovation
during Crises, www.comrades-project.eu) aims to empower communities with
intelligent socio-technical solutions to help them reconnect, respond to, and recover
from crisis situations.
Based on the NeOn methodology [1] and a qualitative and structural design approach
[2], we created an Ontology Requirement Specification Document (ORSD) [3], that
highlights the needs and specifies the competency questions that the model needs to
address in order to comply with the COMRADES model requirements.
Although we cannot completely evaluate the ontological model since some data is not
yet available for the model (i.e. the COMRADES platform is not yet fully developed),
we show that the model can successfully represent 102 different competency questions.
6
|
P a g e
D4.1
Enriched
Semantic
Models
of
Emergency
Events
1 Introduction
The representation of crisis events and micro-events is a key aspect of the
COMRADES European project that aims to create an open-source community
resilience platform for helping the management of emergency crises. The model needs
to be easily integrated to the Ushahidi1 platform as it will be used as the backbone of
the developed resilience platform.
The focus of this deliverable is to provide a common semantic model that can be used
in all the different aspects of the platform. In particular, the proposed model should
allow the collection of user reports and related information and its organization. In the
rest of this document, we refer to such model as the COMRADES model.
Our development approach follows a qualitative and structural design methodology [2]
where requirement and modelling needs are extracted from stakeholder interviews and
existing platform data structures (e.g. Ushahidi) and datasets (e.g. Twitter2 and
ACLED3). The idea of using datasets as input while developing the COMRADES
model is motivated by the need for representing a large variety of input sources in the
model. This differs from other ontology development methods where existing model
are reviewed first by matching them to existing datasets and then extended. The
COMRADES development approach first identifies requirements from the dataset and
other sources, before creating an ontology and then aligning it if possible with existing
ontologies. This approach has the advantage of better integration with requirements
that are not specified in existing data such as requirements obtained from interviews
and to provide a simpler ontology model.
1
Ushahidi,
https://www.ushahidi.com/.
2
Twitter,
https://twitter.com/.
3
ACLED,
http://www.acleddata.com/.
7
|
P a g e
D4.1
Enriched
Semantic
Models
of
Emergency
Events
In order to develop the COMRADES platform, a model that allows for such analysis is
required. Although a few models have been designed in the past, most of them have
been focused on particular crises aspects and tend to be overly complex. The aim of the
proposed model is to provide a flexible and minimal model that addresses the
representation of events, the associated evidences and resources, and provides a means
for coordinating action automatically.
The COMRADES model is directly linked with other WP4 tasks as well as the other
work package needs. In particular, it needs to allow event and micro-event modelling
(T4.2) and action coordination (T4.3) as well as multilingual processing (T3.1), content
informativeness representation (T3.2) and content validity assessment (T3.3).
Although there are different technologies for representing ontologies, we decide to use
RDF/OWL as the COMRADES project needs to deal with data obtained from social
media and online data sources and RDF/OWL is a semantic technology particularly
adapted to such setting. Moreover, since the COMRADES platform will be web based
this helps the integration of the model as web frameworks can manipulate RDF/OWL
data easily.
Even though different methods exist for building ontologies and data models, we
decide to rely on two different approaches for building the COMRADES model. First,
we propose to partially follow the NeOn methodology [1], a comprehensive approach
for specifying, developing and evaluating ontologies. Second, we propose to apply the
qualitative and structural design approach [2] for including non-ontological resources
and user studies during the specification phase of the COMRADES model.
Although the COMRADES model may be applied to different crises and scenarios not
covered by the COMRADES project, the model goal is only focused on fulfilling the
project requirements in order to provide a relatively simple model that can be easily
reuse within the COMRADES project. For this purpose, the COMRADES model aims
to follow the project requirements rather than providing a single model that fulfils all
existing and future crisis platforms. Nevertheless, we aim to provide a model that can
be extended easily, so it may be integrated in additional scenarios in the future.
8
|
P a g e
D4.1
Enriched
Semantic
Models
of
Emergency
Events
A few different methods exists for designing ontological models such as Methontology
[4], On-To-Knowledge [5], and DILIGENT [6]. However, we decided to focus on the
NeOn approach since it helps the integration of existing models and reuse of non-
ontological models.
For creating the COMRADES model, we have to follow the scenarios 1 and 9. To
some extent we also try to reuse some commonly use ontologies as outlined in other
scenarios. However, we do not strictly follow the ontology reuse scenario as it is not
the focus of the COMRADES data model.
9
|
P a g e
D4.1
Enriched
Semantic
Models
of
Emergency
Events
The main scenario for developing the COMRADES model is Scenario 1, as the
COMRADES ontological model needs to be developed from scratch. An important
task of this scenario is the creation of an Ontology Requirement Specification
Document (ORSD) [3] that describes the purpose, scope, implementation language,
target group and intended uses of the specified model. In particular, this specification
document needs to define a set of requirements that are defined as a set of Competency
Questions (CQs). In order to create the ontology requirement specification, we need to
collect knowledge from different sources and design competency questions.
The COMRADES model needs to integrate with the existing Ushahidi platform, and to
map existing non-ontological resources (i.e. existing datasets). Although the second
scenario is in principle designed to help with this task by proposing a non-ontological
resource reuse process, it focuses on glossaries, dictionaries, lexicons, classification
schemes and taxonomies, and thesauri. This type of resource is wildly different from
the ones we are integrating when designing the COMRADES model, as we focus on
the integration of existing data structures from the Ushahidi platform and third party
datasets that are more complex models that dictionaries and thesauri. As a result, the
second scenario is not really suitable for our task.
Although the scenarios 3, 4, 5, 6, 7 and 8 could be also applied to the design of the
COMRADES model, the main focus is to provide modelling support for the different
project tasks, integrated datasets and the Ushahidi platform4. Existing crisis ontologies
are not completely relevant for COMRADES since they either focus on very specific
use cases or are not designed to integrate with a large variety of data sources. As a
consequence, the COMRADES modelling task does not concentrate on these
scenarios.
Nevertheless, when possible, we try to map some of the key COMRADES concepts to
existing ontologies that are not necessarily designed for representing crises (e.g. SIOC,
FOAF, DC Terms).
One of the main shortcomings of the NeOn methodology is that the approach does not
consider existing data structures and datasets as part of the development process. Even
though the second scenario proposes the integration of non-ontological resources, its
focus is not on existing data structures but on non-practical knowledge (e.g. thesauri,
dictionaries). Therefore, the NeOn approach is mostly suitable when: 1) the new
ontology needs to represent or integrate completely new datasets; or 2) the new
ontology needs to integrate with existing ontologies.
4
As
discussed
previously,
the
Ushahidi
platform
will
be
the
backbone
of
the
COMRADES
resilience
platform.
As
a
result,
the
model
needs
to
support
all
the
data
structures
used
in
the
Ushahidi
software.
10
|
P a g e
D4.1
Enriched
Semantic
Models
of
Emergency
Events
In this context we propose to integrate elements from the qualitative and structural
design methodology [2] where the design of a particular model is extracted from
qualitative studies (e.g. interviews, surveys, etc.) and the structural analysis of datasets
and the structure of existing software platforms or processes (e.g. thread structure of
the data, user social interactions).
In order to use this methodology, we need to: 1) obtain requirements and perceived
needs by stakeholders using interviews or surveys (qualitative phase); and 2) collect
the data that needs to be represented (e.g. Ushahidi data format, Twitter posts)
(structural phase).
Following that phase, the interviews are used for obtaining functional requirements and
identifying important features that are necessary for designing a new model. Similarly,
data structures are analysed for creating a common representation that feeds into the
competency question and model implementation.
Even though a complete evaluation of the COMRADES model requires the actual
deployment of the model as part of the COMRADES resilience platform and the
integration of the input and outputs of the different work packages, we propose to
focus on a theoretical evaluation as there is not any data produced by the project yet.
11
|
P a g e
D4.1
Enriched
Semantic
Models
of
Emergency
Events
12
|
P a g e
D4.1
Enriched
Semantic
Models
of
Emergency
Events
3 Introduction
According to the first scenario of the NeOn methodology, the first step for creating a
new ontological model from scratch is to create an Ontology Requirement
Specification Document (ORSD) [1]. In order to do so, we need to collect different
information.
As outlined in Figure 2, the ORSD document is divided in 7 different parts. For filling
each of these parts, we use stakeholders interviews, the COMRADES project
description (i.e. work package needs and project aims), and analyse the structure of the
Ushahidi platform and different crises related datasets.
Figure
2
Template
for
creating
an
Ontology
Requirement
Specification
Document
(ORSD)
(Image
source
[1])
13
|
P a g e
D4.1
Enriched
Semantic
Models
of
Emergency
Events
We mostly follow the structure outlined by Figure 2. First, we identify the purpose of
the model, its scope and level of formality. Then, we use both the COMRADES project
description and interviews for framing the intended use and users of the model. Finally,
we create competency questions using the qualitative and structural design
methodology discussed in the introduction.
As mentioned in the introduction section of this document, we use the qualitative and
structural design approach for determining the requirements of the COMRADES
model.
In this section, we present the different data sources that are investigated for designing
the COMRADES model, and identify the model requirements that can be identified by
those sources.
A few requirements for the COMRADES model are clearly outlined by each work
package (WP) tasks. In particular, the work on content informativeness and validity
assessment (WP3) and emergency event detection, modelling and matchmaking (WP4)
stipulates the following five tasks:
Multilingual
Content
Processing
(Task
3.1)
Content
Informativeness
Classification
(Task
3.2)
Content
and
Source
Validity
Assessment
(Task
3.3)
Emergency
Event
Identification
and
Clustering
(Task
4.2)
Semantic
Matchmaking
of
Emergency
Events
(Task
4.3)
Looking at each task, we can observe that for T3.1, the model needs to be able to
represent different type of social media data and to be able to attach different pieces of
information such as its language, topics and named entities. For T3.2 we need to
represent the informativeness of individual messages. For T3.3, it is required to
represent user profiles and the trustworthiness of particular pieces of information.
For the WP4 tasks, documents need to be categorised, and events need to be
represented (T4.2). For T4.3, events need to be clustered in order to match related
events.
Many requirements come directly from analysing the needs of existing communities
dealing with emergency situations, which was gathered by WP2, and will be delivered
in D2.1 in March 2017. In this context, 8 interviews were conducted as part of the
14
|
P a g e
D4.1
Enriched
Semantic
Models
of
Emergency
Events
work on community requirements and evaluation of resilience platform (WP2) and the
sociotechnical requirement of the COMRADES resilience platform.
The interviews, which will be fully described in D2.2 due March 2017, involved a
specialist in ICT for disaster management and 7 community leaders. Each interviewee
was asked questions about how they currently use technologies when dealing with
crises, and specifically What sociotechnical requirements should be considered to
design a social platform to boost communities resilience in a disaster situation?
The interviewed ICT specialist was Dr Marc van den Homberg (MA), a senior disaster
management expert at CORDAID5 (a development aid organisation in the
Netherlands). He is currently working in Bangladesh with local communities to help
them to deal with floods. During the interview, he shared experiences from past and
current projects.
The interviews with the community leaders focused on their perception of how they
currently deal with disaster situations and how it is supported by technology. They
shared insights concerning how a new technology could improve crisis management.
The interviewed community leaders were the following:
6
- Adin (AD), Director at Hysteria , a community laboratory that is focused on
youth empowerment, art and city issues in Semarang, Indonesia. A current user
of the Ushahidi platform.
- Milan Mukhia (MI) from CORDAID. Milan has worked on humanitarian
services in disaster zones in different countries for 12 years. Coordinating
collaborations among stakeholders is his main focus.
7
- Salina Shakya (SA) from CORDAID. Works for the project Parivartan,
helping to facilitate the process for the society to go back normal life after an
earthquake disaster.
8
- Lumanti Joshi (LU), from Lumanti, a support group for shelter in Nepal.
Architect, he has worked on organising communities for building
reconstruction plans after disasters for 13 years. His focus is bridging
community and government by creating structured plans.
- Chuks (CH), Deputy Director at Reclaim Naija9 in Nigeria. The goal of the
Reclaim Naija project is related to monitoring elections in real time. Citizens
use Ushahidi to report incidents such as fraud or violence. The aim is to change
the paradigm of elections in Nigeria by empowering grass-root communities
towards civic participation.
- Elsa Marie DSilva (EL) is the founder of the project Safecity,10 which aims at
making the problem of sexual harassment more evident to the whole society
5
CORDAD,
https://www.cordaid.org/.
6
Hysteria,
http://grobakhysteria.or.id.
7
Parivartan,
http://parivartannepal.org.np.
8
Lumanti,
http://lumanti.org.np.
9
Reclaim
Naija,
http://reclaimnaija.net.
10
Safecity,
http://safecity.np.
15
|
P a g e
D4.1
Enriched
Semantic
Models
of
Emergency
Events
From the different interviews we can observe that there is a strong need for a platform
that allows anonymous reports, privacy management, the collection and visualisation
of event location and reports as well as methods for searching particular events and the
ability to assign tasks to reports. The interviews identify requirements for the
COMRADES platform and by extension the COMRADES model in term of
functionalities, usability, data needs, performance and external data sources.
Users need to create reports of incidents with geolocation, time and date, the source of
the information (e.g. data source, person reporting the incident) while ensuring
methods that allow anonymous reports and feedback. The reliability of information
needs to be available, and reports need to be approved. It should also be possible to
assign action to reports and check their status. Reports should be available in different
languages if possible. Information should also be categorised (e.g. needs, resources).
In term of data sources, the model should support means for adding multiple data
sources such as social media (e.g. Flickr, Instagram, Twitter), SMS and WhatsApp.
Besides the need for representing reports of event and external data, the interviews
showed a strong need for identifying the reliability of information, privacy
management as well as assigning tasks for solving particular issues. Therefore, the
COMRADES model needs to provide an easy representation for external data and for a
task representation model, as well as access to management of information and its
trustworthiness.
4.3 Ushahidi
Data
Structures
As part of the development of the COMRADES platform, an audit of the different data
structures used in the Ushahidi platform was performed (D5.1). The Ushahidi data
structures cover a wide range of key COMRADES needs such as the representation of
users, posts and categories. Since such data structures are all formatted in JSON, they
need to be translated into an ontological model so they can be integrated into the
COMRADES model.
11
Connected
Development,
http://connecteddevelopment.org.
16
|
P a g e
D4.1
Enriched
Semantic
Models
of
Emergency
Events
DATA STRUCTURE
DESCRIPTION PROPERTIES RELATIONS
NAME
A survey is the core unit of the Id, url, title, content, created,
Parent (Post), form, user
Post (Survey) Ushahidi platform. All social media updated, source, location,
(creator), tags,
data is transformed into a survey. type, allowed_privileges.
Tags (or categories) can be applied Id, url, tag, slug, type,
Tag across all posts, regardless of the description, created, color, Parent (Tag)
Posts Form. icon, role, allowed_privileges
By analysing the different APIs of the Ushahidi platform, we obtain the data structures
listed in
17
|
P a g e
D4.1
Enriched
Semantic
Models
of
Emergency
Events
DATA STRUCTURE
DESCRIPTION PROPERTIES RELATIONS
NAME
A survey is the core unit of the Id, url, title, content, created,
Parent (Post), form, user
Post (Survey) Ushahidi platform. All social media updated, source, location,
(creator), tags,
data is transformed into a survey. type, allowed_privileges.
Tags (or categories) can be applied Id, url, tag, slug, type,
Tag across all posts, regardless of the description, created, color, Parent (Tag)
Posts Form. icon, role, allowed_privileges
Table 1. The information associated with the data structures consists of either
properties or relations. Properties are textual fields that are not shared across data
structures, whereas relations are used for linking different data structures together.
18
|
P a g e
D4.1
Enriched
Semantic
Models
of
Emergency
Events
It is important to note that some of the features derived from the APIs would benefit
from being modelled as relations rather than properties. For instance, the icon used for
representing a Tag should be converted to a relation that links to a Media resource, so
that any media can be used for representing a particular category.
An important aspect of the Ushahidi data model is the concept of forms. Forms are
associated with particular posts and are used for representing arbitrary textual input
using customisable fields. This is particularly challenging in an ontological context as
it can add a lot of complexity to the model.
Many of the crisis-related datasets and data sources that can be used for data analysis
purposes by the COMRADES project come from social media and particularly Twitter.
Crisis-related datasets are generally divided into high-level data and low-level
information. High-level datasets contain citizen reports or social media reports about
discrete events that occur in large-scale crises, whereas low-level datasets focus on the
general description of events. Compared to high-level datasets, low-level datasets have
more information about the specifics of particular events and are typically created
manually by experts or organizations, by verifying reports.
Unfortunately, such data tends to be created after events occur, and contains
aggregated information. Compared to such low-level datasets, the high-level datasets
tend to be unfiltered and unverified reports of discrete events that lack clear context. In
COMRADES, we are more interested in types of data such as: 1) those which tend to
contain more real-time information than the low-level datasets; 2) those where the
dataset size is much larger than their low-level counterpart.
The following table (Table 2) lists the different datasets that have been investigated so
far. The available data can be divided depending on the data that was used for building
a particular dataset. We distinguish three types of data source: social media data (i.e.
Twitter posts), user reports (e.g. Ushahidi, ACLED) and news agency data (e.g. news
websites). Each data types have advantages and disadvantages. Social media data is
widely available, however reliability is unclear and the format is highly unstructured so
it requires complex analysis in order to be converted into usable data. Citizen reports
are more scarce but potentially more useful as they are formatted specifically for
describing events. Finally, news data has the advantage to be more reliable and can
contain information about disaster relief information. However, such data is more
likely available after an event occurs and is low-level as it is summarizing a situation.
19
|
P a g e
D4.1
Enriched
Semantic
Models
of
Emergency
Events
SIZE
6 Crises / Annotated by
Crisis Lex T6 Twitter (Social Media) 2012-2013 ~60k Tweets
relatedness
1997 Now
Event summaries created weakly Event Summaries (Created and Weekly
(Africa) /
ACLED about event occurring in Africa verified manually). Uses the CAMEO datasets
2010 Now
and Asia. event taxonomy. (expanding)
(Asia)
4163
The Humanitarian Data Exchange Datasets /
is a dataset repository that 244
Citizen Reports / Event Summaries /
HDX contains multiple datasets about 2014-Now Locations /
Social Media
different crises and related 804 Sources
resources in different formats.
(expanding)
In term of data formats, existing social media datasets tend to be based on Twitter data,
therefore, they directly follow the twitter message format and contains small short text
with user information and sometimes user GPS coordinates that can be used for
identifying the location of particular events.
20
|
P a g e
D4.1
Enriched
Semantic
Models
of
Emergency
Events
created for monitoring the USA presidential elections of 201612 has custom fields
about candidates in each reports.
Finally, many of the news agency based datasets such as the GDELT13, ACLED14 and
Phoenix Data Project datasets15 follow the CAMEO [8] model that provides a
taxonomy to identify the type of event mentioned as well as the actors involved.
Since there are many similarities between the different data models listed in Table
Error! Reference source not found., and since each dataset uses different
terminology for describing similar type data, we decided to translate the data structures
found in each dataset into the same format (Table 3).
12
Ushahidi
USA
Elections,
https://usaelectionmonitor.ushahidi.io.
13
GDELT,
http://www.gdeltproject.org/data.html#rawdatafiles.
14
ACLED,
http://www.acleddata.com/data/.
15
Phoenix
Data
Project,
http://phoenixdata.org/data.
21
|
P a g e
D4.1
Enriched
Semantic
Models
of
Emergency
Events
22
|
P a g e
D4.1
Enriched
Semantic
Models
of
Emergency
Events
As with the Ushahidi data structures, there are some features that may not be useful for
the COMRADES model. For instance, the CrisisNet dataset provides data source
information that is not necessary for the COMRADES model as this is not used by the
different tools developed by the COMRADES platform and the Ushahidi platform.
By analysing the different properties of each dataset, we distinguish four different data
structures: 1) report, events and posts; 2) geolocation information; 3) user and account
23
|
P a g e
D4.1
Enriched
Semantic
Models
of
Emergency
Events
information, and; 4) data sources. The report, events and posts hold the main
documents of the datasets whereas user and account information represent document
creators or the organisation or information sources involved in events. Geolocation
data structures hold information related to events and users. Finally, data source is only
used by the CrisisNet dataset, and stores information about how data is accessed.
In general, it appears that the crisis related datasets hold more information about events
than the Ushahidi data structures even though Ushahidi can use forms for modelling
such type of data. In particular, the datasets that follow the CAMEO taxonomy support
different types of events and actors, with many different properties such as the actors
involved in particular events as well as the organisations they belong to.
In summary, the analysis of the crisis related datasets show that the Ushahidi data
structure already support many of the requirement of the existing dataset except for the
representation of domain specific information (e.g. Twitter posts) and rich user or
event model that is mostly given by the CAMEO taxonomy and the Twitter data.
As part of the first step for designing the ORSD, we need to define the purpose, scope
and level of formality of the model. The aim of the COMRADES project is to create a
community resilience platform that provides a software that help communities to
reconnect, response, and recover from crisis situations by providing a representation
that allows communities or individuals to reconnect, respond to, and recover from
crisis situations. In other words, the model needs to enable the representation of
individuals and group of individuals, allow communication between individuals,
enable communities to respond to crises by gathering critical information, and recover
by allowing the organization of resources and aid.
In order to do so, the COMRADES project aims to build on top of the Ushahidi
platform by providing new intelligent algorithms aimed at helping communities,
citizens, and humanitarian services with analysing, verifying, monitoring, and
responding to emergency events.
In general, the model needs to be general enough to cover a wide variety of scenarios
and therefore be flexible. In the case of ontological development, a flexible model
needs to offer relatively loose semantics (i.e. avoid overspecialisation) so that new
types of users or resources do not require important ontological modifications.
In terms of scope, the model aims to support the modelling of crises and their recovery
through social media analysis and manual data input.
24
|
P a g e
D4.1
Enriched
Semantic
Models
of
Emergency
Events
The COMRADES model needs to satisfy different user groups such as governmental
organizations and non-governmental groups, as well as individuals. Such individuals
may have many different aims and goals. The model also needs to support algorithmic
needs by allowing software to assert new information themselves (e.g. trustworthiness,
extracted entities).
Following the interviews with stakeholders, we distinguish four different type of users
for the COMRADES model:
(1) Platform stakeholders: The individuals or organisations that supply
community platforms such as Ushahidi.
(2) Local community groups: Community members of local activist groups.
(3) Responders: Organisations and individuals that use information gathered by
platforms in order to organise the response and recovery of a particular crisis.
(4) Individuals and small citizen groups: Individuals or small communities that
are affected by a particular crisis.
Each of these user groups have different needs that define how the COMRADES
model will be used. For instance, platform stakeholders need to make sure that the
platform can be deployed easily. Community groups need to be able to assess a given
crisis situation. Responders require the ability to analyse a given situation and organise
recovery. Finally, individuals and citizen groups need to understand the situation and
to be able to ask for assistance by reporting crisis events.
In summary, the COMRADES model needs to cater for the four different user types
mentioned above, as well as to be useful in situations where users are looking for or
are willing to provide information about crises and where responders are
organising resources in order to solve a particular situation.
A key part of the ORSD is to define competency questions that define what types of
queries the model should be able to support. As previously discussed, we perform two
types of analysis: 1) study stakeholder interviews for better understanding their needs;
2) analyse the work package requirements of the COMRADES project, and; 3) study
the structure of the Ushahidi community.
25
|
P a g e
D4.1
Enriched
Semantic
Models
of
Emergency
Events
In the following sections we convert the knowledge sources discussed in the previous
section into competency questions. If a particular question is already covered by
another knowledge source, we do not add an additional question.
5.3.1 Work
Package
Requirements
For T3.2, T3.2 and T3.3, we obtain the following competency questions:
As previously discussed, the interviews conducted with the stakeholders showed the
need for a strong reporting model with methods for managing the access to
information, multiple data sources and task management. The competency questions
derived from the interviews are listed below:
26
|
P a g e
D4.1
Enriched
Semantic
Models
of
Emergency
Events
CQ30:
What
are
the
type
of
posted
document
(Forms)
available
in
the
platform?
CQ31:
What
are
the
type
of
media
available?
CQ32:
What
are
the
categories
of
documents
that
are
in
the
platform?
CQ33:
What
are
the
document
collections
in
the
platform?
CQ34:
What
are
the
different
user
roles?
The COMRADES model also needs to be able to be queried for retrieving different
properties from each model class stored in the model:
27
|
P a g e
D4.1
Enriched
Semantic
Models
of
Emergency
Events
6 Term
Glossary
Now that we have extracted a set of competency questions, we can extract the terms
that are the most used in the questions in order to help the development of the model.
The idea is that the most frequent terms are key aspects of the model and need to be
modelled prominently (e.g. classes), whereas infrequent term may not need to be
represented as prominently in the final model.
28
|
P a g e
D4.1
Enriched
Semantic
Models
of
Emergency
Events
The typical NeOn approach requires competency questions that are linked to actual
data in order to extract each of those terms. This is different from the type of
competency question that we have, since the previously listed questions are
conceptualised from data structures and general interviews. As a result, our
competency questions are not data specific and are more conceptual.
The NeOn methodology [9] distinguishes three different types of terms: 1) competency
question terms; 2) competency question answer terms, and; 3) object terms. The
competency question terms are the top words that appears in competency questions,
whereas answer terms are the ones that appears in the answer of the competency
questions. The object terms are the named entities that are extracted from competency
questions and answers.
Since we do not have instantiated competency questions that both contain data specific
questions and answers, we generate the glossary terms as follow: 1) we extract the
most frequent terms appearing in our competency questions; 2) we extract the most
frequent terms appearing in the data structures that we have used for creating our
competency questions (Error! Reference source not found. and Error! Reference
source not found.). The idea is that besides the terms extracted from the competency
questions, the property descriptions and names of the different datasets and the
Ushahidi can help the identification of the key concept and attributes of the
COMRADES model.
The top terms extracted from the competency questions are listed in Table 4.
Document (27), Event (17), User (13), created (10), Type (9), Information (8), Message (8), Collection (7),
Competency
Category (6), Platform (6), Actor (6), Media (6), updated (6), associated (5), related (4), Role (4), Report (4),
Question
name (3), description (3), Source (3), language (3), Account (3), Events (3), reliable (3).
created (13), allowed_privileges (11), id (10), URL (10), Form (9), data (9), User (8), Type (8), updated (7),
Data
Post (6), Message (6), Media (6), Multiple (5), Event (5), Creator (4), Posts (4), description (4), sources (4),
Structures
Collection (4), Crises (4), social (4), contact (4), name (3), annotated (3).
Table
4
Top
Terms
Extracted
from
the
Competency
Question,
Crisis
Related
Dataset
and
the
Ushahidi
Data
Structures.
7 Summary
In order to specify the COMRADES model, we analysed the COMRADES project
requirements and the Ushahidi platform as well as related crisis datasets in order to
extract the data requirements for the model. We also analysed user requirements from
stakeholder interviews in order to derive the model requirements from the future user
perspectives. This approach was based on the structural and qualitative design
approach discussed in the introduction. The analysis helped us to better understand the
aims of the model, its future usage and users. We also produced a set of competency
questions that form the basis of the model implementation. The next part of this
document reuses those findings for fully specifying and implementing the
COMRADES ontology. In particular, we reuse the competency questions for guiding
29
|
P a g e
D4.1
Enriched
Semantic
Models
of
Emergency
Events
the development of the ontology as well as the common data structures observed in the
Ushahidi platform and crisis related datasets.
30
|
P a g e
D4.1
Enriched
Semantic
Models
of
Emergency
Events
8 Introduction
In the previous sections, we created the Ontology Requirement Specification
Document (ORSD) [1] for the COMRADES model based on multiple analyses and
extracted a set of competency questions as well as a glossary of key terms.
In the following sections we create the COMRADES ontology16 based on the ORSD
by identifying the key components of the model and then identifying the relations
between each component as well as the properties of the ontology. We also integrate
the ontological model with different existing ontologies for improving the model
interoperability and usability. For simplifying the usage of the model between different
communities, we also translate the ontology classes, properties and relation to different
language. Finally, we discuss how domain knowledge can be added to the
COMRADES ontology.
9 Model
Principles
Before introducing the COMRADES model, we discuss the main approach used for
organising the gathering and organisation of information and resources about crises.
Many of the datasets and data structure analysed when creating the ORSD are centred
on reports and the ingestion of external documents rather than the direct modelling of
events. In this context we decide to centre the COMRADES model on reports rather
than events where reports are clustered together for describing events that result in real
world situations and external documents (or other information sources such as other
reports or an informant) are used for documenting what is discussed in a report.
Reports can be used in different ways for documenting events, needs, resources and so
on and form the base of the COMRADES model. The advantage of using a report
centred approach is that it allows a more organic gathering of information related to
events without needing rigid data structures. This is particularly suitable for resilience
platforms that are deployed in large variety of situations where the types of reports are
context specific.
We use a situation model for documenting how events affect their environment.
Typically, a situation would involve different entities (e.g. local population, building,
political situation) and would define the state that was induced by the situation. For
example, a building explosion (situation) would induce a particular building (entity) to
be collapsed (status).
16
COMRADES
Ontology,
http://socsem.open.ac.uk/ontologies/comrades.
31
|
P a g e
D4.1
Enriched
Semantic
Models
of
Emergency
Events
For representing users and the permissions associated documents, reports and other
model classes we use the concepts of roles and accounts where user hold roles that are
associated with user permissions. We also use the concept of user account that are used
for holding platform specific user information such as the user contribution reliability.
Finally, we add a simple model for representing tasks that can be attached to reports
and assigned to users.
10 Ontology
Components
Based on the previous model principles, we discuss the classes, relations and properties
of the COMRADES model. We refer to the COMRADES namespace as com in the
following sections.
32
|
P a g e
D4.1
Enriched
Semantic
Models
of
Emergency
Events
The competency questions show that many properties and relations are focused on
different types of documents and that both the Ushahidi platform and the crisis related
dataset model prefer modelling event indirectly using user submitted reports or
automatically generated documents. As a consequence, we decide to centre the
representation crisis related information around the concepts of com:Report,
com:Situation,
com:Event, com:Document
and
com:Informant.
Besides associating reports to documents and informants, reports are also connected
with the events (com:Event) and situations (com:Situation) that they are
describing or updating. Events are things that happens or takes place whereas
situations are used for representing the states (com:State) of entities
(com:Entities).
The different type of information collected by the COMRADES model can be grouped
and categorised in different ways. For instance, com:Report
and
com:Document
33
|
P a g e
D4.1
Enriched
Semantic
Models
of
Emergency
Events
Another important part of the model is the representation of the actors, organisations
and the accounts that are used for representing the creator of com:Document
and the
person that posted a
com:Report
as well as the people and organisation that created
com:Situation or com:Event.
Users (com:Agent) are all defined as a subclass of com:Informant that can be used
as the information source of com:Report when no document source
(com:Document) is available but when an information comes from a known
individual or person.
For contributions within the COMRADES model, the com:Accout class is used for
abstracting contributor specific information that only exist within the COMRADES
model such as the number of documents created by a com:Agent.
The Ushahidi platforms also supports the assignment of tasks to platform users. We
support tasks by adding the com:Task
to the model and linking it to com:Account
and com:Report so that reports can be used for assigning tasks.
10.2 Properties
Contrary to relations, properties are not associated with other classes of the
COMRADES ontology. The different properties required for each classes can be
directly extracted from the competency questions as well as the previously analysed
data structures. The properties of the classes displayed in Figure 3 are listed in the
following table (Table 5):
34
|
P a g e
D4.1
Enriched
Semantic
Models
of
Emergency
Events
informativeness How informative is the report (i.e. useful for crisis analysis).
Report
language The language of the report.
informativeness How informative is the situation (i.e. useful for crisis analysis).
informativeness How informative is the event (i.e. useful for crisis analysis).
35
|
P a g e
D4.1
Enriched
Semantic
Models
of
Emergency
Events
It is important to note that the reliability of the different elements of the ontology are
not represented as properties. Instead, the reliability and trustworthiness of resources is
represented using the Veracity ontology17 [10].
Although many ontologies have been designed for representing crises or related
information, most of them do not focus on the concepts of report and document. Rather
than using those concepts, existing models prefer focusing on the event representation
of emergency crises and ignore the collection of evidences and user submitted reports
as a mean for representing event related information. Task representation is also
generally absent from crisis related ontologies.
Few ontologies have been designed for modelling event in crises situations such as
MOAC18 (Management of a Crisis) and HXL (Humanitartian eXchange Lnaguage)
[11]. However, despite modelling resources, processes, damages, and disasters (fire,
17
Veracity
Ontology,
http://purl.org/net/veracity/ns.
18
MOAC,
http://www.observedchange.com/moac/ns/.
36
|
P a g e
D4.1
Enriched
Semantic
Models
of
Emergency
Events
people trapped, medical emergency), these models do not provide representations for
documents and reports. The need for more complete models was highlighted by Liu et
al. [12]. Moreover, existing semantic models were mostly designed for providing a
static view of emergency situation, where elements are captured but not their temporal
evolution.
Most of the ontologies reused in the COMRADES ontology are based on widely used
ontology. The main reason for reusing such kind of ontologies is that it improves the
reusability of the model by allowing it to be used similarly to existing ontologies.
The COMRADES ontology reuses five different ontologies for modelling its
components and properties. The main ontology reused for representing the different
elements of the COMRADES model is the SIOC ontology [13] that provides
constructs for representing online communities. We reuse the SIOC ontology for
representing documents, reports, collections, permissions and roles as well as a
different properties and relations of the model.
We also reuse the FOAF21 (Friend Of A Friend) ontology for representing users in the
model as it integrates well with the SIOC ontology and provides ways for representing
agents and organisations.
For modelling geolocation, we use the Geonames22 and WGS8423 ontologies as they
provide basic representations of geolocation coordinates that can be used for
identifying the location of events and other resources.
The Dublin Core24 model is also used as it provides many properties, relation and
classes specifically designed for modelling documents. Finally, for representing the
trustworthiness of the different content of the platform we us the Veracity ontology
[10] as it provides methods for asserting the reliability of different resources. The
different mappings are described in Figure 3.
19
CURIO,
http://purl.org/net/curio/.
20
SIOC,
http://rdfs.org/sioc/spec/.
21
FOAF,
http://xmlns.com/foaf/spec/.
22
Geonames
Ontology,
http://www.geonames.org/ontology#.
23
WGS84
Ontology,
https://www.w3.org/2003/01/geo/#vocabulary.
24
Dublin
Core,
http://dublincore.org/documents/dcmi-terms/.
37
|
P a g e
D4.1
Enriched
Semantic
Models
of
Emergency
Events
12 Multilingual
Support
One of the aims of the COMRADES model is to support multiple languages so that the
model can be used by different communities around the world. In order to do so, we
translate the name of the classes, properties and relations of the ontology in different
languages using the language tagging features of RDF [14].
At the moment, we only translate labels into Spanish and French and do not translate
the description of the ontology classes, properties and relations. Nevertheless, such
translation can be added if necessary later on and it does not affect the usage of the
COMRADES model as the ontological concepts are translation independent.
13 Domain
Knowledge
The specification of domain knowledge in the COMRADES ontology is mostly
centred on: 1) The definition of user organisations, religious groups and ethnic groups;
2) The specification of report types, document types and event types, and; 3) The
definition of categories, entity types and entity statuses.
Although different methods can be used for creating such resources such as creating
domain specific gazetteers, we decided to not enforce any specific domain knowledge
in order to simplify the integration of the COMRADES model into existing dataset and
tools.
Each tool and dataset can specify its own domain knowledge depending on the model
usage specifics. If interoperability between different datasets or model is required,
resources can be linked to external entity resources such as DBpedia25 so that similar
entities or resources can be identified more easily even if of the COMRADES ontology
is used in different contexts.
14 Summary
In the previous sections we introduced the COMRADES ontology based on the ORSD.
First, we analysed the competency questions and ORSD glossary in order to create a
high level version of the COMRADES model. Second, we implemented the ontology
25
DBpedia,
http://dbpedia.org.
38
|
P a g e
D4.1
Enriched
Semantic
Models
of
Emergency
Events
using RDF/OWL and aligned the implemented ontology with existing ontological
models. We also translated ontological classes, properties and relation to different
language for simplifying the usage of the ontology in different communities.
During the model development we decided to not implement any specific domain
knowledge in order to simplify the model by not enforcing any default domain
knowledge that can complicate the model integration into existing tools. Rather than
proposing default domain knowledge, the COMRADES model provides classes that
can be extended depending on the model usage or the integrated datasets. This allows
for a more targeted usage of the model and a simpler integration of the model into
existing applications or tools.
39
|
P a g e
D4.1
Enriched
Semantic
Models
of
Emergency
Events
15 Introduction
Although different methods can be used for evaluating ontologies, many methods rely
on mapping existing data and then evaluating if the competency questions can be
verified on real data. Since we do not have datasets that cover all the parts of the
COMRADES ontology, we decided to perform a theoretical evaluation by checking if
the classes and properties of the COMRADES ontology can be mapped to the
competency questions.
In the following section, we discuss the evaluation approach and how competency
questions are mapped to the ontology properties, relations and classes of the
COMRADES model. We also show how the current model represents the current
competency questions.
16 Ontology
Evaluation
In order to evaluate the COMRADES ontology, we first extract the key classes,
properties and relations associated with each competency questions. Then, we check if
a path exists between each element of the extracted properties, relations and classes.
Finally, we assert if a competency question is validated based on the path existence.
For each competency question, we list the classes and relations that needs to be
connected and evaluate if the competency question is validated (i.e. if there is a path
between the classes, relations and properties associated with the competency question).
The mapping and results for each competency question is listed below (Table 6):
CQ PATH CQ VALID?
CQ1 com:Message
Yes (COUNT)
40
|
P a g e
D4.1
Enriched
Semantic
Models
of
Emergency
Events
Yes (Agent
CQ20 com:Event
com:describes
com:Report
com:informant
com:Informant
properties)
41
|
P a g e
D4.1
Enriched
Semantic
Models
of
Emergency
Events
42
|
P a g e
D4.1
Enriched
Semantic
Models
of
Emergency
Events
16.2 Results
Similarly, some of the competency questions are ambiguous with the loose usage of
the term document. In the implementation of the COMRADES model, some of those
documents are actually reports. We corrected those mappings when validating the
competency questions.
There are also some competency questions that are not represented directly but can be
represented by adding subclasses to the existing model. For instance, the com:Agent
involved in an com:Event can be represented through a com:Situation and a new
type of com:Entity.
43
|
P a g e
D4.1
Enriched
Semantic
Models
of
Emergency
Events
17 Conclusions
We introduced the COMRADES ontology as a model that supports the representation
of events and related information during emergency crises. We based the development
of the model on the NeOn methodology [1] and on a qualitative and structural design
approach [2] and evaluated the COMRADES ontology by mapping competency
questions to ontology properties, relations and classes.
Although the model is still not populated with the input and output data of the different
components of the COMRADES platform, since they are still under development, we
provided a partial evaluation of the COMRADES ontology by mapping a list of
competency questions to the COMRADES ontology properties, relations and classes.
Competency questions are commonly used in ontology evaluation practices, to test the
capability of the model in answering all required queries.
Since the needs and requirements of the COMRADES resilience platform are likely to
evolve during the project, we designed the model to be easily extensible. For instance,
additional types of data and reports can be added to the model and new types of events
or resources can be specified. Further evaluations will be performed on the model in
the COMRADES platform when further data becomes available.
44
|
P a g e
D4.1
Enriched
Semantic
Models
of
Emergency
Events
Appendix
18 References
[1]
M.C.
Surez-Figueroa,
A.
Gmez-Prez,
M.
Fernndez-Lpez,
The
neon
methodology
for
ontology
engineering,
in:
Ontol.
Eng.
a
Networked
World,
2012:
pp.
934.
doi:10.1007/978-3-642-24794-1_2.
[2]
G.
Burel,
Community
and
Thread
Methods
for
Identifying
Best
Answers
in
Online
Question
Answering
Communities,
(2016).
http://oro.open.ac.uk/46144/
(accessed
November
30,
2016).
[3]
M.C.
Surez-Figueroa,
A.
Gmez-Prez,
B.
Villazn-Terrazas,
How
to
write
and
use
the
ontology
requirements
specification
document,
in:
Lect.
Notes
Comput.
Sci.
(Including
Subser.
Lect.
Notes
Artif.
Intell.
Lect.
Notes
Bioinformatics),
2009:
pp.
966982.
doi:10.1007/978-3-642-05151-7_16.
[4]
M.F.
Lopez,
A.
Gomez-Perez,
J.P.
Sierra,
A.P.
Sierra,
Building
a
chemical
ontology
using
Methontology
and
the
Ontology
Design
Environment,
IEEE
Intell.
Syst.
14
(1999)
3746.
doi:10.1109/5254.747904.
[5]
S.
Staab,
R.
Studer,
H.P.
Schnurr,
Y.
Sure,
Knowledge
processes
and
ontologies,
IEEE
Intell.
Syst.
Their
Appl.
16
(2001)
2634.
doi:10.1109/5254.912382.
[6]
D.
Vrandecic,
S.
Pinto,
C.
Tempich,
Y.
Sure,
The
DILIGENT
knowledge
processes,
J.
Knowl.
Manag.
9
(2005)
8596.
doi:10.1108/13673270510622474.
[7]
M.C.
Surez-Figueroa,
A.
Gmez-Prez,
M.
Fernandez
-
Lopez,
The
Neon
methodology
framework:
a
scenario
-
based
methodology
for
ontology
development,
Appl.
Ontol.
10
(2015)
107145.
doi:10.1007/978-3-642-24794-
1.
[8]
P.
Schrodt,
.
Yilmaz,
The
CAMEO
(conflict
and
mediation
event
observations)
actor
coding
framework,
Annu.
Meet.
.
(2008).
http://eventdata.parusanalytics.com/papers.dir/APSA.2005.pdf
(accessed
December
13,
2016).
[9]
A.
Prez,
M.D.F.
Baonza,
B.
Villazn,
Neon
methodology
for
building
ontology
networks:
Ontology
specification,
Methodology.
(2008)
118.
doi:10.1016/j.landurbplan.2011.04.007.
[10]
G.
Burel,
A.E.C.
Basave,
M.
Rowe,
A.
Sosa,
Representing,
proving
and
sharing
trustworthiness
of
web
resources
using
Veracity,
Knowl.
Eng.
Manag.
by
Masses.
(2010)
421430.
http://ekaw2010.inesc-
id.pt/accepted_short_papers.html.
[11]
C.
Keler,
C.
Hendrix,
The
Humanitarian
eXchange
Language:
Coordinating
disaster
response
with
semantic
web
technologies,
Semant.
Web.
6
(2015)
5
21.
doi:10.3233/SW-130130.
[12]
S.
Liu,
D.
Shaw,
C.
Brewster,
Ontologies
for
crisis
management:
a
review
of
state
of
the
art
in
ontology
design
and
usability,
ISCRAM
2013
-
10th
Int.
Conf.
Inf.
Syst.
Cris.
Response
Manag.
(2013)
349359.
http://windermere.aston.ac.uk/~kiffer/papers/Liu_ISCRAM13.pdf.
[13]
J.G.
Breslin,
S.
Decker,
SIOC:
an
approach
to
connect
web-based
communities,
Int.
J.
Web
Based
Communities.
2
(2006)
133142.
doi:10.1504/IJWBC.2006.010305.
45
|
P a g e
D4.1
Enriched
Semantic
Models
of
Emergency
Events
46 | P a g e