Volunteered geographic information (VGI) is a potential source of information to complement other sources in flood risk management.

An experimental evaluation of a

crowdsourcing-based approach for flood risk


Ranieri de Brito Moreira, L´

ıvia Castro Degrossi, and

Jo˜o Porto de Albuquerque


University of S˜o Paulo,


Institute of Mathematical and Computing Sciences,

S˜o Carlos – SP, Brazil




Volunteered geographic information (VGI) is a potential source

of information to complement other sources in flood risk management.

However, there is still not enough experimental evidence about the usefulness of VGI in different situations and scenarios. We conducted an experimental evaluation for verifying if VGI, obtained through a crowdsourcing platform, can be useful for the flood risk management context.

The experiment occurred in two points of the watershed of S˜o Cara los/SP in Brazil with 15 participants. The results show that volunteered geographic information is, in average, comparable to sensor data. Thus, we can conclude that using crowdsourcing for producing VGI can be a useful source for flood risk management.

Keywords: Experimental Evaluation, Experimental Software Engineering, Volunteered Geographic Information, Flood Risk Management, Disaster Management 1 Introduction The increase in the number of natural disasters has been a growing concern for national and international organizations, due to their environmental risks and potential hazards [1]. The occurrence of a natural disaster is related to environmental characteristics and social system vulnerability [2]. The prevention of natural disasters and reduction of the social system vulnerability are themes of major concern in a local, national and international level [3].

Disaster management (DM) consists of a set of activities that aim at preventing or reducing social and economic impacts of natural disasters [4, 5]. DM is divided into four main phases: mitigation, preparedness, response and recovery.

In each of these phases, different types of actions, policies and information are required. In particular, investments in the first two activities (mitigation and preparedness) are expected to greatly reduce the impact and losses caused by a natural disaster.

According to “The International Disaster Database”1, flood is the disaster with more occurrences in the world, and in particular in Brazil. Flood risk management (FRM) is a subarea of disaster management thataims to control a flood, Extracted From: http://www.emdat.be prepare for its occurrence and mitigate its impacts [6]. In particular, an efficient preparation requires the monitoring of risk areas before a disaster [7,8]. Accurate information about the current state of environmental variables is mandatory for enabling the simulation of effects and severity of a disaster [6, 8]. This type of simulation helps to reduce the impact of floods, allows the affected population to take preventive measures and enables agencies to develop emergency response actions.

However, effectively managing flood risk depends on the availability of up-todate and accurate information about environmental variables for improving situational awareness [4]. Such information can be obtained from different sources, including sensors, satellites and other technologies. In addition, another source that can be used is information provided by volunteers, so-called Volunteered Geographic Information (VGI) [9].

The use of “human as sensors” can be a valuable source of information in the context of disaster management, due to the potentially large number of volunteers [4]. Indeed, with the increase on the number of mobile devices with GPS (Global Positioning System), along with the interactions enabled by Web 2.0, the creation of geographic information by the general public was facilitated.

However, despite the advantages of volunteers’ participation in gathering information, several challenges still need to be faced.

The quality of data generated by volunteers is a major concern. Volunteered information can be created with omissions, exaggerations, and even errors. Another challenge is the lack of structure of volunteered information. This is often regarded as insufficiently structured, documented and validated [10], while information collected with devices such as satellites and sensors has well-defined patterns and structures. Thus, there is a need for verifying whether volunteered information can be useful for the flood risk management context, i.e. whether it is sufficiently structured and accurate so that it can contribute to this application domain.

In this context, Degrossi et al. [11] proposed a crowdsourcing-based approach for gathering useful volunteered information for flood risk management context.

The authors performed an experimental evaluation of the approach with 10 participants in one point of the watershed of S˜o Carlos/SP city. Due to the small a number of participants and points of the watershed, there is still not enough experimental evidence to support the statement of usefulness of volunteered information in this context. Thus, this paper presents an experimental evaluation for verifying if volunteered information, obtained through a crowdsourcing platform, can be useful for flood risk management context in Brazil.

The remainder of this paper is structured as follows: Section 2 presents the theoretical background for this work. In section 3, the experimental evaluation is explained. Section 4 presents the results. Finally, Section 5 presents the conclusions of this study.

2 Background In the next sections, we present the concepts and underlying principles related to this work.

2.1 Flood Risk Management Flood Risk Management is the process of managing a flood risk situation. This aims at controlling a flood, being prepared for it and minimizing its impacts [6]. For this, FRM comprises actions before, during and after a flood. These actions involve early warning and forecasting scenario, contingency plans and restoration [12].

Among natural disasters occurring worldwide, floods are the most frequent, representing 30% of natural disasters [8]. The increase in the number of floods is associated with climate change, being aggravated due to urban sprawl and the phenomenon of rapid urbanization without the availability of essential services [13, 14]. The number of affected people and financial and economic damages increase every year [15].

The preparation phase aims at reducing residual risk through early warning systems and measures that can be taken to minimize flood impacts. For this, the constant monitoring of risks and danger assessment is required. Every dollar invested in flood prevention reduces in U$ 25 dollars the damage incurred in a natural disaster [12].

Currently, geographic information and related technologies play a fundamental role in all phases of FRM. Natural disasters are typically monitored using different devices such as sensors, satellites, seismometers, among others. However, these devices do not provide information about the impacts of a disaster. Volunteered geographic information can be a valuable source of information about the impacts of natural disasters [16], due to the potentially large number of volunteers who act as “sensors”, noticing important parameters of a disaster in a local environment [4].

2.2 Volunteered Geographic Information

Erst, the creation of geographic information was carried out by official agencies.

However, with the increase of interactions made possible by Web 2.0, the use of devices with GPS (Global Positioning System) and the access to broadband Internet, geographic information is being produced by people who have little formal qualification. This type of information is called Volunteered Geographic Information (VGI) [9].

Among the advantages associated with VGI, researchers emphasize its use to enhance, update or complement an existing geospatial data [9]. In different scenarios, volunteered information have better quality than data provided by specialized organization, since in different parts of the world this information is outdated or they were acquired with old and less precise technologies than those currently available to the general public [17].

However, despite the advantages of citizens’ participation in collecting information, there are a lot of challenges to be faced. Data quality is a major concern.

Information from many individuals can lead to doubts about it credibility [18].

According to [10], the credibility of VGI can be understood as a subjective concept that describes whether a piece of information can be trusted, considering any possible intentional or unintentional omission or exaggeration error. Moreover, it is not known beforehand how and from the information will be provided.

Another challenge faced refers to the location. Unlike in-situ sensors, people are in constant movement, so the observations they made need to be located so they become useful [4]. Furthermore, VGI is often regarded as poorly structured, documented and validated [10].

Recent natural disasters have shown that volunteered information can improve situational awareness by providing an overview of the present situation [4]. This fact occurs because VGI offers a great opportunity to raise awareness due to the potentially large number of volunteers that observe important parameters of disaster management in a local environment [4, 9, 19]. Still, despite recent advances in the development of sensors, their observations may not be available due to communication interruptions or even the destruction of the sensor, besides that a sensor is not able to measure certain phenomena such as hail storms [4].

In this scenario, different software platforms, also called crowdsourcing platforms, have been employed for gathering volunteered information and enabling its visualization and analysis. The term crowdsourcing can be understood as a production model where the intelligence and knowledge of volunteers are used to solve problems, create content and develop new technologies. Furthermore, the term refers to a way to organize the work, that involves an information system for coordinating and following up tasks carried out by people [20].

3 Experimental Evaluation

The effectiveness of an approach can be verified through an experimental evaluation conducted in a controlled and well-defined way. For evaluating approach proposed by Degrossi et al. [11], the authors conducted an experimental evaluation for verifying whether volunteered information, obtained through a crowdsourcing platform, was useful for the context of flood risk management. According to Degrossi et al. [11], a volunteered information is considered useful if it can be used in hydrological models or decision support systems. The results indicated that volunteered information can be useful for flood risk management context, since the average of volunteered information about water level was equal to the average of measurements carried out by sensor. However, there is not enough experimental evidence for such statement, since the number of participants in this evaluation was reduced (10 participants). Thus, we saw a need to conduct another experimental evaluation for verifying the usefulness of this type of information.

The experimental evaluation, conducted in this work, aimed at verifying the results obtained in the first experiment. For this, we selected fifteen new participants. The participants were selected from a set of students of a Experimental Software Engineering discipline ministered at the university.

Differently from the first experiment, in this the participants had no experience with the context of flood risk management. The use of volunteers without experience approaches this experimental evaluation of the real scenario of the approach proposed by Degrossi et al. [11], where any volunteer can provide information about environmental variables in the context of flood risk management.

In addiction, unlike the first experiment, two points of the watershed S˜o Cara los/SP city were selected for conducting the experimental evaluation.

Before the experiment, participants went through a training about the crowdsourcing platform (Flood Citizen Observatory2 ), the mechanism used to interpret the water level in the riverbed, and how to insert a report in the platform.

The Flood Citizen Observatory is a crowdsourcing platform that aims at obtaining volunteered information for flood risk management context, “more specifically about flooded areas and water level in the riverbed” [11]. After training, participants were conducted to the two points of the watershed of S˜o Carlos/SP a city for collecting information about water level in the riverbed and insert it in the crowdsourcing platform.

To guide this experimental evaluation, the research question defined by Degrossi et al. [11] was selected. The question aims at determining whether the difference between the average of information provided by participants (volunteers) and the average of data measured by a sensor is statistically significant.

As noted by Degrossi et al. [11], it is important to verify this difference because if it is significant “volunteer information may not reflect the real state of the environmental variable observed, resulting in erroneous predictions about the flood risk”.

Thus, the research question of this work is:

RQ) Is the difference between volunteered information and sensor data significant?

For this research question, two hypotheses were defined, a null hypothesis and

a alternative hypothesis:

Null Hypothesis (H0): the average of volunteered information is equal to the average of sensor data.

µ(volunteer) = µ(sensor) Alternative Hypothesis (H1): the average of volunteered information is different from the average of sensor data.

µ(volunteer) = µ(sensor) For each experiment variable, this is the average of volunteered information and average of sensor data, metrics were defined for measuring the variable.

