«Abstract Volunteered geographic information (VGI) is a potential source of information to complement other sources in ﬂood risk management. ...»
An experimental evaluation of a
crowdsourcing-based approach for ﬂood risk
Ranieri de Brito Moreira, L´
ıvia Castro Degrossi, and
Jo˜o Porto de Albuquerque
University of S˜o Paulo,
Institute of Mathematical and Computing Sciences,
S˜o Carlos – SP, Brazil
Volunteered geographic information (VGI) is a potential source
of information to complement other sources in ﬂood risk management.
However, there is still not enough experimental evidence about the usefulness of VGI in diﬀerent situations and scenarios. We conducted an experimental evaluation for verifying if VGI, obtained through a crowdsourcing platform, can be useful for the ﬂood risk management context.
The experiment occurred in two points of the watershed of S˜o Cara los/SP in Brazil with 15 participants. The results show that volunteered geographic information is, in average, comparable to sensor data. Thus, we can conclude that using crowdsourcing for producing VGI can be a useful source for ﬂood risk management.
Keywords: Experimental Evaluation, Experimental Software Engineering, Volunteered Geographic Information, Flood Risk Management, Disaster Management 1 Introduction The increase in the number of natural disasters has been a growing concern for national and international organizations, due to their environmental risks and potential hazards . The occurrence of a natural disaster is related to environmental characteristics and social system vulnerability . The prevention of natural disasters and reduction of the social system vulnerability are themes of major concern in a local, national and international level .
Disaster management (DM) consists of a set of activities that aim at preventing or reducing social and economic impacts of natural disasters [4, 5]. DM is divided into four main phases: mitigation, preparedness, response and recovery.
In each of these phases, diﬀerent types of actions, policies and information are required. In particular, investments in the ﬁrst two activities (mitigation and preparedness) are expected to greatly reduce the impact and losses caused by a natural disaster.
According to “The International Disaster Database”1, ﬂood is the disaster with more occurrences in the world, and in particular in Brazil. Flood risk management (FRM) is a subarea of disaster management thataims to control a ﬂood, Extracted From: http://www.emdat.be prepare for its occurrence and mitigate its impacts . In particular, an eﬃcient preparation requires the monitoring of risk areas before a disaster [7,8]. Accurate information about the current state of environmental variables is mandatory for enabling the simulation of eﬀects and severity of a disaster [6, 8]. This type of simulation helps to reduce the impact of ﬂoods, allows the aﬀected population to take preventive measures and enables agencies to develop emergency response actions.
However, eﬀectively managing ﬂood risk depends on the availability of up-todate and accurate information about environmental variables for improving situational awareness . Such information can be obtained from diﬀerent sources, including sensors, satellites and other technologies. In addition, another source that can be used is information provided by volunteers, so-called Volunteered Geographic Information (VGI) .
The use of “human as sensors” can be a valuable source of information in the context of disaster management, due to the potentially large number of volunteers . Indeed, with the increase on the number of mobile devices with GPS (Global Positioning System), along with the interactions enabled by Web 2.0, the creation of geographic information by the general public was facilitated.
However, despite the advantages of volunteers’ participation in gathering information, several challenges still need to be faced.
The quality of data generated by volunteers is a major concern. Volunteered information can be created with omissions, exaggerations, and even errors. Another challenge is the lack of structure of volunteered information. This is often regarded as insuﬃciently structured, documented and validated , while information collected with devices such as satellites and sensors has well-deﬁned patterns and structures. Thus, there is a need for verifying whether volunteered information can be useful for the ﬂood risk management context, i.e. whether it is suﬃciently structured and accurate so that it can contribute to this application domain.
In this context, Degrossi et al.  proposed a crowdsourcing-based approach for gathering useful volunteered information for ﬂood risk management context.
The authors performed an experimental evaluation of the approach with 10 participants in one point of the watershed of S˜o Carlos/SP city. Due to the small a number of participants and points of the watershed, there is still not enough experimental evidence to support the statement of usefulness of volunteered information in this context. Thus, this paper presents an experimental evaluation for verifying if volunteered information, obtained through a crowdsourcing platform, can be useful for ﬂood risk management context in Brazil.
The remainder of this paper is structured as follows: Section 2 presents the theoretical background for this work. In section 3, the experimental evaluation is explained. Section 4 presents the results. Finally, Section 5 presents the conclusions of this study.
2 Background In the next sections, we present the concepts and underlying principles related to this work.
2.1 Flood Risk Management Flood Risk Management is the process of managing a ﬂood risk situation. This aims at controlling a ﬂood, being prepared for it and minimizing its impacts . For this, FRM comprises actions before, during and after a ﬂood. These actions involve early warning and forecasting scenario, contingency plans and restoration .
Among natural disasters occurring worldwide, ﬂoods are the most frequent, representing 30% of natural disasters . The increase in the number of ﬂoods is associated with climate change, being aggravated due to urban sprawl and the phenomenon of rapid urbanization without the availability of essential services [13, 14]. The number of aﬀected people and ﬁnancial and economic damages increase every year .
The preparation phase aims at reducing residual risk through early warning systems and measures that can be taken to minimize ﬂood impacts. For this, the constant monitoring of risks and danger assessment is required. Every dollar invested in ﬂood prevention reduces in U$ 25 dollars the damage incurred in a natural disaster .
Currently, geographic information and related technologies play a fundamental role in all phases of FRM. Natural disasters are typically monitored using different devices such as sensors, satellites, seismometers, among others. However, these devices do not provide information about the impacts of a disaster. Volunteered geographic information can be a valuable source of information about the impacts of natural disasters , due to the potentially large number of volunteers who act as “sensors”, noticing important parameters of a disaster in a local environment .
2.2 Volunteered Geographic Information
Erst, the creation of geographic information was carried out by oﬃcial agencies.
However, with the increase of interactions made possible by Web 2.0, the use of devices with GPS (Global Positioning System) and the access to broadband Internet, geographic information is being produced by people who have little formal qualiﬁcation. This type of information is called Volunteered Geographic Information (VGI) .
Among the advantages associated with VGI, researchers emphasize its use to enhance, update or complement an existing geospatial data . In diﬀerent scenarios, volunteered information have better quality than data provided by specialized organization, since in diﬀerent parts of the world this information is outdated or they were acquired with old and less precise technologies than those currently available to the general public .
However, despite the advantages of citizens’ participation in collecting information, there are a lot of challenges to be faced. Data quality is a major concern.
Information from many individuals can lead to doubts about it credibility .
According to , the credibility of VGI can be understood as a subjective concept that describes whether a piece of information can be trusted, considering any possible intentional or unintentional omission or exaggeration error. Moreover, it is not known beforehand how and from the information will be provided.
Another challenge faced refers to the location. Unlike in-situ sensors, people are in constant movement, so the observations they made need to be located so they become useful . Furthermore, VGI is often regarded as poorly structured, documented and validated .
Recent natural disasters have shown that volunteered information can improve situational awareness by providing an overview of the present situation . This fact occurs because VGI oﬀers a great opportunity to raise awareness due to the potentially large number of volunteers that observe important parameters of disaster management in a local environment [4, 9, 19]. Still, despite recent advances in the development of sensors, their observations may not be available due to communication interruptions or even the destruction of the sensor, besides that a sensor is not able to measure certain phenomena such as hail storms .
In this scenario, diﬀerent software platforms, also called crowdsourcing platforms, have been employed for gathering volunteered information and enabling its visualization and analysis. The term crowdsourcing can be understood as a production model where the intelligence and knowledge of volunteers are used to solve problems, create content and develop new technologies. Furthermore, the term refers to a way to organize the work, that involves an information system for coordinating and following up tasks carried out by people .
3 Experimental Evaluation
The eﬀectiveness of an approach can be veriﬁed through an experimental evaluation conducted in a controlled and well-deﬁned way. For evaluating approach proposed by Degrossi et al. , the authors conducted an experimental evaluation for verifying whether volunteered information, obtained through a crowdsourcing platform, was useful for the context of ﬂood risk management. According to Degrossi et al. , a volunteered information is considered useful if it can be used in hydrological models or decision support systems. The results indicated that volunteered information can be useful for ﬂood risk management context, since the average of volunteered information about water level was equal to the average of measurements carried out by sensor. However, there is not enough experimental evidence for such statement, since the number of participants in this evaluation was reduced (10 participants). Thus, we saw a need to conduct another experimental evaluation for verifying the usefulness of this type of information.
The experimental evaluation, conducted in this work, aimed at verifying the results obtained in the ﬁrst experiment. For this, we selected ﬁfteen new participants. The participants were selected from a set of students of a Experimental Software Engineering discipline ministered at the university.
Diﬀerently from the ﬁrst experiment, in this the participants had no experience with the context of ﬂood risk management. The use of volunteers without experience approaches this experimental evaluation of the real scenario of the approach proposed by Degrossi et al. , where any volunteer can provide information about environmental variables in the context of ﬂood risk management.
In addiction, unlike the ﬁrst experiment, two points of the watershed S˜o Cara los/SP city were selected for conducting the experimental evaluation.
Before the experiment, participants went through a training about the crowdsourcing platform (Flood Citizen Observatory2 ), the mechanism used to interpret the water level in the riverbed, and how to insert a report in the platform.
The Flood Citizen Observatory is a crowdsourcing platform that aims at obtaining volunteered information for ﬂood risk management context, “more speciﬁcally about ﬂooded areas and water level in the riverbed” . After training, participants were conducted to the two points of the watershed of S˜o Carlos/SP a city for collecting information about water level in the riverbed and insert it in the crowdsourcing platform.
To guide this experimental evaluation, the research question deﬁned by Degrossi et al.  was selected. The question aims at determining whether the diﬀerence between the average of information provided by participants (volunteers) and the average of data measured by a sensor is statistically signiﬁcant.
As noted by Degrossi et al. , it is important to verify this diﬀerence because if it is signiﬁcant “volunteer information may not reﬂect the real state of the environmental variable observed, resulting in erroneous predictions about the ﬂood risk”.
Thus, the research question of this work is:
RQ) Is the diﬀerence between volunteered information and sensor data signiﬁcant?
For this research question, two hypotheses were deﬁned, a null hypothesis and
a alternative hypothesis:
Null Hypothesis (H0): the average of volunteered information is equal to the average of sensor data.
µ(volunteer) = µ(sensor) Alternative Hypothesis (H1): the average of volunteered information is different from the average of sensor data.
µ(volunteer) = µ(sensor) For each experiment variable, this is the average of volunteered information and average of sensor data, metrics were deﬁned for measuring the variable.