Cadernos de Saúde Pública
ISSN 1678-4464
36 nº.9
Rio de Janeiro, Setembro 2020
ARTIGO
A correspondência entre a estrutura da rede de mobilidade terrestre e a propagação da COVID-19 no Brasil
Vander Luis de Souza Freitas, Thais Cláudia Roma de Oliveira Konstantyner, Jeferson Feitosa Mendes, Cátia Souza do Nascimento Sepetauskas, Leonardo Bacelar Lima Santos
http://dx.doi.org/10.1590/0102-311X00184820
COVID-19; Vigilância em Saúde Pública; Epidemias
Introduction
The world is currently facing a global public health emergency due to the COVID-19 pandemic, declared on March 11th, 2020 by the World Health Organization (WHO) 1. As of June 4th, 2020, more than 6.7 million cases have been confirmed worldwide, with almost 400,000 deaths. In Brazil, the first documented case was in the city of São Paulo on February 26th, 2020. Since then, there are about 615,000 confirmed cases and 34,000 deaths in the national territory 2 (Worldometers COVID-19: coronavirus pandemic. https://www.worldometers.info/coronavirus/, accessed on 15/May/2020; Ministério da Saúde. Painel coronavírus. https://covid.saude.gov.br/, accesssed on 14/May/2020).
The inter-cities mobility network serves as a proxy for the transmission network, vital for understanding outbreaks, especially in Brazil, a continental-dimension country 3,4,5,6,7. The complex network approach 8 emerges as a natural mechanism to handle mobility data computationally, taking areas as nodes (fixed) and movements between origins and destinations as connections (flows). Some networks' measures can be used to find the structurally more vulnerable areas in the context of the current study. The degree of a node is the number of cities that it is connected to, showing the number of possible destinations. The strength captures the total number of people that travel to (or come from) such places in each time frame. From a probability perspective, the cities that receive more people are more vulnerable to SARS-CoV-2. The betweenness centrality, on the other hand, considers the entire network to depict the topological importance of a city in the routes that are more likely to be used.
In this context, this work aims to investigate the correspondences of the measures of networks with the emergence of cities with confirmed cases of COVID-19 in two scales: Brazil and the State of São Paulo. Specifically, we analyze (i) the Brazilian inter-cities mobility networks under different flow thresholds to neglect the lowest-frequency travels, especially in the beginning of the outbreak, when the interiorization of the disease was not yet in progress; and (ii) the correspondence between the statistics of networks and the spreading of COVID-19 in Brazil.
Methods
The most common mobility data used in studies of this nature in Brazil are the pendular travels, from the 2010 national census 9. In this research, we use the Brazilian Institute of Geography and Statistics (IBGE) roads data from 2016 10, which contains the flows between cities considering terrestrial vehicles in which it is possible to buy a ticket (mainly buses and vans). This information seeks to quantify the interconnection between cities, the movement of attraction that urban centers carry out for the consumption of goods and services, and the long-distance connectivity of Brazilian cities. The North region is not included in this paper, because neither the fluvial nor the air models are covered, and their roles are crucial to understand the spreading process there, especially in the Amazon region. According to an investigation of seroprevalence of antibodies to SARS-CoV-2 11, Northern cities are among the ones with the highest values, and six of them are located along a 2,000km stretch of the Amazon river.
The above-cited IBGE data 10 contains the travel frequency (flow) between pairs of Brazilian cities/districts in a general/typical week, considering only origins and destinations, without any information about possible connections between them. The frequencies are aggregated within the round trip, which means that the number of travels from city A to city B is the same as from B to A. We produce two types of undirected networks with a different number N of nodes to capture actions in two scales (country and state):
(1) N = 4,987 - Brazil without the North region: nodes are cities and edges are the flow of direct travels between them.
(2) N = 620 - São Paulo State: a subset of the previous network, containing only cities within the São Paulo State. For simplicity, no further analysis is performed to evaluate the dependency of the network in relation to the state neighboring cities.
Some cities are not present in our networks, due to a simplification that IBGE does: it groups small neighboring municipalities with almost no flow into single nodes. For simplicity, and considering that such places do not contain cases in the first days of the outbreak, they are not individually accounted for in our analysis.
We focus on two versions of each network for certain flow thresholds η, the η 0 (η = 0) that is the original network from the IBGE data and η d (η = d). The d corresponds to the higher flow threshold that produces the network with the largest diameter. The motivation behind η d is to get a threshold high enough to not consider the least frequent connections and to not disregard the most frequent ones 6,12.
We also use COVID-19 data from the state daily bulletins and Brazilian Ministry of Health 2, which are reported by place of residence and notification date. It shows that, until June 4th, 2020, the number of cities with at least one confirmed patient with COVID-19 is 3,851 in the Brazil without the North region network, which corresponds to 77% of the nodes. The analogous for São Paulo is 535 (86% of the nodes). With this data, we track the response of each measure in detecting vulnerable cities according to the evolution of the virus spreading process, as each city notified the first case.
Complex network measures
The topological degree (k) 8 of a node is the number of links it has to other nodes. As here the networks are undirected, there is no distinction between incoming and outgoing edges.
In a connected graph, there is at least one shortest path σ vw between any pair of nodes v and w. The betweenness centrality 8 (b) of a node i is the rate of those shortest paths that pass through i:
Although it is a pointwise measure, it considers non-local information related to all shortest paths on the network. It is worth highlighting that in this context this centrality index is not a transportation (physical) measure but a mobility (process) one. Besides, both degree and betweenness do not account for the network flows here, but the binary (weightless) networks. The diameter of a network is the distance between the farthest nodes, given by the maximum shortest path.
The strength (s) 8 of a node on the other hand is the accumulated flow from incident edges:
In which F ij is the flow between nodes i and j.
Figure 1 Illustration of network measures: strength, degree, and betweenness.
|
We assess which of the computed measures (s, k, and b) of the mobility networks better approximates the spreading of COVID-19 in Brazil. We compare the top-ranked n cities of each measure with the n cities that contain confirmed cases. We vary n from 1 to the number of cities with confirmed cases to chase the transmission dynamics. In order to verify whether the rate of correspondence between the top-ranked cities from the networks' measures and the cities with COVID-19 cases has statistical significance, we verify what are the results of picking cities at random instead of under the measures' guidance via a hypothesis test with simulated distributions 13. We perform 100,000 simulations for each n, choosing n nodes by chance and monitoring what is the rate of positive cases.
Geographical visualization
A geographical approach for complex systems analysis is especially important for mobility phenomena. Santos et al. 14 proposed a graph where the nodes have a known geographical location, and the edges have spatial dependence, the (geo)graph. It provides a simple tool to manage, represent, and analyze geographical complex networks in different domains 6,12 and it is used in this work. The geographical manipulation is performed with the PostgreSQL Database Management System (https://www.postgresql.org/) and its spatial extension PostGIS. Lastly, the maps are produced using the Geographical Information System ArcGIS (http://www.esri.com/software/arcgis/index.html).
Results
This section presents the results of the topological analysis for the previously mentioned networks. Under η 0 (original flows), the Brazil without the North region network has N = 4,987 nodes, |E| = 59,453 edges, <s> = 1,169.4 of average strength, <k> = 23.8 of average degree and <b> = 5,219.4 of average betweenness. When the threshold η d (the higher threshold with maximum diameter) is considered, it has |E| = 2,482, <s> = 414, <k> = 1 and <b> = 1,385.6. The São Paulo State network, on the other hand has N = 620 nodes, |E| = 4,796 edges, <s> = 1,132.4 of average strength, <k> = 15.5 of average degree and <b> = 504.2 of average betweenness, under η 0 . The more restricted version, with η d , possesses |E| = 486 , <s> = 535, <k> = 1.6 and <b> = 169.9.
Two nodes are connected when between them there is a nonzero flow, which means that the number of connections |E| decreases for increasing threshold (η). The resulting networks are undirected and, throughout the paper, both the degree and the betweenness measures do not account for the flows, but weightless edges instead. The diameter of the networks for varying η is computed and the higher threshold with maximum diameter is found for both networks: η d = 507.55 for Brazil whitouth North region and η d = 169.9 for São Paulo State.
Following the (geo)graphs approach, it is possible to visualize nodes and edges of the Brazilian mobility network in the geographical space for η d in
Figure 2 Maps of the Brazil without the North region and São Paulo State networks and the topological strength associated to each node/city for thresholds η 0 and η d .
|
Figure 3 Correspondence (rate) between the n top ranked cities for different network criteria: s, k, and b, and cities that have at least one patient with COVID-19 in Brazil without the North region and São Paulo State.
|
According to
High-frequency oscillations are perceived in
Interestingly, on March 31st, the high-frequency oscillations start to diminish in São Paulo State. A few days further, after April 7th, the betweenness centrality with η d starts to be a bad predictor for Brazil without North region and then for São Paulo State.
Discussion
We present a complex network-based analysis in the Brazilian inter-cities mobility networks towards the identification of cities that are vulnerable to the SARS-CoV-2 spreading. The networks are built with the IBGE terrestrial mobility data from 2016 that have the flow of people between cities in a general/typical week. The cities are modeled as nodes and the flows as weighted edges. The geographical graphs, (geo)graphs, are visualized within Geographical Information Systems.
Two scales are investigated, the Brazilian cities without the North region, and the State of São Paulo. The former does not account for the North due to the high number of fluvial routes and some intrinsic local characteristics that are not represented with the terrestrial data. The State of São Paulo is crucial in the ongoing pandemic since the first documented case was in the state capital, and it is currently one of the main focuses of the virus spreading.
Three network measures are studied, namely the strength, degree, and betweenness centrality, under two flow thresholds to account for different mobility intensities, the original flow data and networks with only the edges with higher weights. Other network measures were preliminarily tested, including the weighted version of the betweenness centrality. However, the integrals of the correspondence curves of
Regarding
Due to their importance in mobility, many cities of
Both s and b with η 0 pose good results at the beginning of the pandemics for the Brazil without North region network, but s alone started to be the best predictor from the end of April. The most important cities, due to their high flow of travelers and their role in the most used routes, are reached first, followed by those with smaller flows, probably because of the interiorization of the virus - the outbreak reaching the countryside cities. This behavior is even more pronounced in São Paulo State, in which s under η d is the best option at first, neglecting lower flow venues, especially in April, but the η 0 started to be the best option from May onwards.
In the ongoing pandemics, from May 1st, the s index with η 0 is currently the best predictor and may help to figure out which countryside cities are about to receive new cases. Moreover, it may help in the following waves of the disease. In the case of another pandemic, one could first compute the strength of the networks according to the last updated data from IBGE and identify the top-ranked cities. In Brazil, it is enough checking on strength at the original data, as we presented, since it produces similar results as the betweenness centrality and is computationally cheaper to obtain. Regarding the State of São Paulo, one better checks on the strength index with threshold η d in the first weeks and only then switch to η 0 . As our results show, the correspondence has statistical significance and, along with other information about the regions such as where are the first notified cases, the pandemic could be closely traced.
It is worth mentioning that the COVID-19 data comes from the Brazilian Health Surveillance System, which is fed with data provided by each city in the country. The information update is a complex and dynamic process and there may be delays or errors in the data transfer. Moreover, considering the size and heterogeneity of Brazil, it is important to highlight that there are differences in the capacity to detect cases opportunely and in the quality of the information 19. On the other hand, in late January 2020, almost a month before the first Brazilian COVID-19, the epidemiological surveillance guidelines and the National Contingency Plan for COVID-19 were published. One of the main objectives of these documents was to provide early guidance to the Brazilian Unified National Health System (SUS) service network, to act in the identification of COVID-19 cases 20. Besides the limitations in the health surveillance system, there is a lack of information about possible intermediate stops between origins and destinations in the IBGE data, as it gives only the travelers' initial and final positions.
As future work, we intend to analyze fluvial and aerial mobility data as well, as they include valuable information about the transport of people and goods. The former is fundamental to the discussion of the dynamics for the Brazilian North region, especially the Amazon, and the latter captures long-range connections which are relevant in a possible future moment of re-emergence of the disease in the country, especially by foreign travelers. Lastly, one could check for correspondences between the measures of networks and data from other epidemics, and analyze control measures based on topological properties associated with the mobility network 21.
Acknowledgments
We would like to thank Jussara R. Angelo (Oswaldo Cruz Foundation, Rio de Janeiro, Brazil) for the valuable discussions.
References
This is an open-access article distributed under the terms of the Creative Commons Attribution License
Cadernos de Saúde Pública | Reports in Public Health
Rua Leopoldo Bulhões 1480 - Rio de Janeiro RJ 21041-210 Brasil
Secretaria Editorial +55 21 2598-2511.
cadernos@fiocruz.br