### Data source

We defined COVID-19 cases as cases of SARS-CoV-2 infection that were newly confirmed as positive by polymerase chain reaction test. The data of COVID-19 cases across the 47 prefectures from March 1 to December 31, 2020, were obtained from the Toyo Keizai Online “The Novel Coronavirus Disease (COVID-19) Situation Report in Japan” by Kazuki Ogiwara, in which dates of cases were based on diagnosis (reporting)^{29}. For further analysis, in the supplementary section, the sensitivity analysis defines symptom onset using the report by the Tokyo government. Persons with an unknown onset date (e.g., no symptoms even if test result is positive, no memory of onset date) are excluded from the graph^{30}.

“LocationMind xPop” data refer to people flows data collected by individual location data sent from mobile phones with users’ consent through applications such as the “docomo map navi” service (map navi, local guide) provided by NTT DOCOMO, INC. The data are processed collectively and statistically to conceal the private information. The original location data are GPS data (latitude, longitude) sent at a frequency of every 5 min at the shortest interval and do not include information that specifies individuals. Informed consent was not obtained, but since the data were anonymized to prevent individuals from being identified and personal information, such as gender, age and occupation, were unknown, it was not necessary to obtain informed consent. The mobility data used for the present study showed the daily interprefectural origin–destination (OD) mobility flow volume. Values were calibrated using the national census to represent the total population of Japan. Mobile phone data were provided by LocationMind and comprised OD travel volumes among prefectures in Japan. Travel volumes were aggregated from GPS trajectories created from the mobile phone applications, with no residual information traced back to the individual users. Data provided the number of displacements (or trips) observed between any two consecutive locations where the user spent at least 15 min. The travel flows were adjusted by LocationMind to be representative of the general population. NTT Docomo is the largest mobile phone operator in Japan, and it accounts for approximately 44% of total mobile phone subscribers^{31}.

### Variables

Two types of human mobility were used as objective variables: intraprefectural human mobility, which is the number of people moving within one prefecture, and interprefectural human mobility, which is the number of people moving across prefecture borders. We also examined the dates these time series changed dramatically during 2020 and assessed their relationship with the introduction of government policies to control human mobility in Tokyo, Hokkaido, and Okinawa.

### Statistical analysis

The effects of human mobility suppression do not appear immediately, and ongoing suppression of human mobility is required to reduce the number of infected people. It is difficult to directly compare the relationship between the number of newly infected people and human mobility; therefore, we compared human mobility and the IAI, which is the rate of change in the speed of infection. This measurement generates intuitive results of COVID-19 infection and provides a good indication of infection status. We transformed the new infection number using a logarithm and then took the difference between them. The IAI showed a relationship with the SIR model^{32,33}. The SIR model aims to predict the number of individuals who are susceptible to infection, actively infected, or have recovered from infection. The model assumes that no infection control measures are taken, and infectious diseases will spread to the whole population worldwide. However, in reality, only 0.186% of people were infected in Japan by December 31st. Therefore, the SIR model reduces the number to a simpler exponential function. The SIR model is represented by Eq. (1):

$$ \begin{array}{*{20}c} {\frac{dS}{{dt}}\left( t \right) = – \beta \frac{S\left( t \right)I\left( t \right)}{{N\left( t \right)}}, \frac{dI}{{dt}}\left( t \right) = \beta \frac{S\left( t \right)I\left( t \right)}{{N\left( t \right)}} – \gamma I\left( t \right), \frac{dR}{{dt}}\left( t \right) = \gamma I\left( t \right)} \\ \end{array} $$

(1)

where \(I\left( t \right)\) is the total number of individuals infected, \(S\left( t \right)\) is the number of susceptible individuals, \(R\left( t \right)\) is the number of recovered individuals at day \(t,\) \({\text{and }}\beta\) and \(\gamma\) are parameters at day \(t\). Considering \(\frac{S\left( t \right)}{{N\left( t \right)}} \approx 1\), the SIR model reduces to Eq. (2):

$$ \begin{array}{*{20}c} {\frac{dI}{{dt}}\left( t \right) = \left( {\beta – \gamma } \right)I\left( t \right)} \\ \end{array} $$

(2)

Equivalently, it reduces to Eq. (3):

$$ \begin{array}{*{20}c} {I\left( t \right) = exp\left( {kt} \right)} \\ \end{array} $$

(3)

where \(k = \beta – \gamma\). Here, the effective reproduction number is defined as \(R_{0} = \frac{\beta }{\gamma }\), and \(k = \gamma \left( {R_{0} – 1} \right)\) holds^{32}. Since the daily infection number is \(I\left( t \right) – I\left( {t – 1} \right) = exp\left( k \right)\), Eq. (4)

$$ \begin{array}{*{20}c} {k = \log \left( {I\left( t \right) – I\left( {t – 1} \right)} \right)} \\ \end{array} $$

(4)

Holds, where \(I\left( t \right) – I\left( {t – 1} \right)\) is the number of new infections. However, using this parameter, the infection growth rate changes day by day because of government policies, changes in properties of the virus, or personal infection control. We consider \(k\left( t \right)\), which is parameter \(k\) at day \(t\), and let \(k\left( t \right) – k\left( {t – 1} \right)\) be \(\Delta k\left( t \right)\). This is termed the IAI, which may change according to the influence of government policies, genetic characteristics of the virus, and individual infectious disease control measures. If \(\Delta k\left( t \right) > 1\), the number of infections increases, and if \(\Delta k\left( t \right) < 1\), the number of infections decreases. On the other hand, if \(\Delta k\left( t \right) = 1\), the number of infections does not change. In the SIR model, \(\Delta k\left( t \right)\) is constant for any day \(t\). If \(\gamma \left( t \right),\) which is \(\gamma\) at day \(t\), can be given, the effective reproduction number at day \(t\) becomes \(R_{0} \left( t \right) = k\left( t \right)/\gamma \left( t \right) + 1\), and \(\gamma \left( t \right)\) is difficult to observe.

We calculated the cross-correlation function (CCF)^{34} between the mobility time series and the corresponding IAI time series with a lag from 0 to 21 days (3 weeks) to evaluate the similarity between the two time series of human mobility and the IAI with lags. Both the IAI and human mobility were transformed from a 7-day moving average and standardized as Eq. (5)

$$ \begin{array}{*{20}c} {r_{ }^{hk} \left( l \right) = \frac{{\mathop \sum \nolimits_{t = l}^{T} \left( {k\left( t \right) – \underline {k} } \right)\left( {h\left( {t – l} \right) – \underline {h} } \right)}}{{\sqrt {\mathop \sum \nolimits_{t = l}^{T} \left( {k\left( t \right) – \underline {k} } \right)^{2} } \sqrt {\mathop \sum \nolimits_{t = l}^{T} \left( {h\left( {t – l} \right) – \underline {h} } \right))^{2} } }}} \\ \end{array} $$

(5)

where \(h\left( t \right)\) is the mobility time series, \(k\left( t \right)\) is the IAI time series, \(\underline {h} {\text{and }}\underline {k}\) are the mean values of each series, and \(l\) is the lag. This function provides correlations of two time series for each lag \(l\).

Finally, since the IAI varies from day to day^{35} and clearly has change points where the value changes abruptly, we conducted change point detection analysis using the binary segmentation (BinSeg) method^{36} with the mean and variance to determine factors that dramatically change human mobility and COVID-19 infection. Change point detection provides accurate detection of abrupt and significant changes in the behavior of a time series. In this calculation, \(l\) is the likelihood function of the normal distribution, \(X\) is one domain of the time series, \(\hat{\mu },\hat{\sigma }\) is the estimated mean and variance of the domain \(X\), and the total binary segmented likelihood is given by Eq. (6):

$$ \begin{array}{*{20}c} {P\left( t \right) = l\left( {X_{1} {|}\hat{\mu }_{1} ,\hat{\sigma }_{1} } \right) + l\left( {X_{2} {|}\hat{\mu }_{2} ,\hat{\sigma }_{2} } \right)} \\ \end{array} $$

(6)

where \(l(X_{1} |\hat{\mu }_{1} ,\hat{\sigma }_{1} )\) is the likelihood of the days before day \(t\) and \(l(X_{2} |\hat{\mu }_{2} ,\hat{\sigma }_{2} )\) is the likelihood of the days after day \(t\). The change point is estimated as the date when \(P\) is at its maximum value. We recursively conducted binary segmentation and used the stop criteria of the modified BIC (MBIC) to detect multiple change points. The minimum time between two change points was set to 30 days (1 month).

Network analysis has been used in medical research, such as gene coexpression, disease co-occurrence, and the topological dynamics of the spread of infectious disease^{37,38,39}. In the present study, the amount of human mobility in Japan between February 20th and April 26th, 2020, was retrieved from the above data source for human mobility, and plots of human mobility between the prefectures were constructed. The network graphs were constructed based on the correlation of changes in the numbers of confirmed cases between two geographical areas (e.g., provinces). If the correlation is > 0.5, the two areas are connected in a network. In addition, the difference in interprefectural mobility was calculated and plotted, as shown in Fig. 2. In the plot, several major prefectures are spotted in color (Hokkaido, Tokyo, Kanagawa, Aichi, Osaka and Okinawa prefecture). All analyses were conducted using R (version 4.1.0) and Python (version 3.9.10).

### Ethical statement

All procedures contributing to this work comply with the Declaration of Helsinki. The analyses were based on an anonymous administrative database, therefore patient consent and Ethical Committee approval were obtained. This study was approved by the ethics committee of Juntendo University,Japan (M20-0190).

link