---
title: "Introduction to PowerSDI"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Introduction to PowerSDI}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
bibliography: bibliography.bib
---


# Introduction

{PowerSDI} calculates the Standardized Precipitation (SPI; @mckee1993) and Standardized Precipitation-Evapotranspiration (SPEI; @VicenteSerrano2010) indices using gridded-data from the NASA-POWER project.
The package includes functions (`ScientSDI()`, `Reference()`, `Accuracy()` and `PlotData()`), which evaluate how well the drought indices meet their conceptual assumptions.
The `OperatSDI()` function allows users to download NASA-POWER data for specific monitoring periods.
The package adopts a quasi-weekly basic time scale (`quart.month`), allowing for four index calculations per month: days 1 to 7, days 8 to 14, days 15 to 21, and days 22 to 28, 29, 30, or 31 depending on the month.
For instance, if the time scale is set to 4 (`TS = 4`), it corresponds to a moving window with a 1-month length that is calculated four times each month.
If `TS = 48`, the time scale corresponds to a moving window with a 12-month length that is calculated four times each month.
{PowerSDI} uses functions from the {nasapower} [@Sparks2018; @Sparks2023] and {Lmom} [@Hosking2023] packages.

# Basic Instructions: Getting Started

Load the library in your R session.


```r
library("PowerSDI")
```

## Using PlotData() to visually inspect NASA-POWER data

The `PlotData()` function allows a visual inspection of the indices' inputs (precipitation and potential evapotranspiration) obtained from the NASA-POWER project.
These inputs (in millimetres) are show at the 1-quart.month time scale.

## Example 1: Inspecting indices' input for Campinas-SP, Brazil


```r
PlotData(
  lon = -47.3,
  lat = -22.65,
  start.date = "1991-01-01",
  end.date = "2022-12-31"
)
#> Just a sec. Downloading NASA POWER data and calculating the other parameters.
```

<div class="figure" style="text-align: center">
<img src="PlotData-1.png" alt="plot of chunk PlotData"  />
<p class="caption">plot of chunk PlotData</p>
</div>

## ScientSDI(): Replacing suspicious data and calculating the distributions' parameters

The first step of the SPI and SPEI algorithms is to calculate the cumulative probabilities of their input variables [@Guttman1999].
The `ScientSDI()` function estimates the parameters of the gamma, generalized extreme value (GEV), or generalized logistic distributions (GLO) through the L-moments method. This function also allows users to replace suspicious values from the data sample.

## Example 2: Applying the ScientSDI() function for Campinas-SP, Brazil

The plots generated by the `PlotData()` function for Campinas revealed one suspicious rain value larger than 250 mm.
Since this suspicious record represents less than 1% of the data sample, we replaced it with (`RainUplim = 250`) before calculating the parameters of the gamma and GEV distributions. The PE data showed no suspicious values.

```r
Campinas <- ScientSDI(
  lon = -47.3,
  lat = -22.65,
  start.date = "1991-01-01",
  end.date = "2022-12-31",
  distr = "GEV",
  TS = 4,
  Good = "no",
  RainUplim = 250,
  RainLowlim = NULL,
  PEUplim = NULL,
  PELowlim = NULL
)
#> Removed row(s) above limit: 420 for rain
#> The calculations started on: 1991-01-01
```

Check the SDI values.


```r
head(Campinas$SDI)
#>   Year Month quart.month   Rain  PE.Harg    PE.PM  PPE.Harg    PPE.PM        SPI
#> 1 1991     1           4 244.61 170.7373 146.4035  73.87270  98.20646 0.09109635
#> 2 1991     2           5 314.52 162.4491 135.7954 152.07087 178.72457 1.03838634
#> 3 1991     2           6 323.98 155.9019 131.5357 168.07813 192.44427 1.33262036
#> 4 1991     2           7 300.04 151.1668 128.4917 148.87323 171.54828 1.01676070
#> 5 1991     2           8 209.40 135.6084 115.1231  73.79158  94.27689 0.47371145
#> 6 1991     3           9 240.75 128.9954 109.1543 111.75463 131.59571 1.08669958
#>    SPEI.Harg    SPEI.PM Categ.SPI Categ.SPEI.Harg Categ.SPEI.PM
#> 1 -0.1936137 -0.1557081    Normal          Normal        Normal
#> 2  0.8203645  0.8938564   mod.wet          Normal        Normal
#> 3  1.3080700  1.4026983   mod.wet         mod.wet       mod.wet
#> 4  0.9191998  0.9590826   mod.wet          Normal        Normal
#> 5  0.4402641  0.5283875    Normal          Normal        Normal
#> 6  1.0332707  1.1195075   mod.wet         mod.wet       mod.wet
```

Check the DistPar


```r
head(Campinas$DistPar)
#>        lon    lat quart.month alfa.rain beta.rain probzero.rain loc.harg  sc.harg
#> [1,] -47.3 -22.65           1  7.408528  29.87982             0 13.73670 74.65810
#> [2,] -47.3 -22.65           2  8.764616  27.03333             0 34.70564 77.84755
#> [3,] -47.3 -22.65           3  9.446275  24.96878             0 36.18941 71.86870
#> [4,] -47.3 -22.65           4  9.797904  25.09108             0 60.80832 88.85291
#> [5,] -47.3 -22.65           5  7.515294  30.45793             0 50.58665 98.79377
#> [6,] -47.3 -22.65           6  8.519209  25.71193             0 40.44269 87.16465
#>         sh.harg   loc.pm     sc.pm      sh.pm TS
#> [1,] 0.04643763 35.10085  77.23558 0.05326288  4
#> [2,] 0.10306408 54.98155  82.17268 0.12652166  4
#> [3,] 0.05582726 54.32006  75.73398 0.05512823  4
#> [4,] 0.35711219 80.30166  96.80879 0.39667736  4
#> [5,] 0.51834901 69.37979 106.06754 0.58681673  4
#> [6,] 0.42728104 55.63610  91.23542 0.44648066  4
```

## Accuracy(): Verifying how well NASA-POWER data actually represent real-world/observed data

A basic assumption related to remote sensing data is that they really represent the "real-world" conditions.
Thus, the `Accuracy()` function calculates the following measures of accuracy: the absolute mean error (AME), root-mean-square-error (RMSE), original (dorig), modified (dmod), refined (dref) Willmott's indices of agreement, and Pearson determination coefficient (R2).

## Example 3: Applying the Accuracy function to compare observed (obs) and NASA-POWER PE data in Campinas-SP, Brazil

In the example below, `data("ObsEst")` contains pairs of observed and estimated potential evapotranspiration (PE) data.
The observed and estimated data were obtained from a weather station in Campinas and from the NASA-POWER project, respectively.
PE was estimated using the Hargreaves & Samani method [@Hargreaves1985].


```r
data("ObsEst")

Compare_PE_Cps <- Accuracy(obs_est = ObsEst, conf.int = "No")

Compare_PE_Cps
#>       AME     RMSE     dorig      dmod     dref     RQuad
#>  2.470223 3.144231 0.9718557 0.8386579 0.838266 0.9197235
```

## OperatSDI(): Generating routine operational NASA-SPI and NASA-SPEI estimates

Considering that the parameters of the gamma (SPI) and GEV (SPEI) distributions have already been estimated by the `ScientSDI()` function, we can now use the `OperatSDI()` function to calculate these two indices on a routine basis.

## Example 4: Applying the OperatSDI function to calculate the SPI and SPEI in Campinas-SP, Brazil

The OperatSDI function enables users to download NASA-POWER data only for the period they intend to monitor.


```r
parms <- Campinas$DistPar # Obtained in Example 2
SDI <- OperatSDI(
  lon = -47.3,
  lat = -22.65,
  start.date = "2023-08-01",
  end.date = "2023-08-07",
  parms = Campinas$DistPar,
  TS = 4
)
#> Calculating...
#> Considering the selected `TS`,4, the calculations started on: 2023-06-22

SDI
#>     Lon    Lat Year Month quart.month  Rain       PE       PPE         SPI       SPEI
#> 1 -47.3 -22.65 2023     7          27 15.67 83.24573 -67.57573 -0.17758163 -0.3648784
#> 2 -47.3 -22.65 2023     7          28 16.54 90.45608 -73.91608 -0.09934836 -0.2155000
#> 3 -47.3 -22.65 2023     8          29 16.50 98.82854 -82.32854 -0.23425126 -0.5376825
#>   Categ.SPI Categ.SPEI
#> 1    Normal     Normal
#> 2    Normal     Normal
#> 3    Normal     Normal
```

# Not So Basic Instructions:

## The ScientSDI() function: Removing suspicious data and assessing the indices' conceptual assumptions

Standardized indices, both SPI and SPEI are expected to provide spatially consistent interpretations of their values [@Guttman1999].
Therefore, their frequency distributions are expected to always approach the standard normal distribution regardless of the region, season, or time scale at which the indices are calculated [@Wu2006; @Stagge2015; @Blain2017].
In this context, the ScientSDI function calculates two normality-checking procedures described by @Wu2006 and @Stagge2015.
Additionally, the calculation algorithm of the SPI and SPEI relies on fitting a parametric distribution to their input data.
Therefore, the `ScientSDI()` function also applies the widely-used Lilliefors [@Lilliefors1967] and Anderson-Darling [@Anderson1954] goodness-of-fit tests to the gamma and GEV/GLO distributions.
The function `ScientSDI()` allows users to select significance levels ranging from 5 to 10% to run these tests.
Additionally, the SPEI algorithm often uses the generalized extreme value (GEV) or the generalized logistic (GLO) [@VicenteSerrano2010; Begueria2013; @Stagge2015; @Stagge2015a; @VicenteSerrano2015; @Blain2017].
A review of these studies suggests that the performance of these two distributions for calculating the SPEI tends to be similar to each other over most of the range of the index possible values (e.g. -2.0:2:0).
However, these studies also found significant differences between the two probability functions in the lower and upper tails [@VicenteSerrano2015].
In this context, the `ScientSDI()` function also allows the users to choose between these two models when calculating the SPEI.

## Example 5.1: Applying the ScientSDI() function with the GEV and verifying conceptual assumptions (Campinas-SP).


```r
Campinas.GEV <- ScientSDI(
  lon = -47.3,
  lat = -22.65,
  start.date = "1991-01-01",
  end.date = "2022-12-31",
  distr = "GEV",
  TS = 4,
  Good = "yes",
  sig.level = 0.95,
  RainUplim = 250,
  RainLowlim = NULL,
  PEUplim = NULL,
  PELowlim = NULL
)
#> Removed row(s) above limit: 420 for rain
#> The calculations started on: 1991-01-01
```

Check the goodness of fit.


```r
head(Campinas.GEV$GoodFit)
#>       Lili.Rain      Crit Lili.PPEHarg      Crit Lili.PPEPM      Crit   AD.Rain
#> [1,] 0.11611448 0.1396603   0.09643114 0.1222336 0.10632197 0.1236653 0.6390502
#> [2,] 0.11600471 0.1428267   0.09792190 0.1253470 0.08309344 0.1267864 0.9182668
#> [3,] 0.13690470 0.1413462   0.13726836 0.1223576 0.13620839 0.1265062 0.8535305
#> [4,] 0.12338364 0.1381886   0.12333484 0.1208748 0.11852563 0.1225573 0.7133716
#> [5,] 0.05656565 0.1435980   0.06200571 0.1226508 0.05552784 0.1203243 0.1413624
#> [6,] 0.07813975 0.1403786   0.08737362 0.1228580 0.09892327 0.1225442 0.6182675
#>           Crit AD.PPEHarg      Crit  AD.PPEPM      Crit
#> [1,] 0.7332467  0.3436237 0.5310983 0.3620798 0.5272992
#> [2,] 0.7060077  0.3130080 0.5854438 0.2911684 0.6386024
#> [3,] 0.7261569  0.5908702 0.5463382 0.6880857 0.5640758
#> [4,] 0.7292376  0.6114124 0.5115284 0.6220234 0.5193580
#> [5,] 0.7369261  0.1526510 0.5074809 0.1144797 0.5235433
#> [6,] 0.7463400  0.2623157 0.5169554 0.3154090 0.5154911
```

Check the normality.


```r
head(Campinas.GEV$Normality)
#>      SPI.Shap            SPI.Shap.p           SPI.AbsMed          
#> [1,] "0.967900296734503" "0.463209059642815"  "0.0997964641569041"
#> [2,] "0.96619340372341"  "0.420858893940406"  "0.0104781137427224"
#> [3,] "0.975582768775712" "0.682405906920205"  "0.129821941048217" 
#> [4,] "0.953414578207946" "0.180061525951205"  "0.168690442662682" 
#> [5,] "0.930008418348121" "0.0391836955173206" "0.291796230478462" 
#> [6,] "0.940736018465501" "0.0785562689884901" "0.198234408496805" 
#>      SPEI.Harg.Shap      SPEI.Harg.Shap.p    SPEI.Harg.AbsMed      
#> [1,] "0.968381434480606" "0.475655103439624" "0.146211508037505"   
#> [2,] "0.965983791350213" "0.415857065519545" "0.0430601773978751"  
#> [3,] "0.977751926965489" "0.747676355214095" "0.146518713888501"   
#> [4,] "0.978980071731148" "0.76943094784405"  "0.000945228380311781"
#> [5,] "0.966177284178725" "0.401102999039477" "0.116531689396188"   
#> [6,] "0.948926963831835" "0.134350899702926" "0.0873165947811334"  
#>      SPEI.PM.Shap        SPEI.PM.Shap.p      SPEI.PM.AbsMed       SPI.testI 
#> [1,] "0.968798285250651" "0.486612320910051" "0.165875549466575"  "Normal"  
#> [2,] "0.975597247176717" "0.682843863812409" "0.109628810903115"  "Normal"  
#> [3,] "0.984353877822483" "0.918729743222738" "0.122398459123283"  "Normal"  
#> [4,] "0.975156717722638" "0.651747489859029" "0.0378055330733986" "Normal"  
#> [5,] "0.967804350100931" "0.441114432321209" "0.106759663113951"  "NoNormal"
#> [6,] "0.944436532965484" "0.100097027435938" "0.116250636028372"  "NoNormal"
#>      SPEI.Harg.testI SPEI.PM.testI SPI.testII SPEI.Harg.testII SPEI.PM.testII
#> [1,] "Normal"        "Normal"      "Normal"   "Normal"         "Normal"      
#> [2,] "Normal"        "Normal"      "Normal"   "Normal"         "Normal"      
#> [3,] "Normal"        "Normal"      "Normal"   "Normal"         "Normal"      
#> [4,] "Normal"        "Normal"      "Normal"   "Normal"         "Normal"      
#> [5,] "Normal"        "Normal"      "NoNormal" "Normal"         "Normal"      
#> [6,] "Normal"        "Normal"      "Normal"   "Normal"         "Normal"
```

## Example 5.2: Applying the ScientSDI() function with the GLO and verifying conceptual assumptions (Campinas-SP).


```r
Campinas.GLO <- ScientSDI(
  lon = -47.3,
  lat = -22.65,
  start.date = "1991-01-01",
  end.date = "2022-12-31",
  distr = "GLO",
  TS = 4,
  Good = "yes",
  sig.level = 0.95,
  RainUplim = 250,
  RainLowlim = NULL,
  PEUplim = NULL,
  PELowlim = NULL
)
#> Removed row(s) above limit: 420 for rain
#> The calculations started on: 1991-01-01
```

Check the goodness of fit.


```r
head(Campinas.GLO$GoodFit)
#>       Lili.Rain      Crit Lili.PPEHarg      Crit Lili.PPEPM      Crit   AD.Rain
#> [1,] 0.11611448 0.1408241   0.07793223 0.1267743 0.08361417 0.1294131 0.6390502
#> [2,] 0.11600471 0.1438986   0.07754312 0.1248404 0.07863811 0.1290235 0.9182668
#> [3,] 0.13690470 0.1394402   0.11096348 0.1283290 0.10719062 0.1275217 0.8535305
#> [4,] 0.12338364 0.1390644   0.10560516 0.1236858 0.10132682 0.1254306 0.7133716
#> [5,] 0.05656565 0.1378510   0.05057530 0.1234080 0.04670884 0.1240447 0.1413624
#> [6,] 0.07813975 0.1377996   0.11117785 0.1267045 0.12181073 0.1251562 0.6182675
#>           Crit AD.PPEHarg      Crit   AD.PPEPM      Crit
#> [1,] 0.7021159  0.2188857 0.5549677 0.21990399 0.5609734
#> [2,] 0.7345871  0.2930398 0.5376209 0.27915703 0.5696080
#> [3,] 0.7112916  0.4200223 0.5585982 0.51359779 0.5502796
#> [4,] 0.7054471  0.5535407 0.5338313 0.57487863 0.5582377
#> [5,] 0.7249042  0.1123251 0.5474743 0.09516567 0.5540126
#> [6,] 0.7072738  0.4195567 0.5615291 0.50640990 0.5446173
```

Check the normality.


```r
head(Campinas.GLO$GoodFit)
#>       Lili.Rain      Crit Lili.PPEHarg      Crit Lili.PPEPM      Crit   AD.Rain
#> [1,] 0.11611448 0.1408241   0.07793223 0.1267743 0.08361417 0.1294131 0.6390502
#> [2,] 0.11600471 0.1438986   0.07754312 0.1248404 0.07863811 0.1290235 0.9182668
#> [3,] 0.13690470 0.1394402   0.11096348 0.1283290 0.10719062 0.1275217 0.8535305
#> [4,] 0.12338364 0.1390644   0.10560516 0.1236858 0.10132682 0.1254306 0.7133716
#> [5,] 0.05656565 0.1378510   0.05057530 0.1234080 0.04670884 0.1240447 0.1413624
#> [6,] 0.07813975 0.1377996   0.11117785 0.1267045 0.12181073 0.1251562 0.6182675
#>           Crit AD.PPEHarg      Crit   AD.PPEPM      Crit
#> [1,] 0.7021159  0.2188857 0.5549677 0.21990399 0.5609734
#> [2,] 0.7345871  0.2930398 0.5376209 0.27915703 0.5696080
#> [3,] 0.7112916  0.4200223 0.5585982 0.51359779 0.5502796
#> [4,] 0.7054471  0.5535407 0.5338313 0.57487863 0.5582377
#> [5,] 0.7249042  0.1123251 0.5474743 0.09516567 0.5540126
#> [6,] 0.7072738  0.4195567 0.5615291 0.50640990 0.5446173
```

## The Accuracy() function: Applying the Accuracy function and calculating confidence intervals.

The `Accuracy()` function may also provide confidence intervals (CI) for AME, RMSE, dorig, dmod and dref.
As emphasized by @Willmott1985, the CI specifies a range of values within which these statistics are expected to vary by chance.
Consequently, users can interpret the magnitude of the CI as an indicator of the reliability of the estimated values for the comparison metrics @Willmott1985.
The function allows users to select between 90 and 95% CIs.

## Example 6: Applying the Accuracy function and calculating confidence intervals (Campinas-SP).


```r
data("ObsEst")
Compare_PE_Cps_withCI <-
  Accuracy(obs_est = ObsEst,
           conf.int = "yes",
           sig.level = 0.95)
#> Just a sec
Compare_PE_Cps_withCI
#>  AMECIinf      AME AMECIsup RMSECIinf     RMSE RMSECIsup dorig_CIinf     dorig
#>  2.467782 2.470223 2.473673  3.140299 3.144231  3.148564   0.9717026 0.9718557
#>  dorigCIsup dmodCIinf      dmod dmodCIsup dref_CIinf     dref dref_CIsup RQuad_CIinf
#>   0.9718942 0.8382366 0.8386579 0.8387874  0.8377858 0.838266   0.838368   0.9194093
#>      RQuad RQuad_CIsup
#>  0.9197235    0.919914
```

# References