Correlation between rasters
In this blog article, I propose a way to look at the spatial repartition of correlation between two rasters. This is an update of a blog post I wrote in 2018 : Spatial correlation between rasters. The main difference is that I use the new {terra} package instead of the old {raster} package. The {terra} package is a modern package for spatial data manipulation and replaces the now-retired {raster} package. I also use {tmap} instead of {ggplot2} for map visualizations. Last but not least, the examples are reproducible and the code is available on my Blog tips github repository.
Libraries necessary for the analysis
# Libraries
library(terra)
library(dismo)
library(tmap)
library(mapview)
library(mapedit)
Comparison between biome datasets
For the example, we use raster layers of biome datasets included in package {dismo}. Although, these datasets share the same extent, origin and resolution, we will transform coarsely one to simulate a different extent, origin and resolution. This is a common situation when working with spatial data, and likely the one you will face. We will then crop, aggregate and resample the two rasters to have the same extent, origin and resolution. This will allow us to compare them.
We use bio1
= Mean annual temperature (°C * 10) and bio12
= Total annual precipitation (mm). We will transform bio12
to simulate a different extent, origin and resolution.
# get list of predictor variables in dismo
fnames <- list.files(
path = paste(system.file(package = "dismo"), "/ex", sep = ""),
pattern = "grd", full.names = TRUE
)
# Get Bio1 and Bio12
bio1 <- rast(fnames[1])
bio12 <- rast(fnames[2])
# Transform Bio12 to simulate a different extent, origin and resolution
bio12_trans <- aggregate(bio12, 4)
Visualizing Spatial Patterns
Clear, engaging visualizations are crucial in spatial analysis. The {tmap} package makes this easy while offering publication-ready quality. Let’s map our variables side by side.
# Temperature and precipiation maps
tm1 <- tm_shape(bio1) +
tm_raster(
title = "Temperature (°C)", palette = "-RdBu",
midpoint = mean(values(bio1), na.rm = TRUE)
) +
tm_layout(title = "Mean annual temperature")
tm2 <- tm_shape(bio12_trans) +
tm_raster(title = "Precipitation", palette = "Greens") +
tm_layout(title = "Total annual precipitation (mm)")
# Arrange maps
tmap_arrange(tm1, tm2)
These maps visually highlight the spatial variation in both variables and hint at potential correlations.
Resampling and Aggregating Rasters to allow comparison
We need to ensure both rasters have compatible resolutions. We use the smaller resolution of the two rasters to resample the other. We will also stack the two rasters for further analysis.
# Aggregate and resample chlorophyll raster
bio1_r <- resample(bio1, bio12_trans)
# Stack rasters for analysis
bio_1_12 <- c(bio1_r, bio12_trans)
names(bio_1_12) <- c("temperature", "precipitation")
# Temperature and precipiation maps
tm1 <- tm_shape(bio1_r) +
tm_raster(
title = "Temperature (°C)", palette = "-RdBu",
midpoint = mean(values(bio1_r), na.rm = TRUE)
) +
tm_layout(title = "Mean annual temperature")
tm2 <- tm_shape(bio12_trans) +
tm_raster(title = "Precipitation", palette = "Greens") +
tm_layout(title = "Total annual precipitation (mm)")
# Arrange maps
tmap_arrange(tm1, tm2)
Exploring Relationships Between Variables
One of our key goals is to investigate the relationship between temperature and precipitation. We can start by calculating the correlation between the two rasters.
The overall correlation is a usual operation using function cor()
between values of the two rasters. This returns a single value.
# Correlation between layers
cor(
values(bio_1_12)[, 1],
values(bio_1_12)[, 2],
use = "na.or.complete"
)
## [1] 0.6514446
Estimate a linear model
Another way to see if there is correlation between both layers is to define a linear model (here lm(precipitation ~ temperature)
). To have a look at the spatial heterogeneity, we can plot the distribution of residuals of the model. (Do not forget to retrieve missing residuals from NA
values.). This map shows where observations deviates from the model.
lm1 <- lm(values(bio_1_12)[, 2] ~ values(bio_1_12)[, 1])
summary(lm1)
##
## Call:
## lm(formula = values(bio_1_12)[, 2] ~ values(bio_1_12)[, 1])
##
## Residuals:
## Min 1Q Median 3Q Max
## -1357.57 -394.08 -55.32 354.48 2457.50
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -327.1220 90.5790 -3.611 0.000335 ***
## values(bio_1_12)[, 1] 8.3878 0.4369 19.200 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 606.8 on 500 degrees of freedom
## (1754 observations effacées parce que manquantes)
## Multiple R-squared: 0.4244, Adjusted R-squared: 0.4232
## F-statistic: 368.6 on 1 and 500 DF, p-value: < 2.2e-16
# Create and plot residual raster
resid_r <- rast(bio_1_12, 1)
values(resid_r) <- NA
values(resid_r)[
!is.na(values(bio_1_12)[, 1]) &
!is.na(values(bio_1_12)[, 2])
] <- lm1$residuals
tm_shape(resid_r) +
tm_raster(title = "Residuals", palette = "-RdBu", midpoint = 0) +
tm_layout(title = "Model Residuals")
Calculate local correlation using focal()
on two rasters at the same time
The local correlation is a different way to see the relation between two covariates in space. The idea a simple. Around each cell of the rasters, we define a focal area (square, round, …). For the example, I chose a 5*5 square. We calculate the correlation between the 25 values of each raster in this square using function cor()
. This local correlation is recorded for the central cell. This operation is repeated over the entire raster thanks to focal()
function of library {raster}.
However, we need to use a trick to use the function on two rasters at the same time. We define a third raster (bio_1_12_nb
) that records the positions of values from 1
to ncell()
. focal()
will thus extract the positions for which we can get values of the two rasters (as they now have the same extent, origin and resolution).
Be careful if you manipulate big rasters, this may use some more RAM.
bio_1_12_nb <- rast(bio_1_12, 1)
values(bio_1_12_nb) <- 1:ncell(bio_1_12)
matrix_bio_1_12 <- values(bio_1_12) # stack as raster [MW]
focal_cor <- focal(
x = bio_1_12_nb,
w = matrix(1, 5, 5),
fun = function(x, y = matrix_bio_1_12) {
cor(y[x, 1], y[x, 2],
use = "na.or.complete"
)
},
filename = file.path(extraWD, "focal_cor.tif"),
overwrite = TRUE
)
The resulting map shows areas where correlation between precipitation and temperature is negative, null or positive. This is a different way to see relation between covariates.
focal_cor <- raster(file.path(extraWD, "focal_cor.tif"))
# Plot local correlation
tm_shape(focal_cor) +
tm_raster(title = "Local Correlation", palette = "-RdYlGn", midpoint = 0) +
tm_layout(title = "Local Correlation Map")
This blog post is a rewriting of a previous blog post I wrote in 2018 : Spatial correlation between rasters. I use ChatGPT with this command : I would like to use package 'terra' instead of 'raster'. And use 'tmap' instead of 'ggplot2'
and the resulting code did not need further editing. Hence, if you too, need to update your code using {raster} package, you can use ChatGPT to help you…
The R-script of this article is available on my Blog tips github repository
Citation:
For attribution, please cite this work as:
Rochette Sébastien. (2024, Dec. 06). "Spatial correlation between rasters using 'terra'". Retrieved from https://statnmap.com/2024-12-06-spatial-correlation-between-rasters-using-terra/.
BibTex citation:
@misc{Roche2024Spati,
author = {Rochette Sébastien},
title = {Spatial correlation between rasters using 'terra'},
url = {https://statnmap.com/2024-12-06-spatial-correlation-between-rasters-using-terra/},
year = {2024}
}