11  Trapezoidal Integration

The examples in Chapter 11 require that the search path contains the following namespaces,

library(groupedHyperframe)

The chained trapezoidal rule of numerically approximating a definite integral is an ancient idea since the dawn of human civilization.

Let \(\{x_k; k=0,\cdots,N\}\) be a partition of the real interval \([a,b]\) such that \(a = x_0 < x_1 < \cdots < x_{N-1} < x_N = b\) and \(\Delta x_k= x_k - x_{k-1}\) be the length of the \(k\)-th sub-interval, then the chained-trapezoidal approximation of the integration of the function \(y=f(x)\) is,

\[ \begin{split} \int_a^b f(x)\,dx & \approx \sum_{k=1}^{N}{\dfrac{y_{k-1}+y_k}{2}}\Delta x_{k} \\ & = \left(\frac{y_0}{2}+\sum _{k=1}^{N-1}y_k+\frac{y_N}{2}\right)\Delta x, \quad\iff\ \forall k, \Delta x_{k}\equiv \Delta x \end{split} \tag{11.1}\]

Consider a toy example x \(=(x_0,x_1,x_2,x_3,x_4,x_5)^t\) and y \(=\{y_k = f(x_k);k=0,1,\cdots,5\}\) (Listing 11.1),

Listing 11.1: Data: toy example x and y
x. = seq.int(from = 10L, to = 20L, by = 2L)
set.seed(23); x = rnorm(n = length(x.), mean = x., sd = .5)
set.seed(12); y = rnorm(n = length(x.), mean = 1, sd = .2)
list(x = x, y = y)
# $x
# [1] 10.09661 11.78266 14.45663 16.89669 18.49830 20.55375
# 
# $y
# [1] 0.7038865 1.3154339 0.8086511 0.8159990 0.6004716 0.9455408

Note that the author intentionally makes the toy example data x not equidistant, as neither the chained trapezoidal rule framework nor the functions pracma::trapz() and pracma::cumtrapz() (Borchers 2025, v2.4.6) require the equidistant assumption.

Function pracma::trapz() (Listing 11.2) calculates a trapezoidal integration (Equation 11.1, Figure 11.1).

Listing 11.2: Review: function pracma::trapz() (Borchers 2025)
pracma::trapz(x, y)
# [1] 9.247523

In R nomenclature, the term “cumulative” indicates an inclusive scan, that the ith element of the output is the operation result of the first-i-elements of the input. For example, the function base::sum() (Listing 11.3) is a summation, while the function base::cumsum() (Listing 11.4) is a cumulative summation.

Listing 11.3: Review: function base::sum()
sum(1:5)
# [1] 15
Listing 11.4: Review: function base::cumsum()
cumsum(1:5)
# [1]  1  3  6 10 15

Function pracma::cumtrapz() (Listing 11.5) calculates a cumulative trapezoidal integration. Note that the first element of the output is always 0, as a trapezoidal integration needs at least \(\left(x_0, y_0\right)\) and \(\left(x_1, y_1\right)\). The first 0-value is often removed for downstream analysis.

Listing 11.5: Review: function pracma::cumtrapz() (Borchers 2025)
pracma::cumtrapz(x, y)
#          [,1]
# [1,] 0.000000
# [2,] 1.702340
# [3,] 4.542215
# [4,] 6.524337
# [5,] 7.658653
# [6,] 9.247523

11.1 (Cumulative) Average Vertical Height

The average vertical height of a trapezoidal integration is the chained-trapezoid (Equation 11.1) divided by the length of \(x\)-domain \(b-a\),

\[ \begin{split} (b-a)^{-1}\displaystyle\int_a^b f(x)\,dx & \approx (b-a)^{-1} \sum_{k=1}^{N}{\dfrac{y_{k-1}+y_k}{2}}\Delta x_{k} \\ & = N^{-1}\left(\frac{y_0}{2}+\sum _{k=1}^{N-1}y_k+\frac{y_N}{2}\right), \quad\iff\ \forall k, \Delta x_{k}\equiv \Delta x \end{split} \tag{11.2}\]

Function vtrapz() (Listing 11.6) calculates the average vertical height of a trapezoidal integration (Equation 11.2). The (tentative) prefix v indicates “vertical”. The shaded rectangle in Figure 11.1 has the same area as the trapezoidal integration. Therefore, the author defines the vertical height of the shaded rectangle as the average vertical height of the trapezoidal integration.

Listing 11.6: Example: function vtrapz()
vtrapz(x, y)
# [1] 0.8843263

The S3 generic function cumvtrapz() calculates the cumulative average vertical height of a trapezoidal integration. Package groupedHyperframe (v0.3.2) implements the following S3 methods (Table 11.1),

Table 11.1: S3 methods groupedHyperframe::cumvtrapz.* (v0.3.2)
visible isS4
cumvtrapz.default TRUE FALSE
cumvtrapz.fv TRUE FALSE
cumvtrapz.fvlist TRUE FALSE
cumvtrapz.hyperframe TRUE FALSE

The default method cumvtrapz.default() takes a numeric-vector input of x. The output (Listing 11.7) has,

  • the 1st element of NaN, i.e., the result of 0 (trapzoidal integration) divided by 0 (length of \(x\)-domain). The first NaN-value is often removed for downstream analysis (default parameter value rm1 = TRUE);
  • the 2nd element of \((y_0+y_1)/2\);
  • the \((i+1)\)-th element of \(i^{-1}(y_0/2+\sum_{k=1}^{i-1}y_k+y_i/2)\), if \(\{x_k;k=0,\cdots,i\}\) are equidistant, i.e., \(\forall k\in\{1,\cdots,i\}\), \(\Delta x_{k}\equiv \Delta x\). Note that the R does not use zero-based numbering; i.e., R vector indices start at 1L instead of 0L;
  • the \((N+1)\)-th and last element of Equation 11.2 (Listing 11.6).
Listing 11.7: Example: function cumvtrapz.default()
cumvtrapz(x, y, rm1 = FALSE)
#           [,1]
# [1,]       NaN
# [2,] 1.0096602
# [3,] 1.0417859
# [4,] 0.9594490
# [5,] 0.9115603
# [6,] 0.8843263

11.2 Visualization

The S3 generic function visualize_vtrapz() visualizes the concept of (cumulative) average vertical height of a trapezoidal integration. Package groupedHyperframe (v0.3.2) implements the following S3 methods (Table 11.2),

Table 11.2: S3 methods groupedHyperframe::visualize_vtrapz.* (v0.3.2)
visible isS4
visualize_vtrapz.density TRUE FALSE
visualize_vtrapz.ecdf TRUE FALSE
visualize_vtrapz.function TRUE FALSE
visualize_vtrapz.fv TRUE FALSE
visualize_vtrapz.fvlist TRUE FALSE
visualize_vtrapz.loess TRUE FALSE
visualize_vtrapz.numeric TRUE FALSE

Figure 11.1 visualizes the toy example (Listing 11.1) of the non-equidistant \((x_0,x_1,\cdots,x_5)^t\) and \((y_0,y_1,\cdots,y_5)^t\).

Listing 11.8: Figure: Visualize a toy example
Code
visualize_vtrapz(x, y) + 
  ggplot2::labs(y = NULL) + 
  ggplot2::theme_minimal()
Figure 11.1: Visualize a toy example

Figure 11.2 visualizes the kernel stats::density() of a random normal sample of size 1e3L, using the elements $x and $y. Obviously, this practice is meaningless mathematically, as a density function has \(x\)-domain of \((-\infty, \infty)\), thus the average vertical height of the trapzoidal integration under a density curve should be zero. This practice is meaningful to an empirical density though, if we use the range of the actual observations as the \(x\)-domain.

Listing 11.9: Figure: Visualize stats::density()
Code
set.seed(12); rnorm(n = 1e3L) |> density() |> visualize_vtrapz() + 
  ggplot2::theme_minimal()
Figure 11.2: Visualize stats::density()

Figure 11.3 visualizes the empirical cumulative distribution function stats::ecdf() of a random normal sample of size 1e3L, using the objects x and y in the enclosing environment. Obviously, this practice is meaningless mathematically, as a distribution function has \(x\)-domain of \((-\infty, \infty)\), thus the average vertical height of the trapzoidal integration under a distribution curve should be zero. This practice is meaningful to an empirical distribution though, if we use the range of the actual observations as the \(x\)-domain.

Listing 11.10: Figure: Visualize stats::ecdf()
Code
set.seed(27); rnorm(n = 1e3L) |> ecdf() |> visualize_vtrapz() + 
  ggplot2::theme_minimal()
Figure 11.3: Visualize stats::ecdf()

Figure 11.4 visualizes the local polynomial regression fit stats::loess() with the numeric response $dist and the numeric predictor $speed in the data set cars from package datasets shipped with R version 4.5.2 (2025-10-31).

Listing 11.11: Figure: Visualize stats::loess()
Code
loess(dist ~ speed, data = datasets::cars) |> visualize_vtrapz() + 
  ggplot2::theme_minimal()
Figure 11.4: Visualize stats::loess()

Figure 11.5 visualizes the receiver operating characteristic (ROC) curve of the point-pattern (Baddeley et al. 2025) swedishpines (Section 10.18, Listing 11.12), from the S3 method spatstat.explore::roc.ppp() (Section 15.1). Note that the following two values are mathematically equivalent, although not numerically identical,

  • the return of the “pseudo” pipeline <ppp.object> |> roc() |> vtrapz() (Listing 11.13), i.e., the average vertical height of the trapezoidal integration under the ROC-curve, and
  • the return of the pipeline <ppp.object> |> auc() (Listing 11.14), i.e., the area-under-ROC-curve (AU-ROC) via the S3 method spatstat.explore::auc.ppp().

because the \(x\)-domain is \([0,1]\) for an ROC curve.

Listing 11.12: Review: ROC of swedishpines
swedishpines_roc = spatstat.data::swedishpines |>
  spatstat.explore::roc.ppp(covariate = 'x')
Listing 11.13: Review: vtrapz of ROC
Code
swedishpines_roc |>
  with.default(expr = {
    vtrapz(x = p, y = raw)
  })
# [1] 0.5407941
Listing 11.14: Review: auc of point-pattern
Code
spatstat.data::swedishpines |> 
  spatstat.explore::auc.ppp(covariate = 'x')
# [1] 0.5407314
Listing 11.15: Figure: Visualize spatstat.explore::roc()
Code
swedishpines_roc |> 
  visualize_vtrapz() + 
  ggplot2::theme_minimal()
Figure 11.5: Visualize spatstat.explore::roc()

Figure 11.6 visualizes the conditional mean \(E(r)\) and variance \(V(r)\) between the points and the marks (Schlather, Ribeiro, and Diggle 2003), from function spatstat.explore::Emark() and spatstat.explore::Vmark() (Section 15.1) with an \(r\)-vector of default length-513L (v3.6.0.1). The return of the pipeline <ppp.object> |> Emark() |> cumvtrapz(),

  • indicates the cumulative average vertical height of the trapzoidal integration of the conditional mean \(E(r)\);
  • is denoted by <numeric-mark>.E.cumvtrapz in the output of the S3 method cumvtrapz.hyperframe() (Section 3.3.2, Section 20.11);
  • was named “standardized area under the conditional mean curve (AUcMean)” (Chervoneva et al. 2021).

The return of the pipeline <ppp.object> |> Vmark() |> cumvtrapz(),

  • indicates the cumulative average vertical height of the trapzoidal integration of the conditional variance \(V(r)\);
  • is denoted by <numeric-mark>.V.cumvtrapz in the output of the S3 method cumvtrapz.hyperframe() (Section 3.3.2, Section 20.11);
  • was named “standardized area under the conditional variance curve (AUcVar)” (Chervoneva et al. 2021).
Listing 11.16: Figure: Visualize Emark() and Vmark()
Code
fig_vtrapz_E = spatstat.data::spruces |>
  spatstat.explore::Emark() |>
  visualize_vtrapz(draw.rect = FALSE)
fig_vtrapz_V = spatstat.data::spruces |>
  spatstat.explore::Vmark() |>
  visualize_vtrapz(draw.rect = FALSE)
(fig_vtrapz_E + fig_vtrapz_V +
  patchwork::plot_layout(ncol = 2L)) & 
  ggplot2::theme_minimal()
Figure 11.6: Visualize Emark() and Vmark()