18  groupedData

Function nlme::groupedData() (Pinheiro, Bates, and R Core Team 2025, v3.1.168) creates a grouped data frame, i.e., an R object of S3 class 'groupedData'.

Listing 18.1 summarizes the S3 methods for the class 'groupedData' in packages nlme,

Listing 18.1: Existing (but not exported) S3 methods nlme::*.groupedData
Code
suppressPackageStartupMessages(library(nlme))
.S3methods(class = 'groupedData', all.names = TRUE) |> 
  attr(which = 'info', exact = TRUE) |>
  subset.data.frame(subset = (from != 'groupedHyperframe'))
#                           visible                from       generic  isS4
# [.groupedData               FALSE registered S3method             [ FALSE
# as.data.frame.groupedData   FALSE registered S3method as.data.frame FALSE
# asTable.groupedData         FALSE registered S3method       asTable FALSE
# collapse.groupedData        FALSE registered S3method      collapse FALSE
# formula.groupedData         FALSE registered S3method       formula FALSE
# isBalanced.groupedData      FALSE registered S3method    isBalanced FALSE
# lme.groupedData             FALSE registered S3method           lme FALSE
# lmList.groupedData          FALSE registered S3method        lmList FALSE
# print.groupedData           FALSE registered S3method         print FALSE
# update.groupedData          FALSE registered S3method        update FALSE

The examples in Chapter 18 require that the search path contains the following namespaces,

library(groupedHyperframe)

Package groupedHyperframe (v0.3.2) implements the following S3 methods to the class 'groupedData' (Table 18.1),

Table 18.1: S3 methods groupedHyperframe::*.groupedData (v0.3.2)
visible from generic isS4
as.groupedHyperframe.groupedData TRUE groupedHyperframe groupedHyperframe::as.groupedHyperframe FALSE

18.1 Create groupedHyperframe

The S3 generic function as.groupedHyperframe() has been introduced in Section 14.1 (Table 14.2). The S3 method as.groupedHyperframe.groupedData() converts a grouped data frame into a grouped hyper data frame (groupedHyperframe, Chapter 19) using its grouping structure.

Listing 18.2 converts the grouped data frame Remifentanil from package nlme (Pinheiro, Bates, and R Core Team 2025, v3.1.168) into a grouped hyper data frame.

Data: Remifentanil
nlme::Remifentanil |>
  head(n = 3L)
# Grouped Data: conc ~ Time | Subject
#   ID Subject Time  conc  Rate      Amt   Age  Sex  Ht Wt    BSA     LBM
# 1  1       1  0.0    NA 71.99 107.9850 30.58 Male 171 72 1.8393 56.5075
# 2  1       1  1.5  9.51 71.99  35.9950 30.58 Male 171 72 1.8393 56.5075
# 3  1       1  2.0 11.50 71.99  37.4348 30.58 Male 171 72 1.8393 56.5075
Listing 18.2: Example: function as.groupedHyperframe.groupedData() on Remifentanil
Remifentanil_g = nlme::Remifentanil |> 
  as.groupedHyperframe()
Example: a grouped hyper data frame Remifentanil_g
Remifentanil_g
# Grouped Hyperframe: ~Subject
# <environment: 0x31998b7d0>
# 
# 65 Subject
# 
# Preview of first 10 (or less) rows:
# 
#         Time      conc      Rate       Amt ID Subject Age    Sex  Ht   Wt    BSA     LBM
# 1  (numeric) (numeric) (numeric) (numeric) 30      30  21 Female 165 55.9 1.6095 42.8260
# 2  (numeric) (numeric) (numeric) (numeric) 21      21  24 Female 161 58.6 1.6131 43.0953
# 3  (numeric) (numeric) (numeric) (numeric) 25      25  32 Female 157 45.9 1.4278 36.4631
# 4  (numeric) (numeric) (numeric) (numeric) 23      23  23 Female 163 50.0 1.5215 39.5740
# 5  (numeric) (numeric) (numeric) (numeric) 29      29  25 Female 163 54.5 1.5782 41.7695
# 6  (numeric) (numeric) (numeric) (numeric) 28      28  30 Female 178 79.1 1.9708 55.4106
# 7  (numeric) (numeric) (numeric) (numeric) 32      32  54 Female 167 45.0 1.4806 37.4038
# 8  (numeric) (numeric) (numeric) (numeric) 64      64  55 Female 168 74.8 1.8455 50.6969
# 9  (numeric) (numeric) (numeric) (numeric) 22      22  33 Female 163 70.5 1.7607 47.7487
# 10 (numeric) (numeric) (numeric) (numeric) 45      45  72 Female 165 54.4 1.5910 42.1204

Listing 18.3 converts the grouped data frame bdf from package nlme (Pinheiro, Bates, and R Core Team 2025, v3.1.168) into a grouped hyper data frame.

Data: bdf
nlme::bdf |>
  head(n = 3L)
# Grouped Data: langPOST ~ IQ.verb | schoolNR
# <environment: 0x3166498b0>
#   schoolNR pupilNR IQ.verb  IQ.perf sex Minority repeatgr aritPRET classNR aritPOST langPRET langPOST ses denomina schoolSES satiprin natitest meetings currmeet mixedgra percmino aritdiff homework
# 1        1   17001    15.0 12.33333   0        N        0       14     180       24       36       46  23        1        11  3.42857        0      1.7  1.83333        0       60       12  2.33333
# 2        1   17002    14.5 10.00000   0        Y        0       12     180       19       36       45  10        1        11  3.42857        0      1.7  1.83333        0       60       12  2.33333
# 3        1   17003     9.5 11.00000   0        N        0       10     180       24       33       33  15        1        11  3.42857        0      1.7  1.83333        0       60       12  2.33333
#   classsiz groupsiz IQ.ver.cen avg.IQ.ver.cen grpSiz.cen
# 1       29       29   3.165938      -1.514062   5.899432
# 2       29       29   2.665938      -1.514062   5.899432
# 3       29       29  -2.334062      -1.514062   5.899432
Listing 18.3: Example: function as.groupedHyperframe.groupedData() on bdf
bdf_g = nlme::bdf |> 
  as.groupedHyperframe()
Example: a grouped hyper data frame bdf_g
bdf_g
# Grouped Hyperframe: ~schoolNR
# <environment: 0x316bc3560>
# 
# 131 schoolNR
# 
# Preview of first 10 (or less) rows:
# 
#     pupilNR   IQ.verb   IQ.perf      sex Minority  repeatgr  aritPRET   classNR  aritPOST  langPRET  langPOST       ses mixedgra  percmino  homework  classsiz  groupsiz IQ.ver.cen grpSiz.cen schoolNR
# 1  (factor) (numeric) (numeric) (factor) (factor) (ordered) (numeric) (numeric) (numeric) (numeric) (numeric) (numeric) (factor) (numeric) (numeric) (numeric) (numeric)  (numeric)  (numeric)       47
# 2  (factor) (numeric) (numeric) (factor) (factor) (ordered) (numeric) (numeric) (numeric) (numeric) (numeric) (numeric) (factor) (numeric) (numeric) (numeric) (numeric)  (numeric)  (numeric)      103
# 3  (factor) (numeric) (numeric) (factor) (factor) (ordered) (numeric) (numeric) (numeric) (numeric) (numeric) (numeric) (factor) (numeric) (numeric) (numeric) (numeric)  (numeric)  (numeric)        2
# 4  (factor) (numeric) (numeric) (factor) (factor) (ordered) (numeric) (numeric) (numeric) (numeric) (numeric) (numeric) (factor) (numeric) (numeric) (numeric) (numeric)  (numeric)  (numeric)      123
# 5  (factor) (numeric) (numeric) (factor) (factor) (ordered) (numeric) (numeric) (numeric) (numeric) (numeric) (numeric) (factor) (numeric) (numeric) (numeric) (numeric)  (numeric)  (numeric)       10
# 6  (factor) (numeric) (numeric) (factor) (factor) (ordered) (numeric) (numeric) (numeric) (numeric) (numeric) (numeric) (factor) (numeric) (numeric) (numeric) (numeric)  (numeric)  (numeric)      258
# 7  (factor) (numeric) (numeric) (factor) (factor) (ordered) (numeric) (numeric) (numeric) (numeric) (numeric) (numeric) (factor) (numeric) (numeric) (numeric) (numeric)  (numeric)  (numeric)       27
# 8  (factor) (numeric) (numeric) (factor) (factor) (ordered) (numeric) (numeric) (numeric) (numeric) (numeric) (numeric) (factor) (numeric) (numeric) (numeric) (numeric)  (numeric)  (numeric)       12
# 9  (factor) (numeric) (numeric) (factor) (factor) (ordered) (numeric) (numeric) (numeric) (numeric) (numeric) (numeric) (factor) (numeric) (numeric) (numeric) (numeric)  (numeric)  (numeric)      109
# 10 (factor) (numeric) (numeric) (factor) (factor) (ordered) (numeric) (numeric) (numeric) (numeric) (numeric) (numeric) (factor) (numeric) (numeric) (numeric) (numeric)  (numeric)  (numeric)      192
#    denomina schoolSES satiprin natitest meetings currmeet aritdiff avg.IQ.ver.cen
# 1         3        11  2.85714        0  2.11111  1.83333       13     -3.5215621
# 2         3        12  3.00000        1  2.10000  2.16667       17     -5.0840621
# 3         1        11  3.00000        0  1.60000  1.66667       27     -2.8340621
# 4         2        20  3.14286        1  2.10000  2.00000       27     -2.0840621
# 5         1        15  3.00000        0  2.60000  2.66667       27     -1.3340621
# 6         3        14  2.71429        0  2.00000  2.00000       13     -1.1912049
# 7         1        17  3.28571        0  1.10000  1.00000        9     -0.4054907
# 8         1        20  3.14286        0  2.70000  2.66667       17     -2.4007288
# 9         1        15  3.28571        0  1.60000  1.33333       17     -1.0090621
# 10        3        19  3.57143        1  3.20000  3.16667       16     -1.0840621

Converting a (grouped) data frame with substantial amount of duplicated information into a grouped hyper data frame not necessarily(!!) reduces the memory allocation, because the hyperframe object (Chapter 20) carries additional auxiliary information. And even when it does reduce the memory allocation, a grouped hyper data frame would not reduce much the saved file.size compared to a data frame, if xz compression is used for both.

Advanced: Reducing memory allocation
unclass(object.size(Remifentanil_g) / object.size(nlme::Remifentanil))
# [1] 0.3094831
f = replicate(n = 2L, expr = tempfile(fileext = '.rds'))
Remifentanil_g |> saveRDS(file = f[1L], compress = 'xz')
nlme::Remifentanil |> saveRDS(file = f[2L], compress = 'xz')
file.size(f[1L]) / file.size(f[2L])
# [1] 1.206928
Advanced: Not reducing memory allocation
unclass(object.size(bdf_g) / object.size(nlme::bdf))
# [1] 25.78725