Contents
Overview
The var1 family models time-series data using a lag-1 vector autoregressive (VAR) model. This framework decomposes multivariate time-series into two network structures:
- Temporal network: directed relationships encoding how variables at one time point predict variables at the next time point (the
betamatrix). - Contemporaneous correlations/network: undirected relationships among the residuals (innovations) at the same time point (the
sigma_zetaoromega_zetamatrix). This is a network only when the GGM parameterization is used.
The model is fitted to the Toeplitz (block) covariance matrix of the augmented data, where both the lagged ($t-1$) and current ($t$) measurements are included. Importantly, the block of covariances of the lagged variables is modeled completely separately (using a Cholesky decomposition) and does not enter the temporal or contemporaneous model estimation.
The gvar() wrapper function sets contemporaneous = "ggm", resulting in a graphical VAR model where the contemporaneous structure is represented as a Gaussian Graphical Model (network of partial correlations among innovations).
The VAR(1) Model
The VAR(1) model specifies that each variable at time $t$ is a linear function of all variables at time $t-1$, plus an innovation term:
$$\boldsymbol{y}_t = \boldsymbol{\mu} + \boldsymbol{B}(\boldsymbol{y}_{t-1} - \boldsymbol{\mu}) + \boldsymbol{\zeta}_t$$
Where:
- $\boldsymbol{\mu}$ (mu): the $p \times 1$ vector of stationary means.
- $\boldsymbol{B}$ (beta): the $p \times p$ matrix of lag-1 regression coefficients. Element $b_{ij}$ represents the effect of variable $j$ at time $t-1$ on variable $i$ at time $t$. This is the temporal network.
- $\boldsymbol{\zeta}_t$: the innovation (residual) vector at time $t$, with covariance matrix $\boldsymbol{\Sigma}_\zeta$. This captures the contemporaneous relationships among variables after removing temporal effects.
Under stationarity, define $\boldsymbol{\Sigma}_0 = \text{var}(\boldsymbol{y}_t)$ as the $p \times p$ stationary (lag-0) covariance matrix and $\boldsymbol{\Sigma}_1 = \text{cov}(\boldsymbol{y}_t,\, \boldsymbol{y}_{t-1})$ as the $p \times p$ lag-1 cross-covariance matrix. The model then implies:
- Stationary covariance: $\text{vec}(\boldsymbol{\Sigma}_0) = (\boldsymbol{I} - \boldsymbol{B} \otimes \boldsymbol{B})^{-1}\text{vec}(\boldsymbol{\Sigma}_\zeta)$
- Lag-1 covariance: $\boldsymbol{\Sigma}_1 = \boldsymbol{B}\boldsymbol{\Sigma}_0$
The model is fitted by maximum likelihood estimation on the combined Toeplitz block covariance matrix formed from $\boldsymbol{\Sigma}_0$ and $\boldsymbol{\Sigma}_1$.
Contemporaneous Parameterizations
The contemporaneous covariance matrix $\boldsymbol{\Sigma}_\zeta$ (the covariance among innovations) can be parameterized in four ways using the contemporaneous argument:
| Type | Key Matrix | Equation |
|---|---|---|
"cov" (default) |
sigma_zeta |
$\boldsymbol{\Sigma}_\zeta$ (directly estimated) |
"chol" |
lowertri_zeta |
$\boldsymbol{\Sigma}_\zeta = \boldsymbol{L}_\zeta\boldsymbol{L}_\zeta'$ |
"prec" |
kappa_zeta |
$\boldsymbol{\Sigma}_\zeta = \boldsymbol{K}_\zeta^{-1}$ |
"ggm" |
omega_zeta, delta_zeta |
$\boldsymbol{\Sigma}_\zeta = \boldsymbol{\Delta}_\zeta(\boldsymbol{I} - \boldsymbol{\Omega}_\zeta)^{-1}\boldsymbol{\Delta}_\zeta$ |
gvar() wrapper is shorthand for var1(..., contemporaneous = "ggm"). This is the recommended parameterization for network analysis, as it represents the contemporaneous structure as a network of partial correlations among innovations.
Data Preparation
Time-series data typically comes from experience sampling methodology (ESM) or ecological momentary assessment (EMA) studies, where participants are measured repeatedly over time. The var1() function handles the data augmentation (creating lagged variables) internally.
Key Arguments for Time-Series Data
vars: character vector of variable names to include in the model.dayvar: string indicating the column name for the assessment day. When specified, the first measurement of a day is not regressed on the last measurement of the previous day. Only add this if the data has multiple observations per day.beepvar: string indicating the column name for the assessment beep number within a day. Non-consecutive beeps are treated as missing.idvar: string indicating the column name for subject ID. When multiple subjects are present, their data is concatenated (but each subject's time-series is treated separately for lagging). Note: for proper multi-level modeling with between-subject effects, usepanelvar()orpanelgvar()instead for datasets with limited repeated measures, and mlVAR otherwise.
Example Data Structure
# Typical ESM data format:
# id day beep relaxed sad nervous concentration
# 1 1 1 3 2 1 4
# 1 1 2 4 1 2 3
# 1 1 3 2 3 1 4
# 1 2 1 3 2 2 3 # New day: not lagged from previous row
# ...
# Fit model with day structure:
model <- gvar(data, vars = c("relaxed", "sad", "nervous", "concentration"),
dayvar = "day", beepvar = "beep")
Example: Graphical VAR on Simulated Data
This example demonstrates the basic workflow using simulated data from the graphicalVAR package. We simulate a 2-variable VAR(1) process with known parameters and recover them using gvar().
library("psychonetrics")
library("dplyr")
library("graphicalVAR")
# Simulate a 2-variable VAR(1) process:
beta <- matrix(c(
0, 0.5,
0.5, 0
), 2, 2, byrow = TRUE)
kappa <- diag(2)
simData <- graphicalVARsim(50, beta, kappa)
# Fit graphical VAR model:
model <- gvar(simData)
model <- model %>% runmodel
# Parameter estimates:
model %>% parameters
# Confidence interval plot for temporal effects:
CIplot(model, "beta")
# Extract the temporal network:
getmatrix(model, "beta")
# Extract the contemporaneous correlations/network:
getmatrix(model, "omega_zeta")
Example: Clinical Time-Series (ESM Data)
This example uses ESM data from a single clinical patient (Epskamp, van Borkulo, van der Veen, Servaas, Isvoranu, Riese, & Cramer, 2018), available at https://osf.io/c8wjz/. The dataset contains 7 variables measured multiple times per day.
# Download the data from https://osf.io/c8wjz/
tsdata <- read.csv("Supplementary2_data.csv")
# Encode time variable:
tsdata$time <- as.POSIXct(tsdata$time, tz = "Europe/Amsterdam")
# Extract day variable:
tsdata$Day <- as.Date(tsdata$time, tz = "Europe/Amsterdam")
# Variables to use:
vars <- c("relaxed", "sad", "nervous", "concentration",
"tired", "rumination", "bodily.discomfort")
# Fit graphical VAR with FIML for missing data:
model <- gvar(tsdata, vars = vars, dayvar = "Day",
estimator = "FIML")
model <- model %>% runmodel
# Model search:
model_pruned <- model %>% prune(alpha = 0.01)
model_final <- model_pruned %>% stepup(criterion = "bic")
# Compare models:
compare(saturated = model, pruned = model_pruned, final = model_final)
# Extract networks:
temporal <- getmatrix(model_final, "PDC")
contemporaneous <- getmatrix(model_final, "omega_zeta")
# Visualize both networks side by side:
library("qgraph")
labs <- gsub("\\.", "\n", vars)
L <- averageLayout(temporal, contemporaneous)
layout(t(1:2))
qgraph(temporal, layout = L, theme = "colorblind", cut = 0,
directed = TRUE, diag = TRUE,
title = "Temporal", vsize = 12, mar = rep(6, 4),
asize = 5, labels = labs)
qgraph(contemporaneous, layout = L, theme = "colorblind", cut = 0,
title = "Contemporaneous", vsize = 12, mar = rep(6, 4),
asize = 5, labels = labs)
dayvar = "Day" argument ensures the first measurement of each day is not regressed on the last measurement of the previous day, properly handling overnight gaps in ESM data.
dayvar when there is only one measurement per day. Because the first measurement of each day is excluded from the lagged regression, using dayvar with daily data will result in all observations being removed.
Derived Network Matrices (PDC & PCC)
In addition to the raw model parameters (beta, omega_zeta), psychonetrics can compute standardized network matrices that are easier to interpret and compare:
PDC: Partial Directed Correlations
The Partial Directed Correlations (PDC) are the standardized version of the temporal regression coefficients. Specifically, the PDC matrix is the transpose of the beta matrix after standardization. This transposition is needed because in the beta matrix, element $b_{ij}$ represents the effect of variable $j$ on variable $i$, whereas qgraph() expects element $(i,j)$ to represent an edge from node $i$ to node $j$. Extract with:
temporal_network <- getmatrix(model, "PDC")
getmatrix(model, "beta") directly in qgraph(). Because of the row/column convention difference described above, the beta matrix will display edges in the wrong direction. Always use getmatrix(model, "PDC") for visualization, which transposes and standardizes the temporal effects correctly.
When visualizing the PDC matrix with qgraph(), use directed = TRUE and diag = TRUE to display the directed edges and auto-regressive effects.
PCC: Partial Contemporaneous Correlations
The Partial Contemporaneous Correlations (PCC) are the standardized precision matrix of the innovations. When contemporaneous = "ggm", the PCC is equivalent to the omega_zeta matrix. Extract with:
contemporaneous_network <- getmatrix(model, "PCC")
# Equivalent to omega_zeta when using gvar():
contemporaneous_network2 <- getmatrix(model, "omega_zeta")
Summary
The var1 family provides the foundation for time-series network modeling in psychonetrics. Key takeaways:
- The VAR(1) model decomposes time-series data into temporal (directed, lag-1) and contemporaneous (undirected, same time point) structures
- The
gvar()wrapper models contemporaneous relationships as a GGM (graphical VAR) - Use
dayvarandbeepvarto properly handle ESM data with multiple measurements per day - Extract standardized networks using
getmatrix(model, "PDC")(temporal) andgetmatrix(model, "PCC")orgetmatrix(model, "omega_zeta")(contemporaneous) - The same workflow applies:
runmodel→prune→stepup→compare - For multi-subject panel data, see the DLVM1 Family page — a separate model family with wrappers
panelvar(),panelgvar(),dlvm1(), andpanellvgvar()
Next Steps
Now that you understand var1 models, you can explore:
- General Tutorial for the full psychonetrics workflow
- Varcov Family for cross-sectional network models
- LVM Family for latent variable and latent network models
- Examples for more advanced applications
Further Reading
- Epskamp, S. (2020). Psychometric network models from time-series and panel data. Psychometrika, 85(1), 206–231. DOI: 10.1007/s11336-020-09697-3
- Epskamp, S., Waldorp, L. J., Mottus, R., & Borsboom, D. (2018). The Gaussian graphical model in cross-sectional and time-series data. Multivariate Behavioral Research, 53(4), 453–480.
- Epskamp, S., van Borkulo, C. D., van der Veen, D. C., Servaas, M. N., Isvoranu, A. M., Riese, H., & Cramer, A. O. J. (2018). Personalized network modeling in psychopathology: The importance of contemporaneous and temporal connections. Clinical Psychological Science, 6(3), 416–427. DOI: 10.1177/2167702617744325