IMMERSE Project

Video tutorial repository: "lpa_enum"

Latent Profile Analysis

IMMERSE Project: Institute of Mixture Modeling for Equity-Oriented Researchers, Scholars, and Educators

Video Series Funded by IES Training Grant (R305B220021)

Dina Arch

IMMERSE Project

The Institute of Mixture Modeling for Equity-Oriented Researchers, Scholars, and Educators (IMMERSE) is an IES funded training grant (R305B220021) to support Education scholars in integrating mixture modeling into their research.

Please visit our website to learn more and apply for the year-long fellowship.
Follow us on Twitter!

How to reference this walkthrough: This work was supported by the IMMERSE Project (IES - 305B220021)

Visit our GitHub account to download the materials needed for this walkthrough.

Example: PISA Student Data

The first example closely follows the vignette used to demonstrate the tidyLPA package (Rosenberg, 2019).

This model utilizes the PISA data collected in the U.S. in 2015. To learn more about this data see here.
To access the 2015 US PISA data & documentation in R use the following code:

devtools::install_github("jrosen48/pisaUSA15")
library(pisaUSA15)

Latent Profile Models:

model 1 Class-invariant / Diagonal: Equal variances, and covariances fixed to 0
model 2 Class-varying / Diagonal: Free variances and covariances fixed to 0
model 3 Class-invariant / Non-Diagonal: Equal variances and equal covariances
model 4 Free variances, and equal covariances
model 5 Equal variances, and free covariances
model 6 Class Varying / Non-Diagonal: Free variances and free covariances

Load packages

library(naniar)
library(tidyverse)
library(haven)
library(glue)
library(MplusAutomation)
library(here)
library(janitor)
library(gt)
library(tidyLPA)
library(pisaUSA15)
library(cowplot)
library(filesstrings)
here::i_am("lpa.Rmd")

Prepare Data


pisa <- pisaUSA15[1:500,] %>%
  dplyr::select(broad_interest, enjoyment, instrumental_mot, self_efficacy)

Descriptive Statistics

ds <- pisa %>% 
  pivot_longer(broad_interest:self_efficacy, names_to = "variable") %>% 
  group_by(variable) %>% 
  summarise(mean = mean(value, na.rm = TRUE),
            sd = sd(value, na.rm = TRUE)) 

ds %>% 
  gt () %>% 
  tab_header(title = md("**Descriptive Summary**")) %>%
  cols_label(
    variable = "Variable",
    mean = md("M"),
    sd = md("SD")
  ) %>%
  fmt_number(c(2:3),
             decimals = 2) %>% 
  cols_align(
    align = "center",
    columns = mean
  )

Enumeration

`tidyLPA`

Enumerate using estimate_profiles():

Estimate models with classes $K = 1:4$
Model has 4 continuous indicators
Default variance-covariance specifications (model 1)
Change variances and covariances to indicate the model you want to specify (see Vignette)


# Run LPA models 
pisa %>% 
    estimate_profiles(1:4,
                      package = "MplusAutomation",
                      ANALYSIS = "starts = 100, 20;",
                      variances = c("equal", "varying", "equal", "varying"),
                      covariances = c("zero", "zero", "equal", "equal"),
                      keepfiles = TRUE)

# Move files to folder 
files <- list.files(here(), pattern = "^model")
file.move(files, here("tidyLPA"))

Table of Fit

APA formatted model fit table with additional fit indices

Extract data:

output_pisa <- readModels(here("tidyLPA"), quiet = TRUE)

enum_extract <- LatexSummaryTable(
  output_pisa,
  keepCols = c(
    "Title",
    "Parameters",
    "LL",
    "BIC",
    "aBIC",
    "BLRT_PValue",
    "Observations"
  )
)

allFit <- enum_extract %>%
  mutate(aBIC = -2 * LL + Parameters * log((Observations + 2) / 24)) %>%
  mutate(CAIC = -2 * LL + Parameters * (log(Observations) + 1)) %>%
  mutate(AWE = -2 * LL + 2 * Parameters * (log(Observations) + 1.5)) %>%
  separate(Title, c("Model", "Class"), sep = "with") %>% 
  mutate(SIC = -.5 * BIC) %>%
  drop_na(SIC) %>% 
  group_by(Model) %>% 
  mutate(expSIC = exp(SIC - max(SIC))) %>%
  mutate(BF = exp(SIC - lead(SIC))) %>%
  mutate(cmPk = expSIC / sum(expSIC)) %>%
  ungroup() %>% 
  unite(Title, c("Model", "Class"), sep = "with", remove = TRUE) %>% 
  dplyr::select(1:5, 8:9, 6, 12, 13) %>%
  mutate(Title = str_to_title(Title)) %>% 
  arrange(Title)

Create table:

allFit %>%
  gt() %>%
  tab_header(title = md("**Model Fit Summary Table**")) %>%
  cols_label(
    Title = "Classes",
    Parameters = md("Par"),
    LL = md("*LL*"),
    BLRT_PValue = "BLRT",
    BF = md("BF"),
    cmPk = md("*cmPk*")
  ) %>%
  tab_footnote(
    footnote = md(
      "*Note.* Par = Parameters; *LL* = model log likelihood;
BIC = Bayesian information criterion;
aBIC = sample size adjusted BIC; CAIC = consistent Akaike information criterion;
AWE = approximate weight of evidence criterion;
BLRT = bootstrapped likelihood ratio test p-value;
VLMR = Vuong-Lo-Mendell-Rubin adjusted likelihood ratio test p-value;
*cmPk* = approximate correct model probability."
    ),
locations = cells_title()
  )  %>%
  tab_options(column_labels.font.weight = "bold") %>%
  fmt_number(
    9,
    decimals = 2,
    drop_trailing_zeros = TRUE,
    suffixing = TRUE
  ) %>%
  fmt_number(c(3:8, 10),
             decimals = 0) %>%
  sub_missing(1:10,
              missing_text = "--") %>%
  fmt(
    c(8, 10),
    fns = function(x)
      ifelse(x < 0.001, "<0.001",
             scales::number(x, accuracy = 0.01))
  ) %>%
  fmt(
    9,
    fns = function (x)
      ifelse(x > 100, ">100",
             scales::number(x, accuracy = .1))
  ) %>% 
  tab_row_group(
    label = "Model 1",
    rows = c(1:4)) %>%
  tab_row_group(
    label = "Model 2",
    rows =c(5:8)) %>%    
  tab_row_group(
    label = "Model 3",
    rows = c(9:12)) %>%
  tab_row_group(
    label = "Model 4",
    rows = c(13:16)) %>%    
  row_group_order(
    groups = c("Model 1","Model 2", "Model 3", "Model 4")
  )

Information Criteria Plot

Plot information criteria

allFit %>% 
  filter(grepl("Model 1", Title)) %>% 
  dplyr::select(2:7) %>% 
  rowid_to_column() %>% 
  pivot_longer(`BIC`:`AWE`,
               names_to = "Index",
               values_to = "ic_value") %>%  
mutate(Index = factor(Index,
                      levels = c ("AWE", "CAIC", "BIC", "aBIC"))) %>% 
  ggplot(aes(x = rowid, y = ic_value,
             color = Index, shape = Index,
             group = Index, lty = Index)) +
  geom_point(size = 2.0) + geom_line(size = .8) +
  scale_x_continuous(breaks = 1:6) +
  scale_colour_grey(end = .5) +
  theme_cowplot() +
  labs(x = "Number of Classes", y = "Information Criteria Value") +
  theme(legend.title = element_blank(),
        legend.position = "top")

Compare models

# MplusAutomation Method using `compareModels` 

parallelModels <- readModels(here("tidyLPA"))

compareModels(parallelModels[["model_3_class_2.out"]],
  parallelModels[["model_4_class_2.out"]], diffTest = TRUE)

Latent Profile Plot

source("plot_lpa_function.txt")

plot_lpa_function(model_name = output_pisa$model_1_class_2.out)

save figure

ggsave(here("figures", "C4_LPA_Plot.png"), dpi="retina", height=5, width=7, units="in")

References

Hallquist, M. N., & Wiley, J. F. (2018). MplusAutomation: An R Package for Facilitating Large-Scale Latent Variable Analyses in Mplus. Structural equation modeling: a multidisciplinary journal, 25(4), 621-638.

Miller, J. D., Hoffer, T., Suchner, R., Brown, K., & Nelson, C. (1992). LSAY codebook. Northern Illinois University.

Muthén, B. O., Muthén, L. K., & Asparouhov, T. (2017). Regression and mediation analysis using Mplus. Los Angeles, CA: Muthén & Muthén.

Muthén, L.K. and Muthén, B.O. (1998-2017). Mplus User’s Guide. Eighth Edition. Los Angeles, CA: Muthén & Muthén

Rosenberg, J. M., van Lissa, C. J., Beymer, P. N., Anderson, D. J., Schell, M. J. & Schmidt, J. A. (2019). tidyLPA: Easily carry out Latent Profile Analysis (LPA) using open-source or commercial software [R package]. https://data-edu.github.io/tidyLPA/

R Core Team (2017). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL http://www.R-project.org/

Wickham et al., (2019). Welcome to the tidyverse. Journal of Open Source Software, 4(43), 1686, https://doi.org/10.21105/joss.01686

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
enum_lpa		enum_lpa
figures		figures
lpa_enum_cache		lpa_enum_cache
lpa_enum_files		lpa_enum_files
tidyLPA		tidyLPA
.gitignore		.gitignore
README.html		README.html
README.md		README.md
data_62339a89bd516bfbbc8576cfc6c62338.dat		data_62339a89bd516bfbbc8576cfc6c62338.dat
lpa_enum.Rmd		lpa_enum.Rmd
lpa_enum.Rproj		lpa_enum.Rproj
lpa_enum.html		lpa_enum.html
lpa_enum.pdf		lpa_enum.pdf
plot_lpa_function.txt		plot_lpa_function.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Video tutorial repository: "lpa_enum"

Latent Profile Analysis

IMMERSE Project: Institute of Mixture Modeling for Equity-Oriented Researchers, Scholars, and Educators

Video Series Funded by IES Training Grant (R305B220021)

Dina Arch

IMMERSE Project

Load packages

Prepare Data

Descriptive Statistics

Enumeration

`tidyLPA`

Table of Fit

Information Criteria Plot

Compare models

Latent Profile Plot

References

About

Releases

Packages

Languages

immerse-ucsb/lpa_enum

Folders and files

Latest commit

History

Repository files navigation

Video tutorial repository: "lpa_enum"

Latent Profile Analysis

IMMERSE Project: Institute of Mixture Modeling for Equity-Oriented Researchers, Scholars, and Educators

Video Series Funded by IES Training Grant (R305B220021)

Dina Arch

IMMERSE Project

Load packages

Prepare Data

Descriptive Statistics

Enumeration

tidyLPA

Table of Fit

Information Criteria Plot

Compare models

Latent Profile Plot

References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

`tidyLPA`

Packages