FounderScholar

Exploring the science of startups

How to Create an APA Style Interaction Graph in R

When running regression analyses, analyzing the interactions between predictor variables is an important tool for understanding how one predictor influences the relationship between another predictor and the outcome variable. Examining p-values and coefficients is not enough, though, we also need to plot the lines to really understand what is happening in the relationship.

This is very easy to do using the R programming language with a number of packages.

library(ggeffects)
model <- lm(Sepal.Length ~ Petal.Length * Petal.Width, data = iris)
plot <- ggeffects::ggpredict(model, c("Petal.Length",
                              "Petal.Width")) |> plot()
plot
ggeffects package interaction graph in R

The above interaction graph looks amazing from only two lines of code. What more do we really need?

Not much. When we’re deep in analysis exploring data, running models, and testing hypotheses, this is all we need to understand the interaction relationships. We can even make the plot look more professional with the interactions package.

library(tidyverse)
library(interactions)
model <- lm(Sepal.Length ~ Petal.Length * Petal.Width, data = iris)
interact_plot(model,
              pred = Petal.Length,
              modx = Petal.Width,
              modx.values = "plus-minus",
              ) +
  ylim(0,NA)
interactions package interaction graph in R

What’s Wrong With the Default Interaction Graphs in R?

The problem with the default interaction graphs in R only emerges when we are ready to drop our figures and tables into a manuscript. Many academic journals, white papers, and websites don’t have strict guidelines on a graph’s formating. Some variation of the two above graphs would be just fine.

The American Psychological Association (APA), on the other hand, publishes guidance on their preferred formatting for graphs and figures in their publication manual. You can also use Google Images to find examples.

A short story to illustrate the pain. I used R for the analyses in my dissertation project. When it came to formatting all these interaction graphs (and I had many) in APA style to drop into my manuscript, I couldn’t find any satisfactory way to do it in R. In desperation, I resorted to entering data into the spreadsheet template available from Jeremy Dawson.

Later, I did run into Kris Preacher’s website and this blog post. The post shows how to generate interaction graphs in APA format. It is very useful and you may need to do this if you have more complex requirements. But it involves both running R code and then plugging numbers into an online form. This is error-prone and time-consuming.

How to Create an APA Style Interaction Graph in R

The code below will allow you to create a simple APA-style interaction graph in R with two predictors (a 2-way interaction). It should, of course, be modified for your specific needs. This code requires a continuous DV, continuous IV, and a continuous moderator. See Part 2 of this topic if your data have a categorical moderator.

##
# Generate an APA compatible interaction plot
##
library(tidyverse)
library(ggeffects)
library(jtools)
library(forcats)

# Declare data frame
df <- iris

# Declare names of variables in data frame
dv_var <- "Sepal.Length"
iv_var <- "Petal.Length"
mod_var <- "Petal.Width"

# Declare plot labels
dv_label <- "Sepal Length"
iv_label <- "Petal Length"
mod_label <- "Petal Width"

# Declare location of legend
legend_pos <- "bottomright"

##
# The rest of the code should not be modified unless you want
# to customize
##

# Rename variables in df
df <- df |>
  rename(dv = all_of(dv_var),
         iv = all_of(iv_var),
         modv = all_of(mod_var))

# Generate your model, using the new var names
model <- lm(dv ~ iv * modv, data = df)

# Calculate one standard deviation above and below
# the predictor and moderator
cond1 <- mean(df$iv) - sd(df$iv)
cond2 <- mean(df$iv) + sd(df$iv)
mod_low <- mean(df$modv) - sd(df$modv)
mod_high <- mean(df$modv) + sd(df$modv)
mods <- c(mod_low, mod_high)
conds <- c(cond1, cond2)

# Generate the dependent variable predictions. These
# will be our y-axis values
predictions <- ggpredict(model,
                         terms = c("iv [conds]",
                                   "modv [mods]"))

# Update the data frame with the variables needed
# to build the plot
plot_vars <- predictions |>
  mutate(mod_order = ifelse(group == mod_low, 1, 2)) |>
  mutate(iv1_order = ifelse(x == cond1, 1, 2)) |>
  mutate(mod = ifelse(group == mod_low,
                      paste("Low", mod_label),
                      paste("High", mod_label))) |>
  mutate(iv1 = ifelse(x == cond1,
                      paste("Low", iv_label),
                      paste("High", iv_label)))

# Generate the plot
p <- ggplot(plot_vars,
            aes(x = fct_reorder(iv1, iv1_order),
                y = predicted,
                shape = fct_reorder(mod, mod_order),
                group = fct_reorder(mod, mod_order))) +
  geom_point(size = 2) +
  geom_line(aes(linetype = fct_reorder(mod, mod_order))) +
  labs(x = '', y = dv_label) +
  ylim(0,NA) +
  theme_apa(legend.pos = all_of(legend_pos)) +
  scale_colour_grey() +
  theme(legend.background = element_rect(fill = "white",
                                         colour = "black"),
        axis.text = element_text(size=12, colour = "black"))

# Output the plot
p
Custom APA style interaction graph in R

For a simple 2-way interaction graph, the above code only requires declaring:

  1. the data frame,
  2. variable names,
  3. variable labels, and optionally,
  4. the legend position.

This code forces the y-axis to include zero, which is a good data visualization practice. Notice the graph has a solid white background and border. It also standardizes the fonts, among other things.

Have fun formatting your next interaction graph in R and please let me know in the comments below if you have questions or suggestions for improvement.

Sign up to explore the intersection of evidence-based knowledge and hands-on startup experience.

Feature Box

Several times a month you’ll receive an article with practical insight and guidance on launching and growing your business. In addition, you’ll be exposed to interesting, influential or useful research, framed for those in the startup community.

11 Replies

  1. Jana

    Hello
    I tried to run this code. But I encounter an error in row 52 :
    # Update the data frame with the variables needed
    # to build the plot
    plot_vars
    mutate(mod_order = ifelse(group == mod_low, 1, 2)) |>
    mutate(iv1_order = ifelse(x == cond1, 1, 2)) |>
    mutate(mod = ifelse(group == mod_low,
    paste(“Low”, mod_label),
    paste(“High”, mod_label))) |>
    mutate(iv1 = ifelse(x == cond1,
    paste(“Low”, iv_label),
    paste(“High”, iv_label)))

    I get the error message:
    Error in `mutate()`:
    ℹ In argument: `mod_order = ifelse(group == mod_low, 1, 2)`.
    Caused by error:
    ! object ‘group’ not found
    Backtrace:
    1. dplyr::mutate(…)
    11. base::ifelse(group == mod_low, 1, 2)

    I do not know how to resolve this. I would be happy, if you could see if you detect a mistake?

    Kind regards

    1. Jana

      I fixed this error. For my R version there were “” necessary around group and x. However, now the code works and I get a plot. But this plot does not look right. It only shows something for high IV and Mod. For low it only shows NA. I can not add a picture here,, but maybe you have encountered this problem before?

  2. I just posted a new article that has code to generate an interaction plot if your moderator is a factor (categorical) variable. https://www.founderscholar.com/how-to-create-an-apa-style-interaction-graph-in-r-part-2/

  3. One note is the current version of this code only works for a continuous IV and a continuous moderator. So, if you get errors about doubles or numbers, that is probably the issue. I hope that you can modify this code to work with a moderator that is a factor. I may update it in the future to handle this.

  4. Everett

    Hello! Thank you for the very helpful thread. I have run the code exactly as-is, except for changing the variable names to the names in my own dataframe. When I run the lines

    predictions <- ggpredict(model,
    terms = c("iv [conds]",
    "modv [mods]"))

    I receive the error message

    Error: variables ‘iv’, ‘modv’ were specified with different types from the fit

    the variables are of type "double". and I cannot reason why it isn't running. Do you have advice for this?

    Best,

    Everett

    1. Hi Everett, can you post a few rows of your data? Maybe with just the three variables you are using in the analysis? Then I can run it on my computer.

  5. Jessica

    Hi, I have some experience in Python and about an hour’s worth of class in R – but I’ve always been interested in learning more about R. Unfortunately I’ve never had time and now my capstone paper is due in a week, and I stumbled across your page on making interaction graphs in R, great! I just wanted to see how my data would look and thought your code would make the plot a breeze, and it would be my gateway into R. However, upon adding my data, I’m running into the error:
    Error in UseMethod(“rename”) :
    no applicable method for ‘rename’ applied to an object of class “function”
    once the “# Rename variables in df” block runs. I did not change anything in the code other than the variables for my dataset, what’s the issue here? Thanks in advance!

    1. Laura

      I have the same issue

      1. Happy to help you trouble shoot it if you provide more info.

    2. Sorry it didn’t work for you. Can you post more information? For instance, can you paste e.g. 5 rows of your data here so I can run it and see the error? Let me know what the dv, iv, and moderator is, please. Also, paste your lm() function so I can see what you’re trying to plot.

Leave a Reply