Creating Progressively Blurred Images and a GIF in R
Step 1: Set Up Your R Environment
Before you start, make sure you have R installed on your computer. You can download it from The Comprehensive R Archive Network (CRAN). Additionally, you’ll need to install RStudio, a popular integrated development environment for R, which you can get from the RStudio website.
Step 2: Install Required Packages
Open RStudio and install the required packages. You’ll need magick for image processing, here for easy file path management, and gifski for creating GIFs. Install these packages using the following commands:
install.packages("magick")
install.packages("here")
install.packages("gifski")Step 3: Load the Packages
Load the installed packages into your R session with the library() function:
library(magick)
library(here)
library(gifski)Step 4: Define the Function to Create Blurred Images
You’ll create a function named create_blurred_and_gif. This function takes an image file path as input, creates a series of progressively blurred images, and compiles them into a GIF (Credit). Here’s the function definition:
create_blurred_and_gif <- function(image_file) {
# Define the base path and load the original image
base_path <- here("projects", "hindsight-activity")
original <- image_read(image_file)
# Parameters for image processing
num_images <- 20
max_blur <- 99.9999
# Generate images
png_paths <- vector("character", num_images)
for (i in 1:num_images) {
blur_amount <- max_blur * (1 - (i-1) / (num_images-1))
blurred <- image_blur(original, radius = blur_amount, sigma = blur_amount / 2)
png_path <- file.path(base_path, sprintf("blurred_%02d.png", i))
png_paths[i] <- png_path
image_write(blurred, path = png_path)
}
# Create and save the GIF
gif_path <- file.path(base_path, "blurred_images.gif")
gifski(png_files = png_paths, gif_file = gif_path, width = 800, height = 600, delay = 0.5)
}Step 5: Prepare Your Image
Place the image you want to use in the projects/hindsight-activity folder within your RStudio project directory.
Step 6: Run the Function
Call the function with the path to your image. Ensure the path is correct:
image_path <- here("projects", "hindsight-activity", "your_image.png")
create_blurred_and_gif(image_path)Step 7: View the GIF
The GIF will be saved in the same directory as your original image. You can view it by navigating to the folder or by using R:
utils::browseURL(here("projects", "hindsight-activity", "blurred_images.gif"))Conclusion
This simple tutorial shows you how to leverage R and its powerful packages to create visually appealing educational tools. Such techniques are particularly useful in psychological studies, visual effects demonstrations, or any field requiring detailed visual analysis.







Experimental Design
Revised Experimental Design:
a. Between-Subjects Factor:
- Condition: “old” vs “new” (2 levels)
b. Within-Subjects Factors:
Phase: Phase 1 vs Phase 2 (2 levels)
Fame Level: Extremely Famous vs Moderately Famous (2 levels)
Participants:
- 60 participants total (30 per between-subjects condition)
Stimuli:
20 celebrity faces in total
10 faces per phase
In each phase:
5 extremely famous celebrities
5 moderately famous celebrities
Procedure:
Phase 1: All participants view 5 extremely famous and 5 moderately famous faces
Phase 2:
“Old” condition: Same 10 faces as Phase 1 (5 extremely, 5 moderately famous)
“New” condition: 10 new faces (5 extremely, 5 moderately famous)
Measurements for each trial:
F1 score (time to identify)
Correct identification (binary: 1 or 0)
This revised description correctly captures the full factorial design of the experiment. It’s a 2 (Condition: old vs new) × 2 (Phase: 1 vs 2) × 2 (Fame Level: extremely vs moderately famous) mixed factorial design. The Condition is between-subjects, while Phase and Fame Level are within-subjects factors.
Step-by-Step Guide to Simulating Experimental Data in R
Introduction
In psychological experiments, simulating data can be crucial for planning studies, testing data analysis pipelines, and teaching statistical concepts. This guide will explain how to simulate data for an experiment in which participants are shown pictures of celebrities and must identify them. The dataset will include different conditions and phases, and track whether participants correctly identify the celebrities.
Requirements
To follow along, you will need R installed on your computer. The script uses the following packages: - dplyr for data manipulation - tidyr for reshaping data - purrr for functional programming tools - haven for exporting data to SPSS
You can install these packages using the following command if you haven’t already:
install.packages(c("dplyr", "tidyr", "purrr", "haven"))Step 1: Set Up Your Environment
First, load the necessary libraries and set a seed for reproducibility:
library(dplyr)
library(tidyr)
library(purrr)
library(haven)
# Set seed for reproducibility
set.seed(4747)Step 2: Define Experiment Parameters
Define the basic parameters of your experiment:
n_participants <- 60 # Total number of participants
n_faces_per_phase <- 10 # Number of faces shown per phase
n_famous <- 5 # Number of extremely famous faces
n_moderately_famous <- 5 # Number of moderately famous facesStep 3: Generate Participant Data
Create a data frame for your participants. Each participant is assigned to one of two conditions:
participants <- data.frame(
participant_id = 1:n_participants,
condition = rep(c("old", "new"), each = n_participants / 2)
)Step 4: Generate Face Data
Create a data frame for the faces that will be shown to participants:
faces <- data.frame(
face_id = 1:n_faces_per_phase,
fame_level = rep(c(rep("extremely_famous", n_famous), rep("moderately_famous", n_moderately_famous)), 2)
)Step 5: Simulate Trial Data
Define a function to simulate the outcome of each trial, including the time taken to identify the celebrity and whether the identification was correct:
simulate_trial <- function(fame_level, phase, condition) {
base_score <- ifelse(fame_level == "extremely_famous", 15, 25)
base_score <- base_score + rnorm(1, 0, 2)
if (phase == 2) {
if (condition == "old") {
base_score <- base_score - 1.45 - rnorm(1, 0.5, 0.5)
} else {
base_score <- base_score - 1.45 - rnorm(1, 0, 0.5)
}
}
correct <- rbinom(1, 1, ifelse(fame_level == "extremely_famous", 0.9, 0.7))
return(list(score = base_score, correct = correct))
}Step 6: Generate Data for Both Phases
Use the simulate_trial function to generate data across two phases for all participants and faces:
all_data <- map_df(1:2, function(phase) {
crossing(participants, faces) |>
mutate(
trial_results = pmap(list(fame_level, phase, condition), simulate_trial),
score = map_dbl(trial_results, "score"),
correct = map_int(trial_results, "correct"),
phase = phase
) |>
select(-trial_results)
})Step 7: Reshape the Data
Pivot the data so each row contains the scores from both phases and a single correctness indicator:
all_data <- all_data |>
pivot_wider(
id_cols = c(participant_id, face_id, condition, fame_level),
names_from = phase,
values_from = c(score, correct),
names_glue = "{.value}_{phase}"
) |>
mutate(
correct = correct_1
) |>
select(-correct_1, -correct_2)Step 8: Label and Export Data
Add labels to your variables and export the data to an SPSS file:
spss_data <- all_data |>
mutate(
condition = factor(condition, levels = c("old", "new")),
fame_level = factor(fame_level, levels = c("extremely_famous", "moderately_famous")),
condition = labelled(as.numeric(condition), labels = c(Old = 1, New = 2), label = "Experimental Condition"),
fame_level = labelled(as.numeric(fame_level), labels = c("Extremely Famous" = 1, "Moderately Famous" = 2), label = "Celebrity Fame Level"),
score_1 = labelled(score_1, label = "Time to Identify Celebrity (Phase 1)"),
score_2 = labelled(score_2, label = "Time to Identify Celebrity (Phase 2)"),
correct = labelled(correct, labels = c("Incorrect" = 0, "Correct" = 1), label = "Correct Identification Across Phases"),
participant_id = labelled(participant_id, label = "Participant ID"),
face_id = labelled(face_id, label = "Face ID")
)
write_sav(spss_data, here("FALL24", "weeks", "08-anova", "visual_hindsight_data.sav"))
cat("\nData has been saved as 'visual_hindsight_data.sav' in SPSS format\n")Creating an Answer Key for Practice Assignment
```{r}
library(tidyverse)
library(dplyr)
library(jmvReadWrite)
library(haven)
library(jmv)
## Formatting p values
format.pnum <- function(p, precision = 0.001) {
# Calculate the number of digits needed based on the precision
digits <- max(0, -floor(log10(precision)))
# Format the p-value with the required number of digits
if (p < precision) {
formatted_p <- format(precision, nsmall = digits, scientific = FALSE)
} else {
formatted_p <- format(p, nsmall = digits, scientific = FALSE)
}
# Remove leading zeros if present
formatted_p <- sub("^0\\.", ".", formatted_p)
return(formatted_p)
}
# Formatting p-values formula
format.p <- function(p, precision = 0.001) {
digits <- -log(precision, base = 10)
p <- formatC(p, format = 'f', digits = digits)
if (p < .001) {
p = paste0('< ', precision)}
if (p >= .001) {
p = paste0('= ', p) }
sub("0", "", p)
}
#BG <- jmvReadWrite::read_omv('data/hindsight_BGanova.omv')
#WG <- jmvReadWrite::read_omv('data/hindsight_WGanova.omv')
# Load the data
BG <- haven::read_sav("hindsight-bias/data/hindsight_BGanova.sav")
# Preprocess and mutate the data
BGdata <- BG |>
filter(correct == 1) |> # Limit to correct trials
mutate(hb = score_2 - score_1) # Compute Mdiff for hindsight bias
# Run the ANOVA
BGanova <- jmv::ANOVA(data = BGdata,
dep = "hb",
factors = "condition",
emMeans = list("condition"),
emmPlots = TRUE,
emmTables = TRUE)
print(BGanova)
#Use the str() function to inspect the structure of the BGanova object and identify where the ANOVA table is stored.
#print(names(BGanova))
#str(BGanova)
# Assuming the ANOVA results are stored in a component, let's say `main`
if ("main" %in% names(BGanova)) {
BGanova_table <- as.data.frame(BGanova$main)
print(BGanova_table)
} else {
print("ANOVA results component not found.")
}
# Once you have the ANOVA table as a data frame, you can extract the p-value, F-statistic, and Mean Square Error by referencing the appropriate columns. These columns are typically named something like p, F, and MSE or Error.
# Assuming the ANOVA table is already in BGanova_table
# and that you have checked the names of the columns:
# Extract p-value (usually from the first row, 'condition' effect)
p_value <- BGanova_table$p[1]
# Extract F-statistic (usually from the first row, 'condition' effect)
F_statistic <- BGanova_table$F[1]
# Extract Mean Square Error (from the 'ms' column, second row for residuals)
MSe <- BGanova_table$ms[2]
# Extract degrees of freedom
df1 <- BGanova_table$df[1]
df2 <- BGanova_table$df[2]
# Print extracted values
print(paste("P-Value: ", p_value))
print(paste("F-Statistic: ", F_statistic))
print(paste("Mean Square Error: ", MSe))
# Round the values
F_value_rounded <- round(F_statistic, 2)
MSe_rounded <- round(MSe, 2)
p_value_rounded <- ifelse(p_value < .001, "< .001", round(p_value, 3))
# Print extracted values
print(paste("P-Value: ", p_value_rounded))
print(paste("F-Statistic: ", F_value_rounded))
print(paste("Mean Square Error: ", MSe_rounded))
# Create the formatted string
apa_result <- sprintf("F(%d, %d) = %.2f, MSe = %.2f, p %s", df1, df2, F_value_rounded, MSe_rounded, p_value_rounded)
# Print the result
cat("Results of the between-groups ANOVA:\n")
cat(apa_result, "\n")
```Computing Descriptives for BG Analysis
```{r}
#| label: Get Descriptives
#| echo: false
# Assuming BGdata is loaded and contains the correct columns 'condition' and 'hb'
library(dplyr)
BGdescriptives <- BGdata |>
group_by(condition) |>
summarise(
n = n(),
mean = mean(hb, na.rm = TRUE),
sd = sd(hb, na.rm = TRUE),
.groups = "drop"
)
# Print BGdescriptives to verify the structure
print(BGdescriptives)
# Get the unique conditions
conditions <- unique(BGdescriptives$condition)
# Check if we have exactly two conditions
if (length(conditions) == 2) {
# Extract values for the first condition
condition1_values <- BGdescriptives[BGdescriptives$condition == conditions[1], ]
condition1_n <- condition1_values$n
condition1_mean <- condition1_values$mean
condition1_sd <- condition1_values$sd
# Extract values for the second condition
condition2_values <- BGdescriptives[BGdescriptives$condition == conditions[2], ]
condition2_n <- condition2_values$n
condition2_mean <- condition2_values$mean
condition2_sd <- condition2_values$sd
# Print extracted values
cat("Condition", conditions[1], "- n:", condition1_n, "Mean:", condition1_mean, "SD:", condition1_sd, "\n")
cat("Condition", conditions[2], "- n:", condition2_n, "Mean:", condition2_mean, "SD:", condition2_sd, "\n")
} else {
cat("Expected 2 conditions, but found", length(conditions), "conditions.\n")
print(BGdescriptives)
}
```Within-Groups Analysis
```{r}
#| label: WG data prep
#| echo: false
## Within Group Results
WG <- haven::read_sav("data/hindsight_WGanova.sav")
WGdata <- WG |>
mutate(
EXTREMEdiff = EXTREMEavg_2 - EXTREMEavg_1,
MODERATEdiff = MODERATEavg_2 - MODERATEavg_1,
) |>
select("participant_id",
"EXTREMEdiff",
"MODERATEdiff")
WGanova <- anovaRM(
data = WGdata,
rm = list(
list(
label = 'Fame Level',
levels = c('Moderate', 'Extremely'))),
rmCells = list(
list(
measure = 'MODERATEdiff',
cell = 'Moderate'),
list(
measure = 'EXTREMEdiff',
cell = 'Extremely')),
rmTerms = list(
'Fame Level'))
#print(WGanova)
# Assuming the ANOVA results are stored in a component, let's say `main`
if ("rmTable" %in% names(WGanova)) {
WGanova_table <- as.data.frame(WGanova$rmTable)
#print(WGanova_table)
} else {
print("ANOVA results component not found.")
}
# Assuming the object is named WGanova_table
# Extract F value
WG_F_value <- WGanova_table$`F[none]`[1]
# Extract p value
WG_p_value <- WGanova_table$`p[none]`[1]
# Extract degrees of freedom
WG_df1 <- WGanova_table$`df[none]`[1]
WG_df2 <- WGanova_table$`df[none]`[2]
# Round the values
WG_F_value_rounded <- round(WG_F_value, 2)
WG_p_value_apa <- ifelse(WG_p_value < .001, "< .001", round(WG_p_value, 3))
WG_p_value_rounded <- format.pnum(WG_p_value)
``````{r}
#| label: Generate means
#| echo: false
# Calculate means and SDs for each condition
moderate_stats <- WGdata |>
summarise(
n = n(),
mean = mean(MODERATEdiff, na.rm = TRUE),
sd = sd(MODERATEdiff, na.rm = TRUE)
)
extreme_stats <- WGdata |>
summarise(
n = n(),
mean = mean(EXTREMEdiff, na.rm = TRUE),
sd = sd(EXTREMEdiff, na.rm = TRUE)
)
## Moderate descriptives
moderate_n <- round(moderate_stats$n, 2)
moderate_Mean = round(moderate_stats$mean, 2)
moderate_SD <- round(moderate_stats$sd, 2)
## Extreme descriptives
extreme_n <- round(extreme_stats$n, 2)
extreme_Mean = round(extreme_stats$mean, 2)
extreme_SD <- round(extreme_stats$sd, 2)
# Print the results
#cat("Moderate condition: N =", round(moderate_stats$n, 2), ", Mean =", round(moderate_stats$mean, 2), ", SD =", round(moderate_stats$sd, 2), "\n")
#cat("Extreme condition: Mean =", round(extreme_stats$mean, 2), ", SD =", round(extreme_stats$sd, 2), "\n")
```Conclusion
You have now created a simulated data set for a psychological experiment using R. This dataset can be used for analysis, teaching, or further study design. Adjust the parameters and functions as needed to fit the specific needs of your experiment or hypothesis.