Compare commits
4 Commits
a31f490499
...
master
| Author | SHA1 | Date | |
|---|---|---|---|
| d0d0e0f8d4 | |||
| 42ea52d859 | |||
| 094201eeb4 | |||
| ba54687da2 |
5
.gitignore
vendored
5
.gitignore
vendored
@ -1 +1,4 @@
|
||||
.history/
|
||||
.history/
|
||||
eohi3_2.csv
|
||||
*~
|
||||
.~lock*
|
||||
@ -1 +0,0 @@
|
||||
,ladmin,KrakenMint,21.01.2026 16:44,file:///home/ladmin/.config/libreoffice/4;
|
||||
315
eohi3/00 - var creation.md
Normal file
315
eohi3/00 - var creation.md
Normal file
@ -0,0 +1,315 @@
|
||||
# Variable Creation Scripts Documentation
|
||||
|
||||
This document describes the data processing scripts used to create derived variables in the EOHI3 dataset. Each script performs specific transformations and should be run in sequence.
|
||||
|
||||
---
|
||||
|
||||
## datap 04 - combined vars.r
|
||||
|
||||
### Goal
|
||||
Combine self-perspective and other-perspective variables into single columns. For each row, values exist in either the self-perspective variables OR the other-perspective variables, never both.
|
||||
|
||||
### Transformations
|
||||
|
||||
#### Past Variables (p5 = past)
|
||||
Combines `self[VAL/PERS/PREF]_p5_[string]` and `other[VAL/PERS/PREF]_p5_[string]` into `past_[val/pers/pref]_[string]`.
|
||||
|
||||
**Source Variables:**
|
||||
- **Values (VAL)**: `selfVAL_p5_trad`, `otherVAL_p5_trad`, `selfVAL_p5_autonomy`, `otherVAL_p5_autonomy`, `selfVAL_p5_personal`, `otherVAL_p5_personal`, `selfVAL_p5_justice`, `otherVAL_p5_justice`, `selfVAL_p5_close`, `otherVAL_p5_close`, `selfVAL_p5_connect`, `otherVAL_p5_connect`, `selfVAL_p5_dgen`, `otherVAL_p5_dgen`
|
||||
- **Personality (PERS)**: `selfPERS_p5_open`, `otherPESR_p5_open` (note: typo in source data), `selfPERS_p5_goal`, `otherPERS_p5_goal`, `selfPERS_p5_social`, `otherPERS_p5_social`, `selfPERS_p5_agree`, `otherPERS_p5_agree`, `selfPERS_p5_stress`, `otherPERS_p5_stress`, `selfPERS_p5_dgen`, `otherPERS_p5_dgen`
|
||||
- **Preferences (PREF)**: `selfPREF_p5_hobbies`, `otherPREF_p5_hobbies`, `selfPREF_p5_music`, `otherPREF_p5_music`, `selfPREF_p5_dress`, `otherPREF_p5_dress`, `selfPREF_p5_exer`, `otherPREF_p5_exer`, `selfPREF_p5_food`, `otherPREF_p5_food`, `selfPREF_p5_friends`, `otherPREF_p5_friends`, `selfPREF_p5_dgen`, `otherPREF_p5_dgen`
|
||||
|
||||
**Target Variables:**
|
||||
- `past_val_trad`, `past_val_autonomy`, `past_val_personal`, `past_val_justice`, `past_val_close`, `past_val_connect`, `past_val_DGEN`
|
||||
- `past_pers_open`, `past_pers_goal`, `past_pers_social`, `past_pers_agree`, `past_pers_stress`, `past_pers_DGEN`
|
||||
- `past_pref_hobbies`, `past_pref_music`, `past_pref_dress`, `past_pref_exer`, `past_pref_food`, `past_pref_friends`, `past_pref_DGEN`
|
||||
|
||||
#### Future Variables (f5 = future)
|
||||
Combines `self[VAL/PERS/PREF]_f5_[string]` and `other[VAL/PERS/PREF]_f5_[string]` into `fut_[val/pers/pref]_[string]`.
|
||||
|
||||
**Source Variables:**
|
||||
- **Values (VAL)**: `selfVAL_f5_trad`, `otherVAL_f5_trad`, `selfVAL_f5_autonomy`, `otherVAL_f5_autonomy`, `selfVAL_f5_personal`, `otherVAL_f5_personal`, `selfVAL_f5_justice`, `otherVAL_f5_justice`, `selfVAL_f5_close`, `otherVAL_f5_close`, `selfVAL_f5_connect`, `otherVAL_f5_connect`, `selfVAL_f5_dgen`, `otherVAL_f5_dgen`
|
||||
- **Personality (PERS)**: `selfPERS_f5_open`, `otherPERS_f5_open`, `selfPERS_f5_goal`, `otherPERS_f5_goal`, `selfPERS_f5_social`, `otherPERS_f5_social`, `selfPERS_f5_agree`, `otherPERS_f5_agree`, `selfPERS_f5_stress`, `otherPERS_f5_stress`, `selfPERS_f5_dgen`, `otherPERS_f5_dgen`
|
||||
- **Preferences (PREF)**: `selfPREF_f5_hobbies`, `otherPREF_f5_hobbies`, `selfPREF_f5_music`, `otherPREF_f5_music`, `selfPREF_f5_dress`, `otherPREF_f5_dress`, `selfPREF_f5_exer`, `otherPREF_f5_exer`, `selfPREF_f5_food`, `otherPREF_f5_food`, `selfPREF_f5_friends`, `otherPREF_f5_friends`, `selfPREF_f5_dgen`, `otherPREF_f5_dgen`
|
||||
|
||||
**Target Variables:**
|
||||
- `fut_val_trad`, `fut_val_autonomy`, `fut_val_personal`, `fut_val_justice`, `fut_val_close`, `fut_val_connect`, `fut_val_DGEN`
|
||||
- `fut_pers_open`, `fut_pers_goal`, `fut_pers_social`, `fut_pers_agree`, `fut_pers_stress`, `fut_pers_DGEN`
|
||||
- `fut_pref_hobbies`, `fut_pref_music`, `fut_pref_dress`, `fut_pref_exer`, `fut_pref_food`, `fut_pref_friends`, `fut_pref_DGEN`
|
||||
|
||||
### Logic
|
||||
- Uses self value if present (not empty/NA), otherwise uses other value
|
||||
- If both are empty/NA, result is NA
|
||||
- Assumes mutual exclusivity: each row has values in either self OR other, never both
|
||||
|
||||
### Validation Checks
|
||||
1. **Conflict Check**: Verifies no rows have values in both self and other for the same variable
|
||||
2. **Coverage Check**: Verifies combined columns have expected number of non-empty values (self_count + other_count = combined_count)
|
||||
3. **Sample Row Check**: Shows examples of how values were combined
|
||||
|
||||
### Output
|
||||
- Updates existing target columns in `eohi3.csv`
|
||||
- Creates backup `eohi3_2.csv` before processing
|
||||
|
||||
---
|
||||
|
||||
## datap 05 - ehi vars.r
|
||||
|
||||
### Goal
|
||||
Calculate EHI (End of History Illusion) variables as the difference between past and future variables. Each EHI variable represents the change from past to future perspective.
|
||||
|
||||
### Transformations
|
||||
|
||||
**Calculation Formula:** `ehi_[pref/pers/val]_[string] = past_[pref/pers/val]_[string] - fut_[pref/pers/val]_[string]`
|
||||
|
||||
#### EHI Variables Created
|
||||
|
||||
**EHI Preferences:**
|
||||
- `ehi_pref_hobbies` = `past_pref_hobbies` - `fut_pref_hobbies`
|
||||
- `ehi_pref_music` = `past_pref_music` - `fut_pref_music`
|
||||
- `ehi_pref_dress` = `past_pref_dress` - `fut_pref_dress`
|
||||
- `ehi_pref_exer` = `past_pref_exer` - `fut_pref_exer`
|
||||
- `ehi_pref_food` = `past_pref_food` - `fut_pref_food`
|
||||
- `ehi_pref_friends` = `past_pref_friends` - `fut_pref_friends`
|
||||
- `ehi_pref_DGEN` = `past_pref_DGEN` - `fut_pref_DGEN`
|
||||
|
||||
**EHI Personality:**
|
||||
- `ehi_pers_open` = `past_pers_open` - `fut_pers_open`
|
||||
- `ehi_pers_goal` = `past_pers_goal` - `fut_pers_goal`
|
||||
- `ehi_pers_social` = `past_pers_social` - `fut_pers_social`
|
||||
- `ehi_pers_agree` = `past_pers_agree` - `fut_pers_agree`
|
||||
- `ehi_pers_stress` = `past_pers_stress` - `fut_pers_stress`
|
||||
- `ehi_pers_DGEN` = `past_pers_DGEN` - `fut_pers_DGEN`
|
||||
|
||||
**EHI Values:**
|
||||
- `ehi_val_trad` = `past_val_trad` - `fut_val_trad`
|
||||
- `ehi_val_autonomy` = `past_val_autonomy` - `fut_val_autonomy`
|
||||
- `ehi_val_personal` = `past_val_personal` - `fut_val_personal`
|
||||
- `ehi_val_justice` = `past_val_justice` - `fut_val_justice`
|
||||
- `ehi_val_close` = `past_val_close` - `fut_val_close`
|
||||
- `ehi_val_connect` = `past_val_connect` - `fut_val_connect`
|
||||
- `ehi_val_DGEN` = `past_val_DGEN` - `fut_val_DGEN`
|
||||
|
||||
### Logic
|
||||
- Converts source variables to numeric (handling empty strings and NA)
|
||||
- Calculates difference: past - future
|
||||
- Result can be positive (past > future), negative (past < future), or zero (past = future)
|
||||
|
||||
### Validation Checks
|
||||
1. **Variable Existence**: Checks that all target variables exist before processing
|
||||
2. **Source Variable Check**: Verifies source columns exist
|
||||
3. **Random Row Validation**: Checks 5 random rows showing source values, target value, expected calculation, and match status
|
||||
|
||||
### Output
|
||||
- Updates existing target columns in `eohi3.csv`
|
||||
- Creates backup `eohi3_2.csv` before processing
|
||||
|
||||
---
|
||||
|
||||
## datap 06 - mean vars.r
|
||||
|
||||
### Goal
|
||||
Calculate mean variables for various scales by averaging multiple related variables. Creates both domain-specific means and overall composite means.
|
||||
|
||||
### Transformations
|
||||
|
||||
#### Domain-Specific Means
|
||||
|
||||
**Past Preferences MEAN:**
|
||||
- **Source Variables**: `past_pref_hobbies`, `past_pref_music`, `past_pref_dress`, `past_pref_exer`, `past_pref_food`, `past_pref_friends` (6 variables)
|
||||
- **Target Variable**: `past_pref_MEAN`
|
||||
|
||||
**Future Preferences MEAN:**
|
||||
- **Source Variables**: `fut_pref_hobbies`, `fut_pref_music`, `fut_pref_dress`, `fut_pref_exer`, `fut_pref_food`, `fut_pref_friends` (6 variables)
|
||||
- **Target Variable**: `fut_pref_MEAN`
|
||||
|
||||
**Past Personality MEAN:**
|
||||
- **Source Variables**: `past_pers_open`, `past_pers_goal`, `past_pers_social`, `past_pers_agree`, `past_pers_stress` (5 variables)
|
||||
- **Target Variable**: `past_pers_MEAN`
|
||||
|
||||
**Future Personality MEAN:**
|
||||
- **Source Variables**: `fut_pers_open`, `fut_pers_goal`, `fut_pers_social`, `fut_pers_agree`, `fut_pers_stress` (5 variables)
|
||||
- **Target Variable**: `fut_pers_MEAN`
|
||||
|
||||
**Past Values MEAN:**
|
||||
- **Source Variables**: `past_val_trad`, `past_val_autonomy`, `past_val_personal`, `past_val_justice`, `past_val_close`, `past_val_connect` (6 variables)
|
||||
- **Target Variable**: `past_val_MEAN`
|
||||
|
||||
**Future Values MEAN:**
|
||||
- **Source Variables**: `fut_val_trad`, `fut_val_autonomy`, `fut_val_personal`, `fut_val_justice`, `fut_val_close`, `fut_val_connect` (6 variables)
|
||||
- **Target Variable**: `fut_val_MEAN`
|
||||
|
||||
**EHI Preferences MEAN:**
|
||||
- **Source Variables**: `ehi_pref_hobbies`, `ehi_pref_music`, `ehi_pref_dress`, `ehi_pref_exer`, `ehi_pref_food`, `ehi_pref_friends` (6 variables)
|
||||
- **Target Variable**: `ehi_pref_MEAN`
|
||||
|
||||
**EHI Personality MEAN:**
|
||||
- **Source Variables**: `ehi_pers_open`, `ehi_pers_goal`, `ehi_pers_social`, `ehi_pers_agree`, `ehi_pers_stress` (5 variables)
|
||||
- **Target Variable**: `ehi_pers_MEAN`
|
||||
|
||||
**EHI Values MEAN:**
|
||||
- **Source Variables**: `ehi_val_trad`, `ehi_val_autonomy`, `ehi_val_personal`, `ehi_val_justice`, `ehi_val_close`, `ehi_val_connect` (6 variables)
|
||||
- **Target Variable**: `ehi_val_MEAN`
|
||||
|
||||
#### Composite Means
|
||||
|
||||
**EHI Domain-Specific Mean:**
|
||||
- **Source Variables**: `ehi_pref_MEAN`, `ehi_pers_MEAN`, `ehi_val_MEAN` (3 variables)
|
||||
- **Target Variable**: `ehiDS_mean`
|
||||
|
||||
**EHI Domain-General Mean:**
|
||||
- **Source Variables**: `ehi_pref_DGEN`, `ehi_pers_DGEN`, `ehi_val_DGEN` (3 variables)
|
||||
- **Target Variable**: `ehiDGEN_mean`
|
||||
|
||||
### Logic
|
||||
- Converts source variables to numeric (handling empty strings and NA)
|
||||
- Calculates row means using `rowMeans()` with `na.rm = TRUE` (ignores NA values)
|
||||
- Each mean represents the average of non-missing values for that row
|
||||
|
||||
### Validation Checks
|
||||
1. **Variable Existence**: Uses `setdiff()` to check source and target variables exist
|
||||
2. **Random Row Validation**: Checks 5 random rows showing source variable names, source values, target value, expected mean calculation, and match status
|
||||
|
||||
### Output
|
||||
- Updates existing target columns in `eohi3.csv`
|
||||
- Creates backup `eohi3_2.csv` before processing
|
||||
|
||||
---
|
||||
|
||||
## datap 07 - scales and recodes.r
|
||||
|
||||
### Goal
|
||||
Recode various variables and calculate scale scores. Includes recoding categorical variables, processing cognitive reflection test (CRT) items, calculating ICAR scores, and recoding demographic variables.
|
||||
|
||||
### Transformations
|
||||
|
||||
#### 1. Recode other_length2 → other_length
|
||||
**Source Variable**: `other_length2`
|
||||
**Target Variable**: `other_length`
|
||||
|
||||
**Recoding Rules:**
|
||||
- Values 5-9 → "5-9"
|
||||
- Values 10-14 → "10-14"
|
||||
- Values 15-19 → "15-19"
|
||||
- Value "20+" → "20+" (handled as special case)
|
||||
- Empty strings → preserved as empty string (not NA)
|
||||
- NA → NA
|
||||
|
||||
#### 2. Recode other_like2 → other_like
|
||||
**Source Variable**: `other_like2`
|
||||
**Target Variable**: `other_like`
|
||||
|
||||
**Recoding Rules:**
|
||||
- "Dislike a great deal" → "-2"
|
||||
- "Dislike somewhat" → "-1"
|
||||
- "Neither like nor dislike" → "0"
|
||||
- "Like somewhat" → "1"
|
||||
- "Like a great deal" → "2"
|
||||
- Empty strings → preserved as empty string (not NA)
|
||||
- NA → NA
|
||||
|
||||
#### 3. Calculate aot_total (Actively Open-Minded Thinking)
|
||||
**Source Variables**: `aot01`, `aot02`, `aot03`, `aot04_r`, `aot05_r`, `aot06_r`, `aot07_r`, `aot08`
|
||||
**Target Variable**: `aot_total`
|
||||
|
||||
**Calculation:**
|
||||
1. Reverse code `aot04_r`, `aot05_r`, `aot06_r`, `aot07_r` by multiplying by -1
|
||||
2. Calculate mean of all 8 variables: 4 original (`aot01`, `aot02`, `aot03`, `aot08`) + 4 reversed (`aot04_r`, `aot05_r`, `aot06_r`, `aot07_r`)
|
||||
|
||||
#### 4. Process CRT Questions → crt_correct and crt_int
|
||||
**Source Variables**: `crt01`, `crt02`, `crt03`
|
||||
**Target Variables**: `crt_correct`, `crt_int`
|
||||
|
||||
**CRT01:**
|
||||
- "5 cents" → `crt_correct` = 1, `crt_int` = 0
|
||||
- "10 cents" → `crt_correct` = 0, `crt_int` = 1
|
||||
- Other values → `crt_correct` = 0, `crt_int` = 0
|
||||
|
||||
**CRT02:**
|
||||
- "5 minutes" → `crt_correct` += 1, `crt_int` unchanged
|
||||
- "100 minutes" → `crt_correct` unchanged, `crt_int` += 1
|
||||
- Other values → both unchanged
|
||||
|
||||
**CRT03:**
|
||||
- "47 days" → `crt_correct` += 1, `crt_int` unchanged
|
||||
- "24 days" → `crt_correct` unchanged, `crt_int` += 1
|
||||
- Other values → both unchanged
|
||||
|
||||
**Note**: `crt_correct` and `crt_int` are cumulative across all 3 questions (range: 0-3)
|
||||
|
||||
#### 5. Calculate icar_verbal
|
||||
**Source Variables**: `verbal01`, `verbal02`, `verbal03`, `verbal04`, `verbal05`
|
||||
**Target Variable**: `icar_verbal`
|
||||
|
||||
**Correct Answers:**
|
||||
- `verbal01` = "5"
|
||||
- `verbal02` = "8"
|
||||
- `verbal03` = "It's impossible to tell"
|
||||
- `verbal04` = "47"
|
||||
- `verbal05` = "Sunday"
|
||||
|
||||
**Calculation**: Proportion correct = (number of correct responses) / 5
|
||||
|
||||
#### 6. Calculate icar_matrix
|
||||
**Source Variables**: `matrix01`, `matrix02`, `matrix03`, `matrix04`, `matrix05`
|
||||
**Target Variable**: `icar_matrix`
|
||||
|
||||
**Correct Answers:**
|
||||
- `matrix01` = "D"
|
||||
- `matrix02` = "E"
|
||||
- `matrix03` = "B"
|
||||
- `matrix04` = "B"
|
||||
- `matrix05` = "D"
|
||||
|
||||
**Calculation**: Proportion correct = (number of correct responses) / 5
|
||||
|
||||
#### 7. Calculate icar_total
|
||||
**Source Variables**: `verbal01`-`verbal05`, `matrix01`-`matrix05` (10 variables total)
|
||||
**Target Variable**: `icar_total`
|
||||
|
||||
**Calculation**: Proportion correct across all 10 items = (number of correct responses) / 10
|
||||
|
||||
#### 8. Recode demo_sex → sex
|
||||
**Source Variable**: `demo_sex`
|
||||
**Target Variable**: `sex`
|
||||
|
||||
**Recoding Rules:**
|
||||
- "Male" (case-insensitive) → 0
|
||||
- "Female" (case-insensitive) → 1
|
||||
- Other values (e.g., "Prefer not to say") → 2
|
||||
- Empty/NA → NA
|
||||
|
||||
#### 9. Recode demo_edu → education
|
||||
**Source Variable**: `demo_edu`
|
||||
**Target Variable**: `education` (ordered factor)
|
||||
|
||||
**Recoding Rules:**
|
||||
- "High School (or equivalent)" or "Trade School" → "HS_TS"
|
||||
- "College Diploma/Certificate" or "University - Undergraduate" → "C_Ug"
|
||||
- "University - Graduate (Masters)" or "University - PhD" or "Professional Degree (ex. JD/MD)" → "grad_prof"
|
||||
- Empty/NA → NA
|
||||
|
||||
**Factor Levels**: `HS_TS` < `C_Ug` < `grad_prof` (ordered)
|
||||
|
||||
### Validation Checks
|
||||
Each transformation includes:
|
||||
1. **Variable Existence Check**: Verifies source and target variables exist
|
||||
2. **Value Check**: Verifies expected values exist in source variables (warns about unexpected values)
|
||||
3. **Post-Processing Verification**: Checks 5 random rows showing source values, target values, and calculations
|
||||
|
||||
### Output
|
||||
- Updates existing target columns in `eohi3.csv`
|
||||
- Creates backup `eohi3_2.csv` before processing
|
||||
|
||||
---
|
||||
|
||||
## Script Execution Order
|
||||
|
||||
These scripts should be run in the following order:
|
||||
|
||||
1. **datap 04 - combined vars.r** - Combines self/other variables into past/future variables
|
||||
2. **datap 05 - ehi vars.r** - Calculates EHI variables from past/future differences
|
||||
3. **datap 06 - mean vars.r** - Calculates mean variables for scales
|
||||
4. **datap 07 - scales and recodes.r** - Recodes variables and calculates scale scores
|
||||
|
||||
Each script creates a backup (`eohi3_2.csv`) before processing and includes validation checks to ensure transformations are performed correctly.
|
||||
149
eohi3/DA00_fmaxVALS.r
Normal file
149
eohi3/DA00_fmaxVALS.r
Normal file
@ -0,0 +1,149 @@
|
||||
library(SuppDists)
|
||||
library(dplyr)
|
||||
library(tidyr)
|
||||
|
||||
setwd("/home/ladmin/Documents/DND/EOHI/eohi3")
|
||||
|
||||
between_vars <- c("perspective", "temporalDO")
|
||||
within_vars_MEAN <- c(
|
||||
"past_pref_MEAN", "past_pers_MEAN", "past_val_MEAN",
|
||||
"fut_pref_MEAN", "fut_pers_MEAN", "fut_val_MEAN"
|
||||
)
|
||||
within_vars_DGEN <- c(
|
||||
"past_pref_DGEN", "past_pers_DGEN", "past_val_DGEN",
|
||||
"fut_pref_DGEN", "fut_pers_DGEN", "fut_val_DGEN"
|
||||
)
|
||||
|
||||
df <- read.csv("eohi3.csv", stringsAsFactors = FALSE, check.names = FALSE, na.strings = "NA")
|
||||
|
||||
anova_data_MEAN <- df %>%
|
||||
select(pID, all_of(between_vars), all_of(within_vars_MEAN)) %>%
|
||||
filter(!is.na(perspective), perspective != "",
|
||||
!is.na(temporalDO), temporalDO != "")
|
||||
|
||||
long_data_MEAN <- anova_data_MEAN %>%
|
||||
pivot_longer(
|
||||
cols = all_of(within_vars_MEAN),
|
||||
names_to = "variable",
|
||||
values_to = "MEAN_SCORE"
|
||||
) %>%
|
||||
mutate(
|
||||
time = ifelse(grepl("^past_", variable), "past", "fut"),
|
||||
domain = case_when(
|
||||
grepl("_pref_MEAN$", variable) ~ "pref",
|
||||
grepl("_pers_MEAN$", variable) ~ "pers",
|
||||
grepl("_val_MEAN$", variable) ~ "val",
|
||||
TRUE ~ NA_character_
|
||||
)
|
||||
) %>%
|
||||
mutate(
|
||||
TIME = factor(time, levels = c("past", "fut")),
|
||||
DOMAIN = factor(domain, levels = c("pref", "pers", "val")),
|
||||
perspective = factor(perspective),
|
||||
temporalDO = factor(temporalDO)
|
||||
) %>%
|
||||
select(pID, perspective, temporalDO, TIME, DOMAIN, MEAN_SCORE) %>%
|
||||
filter(!is.na(MEAN_SCORE))
|
||||
|
||||
cell_vars_MEAN <- long_data_MEAN %>%
|
||||
group_by(perspective, temporalDO, TIME, DOMAIN) %>%
|
||||
summarise(
|
||||
n = n(),
|
||||
variance = var(MEAN_SCORE, na.rm = TRUE),
|
||||
.groups = "drop"
|
||||
)
|
||||
|
||||
fmax_by_cell_MEAN <- cell_vars_MEAN %>%
|
||||
group_by(TIME, DOMAIN) %>%
|
||||
summarise(
|
||||
Fmax_observed = max(variance, na.rm = TRUE) / min(variance, na.rm = TRUE),
|
||||
df_min = min(n) - 1L,
|
||||
.groups = "drop"
|
||||
)
|
||||
|
||||
k <- 4
|
||||
|
||||
fmax_table_MEAN <- fmax_by_cell_MEAN %>%
|
||||
rowwise() %>%
|
||||
mutate(
|
||||
alpha_0.05 = SuppDists::qmaxFratio(0.95, df = df_min, k = k),
|
||||
alpha_0.01 = SuppDists::qmaxFratio(0.99, df = df_min, k = k)
|
||||
) %>%
|
||||
ungroup() %>%
|
||||
mutate(
|
||||
Fmax_observed = round(Fmax_observed, 4),
|
||||
alpha_0.05 = round(alpha_0.05, 4),
|
||||
alpha_0.01 = round(alpha_0.01, 4)
|
||||
) %>%
|
||||
select(TIME, DOMAIN, Fmax_observed, alpha_0.05, alpha_0.01)
|
||||
|
||||
# ---- MEAN: Print observed Hartley ratios ----
|
||||
cat("\n--- Hartley ratios (MEAN) ---\n")
|
||||
fmax_table_MEAN %>%
|
||||
mutate(across(where(is.numeric), ~ format(round(., 4), nsmall = 4))) %>%
|
||||
print()
|
||||
|
||||
# ---- DGEN: Observed Hartley ratios ----
|
||||
anova_data_DGEN <- df %>%
|
||||
select(pID, all_of(between_vars), all_of(within_vars_DGEN)) %>%
|
||||
filter(!is.na(perspective), perspective != "",
|
||||
!is.na(temporalDO), temporalDO != "")
|
||||
|
||||
long_data_DGEN <- anova_data_DGEN %>%
|
||||
pivot_longer(
|
||||
cols = all_of(within_vars_DGEN),
|
||||
names_to = "variable",
|
||||
values_to = "DGEN_SCORE"
|
||||
) %>%
|
||||
mutate(
|
||||
time = ifelse(grepl("^past_", variable), "past", "fut"),
|
||||
domain = case_when(
|
||||
grepl("_pref_DGEN$", variable) ~ "pref",
|
||||
grepl("_pers_DGEN$", variable) ~ "pers",
|
||||
grepl("_val_DGEN$", variable) ~ "val",
|
||||
TRUE ~ NA_character_
|
||||
)
|
||||
) %>%
|
||||
mutate(
|
||||
TIME = factor(time, levels = c("past", "fut")),
|
||||
DOMAIN = factor(domain, levels = c("pref", "pers", "val")),
|
||||
perspective = factor(perspective),
|
||||
temporalDO = factor(temporalDO)
|
||||
) %>%
|
||||
select(pID, perspective, temporalDO, TIME, DOMAIN, DGEN_SCORE) %>%
|
||||
filter(!is.na(DGEN_SCORE))
|
||||
|
||||
cell_vars_DGEN <- long_data_DGEN %>%
|
||||
group_by(perspective, temporalDO, TIME, DOMAIN) %>%
|
||||
summarise(
|
||||
n = n(),
|
||||
variance = var(DGEN_SCORE, na.rm = TRUE),
|
||||
.groups = "drop"
|
||||
)
|
||||
|
||||
fmax_by_cell_DGEN <- cell_vars_DGEN %>%
|
||||
group_by(TIME, DOMAIN) %>%
|
||||
summarise(
|
||||
Fmax_observed = max(variance, na.rm = TRUE) / min(variance, na.rm = TRUE),
|
||||
df_min = min(n) - 1L,
|
||||
.groups = "drop"
|
||||
)
|
||||
|
||||
fmax_table_DGEN <- fmax_by_cell_DGEN %>%
|
||||
rowwise() %>%
|
||||
mutate(
|
||||
alpha_0.05 = SuppDists::qmaxFratio(0.95, df = df_min, k = k),
|
||||
alpha_0.01 = SuppDists::qmaxFratio(0.99, df = df_min, k = k)
|
||||
) %>%
|
||||
ungroup() %>%
|
||||
mutate(
|
||||
Fmax_observed = round(Fmax_observed, 4),
|
||||
alpha_0.05 = round(alpha_0.05, 4),
|
||||
alpha_0.01 = round(alpha_0.01, 4)
|
||||
) %>%
|
||||
select(TIME, DOMAIN, Fmax_observed, alpha_0.05, alpha_0.01)
|
||||
|
||||
cat("\n--- Hartley ratios (DGEN) ---\n")
|
||||
fmax_table_DGEN %>%
|
||||
mutate(across(where(is.numeric), ~ format(round(., 4), nsmall = 4))) %>%
|
||||
print()
|
||||
235
eohi3/DA01_anova_DS.r
Normal file
235
eohi3/DA01_anova_DS.r
Normal file
@ -0,0 +1,235 @@
|
||||
library(tidyverse)
|
||||
library(rstatix)
|
||||
library(emmeans)
|
||||
library(effectsize)
|
||||
library(afex)
|
||||
library(car)
|
||||
|
||||
options(scipen = 999)
|
||||
afex::set_sum_contrasts()
|
||||
|
||||
setwd("/home/ladmin/Documents/DND/EOHI/eohi3")
|
||||
|
||||
df <- read.csv(
|
||||
"eohi3.csv",
|
||||
stringsAsFactors = FALSE,
|
||||
check.names = FALSE,
|
||||
na.strings = "NA"
|
||||
)
|
||||
|
||||
between_vars <- c("perspective", "temporalDO")
|
||||
within_vars <- c(
|
||||
"past_pref_MEAN", "past_pers_MEAN", "past_val_MEAN",
|
||||
"fut_pref_MEAN", "fut_pers_MEAN", "fut_val_MEAN"
|
||||
)
|
||||
|
||||
missing_vars <- setdiff(c(between_vars, within_vars, "pID"), names(df))
|
||||
if (length(missing_vars) > 0) {
|
||||
stop(paste("Missing required variables:", paste(missing_vars, collapse = ", ")))
|
||||
}
|
||||
|
||||
anova_data <- df %>%
|
||||
select(pID, all_of(between_vars), all_of(within_vars)) %>%
|
||||
filter(
|
||||
!is.na(perspective), perspective != "",
|
||||
!is.na(temporalDO), temporalDO != ""
|
||||
)
|
||||
|
||||
long_data <- anova_data %>%
|
||||
pivot_longer(
|
||||
cols = all_of(within_vars),
|
||||
names_to = "variable",
|
||||
values_to = "MEAN_SCORE"
|
||||
) %>%
|
||||
mutate(
|
||||
time = case_when(
|
||||
grepl("^past_", variable) ~ "past",
|
||||
grepl("^fut_", variable) ~ "fut",
|
||||
TRUE ~ NA_character_
|
||||
),
|
||||
domain = case_when(
|
||||
grepl("_pref_MEAN$", variable) ~ "pref",
|
||||
grepl("_pers_MEAN$", variable) ~ "pers",
|
||||
grepl("_val_MEAN$", variable) ~ "val",
|
||||
TRUE ~ NA_character_
|
||||
)
|
||||
) %>%
|
||||
mutate(
|
||||
TIME = factor(time, levels = c("past", "fut")),
|
||||
DOMAIN = factor(domain, levels = c("pref", "pers", "val")),
|
||||
perspective = factor(perspective),
|
||||
temporalDO = factor(temporalDO),
|
||||
pID = factor(pID)
|
||||
) %>%
|
||||
select(pID, perspective, temporalDO, TIME, DOMAIN, MEAN_SCORE) %>%
|
||||
filter(!is.na(MEAN_SCORE))
|
||||
|
||||
desc_stats <- long_data %>%
|
||||
group_by(perspective, temporalDO, TIME, DOMAIN) %>%
|
||||
summarise(
|
||||
n = n(),
|
||||
mean = round(mean(MEAN_SCORE), 5),
|
||||
variance = round(var(MEAN_SCORE), 5),
|
||||
sd = round(sd(MEAN_SCORE), 5),
|
||||
median = round(median(MEAN_SCORE), 5),
|
||||
q1 = round(quantile(MEAN_SCORE, 0.25), 5),
|
||||
q3 = round(quantile(MEAN_SCORE, 0.75), 5),
|
||||
min = round(min(MEAN_SCORE), 5),
|
||||
max = round(max(MEAN_SCORE), 5),
|
||||
.groups = "drop"
|
||||
)
|
||||
|
||||
print(desc_stats, n = Inf)
|
||||
|
||||
missing_summary <- long_data %>%
|
||||
group_by(perspective, temporalDO, TIME, DOMAIN) %>%
|
||||
summarise(
|
||||
n_total = n(),
|
||||
n_missing = sum(is.na(MEAN_SCORE)),
|
||||
pct_missing = round(100 * n_missing / n_total, 2),
|
||||
.groups = "drop"
|
||||
)
|
||||
|
||||
print(missing_summary, n = Inf)
|
||||
|
||||
outlier_summary <- long_data %>%
|
||||
group_by(perspective, temporalDO, TIME, DOMAIN) %>%
|
||||
summarise(
|
||||
n = n(),
|
||||
n_outliers = sum(abs(scale(MEAN_SCORE)) > 3),
|
||||
.groups = "drop"
|
||||
)
|
||||
|
||||
print(outlier_summary, n = Inf)
|
||||
|
||||
homogeneity_between <- long_data %>%
|
||||
group_by(TIME, DOMAIN) %>%
|
||||
rstatix::levene_test(MEAN_SCORE ~ perspective * temporalDO)
|
||||
|
||||
print(homogeneity_between, n = Inf)
|
||||
|
||||
# Normality: within-subjects residuals (deviation from each participant's mean)
|
||||
resid_within <- long_data %>%
|
||||
group_by(pID) %>%
|
||||
mutate(person_mean = mean(MEAN_SCORE, na.rm = TRUE)) %>%
|
||||
ungroup() %>%
|
||||
mutate(resid = MEAN_SCORE - person_mean) %>%
|
||||
pull(resid)
|
||||
resid_within <- resid_within[!is.na(resid_within)]
|
||||
n_resid <- length(resid_within)
|
||||
if (n_resid < 3L) {
|
||||
message("Too few within-subjects residuals (n < 3); skipping Shapiro-Wilk.")
|
||||
} else {
|
||||
resid_for_shapiro <- if (n_resid > 5000L) {
|
||||
set.seed(1L)
|
||||
sample(resid_within, 5000L)
|
||||
} else {
|
||||
resid_within
|
||||
}
|
||||
print(shapiro.test(resid_for_shapiro))
|
||||
}
|
||||
# qqnorm(resid_within)
|
||||
# qqline(resid_within)
|
||||
|
||||
aov_afex <- aov_ez(
|
||||
id = "pID",
|
||||
dv = "MEAN_SCORE",
|
||||
data = long_data,
|
||||
between = c("perspective", "temporalDO"),
|
||||
within = c("TIME", "DOMAIN"),
|
||||
type = 3
|
||||
)
|
||||
|
||||
# ANOVA table: uncorrected and Greenhouse–Geisser
|
||||
cat("\n--- ANOVA Table (Type 3, uncorrected) ---\n")
|
||||
print(nice(aov_afex, correction = "none"))
|
||||
cat("\n--- ANOVA Table (Type 3, Greenhouse–Geisser correction) ---\n")
|
||||
print(nice(aov_afex, correction = "GG"))
|
||||
|
||||
# Mauchly's test of sphericity and epsilon (via car::Anova on wide data)
|
||||
anova_wide <- anova_data %>%
|
||||
select(pID, perspective, temporalDO, all_of(within_vars)) %>%
|
||||
filter(if_all(all_of(within_vars), ~ !is.na(.)))
|
||||
response_matrix <- as.matrix(anova_wide[, within_vars])
|
||||
rm_model <- lm(response_matrix ~ perspective * temporalDO, data = anova_wide)
|
||||
idata <- data.frame(
|
||||
TIME = factor(rep(c("past", "fut"), each = 3), levels = c("past", "fut")),
|
||||
DOMAIN = factor(rep(c("pref", "pers", "val"), 2), levels = c("pref", "pers", "val"))
|
||||
)
|
||||
rm_anova <- car::Anova(rm_model, idata = idata, idesign = ~ TIME * DOMAIN, type = 3)
|
||||
rm_summary <- summary(rm_anova, multivariate = FALSE)
|
||||
if (!is.null(rm_summary$sphericity.tests)) {
|
||||
cat("\nMauchly's Test of Sphericity:\n")
|
||||
print(rm_summary$sphericity.tests)
|
||||
}
|
||||
if (!is.null(rm_summary$epsilon)) {
|
||||
cat("\nEpsilon (GG, HF):\n")
|
||||
print(rm_summary$epsilon)
|
||||
}
|
||||
|
||||
# Within-subjects residuals: deviation from each participant's mean (one per observation)
|
||||
resid_within <- long_data %>%
|
||||
group_by(pID) %>%
|
||||
mutate(person_mean = mean(MEAN_SCORE, na.rm = TRUE)) %>%
|
||||
ungroup() %>%
|
||||
mutate(resid = MEAN_SCORE - person_mean) %>%
|
||||
pull(resid)
|
||||
resid_within <- resid_within[!is.na(resid_within)]
|
||||
# R's shapiro.test() allows 3 <= n <= 5000; use a random sample of 5000 if we have more
|
||||
n_resid <- length(resid_within)
|
||||
if (n_resid < 3L) {
|
||||
message("Too few within-subjects residuals (n < 3); skipping Shapiro-Wilk.")
|
||||
} else {
|
||||
resid_for_shapiro <- if (n_resid > 5000L) {
|
||||
set.seed(1L)
|
||||
sample(resid_within, 5000L)
|
||||
} else {
|
||||
resid_within
|
||||
}
|
||||
print(shapiro.test(resid_for_shapiro))
|
||||
}
|
||||
|
||||
# qqnorm(resid_within)
|
||||
# qqline(resid_within)
|
||||
|
||||
# POST-HOC COMPARISONS (significant effects only)
|
||||
|
||||
# TIME (main effect)
|
||||
emm_TIME <- emmeans(aov_afex, ~ TIME)
|
||||
print(pairs(emm_TIME, adjust = "bonferroni"))
|
||||
|
||||
# temporalDO:TIME — ~TIME and ~temporalDO
|
||||
emm_temporalDO_TIME <- emmeans(aov_afex, ~ TIME | temporalDO)
|
||||
print(pairs(emm_temporalDO_TIME, adjust = "bonferroni"))
|
||||
emm_temporalDO_temporalDO <- emmeans(aov_afex, ~ temporalDO | TIME)
|
||||
print(pairs(emm_temporalDO_temporalDO, adjust = "bonferroni"))
|
||||
|
||||
# perspective:temporalDO:TIME — ~TIME, ~perspective, ~temporalDO
|
||||
emm_pt_TIME <- emmeans(aov_afex, ~ TIME | perspective + temporalDO)
|
||||
print(pairs(emm_pt_TIME, adjust = "bonferroni"))
|
||||
emm_pt_perspective <- emmeans(aov_afex, ~ perspective | temporalDO + TIME)
|
||||
print(pairs(emm_pt_perspective, adjust = "bonferroni"))
|
||||
emm_pt_temporalDO <- emmeans(aov_afex, ~ temporalDO | perspective + TIME)
|
||||
print(pairs(emm_pt_temporalDO, adjust = "bonferroni"))
|
||||
|
||||
# perspective:DOMAIN — ~perspective and ~DOMAIN
|
||||
emm_perspective_DOMAIN <- emmeans(aov_afex, ~ perspective | DOMAIN)
|
||||
print(pairs(emm_perspective_DOMAIN, adjust = "bonferroni"))
|
||||
emm_perspective_DOMAIN_domain <- emmeans(aov_afex, ~ DOMAIN | perspective)
|
||||
print(pairs(emm_perspective_DOMAIN_domain, adjust = "bonferroni"))
|
||||
|
||||
# perspective:TIME:DOMAIN — ~TIME, ~perspective, ~DOMAIN
|
||||
emm_pt_TIME_domain <- emmeans(aov_afex, ~ TIME | perspective + DOMAIN)
|
||||
print(pairs(emm_pt_TIME_domain, adjust = "bonferroni"))
|
||||
emm_pt_domain_perspective <- emmeans(aov_afex, ~ perspective | TIME + DOMAIN)
|
||||
print(pairs(emm_pt_domain_perspective, adjust = "bonferroni"))
|
||||
emm_pt_domain_domain <- emmeans(aov_afex, ~ DOMAIN | perspective + TIME)
|
||||
print(pairs(emm_pt_domain_domain, adjust = "bonferroni"))
|
||||
|
||||
# perspective:temporalDO:TIME:DOMAIN — ~TIME, ~perspective, ~temporalDO
|
||||
emm_ptt_TIME <- emmeans(aov_afex, ~ TIME | perspective + temporalDO + DOMAIN)
|
||||
print(pairs(emm_ptt_TIME, adjust = "bonferroni"))
|
||||
emm_ptt_perspective <- emmeans(aov_afex, ~ perspective | temporalDO + TIME + DOMAIN)
|
||||
print(pairs(emm_ptt_perspective, adjust = "bonferroni"))
|
||||
emm_ptt_temporalDO <- emmeans(aov_afex, ~ temporalDO | perspective + TIME + DOMAIN)
|
||||
print(pairs(emm_ptt_temporalDO, adjust = "bonferroni"))
|
||||
343
eohi3/datap 04 - combined vars.r
Normal file
343
eohi3/datap 04 - combined vars.r
Normal file
@ -0,0 +1,343 @@
|
||||
library(dplyr)
|
||||
|
||||
setwd("/home/ladmin/Documents/DND/EOHI/eohi3")
|
||||
|
||||
# Read the data (with check.names=FALSE to preserve original column names)
|
||||
# Keep empty cells as empty strings, not NA
|
||||
# Only convert the literal string "NA" to NA, not empty strings
|
||||
df <- read.csv("eohi3.csv", stringsAsFactors = FALSE, check.names = FALSE, na.strings = "NA")
|
||||
|
||||
# =============================================================================
|
||||
# 1. CREATE BACKUP
|
||||
# =============================================================================
|
||||
#file.copy("eohi3.csv", "eohi3_2.csv", overwrite = TRUE)
|
||||
|
||||
# =============================================================================
|
||||
# 2. DEFINE VARIABLE MAPPINGS
|
||||
# =============================================================================
|
||||
|
||||
# Past variables mapping: [self/other][VAL/PERS/PREF]_p5_[string] -> past_[val/pers/pref]_[string]
|
||||
past_mappings <- list(
|
||||
# Values (VAL)
|
||||
"past_val_trad" = c("selfVAL_p5_trad", "otherVAL_p5_trad"),
|
||||
"past_val_autonomy" = c("selfVAL_p5_autonomy", "otherVAL_p5_autonomy"),
|
||||
"past_val_personal" = c("selfVAL_p5_personal", "otherVAL_p5_personal"),
|
||||
"past_val_justice" = c("selfVAL_p5_justice", "otherVAL_p5_justice"),
|
||||
"past_val_close" = c("selfVAL_p5_close", "otherVAL_p5_close"),
|
||||
"past_val_connect" = c("selfVAL_p5_connect", "otherVAL_p5_connect"),
|
||||
"past_val_DGEN" = c("selfVAL_p5_dgen", "otherVAL_p5_dgen"),
|
||||
|
||||
# Personality (PERS)
|
||||
"past_pers_open" = c("selfPERS_p5_open", "otherPERS_p5_open"),
|
||||
"past_pers_goal" = c("selfPERS_p5_goal", "otherPERS_p5_goal"),
|
||||
"past_pers_social" = c("selfPERS_p5_social", "otherPERS_p5_social"),
|
||||
"past_pers_agree" = c("selfPERS_p5_agree", "otherPERS_p5_agree"),
|
||||
"past_pers_stress" = c("selfPERS_p5_stress", "otherPERS_p5_stress"),
|
||||
"past_pers_DGEN" = c("selfPERS_p5_dgen", "otherPERS_p5_dgen"),
|
||||
|
||||
# Preferences (PREF)
|
||||
"past_pref_hobbies" = c("selfPREF_p5_hobbies", "otherPREF_p5_hobbies"),
|
||||
"past_pref_music" = c("selfPREF_p5_music", "otherPREF_p5_music"),
|
||||
"past_pref_dress" = c("selfPREF_p5_dress", "otherPREF_p5_dress"),
|
||||
"past_pref_exer" = c("selfPREF_p5_exer", "otherPREF_p5_exer"),
|
||||
"past_pref_food" = c("selfPREF_p5_food", "otherPREF_p5_food"),
|
||||
"past_pref_friends" = c("selfPREF_p5_friends", "otherPREF_p5_friends"),
|
||||
"past_pref_DGEN" = c("selfPREF_p5_dgen", "otherPREF_p5_dgen")
|
||||
)
|
||||
|
||||
# Future variables mapping: [self/other][VAL/PERS/PREF]_f5_[string] -> fut_[val/pers/pref]_[string]
|
||||
future_mappings <- list(
|
||||
# Values (VAL)
|
||||
"fut_val_trad" = c("selfVAL_f5_trad", "otherVAL_f5_trad"),
|
||||
"fut_val_autonomy" = c("selfVAL_f5_autonomy", "otherVAL_f5_autonomy"),
|
||||
"fut_val_personal" = c("selfVAL_f5_personal", "otherVAL_f5_personal"),
|
||||
"fut_val_justice" = c("selfVAL_f5_justice", "otherVAL_f5_justice"),
|
||||
"fut_val_close" = c("selfVAL_f5_close", "otherVAL_f5_close"),
|
||||
"fut_val_connect" = c("selfVAL_f5_connect", "otherVAL_f5_connect"),
|
||||
"fut_val_DGEN" = c("selfVAL_f5_dgen", "otherVAL_f5_dgen"),
|
||||
|
||||
# Personality (PERS)
|
||||
"fut_pers_open" = c("selfPERS_f5_open", "otherPERS_f5_open"),
|
||||
"fut_pers_goal" = c("selfPERS_f5_goal", "otherPERS_f5_goal"),
|
||||
"fut_pers_social" = c("selfPERS_f5_social", "otherPERS_f5_social"),
|
||||
"fut_pers_agree" = c("selfPERS_f5_agree", "otherPERS_f5_agree"),
|
||||
"fut_pers_stress" = c("selfPERS_f5_stress", "otherPERS_f5_stress"),
|
||||
"fut_pers_DGEN" = c("selfPERS_f5_dgen", "otherPERS_f5_dgen"),
|
||||
|
||||
# Preferences (PREF)
|
||||
"fut_pref_hobbies" = c("selfPREF_f5_hobbies", "otherPREF_f5_hobbies"),
|
||||
"fut_pref_music" = c("selfPREF_f5_music", "otherPREF_f5_music"),
|
||||
"fut_pref_dress" = c("selfPREF_f5_dress", "otherPREF_f5_dress"),
|
||||
"fut_pref_exer" = c("selfPREF_f5_exer", "otherPREF_f5_exer"),
|
||||
"fut_pref_food" = c("selfPREF_f5_food", "otherPREF_f5_food"),
|
||||
"fut_pref_friends" = c("selfPREF_f5_friends", "otherPREF_f5_friends"),
|
||||
"fut_pref_DGEN" = c("selfPREF_f5_dgen", "otherPREF_f5_dgen")
|
||||
)
|
||||
|
||||
# =============================================================================
|
||||
# 3. COMBINE VARIABLES
|
||||
# =============================================================================
|
||||
|
||||
# Function to combine self and other variables
|
||||
# For each row, values exist in either self OR other, never both
|
||||
# NOTE: Column existence should be checked before calling this function
|
||||
combine_vars <- function(df, self_col, other_col) {
|
||||
# Safety check: if columns don't exist, return appropriate fallback
|
||||
if (!self_col %in% names(df)) {
|
||||
stop(paste("ERROR: Column", self_col, "not found. This should have been caught earlier."))
|
||||
}
|
||||
if (!other_col %in% names(df)) {
|
||||
stop(paste("ERROR: Column", other_col, "not found. This should have been caught earlier."))
|
||||
}
|
||||
|
||||
# Combine: use self value if not empty/NA, otherwise use other value
|
||||
# Handle both NA and empty strings
|
||||
result <- ifelse(
|
||||
!is.na(df[[self_col]]) & df[[self_col]] != "",
|
||||
df[[self_col]],
|
||||
ifelse(
|
||||
!is.na(df[[other_col]]) & df[[other_col]] != "",
|
||||
df[[other_col]],
|
||||
NA
|
||||
)
|
||||
)
|
||||
|
||||
return(result)
|
||||
}
|
||||
|
||||
# Apply past mappings
|
||||
cat("\nCombining past variables...\n")
|
||||
missing_cols <- list()
|
||||
for (new_col in names(past_mappings)) {
|
||||
self_col <- past_mappings[[new_col]][1]
|
||||
other_col <- past_mappings[[new_col]][2]
|
||||
|
||||
# Check if all required columns exist
|
||||
missing <- c()
|
||||
if (!new_col %in% names(df)) {
|
||||
missing <- c(missing, paste("target:", new_col))
|
||||
}
|
||||
if (!self_col %in% names(df)) {
|
||||
missing <- c(missing, paste("self:", self_col))
|
||||
}
|
||||
if (!other_col %in% names(df)) {
|
||||
missing <- c(missing, paste("other:", other_col))
|
||||
}
|
||||
|
||||
if (length(missing) > 0) {
|
||||
missing_cols[[new_col]] <- missing
|
||||
warning(paste("Skipping", new_col, "- missing columns:", paste(missing, collapse = ", ")))
|
||||
next
|
||||
}
|
||||
|
||||
# All columns exist, proceed with combination
|
||||
df[[new_col]] <- combine_vars(df, self_col, other_col)
|
||||
cat(paste(" Updated:", new_col, "\n"))
|
||||
}
|
||||
|
||||
# Report any missing columns
|
||||
if (length(missing_cols) > 0) {
|
||||
cat("\n⚠ Missing columns detected in PAST variables:\n")
|
||||
for (var in names(missing_cols)) {
|
||||
cat(paste(" ", var, ":", paste(missing_cols[[var]], collapse = ", "), "\n"))
|
||||
}
|
||||
}
|
||||
|
||||
# Apply future mappings
|
||||
cat("\nCombining future variables...\n")
|
||||
missing_cols_future <- list()
|
||||
for (new_col in names(future_mappings)) {
|
||||
self_col <- future_mappings[[new_col]][1]
|
||||
other_col <- future_mappings[[new_col]][2]
|
||||
|
||||
# Check if all required columns exist
|
||||
missing <- c()
|
||||
if (!new_col %in% names(df)) {
|
||||
missing <- c(missing, paste("target:", new_col))
|
||||
}
|
||||
if (!self_col %in% names(df)) {
|
||||
missing <- c(missing, paste("self:", self_col))
|
||||
}
|
||||
if (!other_col %in% names(df)) {
|
||||
missing <- c(missing, paste("other:", other_col))
|
||||
}
|
||||
|
||||
if (length(missing) > 0) {
|
||||
missing_cols_future[[new_col]] <- missing
|
||||
warning(paste("Skipping", new_col, "- missing columns:", paste(missing, collapse = ", ")))
|
||||
next
|
||||
}
|
||||
|
||||
# All columns exist, proceed with combination
|
||||
df[[new_col]] <- combine_vars(df, self_col, other_col)
|
||||
cat(paste(" Updated:", new_col, "\n"))
|
||||
}
|
||||
|
||||
# Report any missing columns
|
||||
if (length(missing_cols_future) > 0) {
|
||||
cat("\n⚠ Missing columns detected in FUTURE variables:\n")
|
||||
for (var in names(missing_cols_future)) {
|
||||
cat(paste(" ", var, ":", paste(missing_cols_future[[var]], collapse = ", "), "\n"))
|
||||
}
|
||||
}
|
||||
|
||||
# =============================================================================
|
||||
# 4. VALIDATION CHECKS
|
||||
# =============================================================================
|
||||
|
||||
cat("\n=== VALIDATION CHECKS ===\n\n")
|
||||
|
||||
# Check 1: Ensure no row has values in both self and other for the same variable
|
||||
check_conflicts <- function(df, mappings) {
|
||||
conflicts <- data.frame()
|
||||
|
||||
for (new_col in names(mappings)) {
|
||||
self_col <- mappings[[new_col]][1]
|
||||
other_col <- mappings[[new_col]][2]
|
||||
|
||||
if (self_col %in% names(df) && other_col %in% names(df)) {
|
||||
# Find rows where both self and other have non-empty values
|
||||
both_filled <- !is.na(df[[self_col]]) & df[[self_col]] != "" &
|
||||
!is.na(df[[other_col]]) & df[[other_col]] != ""
|
||||
|
||||
if (any(both_filled, na.rm = TRUE)) {
|
||||
conflict_rows <- which(both_filled)
|
||||
conflicts <- rbind(conflicts, data.frame(
|
||||
variable = new_col,
|
||||
self_col = self_col,
|
||||
other_col = other_col,
|
||||
n_conflicts = length(conflict_rows),
|
||||
example_rows = paste(head(conflict_rows, 5), collapse = ", ")
|
||||
))
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
return(conflicts)
|
||||
}
|
||||
|
||||
past_conflicts <- check_conflicts(df, past_mappings)
|
||||
future_conflicts <- check_conflicts(df, future_mappings)
|
||||
|
||||
if (nrow(past_conflicts) > 0) {
|
||||
cat("WARNING: Found conflicts in PAST variables (both self and other have values):\n")
|
||||
print(past_conflicts)
|
||||
} else {
|
||||
cat("✓ No conflicts found in PAST variables\n")
|
||||
}
|
||||
|
||||
if (nrow(future_conflicts) > 0) {
|
||||
cat("\nWARNING: Found conflicts in FUTURE variables (both self and other have values):\n")
|
||||
print(future_conflicts)
|
||||
} else {
|
||||
cat("✓ No conflicts found in FUTURE variables\n")
|
||||
}
|
||||
|
||||
# Check 2: Verify that combined columns have values where expected
|
||||
check_coverage <- function(df, mappings) {
|
||||
coverage <- data.frame()
|
||||
|
||||
for (new_col in names(mappings)) {
|
||||
self_col <- mappings[[new_col]][1]
|
||||
other_col <- mappings[[new_col]][2]
|
||||
|
||||
# Check if columns exist before counting
|
||||
self_exists <- self_col %in% names(df)
|
||||
other_exists <- other_col %in% names(df)
|
||||
target_exists <- new_col %in% names(df)
|
||||
|
||||
# Count non-empty values in original columns (only if they exist)
|
||||
self_count <- if (self_exists) {
|
||||
sum(!is.na(df[[self_col]]) & df[[self_col]] != "", na.rm = TRUE)
|
||||
} else {
|
||||
NA
|
||||
}
|
||||
|
||||
other_count <- if (other_exists) {
|
||||
sum(!is.na(df[[other_col]]) & df[[other_col]] != "", na.rm = TRUE)
|
||||
} else {
|
||||
NA
|
||||
}
|
||||
|
||||
combined_count <- if (target_exists) {
|
||||
sum(!is.na(df[[new_col]]) & df[[new_col]] != "", na.rm = TRUE)
|
||||
} else {
|
||||
NA
|
||||
}
|
||||
|
||||
# Combined should equal sum of self and other (since they don't overlap)
|
||||
expected_count <- if (!is.na(self_count) && !is.na(other_count)) {
|
||||
self_count + other_count
|
||||
} else {
|
||||
NA
|
||||
}
|
||||
|
||||
match <- if (!is.na(combined_count) && !is.na(expected_count)) {
|
||||
combined_count == expected_count
|
||||
} else {
|
||||
NA
|
||||
}
|
||||
|
||||
coverage <- rbind(coverage, data.frame(
|
||||
variable = new_col,
|
||||
self_non_empty = self_count,
|
||||
other_non_empty = other_count,
|
||||
combined_non_empty = combined_count,
|
||||
expected_non_empty = expected_count,
|
||||
match = match
|
||||
))
|
||||
}
|
||||
|
||||
return(coverage)
|
||||
}
|
||||
|
||||
past_coverage <- check_coverage(df, past_mappings)
|
||||
future_coverage <- check_coverage(df, future_mappings)
|
||||
|
||||
cat("\n=== COVERAGE CHECK ===\n")
|
||||
cat("\nPAST variables:\n")
|
||||
print(past_coverage)
|
||||
|
||||
cat("\nFUTURE variables:\n")
|
||||
print(future_coverage)
|
||||
|
||||
# Check if all coverage matches
|
||||
all_past_match <- all(past_coverage$match, na.rm = TRUE)
|
||||
all_future_match <- all(future_coverage$match, na.rm = TRUE)
|
||||
|
||||
if (all_past_match && all_future_match) {
|
||||
cat("\n✓ All combined variables have correct coverage\n")
|
||||
} else {
|
||||
cat("\n⚠ Some variables may have missing coverage - check the table above\n")
|
||||
}
|
||||
|
||||
# Check 3: Sample check - verify a few rows manually
|
||||
cat("\n=== SAMPLE ROW CHECK ===\n")
|
||||
sample_rows <- min(5, nrow(df))
|
||||
cat(paste("Checking first", sample_rows, "rows:\n\n"))
|
||||
|
||||
for (i in 1:sample_rows) {
|
||||
cat(paste("Row", i, ":\n"))
|
||||
|
||||
# Check one past variable
|
||||
test_var <- "past_val_trad"
|
||||
self_val <- if (past_mappings[[test_var]][1] %in% names(df)) df[i, past_mappings[[test_var]][1]] else NA
|
||||
other_val <- if (past_mappings[[test_var]][2] %in% names(df)) df[i, past_mappings[[test_var]][2]] else NA
|
||||
combined_val <- df[i, test_var]
|
||||
|
||||
cat(sprintf(" %s: self=%s, other=%s, combined=%s\n",
|
||||
test_var,
|
||||
ifelse(is.na(self_val) || self_val == "", "empty", self_val),
|
||||
ifelse(is.na(other_val) || other_val == "", "empty", other_val),
|
||||
ifelse(is.na(combined_val) || combined_val == "", "empty", combined_val)))
|
||||
}
|
||||
|
||||
# =============================================================================
|
||||
# 5. SAVE UPDATED DATA
|
||||
# =============================================================================
|
||||
|
||||
|
||||
write.csv(df, "eohi3.csv", row.names = FALSE, na = "")
|
||||
cat("Updated data saved to: eohi3.csv\n")
|
||||
cat(paste("Total rows:", nrow(df), "\n"))
|
||||
cat(paste("Total columns:", ncol(df), "\n"))
|
||||
187
eohi3/datap 05 - ehi vars.r
Normal file
187
eohi3/datap 05 - ehi vars.r
Normal file
@ -0,0 +1,187 @@
|
||||
library(dplyr)
|
||||
|
||||
setwd("/home/ladmin/Documents/DND/EOHI/eohi3")
|
||||
|
||||
# Read the data (with check.names=FALSE to preserve original column names)
|
||||
# Keep empty cells as empty strings, not NA
|
||||
# Only convert the literal string "NA" to NA, not empty strings
|
||||
df <- read.csv("eohi3.csv", stringsAsFactors = FALSE, check.names = FALSE, na.strings = "NA")
|
||||
|
||||
# =============================================================================
|
||||
# 1. CREATE BACKUP
|
||||
# =============================================================================
|
||||
file.copy("eohi3.csv", "eohi3_2.csv", overwrite = TRUE)
|
||||
|
||||
# =============================================================================
|
||||
# 2. DEFINE VARIABLE MAPPINGS
|
||||
# =============================================================================
|
||||
|
||||
# Target variables (excluding those ending in _MEAN)
|
||||
# Each target var = past_var - fut_var
|
||||
ehi_mappings <- list(
|
||||
# Preferences (PREF)
|
||||
"ehi_pref_hobbies" = c("past_pref_hobbies", "fut_pref_hobbies"),
|
||||
"ehi_pref_music" = c("past_pref_music", "fut_pref_music"),
|
||||
"ehi_pref_dress" = c("past_pref_dress", "fut_pref_dress"),
|
||||
"ehi_pref_exer" = c("past_pref_exer", "fut_pref_exer"),
|
||||
"ehi_pref_food" = c("past_pref_food", "fut_pref_food"),
|
||||
"ehi_pref_friends" = c("past_pref_friends", "fut_pref_friends"),
|
||||
"ehi_pref_DGEN" = c("past_pref_DGEN", "fut_pref_DGEN"),
|
||||
|
||||
# Personality (PERS)
|
||||
"ehi_pers_open" = c("past_pers_open", "fut_pers_open"),
|
||||
"ehi_pers_goal" = c("past_pers_goal", "fut_pers_goal"),
|
||||
"ehi_pers_social" = c("past_pers_social", "fut_pers_social"),
|
||||
"ehi_pers_agree" = c("past_pers_agree", "fut_pers_agree"),
|
||||
"ehi_pers_stress" = c("past_pers_stress", "fut_pers_stress"),
|
||||
"ehi_pers_DGEN" = c("past_pers_DGEN", "fut_pers_DGEN"),
|
||||
|
||||
# Values (VAL)
|
||||
"ehi_val_trad" = c("past_val_trad", "fut_val_trad"),
|
||||
"ehi_val_autonomy" = c("past_val_autonomy", "fut_val_autonomy"),
|
||||
"ehi_val_personal" = c("past_val_personal", "fut_val_personal"),
|
||||
"ehi_val_justice" = c("past_val_justice", "fut_val_justice"),
|
||||
"ehi_val_close" = c("past_val_close", "fut_val_close"),
|
||||
"ehi_val_connect" = c("past_val_connect", "fut_val_connect"),
|
||||
"ehi_val_DGEN" = c("past_val_DGEN", "fut_val_DGEN")
|
||||
)
|
||||
|
||||
# =============================================================================
|
||||
# 3. CHECK IF TARGET VARIABLES EXIST
|
||||
# =============================================================================
|
||||
|
||||
missing_targets <- c()
|
||||
for (target_var in names(ehi_mappings)) {
|
||||
if (!target_var %in% names(df)) {
|
||||
missing_targets <- c(missing_targets, target_var)
|
||||
cat(paste("⚠ Target variable not found:", target_var, "\n"))
|
||||
}
|
||||
}
|
||||
|
||||
if (length(missing_targets) > 0) {
|
||||
cat("\nERROR: The following target variables are missing from eohi3.csv:\n")
|
||||
for (var in missing_targets) {
|
||||
cat(paste(" -", var, "\n"))
|
||||
}
|
||||
stop("Cannot proceed without target variables. Please add them to the CSV file.")
|
||||
}
|
||||
|
||||
# =============================================================================
|
||||
# 4. CALCULATE EHI VARIABLES (past - future)
|
||||
# =============================================================================
|
||||
|
||||
missing_source_cols <- list()
|
||||
|
||||
for (target_var in names(ehi_mappings)) {
|
||||
past_var <- ehi_mappings[[target_var]][1]
|
||||
fut_var <- ehi_mappings[[target_var]][2]
|
||||
|
||||
# Check if source columns exist
|
||||
missing <- c()
|
||||
if (!past_var %in% names(df)) {
|
||||
missing <- c(missing, past_var)
|
||||
}
|
||||
if (!fut_var %in% names(df)) {
|
||||
missing <- c(missing, fut_var)
|
||||
}
|
||||
|
||||
if (length(missing) > 0) {
|
||||
missing_source_cols[[target_var]] <- missing
|
||||
warning(paste("Skipping", target_var, "- missing source columns:", paste(missing, collapse = ", ")))
|
||||
next
|
||||
}
|
||||
|
||||
# Convert to numeric, handling empty strings and NA
|
||||
past_vals <- as.numeric(ifelse(df[[past_var]] == "" | is.na(df[[past_var]]), NA, df[[past_var]]))
|
||||
fut_vals <- as.numeric(ifelse(df[[fut_var]] == "" | is.na(df[[fut_var]]), NA, df[[fut_var]]))
|
||||
|
||||
# Calculate difference: past - future
|
||||
ehi_vals <- past_vals - fut_vals
|
||||
|
||||
# Update target column
|
||||
df[[target_var]] <- ehi_vals
|
||||
cat(paste(" Calculated:", target_var, "=", past_var, "-", fut_var, "\n"))
|
||||
}
|
||||
|
||||
# Report any missing source columns
|
||||
if (length(missing_source_cols) > 0) {
|
||||
for (var in names(missing_source_cols)) {
|
||||
cat(paste(" ", var, ":", paste(missing_source_cols[[var]], collapse = ", "), "\n"))
|
||||
}
|
||||
}
|
||||
|
||||
# =============================================================================
|
||||
# 5. VALIDATION: CHECK 5 RANDOM ROWS
|
||||
# =============================================================================
|
||||
|
||||
cat("\n=== VALIDATION: CHECKING 5 RANDOM ROWS ===\n\n")
|
||||
|
||||
# Set seed for reproducibility
|
||||
set.seed(123)
|
||||
sample_rows <- sample(1:nrow(df), min(5, nrow(df)))
|
||||
sample_rows <- sort(sample_rows)
|
||||
|
||||
for (i in sample_rows) {
|
||||
cat(paste("Row", i, ":\n"))
|
||||
|
||||
# Check a few representative variables from each category
|
||||
test_vars <- c(
|
||||
"ehi_pref_hobbies",
|
||||
"ehi_pers_open",
|
||||
"ehi_val_trad"
|
||||
)
|
||||
|
||||
for (target_var in test_vars) {
|
||||
if (target_var %in% names(ehi_mappings)) {
|
||||
past_var <- ehi_mappings[[target_var]][1]
|
||||
fut_var <- ehi_mappings[[target_var]][2]
|
||||
|
||||
if (past_var %in% names(df) && fut_var %in% names(df)) {
|
||||
past_val <- df[i, past_var]
|
||||
fut_val <- df[i, fut_var]
|
||||
ehi_val <- df[i, target_var]
|
||||
|
||||
# Convert to numeric for calculation check
|
||||
past_num <- as.numeric(ifelse(past_val == "" | is.na(past_val), NA, past_val))
|
||||
fut_num <- as.numeric(ifelse(fut_val == "" | is.na(fut_val), NA, fut_val))
|
||||
ehi_num <- as.numeric(ifelse(is.na(ehi_val), NA, ehi_val))
|
||||
|
||||
# Calculate expected value
|
||||
expected <- if (!is.na(past_num) && !is.na(fut_num)) {
|
||||
past_num - fut_num
|
||||
} else {
|
||||
NA
|
||||
}
|
||||
|
||||
# Check if calculation is correct
|
||||
match <- if (!is.na(expected) && !is.na(ehi_num)) {
|
||||
abs(expected - ehi_num) < 0.0001 # Allow for floating point precision
|
||||
} else {
|
||||
is.na(expected) && is.na(ehi_num)
|
||||
}
|
||||
|
||||
cat(sprintf(" %s:\n", target_var))
|
||||
cat(sprintf(" %s = %s\n", past_var, ifelse(is.na(past_val) || past_val == "", "NA/empty", past_val)))
|
||||
cat(sprintf(" %s = %s\n", fut_var, ifelse(is.na(fut_val) || fut_val == "", "NA/empty", fut_val)))
|
||||
cat(sprintf(" %s = %s\n", target_var, ifelse(is.na(ehi_val), "NA", ehi_val)))
|
||||
cat(sprintf(" Expected: %s - %s = %s\n",
|
||||
ifelse(is.na(past_num), "NA", past_num),
|
||||
ifelse(is.na(fut_num), "NA", fut_num),
|
||||
ifelse(is.na(expected), "NA", expected)))
|
||||
cat(sprintf(" Match: %s\n\n", ifelse(match, "✓", "✗ ERROR")))
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
# =============================================================================
|
||||
# 6. SAVE UPDATED DATA
|
||||
# =============================================================================
|
||||
# COMMENTED OUT: Uncomment when ready to save
|
||||
|
||||
# cat("\n=== SAVING DATA ===\n")
|
||||
write.csv(df, "eohi3.csv", row.names = FALSE, na = "")
|
||||
# cat("Updated data saved to: eohi3.csv\n")
|
||||
# cat(paste("Total rows:", nrow(df), "\n"))
|
||||
# cat(paste("Total columns:", ncol(df), "\n"))
|
||||
|
||||
225
eohi3/datap 06 - mean vars.r
Normal file
225
eohi3/datap 06 - mean vars.r
Normal file
@ -0,0 +1,225 @@
|
||||
library(dplyr)
|
||||
|
||||
setwd("/home/ladmin/Documents/DND/EOHI/eohi3")
|
||||
|
||||
# Read the data (with check.names=FALSE to preserve original column names)
|
||||
# Keep empty cells as empty strings, not NA
|
||||
# Only convert the literal string "NA" to NA, not empty strings
|
||||
df <- read.csv("eohi3.csv", stringsAsFactors = FALSE, check.names = FALSE, na.strings = "NA")
|
||||
|
||||
# =============================================================================
|
||||
# 1. CREATE BACKUP
|
||||
# =============================================================================
|
||||
file.copy("eohi3.csv", "eohi3_2.csv", overwrite = TRUE)
|
||||
|
||||
# =============================================================================
|
||||
# 2. DEFINE MEAN VARIABLE MAPPINGS
|
||||
# =============================================================================
|
||||
|
||||
mean_mappings <- list(
|
||||
# Past Preferences MEAN
|
||||
"past_pref_MEAN" = c("past_pref_hobbies", "past_pref_music", "past_pref_dress",
|
||||
"past_pref_exer", "past_pref_food", "past_pref_friends"),
|
||||
|
||||
# Future Preferences MEAN
|
||||
"fut_pref_MEAN" = c("fut_pref_hobbies", "fut_pref_music", "fut_pref_dress",
|
||||
"fut_pref_exer", "fut_pref_food", "fut_pref_friends"),
|
||||
|
||||
# Past Personality MEAN
|
||||
"past_pers_MEAN" = c("past_pers_open", "past_pers_goal", "past_pers_social",
|
||||
"past_pers_agree", "past_pers_stress"),
|
||||
|
||||
# Future Personality MEAN
|
||||
"fut_pers_MEAN" = c("fut_pers_open", "fut_pers_goal", "fut_pers_social",
|
||||
"fut_pers_agree", "fut_pers_stress"),
|
||||
|
||||
# Past Values MEAN
|
||||
"past_val_MEAN" = c("past_val_trad", "past_val_autonomy", "past_val_personal",
|
||||
"past_val_justice", "past_val_close", "past_val_connect"),
|
||||
|
||||
# Future Values MEAN
|
||||
"fut_val_MEAN" = c("fut_val_trad", "fut_val_autonomy", "fut_val_personal",
|
||||
"fut_val_justice", "fut_val_close", "fut_val_connect"),
|
||||
|
||||
# EHI Preferences MEAN
|
||||
"ehi_pref_MEAN" = c("ehi_pref_hobbies", "ehi_pref_music", "ehi_pref_dress",
|
||||
"ehi_pref_exer", "ehi_pref_food", "ehi_pref_friends"),
|
||||
|
||||
# EHI Personality MEAN
|
||||
"ehi_pers_MEAN" = c("ehi_pers_open", "ehi_pers_goal", "ehi_pers_social",
|
||||
"ehi_pers_agree", "ehi_pers_stress"),
|
||||
|
||||
# EHI Values MEAN
|
||||
"ehi_val_MEAN" = c("ehi_val_trad", "ehi_val_autonomy", "ehi_val_personal",
|
||||
"ehi_val_justice", "ehi_val_close", "ehi_val_connect")
|
||||
)
|
||||
|
||||
# Additional means
|
||||
additional_means <- list(
|
||||
"ehiDS_mean" = c("ehi_pref_MEAN", "ehi_pers_MEAN", "ehi_val_MEAN"),
|
||||
"ehiDGEN_mean" = c("ehi_pref_DGEN", "ehi_pers_DGEN", "ehi_val_DGEN")
|
||||
)
|
||||
|
||||
# =============================================================================
|
||||
# 3. CHECK IF VARIABLES EXIST
|
||||
# =============================================================================
|
||||
|
||||
# Check source variables for mean_mappings
|
||||
missing_source_vars <- list()
|
||||
for (target_var in names(mean_mappings)) {
|
||||
source_vars <- mean_mappings[[target_var]]
|
||||
missing <- setdiff(source_vars, names(df))
|
||||
if (length(missing) > 0) {
|
||||
missing_source_vars[[target_var]] <- missing
|
||||
cat(paste("⚠ Missing source variables for", target_var, ":", paste(missing, collapse = ", "), "\n"))
|
||||
}
|
||||
}
|
||||
|
||||
# Check source variables for additional_means
|
||||
missing_additional_vars <- list()
|
||||
for (target_var in names(additional_means)) {
|
||||
source_vars <- additional_means[[target_var]]
|
||||
missing <- setdiff(source_vars, names(df))
|
||||
if (length(missing) > 0) {
|
||||
missing_additional_vars[[target_var]] <- missing
|
||||
cat(paste("⚠ Missing source variables for", target_var, ":", paste(missing, collapse = ", "), "\n"))
|
||||
}
|
||||
}
|
||||
|
||||
# Check if target variables exist
|
||||
expected_targets <- c(names(mean_mappings), names(additional_means))
|
||||
actual_targets <- names(df)
|
||||
missing_targets <- setdiff(expected_targets, actual_targets)
|
||||
|
||||
if (length(missing_targets) > 0) {
|
||||
cat("\nERROR: The following target variables are missing from eohi3.csv:\n")
|
||||
for (var in missing_targets) {
|
||||
cat(paste(" -", var, "\n"))
|
||||
}
|
||||
stop("Cannot proceed without target variables. Please add them to the CSV file.")
|
||||
}
|
||||
|
||||
# =============================================================================
|
||||
# 4. CALCULATE MEAN VARIABLES
|
||||
# =============================================================================
|
||||
|
||||
# Function to calculate row means, handling NA and empty strings
|
||||
calculate_mean <- function(df, source_vars) {
|
||||
# Extract columns and convert to numeric
|
||||
cols_data <- df[, source_vars, drop = FALSE]
|
||||
|
||||
# Convert to numeric matrix, treating empty strings and "NA" as NA
|
||||
numeric_matrix <- apply(cols_data, 2, function(x) {
|
||||
as.numeric(ifelse(x == "" | is.na(x) | x == "NA", NA, x))
|
||||
})
|
||||
|
||||
# Calculate row means, ignoring NA values
|
||||
rowMeans(numeric_matrix, na.rm = TRUE)
|
||||
}
|
||||
|
||||
# Calculate means for main mappings
|
||||
for (target_var in names(mean_mappings)) {
|
||||
source_vars <- mean_mappings[[target_var]]
|
||||
|
||||
# Check if all source variables exist
|
||||
missing <- setdiff(source_vars, names(df))
|
||||
if (length(missing) > 0) {
|
||||
warning(paste("Skipping", target_var, "- missing source variables:", paste(missing, collapse = ", ")))
|
||||
next
|
||||
}
|
||||
|
||||
# Calculate mean
|
||||
df[[target_var]] <- calculate_mean(df, source_vars)
|
||||
cat(paste(" Calculated:", target_var, "from", length(source_vars), "variables\n"))
|
||||
}
|
||||
|
||||
# Calculate additional means
|
||||
for (target_var in names(additional_means)) {
|
||||
source_vars <- additional_means[[target_var]]
|
||||
|
||||
# Check if all source variables exist
|
||||
missing <- setdiff(source_vars, names(df))
|
||||
if (length(missing) > 0) {
|
||||
warning(paste("Skipping", target_var, "- missing source variables:", paste(missing, collapse = ", ")))
|
||||
next
|
||||
}
|
||||
|
||||
# Calculate mean
|
||||
df[[target_var]] <- calculate_mean(df, source_vars)
|
||||
cat(paste(" Calculated:", target_var, "from", length(source_vars), "variables\n"))
|
||||
}
|
||||
|
||||
# =============================================================================
|
||||
# 5. VALIDATION: CHECK 5 RANDOM ROWS
|
||||
# =============================================================================
|
||||
|
||||
# Set seed for reproducibility
|
||||
set.seed(123)
|
||||
sample_rows <- sample(1:nrow(df), min(5, nrow(df)))
|
||||
sample_rows <- sort(sample_rows)
|
||||
|
||||
for (i in sample_rows) {
|
||||
cat(paste("Row", i, ":\n"))
|
||||
|
||||
# Check a few representative mean variables
|
||||
test_vars <- c(
|
||||
"past_pref_MEAN",
|
||||
"ehi_pref_MEAN",
|
||||
"ehiDS_mean"
|
||||
)
|
||||
|
||||
for (target_var in test_vars) {
|
||||
# Determine which mapping to use
|
||||
if (target_var %in% names(mean_mappings)) {
|
||||
source_vars <- mean_mappings[[target_var]]
|
||||
} else if (target_var %in% names(additional_means)) {
|
||||
source_vars <- additional_means[[target_var]]
|
||||
} else {
|
||||
next
|
||||
}
|
||||
|
||||
# Check if all source variables exist
|
||||
if (!all(source_vars %in% names(df))) {
|
||||
next
|
||||
}
|
||||
|
||||
# Get values
|
||||
source_vals <- df[i, source_vars]
|
||||
target_val <- df[i, target_var]
|
||||
|
||||
# Convert to numeric for calculation
|
||||
source_nums <- as.numeric(ifelse(source_vals == "" | is.na(source_vals) | source_vals == "NA", NA, source_vals))
|
||||
target_num <- as.numeric(ifelse(is.na(target_val), NA, target_val))
|
||||
|
||||
# Calculate expected mean (ignoring NA)
|
||||
expected <- mean(source_nums, na.rm = TRUE)
|
||||
if (all(is.na(source_nums))) {
|
||||
expected <- NA
|
||||
}
|
||||
|
||||
# Check if calculation is correct
|
||||
match <- if (!is.na(expected) && !is.na(target_num)) {
|
||||
abs(expected - target_num) < 0.0001 # Allow for floating point precision
|
||||
} else {
|
||||
is.na(expected) && is.na(target_num)
|
||||
}
|
||||
|
||||
cat(sprintf(" %s:\n", target_var))
|
||||
cat(sprintf(" Source variables: %s\n", paste(source_vars, collapse = ", ")))
|
||||
cat(sprintf(" Source values: %s\n", paste(ifelse(is.na(source_vals) | source_vals == "", "NA/empty", source_vals), collapse = ", ")))
|
||||
cat(sprintf(" %s = %s\n", target_var, ifelse(is.na(target_val), "NA", round(target_val, 4))))
|
||||
cat(sprintf(" Expected mean: %s\n", ifelse(is.na(expected), "NA", round(expected, 4))))
|
||||
cat(sprintf(" Match: %s\n\n", ifelse(match, "✓", "✗ ERROR")))
|
||||
}
|
||||
}
|
||||
|
||||
# =============================================================================
|
||||
# 6. SAVE UPDATED DATA
|
||||
# =============================================================================
|
||||
# COMMENTED OUT: Uncomment when ready to save
|
||||
|
||||
write.csv(df, "eohi3.csv", row.names = FALSE, na = "")
|
||||
# cat("Updated data saved to: eohi3.csv\n")
|
||||
# cat(paste("Total rows:", nrow(df), "\n"))
|
||||
# cat(paste("Total columns:", ncol(df), "\n"))
|
||||
|
||||
462
eohi3/datap 07 - scales and recodes.r
Normal file
462
eohi3/datap 07 - scales and recodes.r
Normal file
@ -0,0 +1,462 @@
|
||||
library(dplyr)
|
||||
|
||||
setwd("/home/ladmin/Documents/DND/EOHI/eohi3")
|
||||
|
||||
# Read the data (with check.names=FALSE to preserve original column names)
|
||||
# Keep empty cells as empty strings, not NA
|
||||
# Only convert the literal string "NA" to NA, not empty strings
|
||||
df <- read.csv("eohi3.csv", stringsAsFactors = FALSE, check.names = FALSE, na.strings = "NA")
|
||||
|
||||
# =============================================================================
|
||||
# 1. CREATE BACKUP
|
||||
# =============================================================================
|
||||
file.copy("eohi3.csv", "eohi3_2.csv", overwrite = TRUE)
|
||||
|
||||
# =============================================================================
|
||||
# HELPER FUNCTION: Check variable existence and values
|
||||
# =============================================================================
|
||||
|
||||
check_vars_exist <- function(source_vars, target_vars) {
|
||||
missing_source <- setdiff(source_vars, names(df))
|
||||
missing_target <- setdiff(target_vars, names(df))
|
||||
|
||||
if (length(missing_source) > 0) {
|
||||
stop(paste("Missing source variables:", paste(missing_source, collapse = ", ")))
|
||||
}
|
||||
if (length(missing_target) > 0) {
|
||||
stop(paste("Missing target variables:", paste(missing_target, collapse = ", ")))
|
||||
}
|
||||
return(TRUE)
|
||||
}
|
||||
|
||||
check_values_exist <- function(var_name, expected_values) {
|
||||
unique_vals <- unique(df[[var_name]])
|
||||
unique_vals <- unique_vals[!is.na(unique_vals) & unique_vals != ""]
|
||||
missing_vals <- setdiff(expected_values, unique_vals)
|
||||
extra_vals <- setdiff(unique_vals, expected_values)
|
||||
|
||||
if (length(missing_vals) > 0) {
|
||||
cat(paste(" ⚠ Expected values not found in", var_name, ":", paste(missing_vals, collapse = ", "), "\n"))
|
||||
}
|
||||
if (length(extra_vals) > 0) {
|
||||
cat(paste(" ⚠ Unexpected values found in", var_name, ":", paste(extra_vals, collapse = ", "), "\n"))
|
||||
}
|
||||
return(list(missing = missing_vals, extra = extra_vals))
|
||||
}
|
||||
|
||||
# =============================================================================
|
||||
# 2. RECODE other_length2 TO other_length
|
||||
# =============================================================================
|
||||
|
||||
cat("\n=== 1. RECODING other_length2 TO other_length ===\n\n")
|
||||
|
||||
# Check variables exist
|
||||
check_vars_exist("other_length2", "other_length")
|
||||
|
||||
# Check values in source
|
||||
cat("Checking source variable values...\n")
|
||||
length_vals <- unique(df$other_length2[!is.na(df$other_length2) & df$other_length2 != ""])
|
||||
cat(paste(" Unique values in other_length2:", paste(length_vals, collapse = ", "), "\n"))
|
||||
|
||||
# Recode - handle "20+" as special case first, then convert to numeric for ranges
|
||||
# Convert to numeric once, suppressing warnings for non-numeric values
|
||||
num_length <- suppressWarnings(as.numeric(df$other_length2))
|
||||
|
||||
df$other_length <- ifelse(
|
||||
is.na(df$other_length2),
|
||||
NA,
|
||||
ifelse(
|
||||
df$other_length2 == "",
|
||||
"",
|
||||
ifelse(
|
||||
df$other_length2 == "20+",
|
||||
"20+",
|
||||
ifelse(
|
||||
!is.na(num_length) & num_length >= 5 & num_length <= 9,
|
||||
"5-9",
|
||||
ifelse(
|
||||
!is.na(num_length) & num_length >= 10 & num_length <= 14,
|
||||
"10-14",
|
||||
ifelse(
|
||||
!is.na(num_length) & num_length >= 15 & num_length <= 19,
|
||||
"15-19",
|
||||
NA
|
||||
)
|
||||
)
|
||||
)
|
||||
)
|
||||
)
|
||||
)
|
||||
|
||||
# Verification check
|
||||
cat("\nVerification (5 random rows):\n")
|
||||
set.seed(123)
|
||||
sample_rows <- sample(1:nrow(df), min(5, nrow(df)))
|
||||
for (i in sample_rows) {
|
||||
source_val <- df$other_length2[i]
|
||||
target_val <- df$other_length[i]
|
||||
cat(sprintf(" Row %d: other_length2 = %s -> other_length = %s\n",
|
||||
i, ifelse(is.na(source_val), "NA", ifelse(source_val == "", "empty", source_val)),
|
||||
ifelse(is.na(target_val), "NA", ifelse(target_val == "", "empty", target_val))))
|
||||
}
|
||||
|
||||
# =============================================================================
|
||||
# 3. RECODE other_like2 TO other_like
|
||||
# =============================================================================
|
||||
|
||||
cat("\n=== 2. RECODING other_like2 TO other_like ===\n\n")
|
||||
|
||||
# Check variables exist
|
||||
check_vars_exist("other_like2", "other_like")
|
||||
|
||||
# Check expected values exist
|
||||
expected_like <- c("Dislike a great deal", "Dislike somewhat", "Neither like nor dislike",
|
||||
"Like somewhat", "Like a great deal")
|
||||
check_values_exist("other_like2", expected_like)
|
||||
|
||||
# Recode
|
||||
df$other_like <- ifelse(
|
||||
is.na(df$other_like2),
|
||||
NA,
|
||||
ifelse(
|
||||
df$other_like2 == "",
|
||||
"",
|
||||
ifelse(
|
||||
df$other_like2 == "Dislike a great deal",
|
||||
"-2",
|
||||
ifelse(
|
||||
df$other_like2 == "Dislike somewhat",
|
||||
"-1",
|
||||
ifelse(
|
||||
df$other_like2 == "Neither like nor dislike",
|
||||
"0",
|
||||
ifelse(
|
||||
df$other_like2 == "Like somewhat",
|
||||
"1",
|
||||
ifelse(
|
||||
df$other_like2 == "Like a great deal",
|
||||
"2",
|
||||
NA
|
||||
)
|
||||
)
|
||||
)
|
||||
)
|
||||
)
|
||||
)
|
||||
)
|
||||
|
||||
# Verification check
|
||||
cat("\nVerification (5 random rows):\n")
|
||||
set.seed(456)
|
||||
sample_rows <- sample(1:nrow(df), min(5, nrow(df)))
|
||||
for (i in sample_rows) {
|
||||
source_val <- df$other_like2[i]
|
||||
target_val <- df$other_like[i]
|
||||
cat(sprintf(" Row %d: other_like2 = %s -> other_like = %s\n",
|
||||
i, ifelse(is.na(source_val), "NA", ifelse(source_val == "", "empty", source_val)),
|
||||
ifelse(is.na(target_val), "NA", ifelse(target_val == "", "empty", target_val))))
|
||||
}
|
||||
|
||||
# =============================================================================
|
||||
# 4. CALCULATE aot_total
|
||||
# =============================================================================
|
||||
|
||||
cat("\n=== 3. CALCULATING aot_total ===\n\n")
|
||||
|
||||
# Check variables exist
|
||||
aot_vars <- c("aot01", "aot02", "aot03", "aot04_r", "aot05_r", "aot06_r", "aot07_r", "aot08")
|
||||
check_vars_exist(aot_vars, "aot_total")
|
||||
|
||||
# Reverse code aot04_r through aot07_r
|
||||
reverse_vars <- c("aot04_r", "aot05_r", "aot06_r", "aot07_r")
|
||||
for (var in reverse_vars) {
|
||||
df[[paste0(var, "_reversed")]] <- as.numeric(ifelse(
|
||||
df[[var]] == "" | is.na(df[[var]]),
|
||||
NA,
|
||||
as.numeric(df[[var]]) * -1
|
||||
))
|
||||
}
|
||||
|
||||
# Calculate mean of all 8 variables (4 reversed + 4 original)
|
||||
all_aot_vars <- c("aot01", "aot02", "aot03", "aot04_r_reversed", "aot05_r_reversed",
|
||||
"aot06_r_reversed", "aot07_r_reversed", "aot08")
|
||||
|
||||
# Convert to numeric matrix
|
||||
aot_matrix <- df[, all_aot_vars]
|
||||
aot_numeric <- apply(aot_matrix, 2, function(x) {
|
||||
as.numeric(ifelse(x == "" | is.na(x), NA, x))
|
||||
})
|
||||
|
||||
# Calculate mean
|
||||
df$aot_total <- rowMeans(aot_numeric, na.rm = TRUE)
|
||||
|
||||
# Verification check
|
||||
cat("\nVerification (5 random rows):\n")
|
||||
set.seed(789)
|
||||
sample_rows <- sample(1:nrow(df), min(5, nrow(df)))
|
||||
for (i in sample_rows) {
|
||||
aot_vals <- df[i, all_aot_vars]
|
||||
aot_nums <- as.numeric(ifelse(aot_vals == "" | is.na(aot_vals), NA, aot_vals))
|
||||
expected_mean <- mean(aot_nums, na.rm = TRUE)
|
||||
actual_mean <- df$aot_total[i]
|
||||
cat(sprintf(" Row %d: aot_total = %s (expected: %s)\n",
|
||||
i, ifelse(is.na(actual_mean), "NA", round(actual_mean, 4)),
|
||||
ifelse(is.na(expected_mean), "NA", round(expected_mean, 4))))
|
||||
}
|
||||
|
||||
# =============================================================================
|
||||
# 5. PROCESS CRT QUESTIONS
|
||||
# =============================================================================
|
||||
|
||||
cat("\n=== 4. PROCESSING CRT QUESTIONS ===\n\n")
|
||||
|
||||
# Check variables exist
|
||||
check_vars_exist(c("crt01", "crt02", "crt03"), c("crt_correct", "crt_int"))
|
||||
|
||||
# Initialize CRT variables
|
||||
df$crt_correct <- 0
|
||||
df$crt_int <- 0
|
||||
|
||||
# CRT01: "5 cents" = correct (1,0), "10 cents" = intuitive (0,1), else (0,0)
|
||||
df$crt_correct <- ifelse(df$crt01 == "5 cents", 1, df$crt_correct)
|
||||
df$crt_int <- ifelse(df$crt01 == "10 cents", 1, df$crt_int)
|
||||
|
||||
# CRT02: "5 minutes" = correct (1,0), "100 minutes" = intuitive (0,1), else (0,0)
|
||||
df$crt_correct <- ifelse(df$crt02 == "5 minutes", df$crt_correct + 1, df$crt_correct)
|
||||
df$crt_int <- ifelse(df$crt02 == "100 minutes", df$crt_int + 1, df$crt_int)
|
||||
|
||||
# CRT03: "47 days" = correct (1,0), "24 days" = intuitive (0,1), else (0,0)
|
||||
df$crt_correct <- ifelse(df$crt03 == "47 days", df$crt_correct + 1, df$crt_correct)
|
||||
df$crt_int <- ifelse(df$crt03 == "24 days", df$crt_int + 1, df$crt_int)
|
||||
|
||||
# Check expected values exist
|
||||
expected_crt01 <- c("5 cents", "10 cents")
|
||||
expected_crt02 <- c("5 minutes", "100 minutes")
|
||||
expected_crt03 <- c("47 days", "24 days")
|
||||
check_values_exist("crt01", expected_crt01)
|
||||
check_values_exist("crt02", expected_crt02)
|
||||
check_values_exist("crt03", expected_crt03)
|
||||
|
||||
# Verification check
|
||||
cat("\nVerification (5 random rows):\n")
|
||||
set.seed(1011)
|
||||
sample_rows <- sample(1:nrow(df), min(5, nrow(df)))
|
||||
for (i in sample_rows) {
|
||||
cat(sprintf(" Row %d:\n", i))
|
||||
cat(sprintf(" crt01 = %s -> crt_correct = %d, crt_int = %d\n",
|
||||
ifelse(is.na(df$crt01[i]) || df$crt01[i] == "", "NA/empty", df$crt01[i]),
|
||||
ifelse(df$crt01[i] == "5 cents", 1, 0),
|
||||
ifelse(df$crt01[i] == "10 cents", 1, 0)))
|
||||
cat(sprintf(" crt02 = %s -> crt_correct = %d, crt_int = %d\n",
|
||||
ifelse(is.na(df$crt02[i]) || df$crt02[i] == "", "NA/empty", df$crt02[i]),
|
||||
ifelse(df$crt02[i] == "5 minutes", 1, 0),
|
||||
ifelse(df$crt02[i] == "100 minutes", 1, 0)))
|
||||
cat(sprintf(" crt03 = %s -> crt_correct = %d, crt_int = %d\n",
|
||||
ifelse(is.na(df$crt03[i]) || df$crt03[i] == "", "NA/empty", df$crt03[i]),
|
||||
ifelse(df$crt03[i] == "47 days", 1, 0),
|
||||
ifelse(df$crt03[i] == "24 days", 1, 0)))
|
||||
cat(sprintf(" Total: crt_correct = %d, crt_int = %d\n\n",
|
||||
df$crt_correct[i], df$crt_int[i]))
|
||||
}
|
||||
|
||||
# =============================================================================
|
||||
# 6. CALCULATE icar_verbal
|
||||
# =============================================================================
|
||||
|
||||
cat("\n=== 5. CALCULATING icar_verbal ===\n\n")
|
||||
|
||||
# Check variables exist
|
||||
verbal_vars <- c("verbal01", "verbal02", "verbal03", "verbal04", "verbal05")
|
||||
check_vars_exist(verbal_vars, "icar_verbal")
|
||||
|
||||
# Correct answers
|
||||
correct_verbal <- c("5", "8", "It's impossible to tell", "47", "Sunday")
|
||||
|
||||
# Calculate proportion correct
|
||||
verbal_responses <- df[, verbal_vars]
|
||||
correct_count <- rowSums(
|
||||
sapply(1:5, function(i) {
|
||||
verbal_responses[, i] == correct_verbal[i]
|
||||
}),
|
||||
na.rm = TRUE
|
||||
)
|
||||
df$icar_verbal <- correct_count / 5
|
||||
|
||||
# Verification check
|
||||
cat("\nVerification (5 random rows):\n")
|
||||
set.seed(1213)
|
||||
sample_rows <- sample(1:nrow(df), min(5, nrow(df)))
|
||||
for (i in sample_rows) {
|
||||
responses <- df[i, verbal_vars]
|
||||
correct <- sum(sapply(1:5, function(j) responses[j] == correct_verbal[j]), na.rm = TRUE)
|
||||
prop <- correct / 5
|
||||
cat(sprintf(" Row %d: Correct = %d/5, icar_verbal = %s\n",
|
||||
i, correct, round(prop, 4)))
|
||||
}
|
||||
|
||||
# =============================================================================
|
||||
# 7. CALCULATE icar_matrix
|
||||
# =============================================================================
|
||||
|
||||
cat("\n=== 6. CALCULATING icar_matrix ===\n\n")
|
||||
|
||||
# Check variables exist
|
||||
matrix_vars <- c("matrix01", "matrix02", "matrix03", "matrix04", "matrix05")
|
||||
check_vars_exist(matrix_vars, "icar_matrix")
|
||||
|
||||
# Correct answers
|
||||
correct_matrix <- c("D", "E", "B", "B", "D")
|
||||
|
||||
# Calculate proportion correct
|
||||
matrix_responses <- df[, matrix_vars]
|
||||
correct_count <- rowSums(
|
||||
sapply(1:5, function(i) {
|
||||
matrix_responses[, i] == correct_matrix[i]
|
||||
}),
|
||||
na.rm = TRUE
|
||||
)
|
||||
df$icar_matrix <- correct_count / 5
|
||||
|
||||
# Verification check
|
||||
cat("\nVerification (5 random rows):\n")
|
||||
set.seed(1415)
|
||||
sample_rows <- sample(1:nrow(df), min(5, nrow(df)))
|
||||
for (i in sample_rows) {
|
||||
responses <- df[i, matrix_vars]
|
||||
correct <- sum(sapply(1:5, function(j) responses[j] == correct_matrix[j]), na.rm = TRUE)
|
||||
prop <- correct / 5
|
||||
cat(sprintf(" Row %d: Correct = %d/5, icar_matrix = %s\n",
|
||||
i, correct, round(prop, 4)))
|
||||
}
|
||||
|
||||
# =============================================================================
|
||||
# 8. CALCULATE icar_total
|
||||
# =============================================================================
|
||||
|
||||
cat("\n=== 7. CALCULATING icar_total ===\n\n")
|
||||
|
||||
# Check variables exist
|
||||
check_vars_exist(c(verbal_vars, matrix_vars), "icar_total")
|
||||
|
||||
# Calculate proportion correct across all 10 items
|
||||
all_correct <- c(correct_verbal, correct_matrix)
|
||||
all_responses <- df[, c(verbal_vars, matrix_vars)]
|
||||
correct_count <- rowSums(
|
||||
sapply(1:10, function(i) {
|
||||
all_responses[, i] == all_correct[i]
|
||||
}),
|
||||
na.rm = TRUE
|
||||
)
|
||||
df$icar_total <- correct_count / 10
|
||||
|
||||
# Verification check
|
||||
cat("\nVerification (5 random rows):\n")
|
||||
set.seed(1617)
|
||||
sample_rows <- sample(1:nrow(df), min(5, nrow(df)))
|
||||
for (i in sample_rows) {
|
||||
responses <- df[i, c(verbal_vars, matrix_vars)]
|
||||
correct <- sum(sapply(1:10, function(j) responses[j] == all_correct[j]), na.rm = TRUE)
|
||||
prop <- correct / 10
|
||||
cat(sprintf(" Row %d: Correct = %d/10, icar_total = %s\n",
|
||||
i, correct, round(prop, 4)))
|
||||
}
|
||||
|
||||
# =============================================================================
|
||||
# 9. RECODE demo_sex TO sex
|
||||
# =============================================================================
|
||||
|
||||
cat("\n=== 8. RECODING demo_sex TO sex ===\n\n")
|
||||
|
||||
# Check variables exist
|
||||
check_vars_exist("demo_sex", "sex")
|
||||
|
||||
# Check values
|
||||
sex_vals <- unique(df$demo_sex[!is.na(df$demo_sex) & df$demo_sex != ""])
|
||||
cat(paste(" Unique values in demo_sex:", paste(sex_vals, collapse = ", "), "\n"))
|
||||
|
||||
# Recode: male = 0, female = 1, else = 2
|
||||
df$sex <- ifelse(
|
||||
is.na(df$demo_sex) | df$demo_sex == "",
|
||||
NA,
|
||||
ifelse(
|
||||
tolower(df$demo_sex) == "male",
|
||||
0,
|
||||
ifelse(
|
||||
tolower(df$demo_sex) == "female",
|
||||
1,
|
||||
2
|
||||
)
|
||||
)
|
||||
)
|
||||
|
||||
# Verification check
|
||||
cat("\nVerification (5 random rows):\n")
|
||||
set.seed(1819)
|
||||
sample_rows <- sample(1:nrow(df), min(5, nrow(df)))
|
||||
for (i in sample_rows) {
|
||||
source_val <- df$demo_sex[i]
|
||||
target_val <- df$sex[i]
|
||||
cat(sprintf(" Row %d: demo_sex = %s -> sex = %s\n",
|
||||
i, ifelse(is.na(source_val) || source_val == "", "NA/empty", source_val),
|
||||
ifelse(is.na(target_val), "NA", target_val)))
|
||||
}
|
||||
|
||||
# =============================================================================
|
||||
# 10. RECODE demo_edu TO education
|
||||
# =============================================================================
|
||||
|
||||
cat("\n=== 9. RECODING demo_edu TO education ===\n\n")
|
||||
|
||||
# Check variables exist
|
||||
check_vars_exist("demo_edu", "education")
|
||||
|
||||
# Check values
|
||||
edu_vals <- unique(df$demo_edu[!is.na(df$demo_edu) & df$demo_edu != ""])
|
||||
cat(paste(" Unique values in demo_edu:", paste(edu_vals, collapse = ", "), "\n"))
|
||||
|
||||
# Recode
|
||||
df$education <- ifelse(
|
||||
is.na(df$demo_edu) | df$demo_edu == "",
|
||||
NA,
|
||||
ifelse(
|
||||
df$demo_edu %in% c("High School (or equivalent)", "Trade School"),
|
||||
"HS_TS",
|
||||
ifelse(
|
||||
df$demo_edu %in% c("College Diploma/Certificate", "University - Undergraduate"),
|
||||
"C_Ug",
|
||||
ifelse(
|
||||
df$demo_edu %in% c("University - Graduate (Masters)", "University - PhD", "Professional Degree (ex. JD/MD)"),
|
||||
"grad_prof",
|
||||
NA
|
||||
)
|
||||
)
|
||||
)
|
||||
)
|
||||
|
||||
# Convert to ordered factor
|
||||
df$education <- factor(df$education,
|
||||
levels = c("HS_TS", "C_Ug", "grad_prof"),
|
||||
ordered = TRUE)
|
||||
|
||||
# Verification check
|
||||
cat("\nVerification (5 random rows):\n")
|
||||
set.seed(2021)
|
||||
sample_rows <- sample(1:nrow(df), min(5, nrow(df)))
|
||||
for (i in sample_rows) {
|
||||
source_val <- df$demo_edu[i]
|
||||
target_val <- df$education[i]
|
||||
cat(sprintf(" Row %d: demo_edu = %s -> education = %s\n",
|
||||
i, ifelse(is.na(source_val) || source_val == "", "NA/empty", source_val),
|
||||
ifelse(is.na(target_val), "NA", as.character(target_val))))
|
||||
}
|
||||
|
||||
# =============================================================================
|
||||
# 11. SAVE UPDATED DATA
|
||||
# =============================================================================
|
||||
# COMMENTED OUT: Uncomment when ready to save
|
||||
|
||||
# write.csv(df, "eohi3.csv", row.names = FALSE, na = "")
|
||||
# cat("\nUpdated data saved to: eohi3.csv\n")
|
||||
# cat(paste("Total rows:", nrow(df), "\n"))
|
||||
# cat(paste("Total columns:", ncol(df), "\n"))
|
||||
|
||||
1046
eohi3/eohi3.csv
1046
eohi3/eohi3.csv
File diff suppressed because one or more lines are too long
501
eohi3/knit/DA01_anova_DS.Rmd
Normal file
501
eohi3/knit/DA01_anova_DS.Rmd
Normal file
@ -0,0 +1,501 @@
|
||||
---
|
||||
title: "Mixed ANOVA - Domain Specific Means (DA01)"
|
||||
author: ""
|
||||
date: "`r Sys.Date()`"
|
||||
output:
|
||||
html_document:
|
||||
toc: true
|
||||
toc_float: true
|
||||
code_folding: hide
|
||||
---
|
||||
|
||||
```{r setup, include = FALSE}
|
||||
knitr::opts_chunk$set(echo = TRUE, message = FALSE, warning = TRUE)
|
||||
```
|
||||
|
||||
# Setup
|
||||
|
||||
```{r libraries}
|
||||
library(tidyverse)
|
||||
library(rstatix)
|
||||
library(emmeans)
|
||||
library(effectsize)
|
||||
library(afex)
|
||||
library(car)
|
||||
|
||||
options(scipen = 999)
|
||||
afex::set_sum_contrasts()
|
||||
```
|
||||
|
||||
# Data
|
||||
|
||||
```{r read-data}
|
||||
# Data file is in parent of knit/ (eohi3/eohi3.csv)
|
||||
df <- read.csv(
|
||||
"/home/ladmin/Documents/DND/EOHI/eohi3/eohi3.csv",
|
||||
stringsAsFactors = FALSE,
|
||||
check.names = FALSE,
|
||||
na.strings = "NA"
|
||||
)
|
||||
|
||||
between_vars <- c("perspective", "temporalDO")
|
||||
within_vars <- c(
|
||||
"past_pref_MEAN", "past_pers_MEAN", "past_val_MEAN",
|
||||
"fut_pref_MEAN", "fut_pers_MEAN", "fut_val_MEAN"
|
||||
)
|
||||
|
||||
missing_vars <- setdiff(c(between_vars, within_vars, "pID"), names(df))
|
||||
if (length(missing_vars) > 0) {
|
||||
stop(paste("Missing required variables:", paste(missing_vars, collapse = ", ")))
|
||||
}
|
||||
|
||||
anova_data <- df %>%
|
||||
select(pID, all_of(between_vars), all_of(within_vars)) %>%
|
||||
filter(
|
||||
!is.na(perspective), perspective != "",
|
||||
!is.na(temporalDO), temporalDO != ""
|
||||
)
|
||||
```
|
||||
|
||||
# Long format
|
||||
|
||||
```{r long-format}
|
||||
long_data <- anova_data %>%
|
||||
pivot_longer(
|
||||
cols = all_of(within_vars),
|
||||
names_to = "variable",
|
||||
values_to = "MEAN_SCORE"
|
||||
) %>%
|
||||
mutate(
|
||||
time = case_when(
|
||||
grepl("^past_", variable) ~ "past",
|
||||
grepl("^fut_", variable) ~ "fut",
|
||||
TRUE ~ NA_character_
|
||||
),
|
||||
domain = case_when(
|
||||
grepl("_pref_MEAN$", variable) ~ "pref",
|
||||
grepl("_pers_MEAN$", variable) ~ "pers",
|
||||
grepl("_val_MEAN$", variable) ~ "val",
|
||||
TRUE ~ NA_character_
|
||||
)
|
||||
) %>%
|
||||
mutate(
|
||||
TIME = factor(time, levels = c("past", "fut")),
|
||||
DOMAIN = factor(domain, levels = c("pref", "pers", "val")),
|
||||
perspective = factor(perspective),
|
||||
temporalDO = factor(temporalDO),
|
||||
pID = factor(pID)
|
||||
) %>%
|
||||
select(pID, perspective, temporalDO, TIME, DOMAIN, MEAN_SCORE) %>%
|
||||
filter(!is.na(MEAN_SCORE))
|
||||
```
|
||||
|
||||
# Descriptive statistics
|
||||
|
||||
```{r desc-stats}
|
||||
desc_stats <- long_data %>%
|
||||
group_by(perspective, temporalDO, TIME, DOMAIN) %>%
|
||||
summarise(
|
||||
n = n(),
|
||||
mean = round(mean(MEAN_SCORE), 5),
|
||||
variance = round(var(MEAN_SCORE), 5),
|
||||
sd = round(sd(MEAN_SCORE), 5),
|
||||
median = round(median(MEAN_SCORE), 5),
|
||||
q1 = round(quantile(MEAN_SCORE, 0.25), 5),
|
||||
q3 = round(quantile(MEAN_SCORE, 0.75), 5),
|
||||
min = round(min(MEAN_SCORE), 5),
|
||||
max = round(max(MEAN_SCORE), 5),
|
||||
.groups = "drop"
|
||||
)
|
||||
|
||||
# Show all rows and columns (no truncation)
|
||||
options(tibble.width = Inf)
|
||||
print(desc_stats, n = Inf)
|
||||
```
|
||||
|
||||
Interpretations:
|
||||
1. Mean and median values are similar, indicating distribution is relatively symmetric and any skew is minimal. Any outliers are not extreme.
|
||||
2. Highest to lowest group n size ratio is 1.14 (139/122). Acceptable ratio for ANOVA (under 1.5).
|
||||
3. Highest to lowest overall group variance ratio is 1.67 (7.93/4.74). Acceptable ratio for ANOVA (under 4).
|
||||
For the sake of consistency w/ the other EHI studies, I also calculated Hartley's F-max ratio.
|
||||
The conservative F-max critical value is 1.60, which is still higher than the highest observed F-max ratio of 1.53.
|
||||
|
||||
# Assumption checks
|
||||
|
||||
## Missing values
|
||||
|
||||
```{r missing}
|
||||
missing_summary <- long_data %>%
|
||||
group_by(perspective, temporalDO, TIME, DOMAIN) %>%
|
||||
summarise(
|
||||
n_total = n(),
|
||||
n_missing = sum(is.na(MEAN_SCORE)),
|
||||
pct_missing = round(100 * n_missing / n_total, 2),
|
||||
.groups = "drop"
|
||||
)
|
||||
|
||||
print(missing_summary, n = Inf)
|
||||
```
|
||||
|
||||
No missing values. As expected.
|
||||
|
||||
## Outliers
|
||||
|
||||
```{r outliers}
|
||||
outlier_summary <- long_data %>%
|
||||
group_by(perspective, temporalDO, TIME, DOMAIN) %>%
|
||||
summarise(
|
||||
n = n(),
|
||||
n_outliers = sum(abs(scale(MEAN_SCORE)) > 3),
|
||||
.groups = "drop"
|
||||
)
|
||||
|
||||
print(outlier_summary, n = Inf)
|
||||
```
|
||||
|
||||
No outliers present. Good.
|
||||
|
||||
## Homogeneity of variance
|
||||
|
||||
```{r homogeneity}
|
||||
homogeneity_between <- long_data %>%
|
||||
group_by(TIME, DOMAIN) %>%
|
||||
rstatix::levene_test(MEAN_SCORE ~ perspective * temporalDO)
|
||||
|
||||
print(homogeneity_between, n = Inf)
|
||||
```
|
||||
|
||||
Levene's test is sigifnicant for two cells: fut-pers and fut-val.
|
||||
However, variance ratios and F-max tests show that any heteroscedasticity is mild.
|
||||
|
||||
## Normality (within-subjects residuals)
|
||||
|
||||
```{r normality}
|
||||
resid_within <- long_data %>%
|
||||
group_by(pID) %>%
|
||||
mutate(person_mean = mean(MEAN_SCORE, na.rm = TRUE)) %>%
|
||||
ungroup() %>%
|
||||
mutate(resid = MEAN_SCORE - person_mean) %>%
|
||||
pull(resid)
|
||||
resid_within <- resid_within[!is.na(resid_within)]
|
||||
|
||||
n_resid <- length(resid_within)
|
||||
if (n_resid < 3L) {
|
||||
message("Too few within-subjects residuals (n < 3); skipping Shapiro-Wilk.")
|
||||
} else {
|
||||
resid_for_shapiro <- if (n_resid > 5000L) {
|
||||
set.seed(1L)
|
||||
sample(resid_within, 5000L)
|
||||
} else {
|
||||
resid_within
|
||||
}
|
||||
print(shapiro.test(resid_for_shapiro))
|
||||
}
|
||||
```
|
||||
|
||||
### Q-Q plot
|
||||
|
||||
```{r qqplot, fig.height = 4}
|
||||
qqnorm(resid_within)
|
||||
qqline(resid_within)
|
||||
```
|
||||
|
||||
Shapiro-Wilk is significant but is sensitive to large sample size.
|
||||
QQ plot shows that centre residuals are normally distributed, with some tail heaviness.
|
||||
ANOVA is robust to violations of normality w/ large sample size.
|
||||
|
||||
Overall, ANOVA can proceed.
|
||||
|
||||
# Mixed ANOVA
|
||||
|
||||
```{r anova}
|
||||
aov_afex <- aov_ez(
|
||||
id = "pID",
|
||||
dv = "MEAN_SCORE",
|
||||
data = long_data,
|
||||
between = c("perspective", "temporalDO"),
|
||||
within = c("TIME", "DOMAIN"),
|
||||
type = 3
|
||||
)
|
||||
|
||||
cat("\n--- ANOVA Table (Type 3, uncorrected) ---\n")
|
||||
print(nice(aov_afex, correction = "none"))
|
||||
cat("\n--- ANOVA Table (Type 3, Greenhouse–Geisser correction) ---\n")
|
||||
print(nice(aov_afex, correction = "GG"))
|
||||
```
|
||||
|
||||
Mauchly's test of sphericity is sig for DOMAIN main effect and interaction (except w/ TIME). Use GG correction for these:
|
||||
- 8 DOMAIN 1.94, 1004.66 1.21 0.63 <.001, p = .529
|
||||
## 9 perspective:DOMAIN 1.94, 1004.66 1.21 7.79 *** <.001, p <.001
|
||||
## 10 temporalDO:DOMAIN 1.94, 1004.66 1.21 0.76 <.001, p = .466
|
||||
## 11 perspective:temporalDO:DOMAIN 1.94, 1004.66 1.21 0.17 <.001, p = .837
|
||||
|
||||
|
||||
The following are significant main effects and interactions:
|
||||
## 15 perspective:temporalDO:TIME:DOMAIN 2, 1036 0.75 3.11 * <.001 .045
|
||||
## 13 perspective:TIME:DOMAIN 2, 1036 0.75 3.58 * <.001 .028
|
||||
## 9 perspective:DOMAIN 1.94, 1004.66 1.21 7.79 *** <.001, p <.001 (GG corrected)
|
||||
## 6 temporalDO:TIME 1, 518 1.86 9.81 ** <.001 .002
|
||||
## 7 perspective:temporalDO:TIME 1, 518 1.86 7.91 ** <.001 .005
|
||||
## 4 TIME 1, 518 1.86 10.05 ** <.001 .002
|
||||
|
||||
|
||||
# Mauchly and epsilon
|
||||
|
||||
```{r mauchly}
|
||||
anova_wide <- anova_data %>%
|
||||
select(pID, perspective, temporalDO, all_of(within_vars)) %>%
|
||||
filter(if_all(all_of(within_vars), ~ !is.na(.)))
|
||||
response_matrix <- as.matrix(anova_wide[, within_vars])
|
||||
rm_model <- lm(response_matrix ~ perspective * temporalDO, data = anova_wide)
|
||||
idata <- data.frame(
|
||||
TIME = factor(rep(c("past", "fut"), each = 3), levels = c("past", "fut")),
|
||||
DOMAIN = factor(rep(c("pref", "pers", "val"), 2), levels = c("pref", "pers", "val"))
|
||||
)
|
||||
rm_anova <- car::Anova(rm_model, idata = idata, idesign = ~ TIME * DOMAIN, type = 3)
|
||||
rm_summary <- summary(rm_anova, multivariate = FALSE)
|
||||
if (!is.null(rm_summary$sphericity.tests)) {
|
||||
cat("\nMauchly's Test of Sphericity:\n")
|
||||
print(rm_summary$sphericity.tests)
|
||||
}
|
||||
if (!is.null(rm_summary$epsilon)) {
|
||||
cat("\nEpsilon (GG, HF):\n")
|
||||
print(rm_summary$epsilon)
|
||||
}
|
||||
```
|
||||
|
||||
# Post-hoc comparisons
|
||||
|
||||
## TIME (main effect)
|
||||
|
||||
```{r posthoc-TIME}
|
||||
emm_TIME <- emmeans(aov_afex, ~ TIME)
|
||||
print(pairs(emm_TIME, adjust = "bonferroni"))
|
||||
```
|
||||
|
||||
Pairwise comparison provide supprot for EHI effect.
|
||||
|
||||
## temporalDO:TIME
|
||||
|
||||
```{r posthoc-temporalDO-TIME}
|
||||
emm_temporalDO_TIME <- emmeans(aov_afex, ~ TIME | temporalDO)
|
||||
print(pairs(emm_temporalDO_TIME, adjust = "bonferroni"))
|
||||
```
|
||||
|
||||
Contrast significant only for temporal display order of past first, then future.
|
||||
When grouped by time instead of temporalDO, no contrasts are significant.
|
||||
|
||||
## perspective:temporalDO:TIME
|
||||
|
||||
```{r posthoc-pt-TIME}
|
||||
emm_pt_TIME <- emmeans(aov_afex, ~ TIME | perspective + temporalDO)
|
||||
print(pairs(emm_pt_TIME, adjust = "bonferroni"))
|
||||
```
|
||||
|
||||
EHI is significant only for self perspective and past first temporal display order.
|
||||
|
||||
When grouped by perspective or temporalDO instead of TIME, no contrasts are significant.
|
||||
|
||||
## perspective:DOMAIN
|
||||
|
||||
```{r posthoc-perspective-DOMAIN}
|
||||
emm_perspective_DOMAIN <- emmeans(aov_afex, ~ perspective | DOMAIN)
|
||||
print(pairs(emm_perspective_DOMAIN, adjust = "bonferroni"))
|
||||
emm_perspective_DOMAIN_domain <- emmeans(aov_afex, ~ DOMAIN | perspective)
|
||||
print(pairs(emm_perspective_DOMAIN_domain, adjust = "bonferroni"))
|
||||
```
|
||||
|
||||
significance is driven by the change from preferences to values in the "other" perspective.
|
||||
|
||||
## perspective:TIME:DOMAIN
|
||||
|
||||
```{r posthoc-pt-DOMAIN}
|
||||
emm_pt_TIME_domain <- emmeans(aov_afex, ~ TIME | perspective + DOMAIN)
|
||||
print(pairs(emm_pt_TIME_domain, adjust = "bonferroni"))
|
||||
```
|
||||
|
||||
EHI effects present for other-perspective in the preferences domain and for self-perspective in the values domain.
|
||||
Estimate is higher in the self-perspective than in the other-perspective.
|
||||
|
||||
```{r posthoc-pt-DOMAIN-domain}
|
||||
emm_pt_domain_domain <- emmeans(aov_afex, ~ DOMAIN | perspective + TIME)
|
||||
print(pairs(emm_pt_domain_domain, adjust = "bonferroni"))
|
||||
```
|
||||
|
||||
Significant contrasts are driven by domain changes from preferences to values in the self vs other perspectives, in the past-oriented questions.
|
||||
Trends reverse depending on perspective, where values have higher estimates than preferences in the self-perspective, but lower estimates than preferences in the other-perspective.
|
||||
|
||||
## perspective:temporalDO:TIME:DOMAIN
|
||||
|
||||
```{r posthoc-ptt-TIME}
|
||||
emm_ptt_TIME <- emmeans(aov_afex, ~ TIME | perspective + temporalDO + DOMAIN)
|
||||
print(pairs(emm_ptt_TIME, adjust = "bonferroni"))
|
||||
```
|
||||
EHI effects are present for three contrasts:
|
||||
- past - fut 0.2806 0.118 518 2.380 0.0177 (other-perspective, preferences domain, past-first temporal display order)
|
||||
- past - fut 0.4358 0.138 518 3.156 0.0017 (self-perspective, personality domain, past-first temporal display order)
|
||||
- past - fut 0.7276 0.141 518 5.169 <0.0001 (self-perspective, values domain, past-first temporal display order)
|
||||
|
||||
A reverse EHI effect is present for 1 contrast:
|
||||
- past - fut -0.2367 0.118 518 -2.001 0.0459 (self-personality, preferences domain, future-first temporal display order)
|
||||
|
||||
```{r posthoc-ptt-perspective}
|
||||
emm_ptt_perspective <- emmeans(aov_afex, ~ perspective | temporalDO + TIME + DOMAIN)
|
||||
print(pairs(emm_ptt_perspective, adjust = "bonferroni"))
|
||||
```
|
||||
1 significant contrast:
|
||||
- other - self -0.6972 0.314 518 -2.220 0.0268 (values domain, past-oriented questions, past-first temporal display order)
|
||||
|
||||
|
||||
not really of theoretical interest but speaks to the perspective:TIME:DOMAIN interaction.
|
||||
|
||||
no significant contrasts when grouped by temporalDO instead of TIME or perspective.
|
||||
|
||||
## Cohen's d (significant contrasts only)
|
||||
|
||||
```{r cohens-d-significant}
|
||||
d_data <- anova_data %>%
|
||||
mutate(
|
||||
past_mean = (past_pref_MEAN + past_pers_MEAN + past_val_MEAN) / 3,
|
||||
fut_mean = (fut_pref_MEAN + fut_pers_MEAN + fut_val_MEAN) / 3,
|
||||
pref_mean = (past_pref_MEAN + fut_pref_MEAN) / 2,
|
||||
pers_mean = (past_pers_MEAN + fut_pers_MEAN) / 2,
|
||||
val_mean = (past_val_MEAN + fut_val_MEAN) / 2
|
||||
)
|
||||
|
||||
cohens_d_results <- tibble(
|
||||
contrast = character(),
|
||||
condition = character(),
|
||||
d = double()
|
||||
)
|
||||
|
||||
# TIME main: past vs fut
|
||||
cohens_d_results <- cohens_d_results %>%
|
||||
add_row(
|
||||
contrast = "TIME (past - fut)",
|
||||
condition = "overall",
|
||||
d = suppressMessages(effectsize::cohens_d(d_data$past_mean, d_data$fut_mean, paired = TRUE)$Cohens_d)
|
||||
)
|
||||
|
||||
# temporalDO:TIME: past vs fut for temporalDO = past
|
||||
d_past_tdo <- d_data %>% filter(temporalDO == "past")
|
||||
cohens_d_results <- cohens_d_results %>%
|
||||
add_row(
|
||||
contrast = "TIME (past - fut)",
|
||||
condition = "temporalDO = past",
|
||||
d = suppressMessages(effectsize::cohens_d(d_past_tdo$past_mean, d_past_tdo$fut_mean, paired = TRUE)$Cohens_d)
|
||||
)
|
||||
|
||||
# perspective:temporalDO:TIME: past vs fut for self, past temporalDO
|
||||
d_self_past <- d_data %>% filter(perspective == "self", temporalDO == "past")
|
||||
cohens_d_results <- cohens_d_results %>%
|
||||
add_row(
|
||||
contrast = "TIME (past - fut)",
|
||||
condition = "self, temporalDO = past",
|
||||
d = suppressMessages(effectsize::cohens_d(d_self_past$past_mean, d_self_past$fut_mean, paired = TRUE)$Cohens_d)
|
||||
)
|
||||
|
||||
# perspective:DOMAIN: pref vs val for perspective = other
|
||||
d_other <- d_data %>% filter(perspective == "other")
|
||||
cohens_d_results <- cohens_d_results %>%
|
||||
add_row(
|
||||
contrast = "DOMAIN (pref - val)",
|
||||
condition = "perspective = other",
|
||||
d = suppressMessages(effectsize::cohens_d(d_other$pref_mean, d_other$val_mean, paired = TRUE)$Cohens_d)
|
||||
)
|
||||
|
||||
# perspective:TIME:DOMAIN - TIME: other, pref
|
||||
d_other_pref <- d_data %>% filter(perspective == "other")
|
||||
cohens_d_results <- cohens_d_results %>%
|
||||
add_row(
|
||||
contrast = "TIME (past - fut)",
|
||||
condition = "other, pref domain",
|
||||
d = suppressMessages(effectsize::cohens_d(d_other_pref$past_pref_MEAN, d_other_pref$fut_pref_MEAN, paired = TRUE)$Cohens_d)
|
||||
)
|
||||
|
||||
# perspective:TIME:DOMAIN - TIME: self, val
|
||||
d_self_val <- d_data %>% filter(perspective == "self")
|
||||
cohens_d_results <- cohens_d_results %>%
|
||||
add_row(
|
||||
contrast = "TIME (past - fut)",
|
||||
condition = "self, val domain",
|
||||
d = suppressMessages(effectsize::cohens_d(d_self_val$past_val_MEAN, d_self_val$fut_val_MEAN, paired = TRUE)$Cohens_d)
|
||||
)
|
||||
|
||||
# perspective:TIME:DOMAIN - DOMAIN: other, past TIME
|
||||
d_other_past <- d_data %>% filter(perspective == "other")
|
||||
cohens_d_results <- cohens_d_results %>%
|
||||
add_row(
|
||||
contrast = "DOMAIN (pref - val)",
|
||||
condition = "other, past TIME",
|
||||
d = suppressMessages(effectsize::cohens_d(d_other_past$past_pref_MEAN, d_other_past$past_val_MEAN, paired = TRUE)$Cohens_d)
|
||||
)
|
||||
|
||||
# perspective:TIME:DOMAIN - DOMAIN: self, past TIME: pref - pers
|
||||
d_self_past_t <- d_data %>% filter(perspective == "self")
|
||||
cohens_d_results <- cohens_d_results %>%
|
||||
add_row(
|
||||
contrast = "DOMAIN (pref - pers)",
|
||||
condition = "self, past TIME",
|
||||
d = suppressMessages(effectsize::cohens_d(d_self_past_t$past_pref_MEAN, d_self_past_t$past_pers_MEAN, paired = TRUE)$Cohens_d)
|
||||
)
|
||||
|
||||
# perspective:TIME:DOMAIN - DOMAIN: self, past TIME: pref - val
|
||||
cohens_d_results <- cohens_d_results %>%
|
||||
add_row(
|
||||
contrast = "DOMAIN (pref - val)",
|
||||
condition = "self, past TIME",
|
||||
d = suppressMessages(effectsize::cohens_d(d_self_past_t$past_pref_MEAN, d_self_past_t$past_val_MEAN, paired = TRUE)$Cohens_d)
|
||||
)
|
||||
|
||||
# 4-way TIME: self, future temporalDO, pref (reverse EHI)
|
||||
d_self_fut_pref <- d_data %>% filter(perspective == "self", temporalDO == "future")
|
||||
cohens_d_results <- cohens_d_results %>%
|
||||
add_row(
|
||||
contrast = "TIME (past - fut) [reverse]",
|
||||
condition = "self, future temporalDO, pref domain",
|
||||
d = suppressMessages(effectsize::cohens_d(d_self_fut_pref$past_pref_MEAN, d_self_fut_pref$fut_pref_MEAN, paired = TRUE)$Cohens_d)
|
||||
)
|
||||
|
||||
# 4-way TIME: other, past temporalDO, pref
|
||||
d_other_past_pref <- d_data %>% filter(perspective == "other", temporalDO == "past")
|
||||
cohens_d_results <- cohens_d_results %>%
|
||||
add_row(
|
||||
contrast = "TIME (past - fut)",
|
||||
condition = "other, past temporalDO, pref domain",
|
||||
d = suppressMessages(effectsize::cohens_d(d_other_past_pref$past_pref_MEAN, d_other_past_pref$fut_pref_MEAN, paired = TRUE)$Cohens_d)
|
||||
)
|
||||
|
||||
# 4-way TIME: self, past temporalDO, pers
|
||||
cohens_d_results <- cohens_d_results %>%
|
||||
add_row(
|
||||
contrast = "TIME (past - fut)",
|
||||
condition = "self, past temporalDO, pers domain",
|
||||
d = suppressMessages(effectsize::cohens_d(d_self_past$past_pers_MEAN, d_self_past$fut_pers_MEAN, paired = TRUE)$Cohens_d)
|
||||
)
|
||||
|
||||
# 4-way TIME: self, past temporalDO, val
|
||||
cohens_d_results <- cohens_d_results %>%
|
||||
add_row(
|
||||
contrast = "TIME (past - fut)",
|
||||
condition = "self, past temporalDO, val domain",
|
||||
d = suppressMessages(effectsize::cohens_d(d_self_past$past_val_MEAN, d_self_past$fut_val_MEAN, paired = TRUE)$Cohens_d)
|
||||
)
|
||||
|
||||
# 4-way perspective: past temporalDO, past TIME, val domain (other - self, between-subjects)
|
||||
d_ptt_val <- d_data %>%
|
||||
filter(temporalDO == "past") %>%
|
||||
select(perspective, past_val_MEAN)
|
||||
d_other_ptt <- d_ptt_val %>% filter(perspective == "other") %>% pull(past_val_MEAN)
|
||||
d_self_ptt <- d_ptt_val %>% filter(perspective == "self") %>% pull(past_val_MEAN)
|
||||
cohens_d_results <- cohens_d_results %>%
|
||||
add_row(
|
||||
contrast = "perspective (other - self)",
|
||||
condition = "past temporalDO, past TIME, val domain",
|
||||
d = suppressMessages(effectsize::cohens_d(d_other_ptt, d_self_ptt, paired = FALSE)$Cohens_d)
|
||||
)
|
||||
|
||||
cohens_d_results %>%
|
||||
mutate(d = round(d, 3)) %>%
|
||||
print(n = Inf)
|
||||
```
|
||||
2535
eohi3/knit/DA01_anova_DS.html
Normal file
2535
eohi3/knit/DA01_anova_DS.html
Normal file
File diff suppressed because one or more lines are too long
2345
eohi3/knit/DA02_anova_DGEN.html
Normal file
2345
eohi3/knit/DA02_anova_DGEN.html
Normal file
File diff suppressed because one or more lines are too long
434
eohi3/knit/DA02_anova_DGEN.rmd
Normal file
434
eohi3/knit/DA02_anova_DGEN.rmd
Normal file
@ -0,0 +1,434 @@
|
||||
---
|
||||
title: "Mixed ANOVA - Domain General Vars"
|
||||
author: ""
|
||||
date: "`r Sys.Date()`"
|
||||
output:
|
||||
html_document:
|
||||
toc: true
|
||||
toc_float: true
|
||||
code_folding: hide
|
||||
---
|
||||
|
||||
```{r setup, include = FALSE}
|
||||
knitr::opts_chunk$set(echo = TRUE, message = FALSE, warning = TRUE)
|
||||
```
|
||||
|
||||
# Setup
|
||||
|
||||
```{r libraries}
|
||||
library(tidyverse)
|
||||
library(rstatix)
|
||||
library(emmeans)
|
||||
library(effectsize)
|
||||
library(afex)
|
||||
library(car)
|
||||
|
||||
options(scipen = 999)
|
||||
afex::set_sum_contrasts()
|
||||
```
|
||||
|
||||
# Data
|
||||
|
||||
```{r read-data}
|
||||
# Data file is in parent of knit/ (eohi3/eohi3.csv)
|
||||
df <- read.csv(
|
||||
"/home/ladmin/Documents/DND/EOHI/eohi3/eohi3.csv",
|
||||
stringsAsFactors = FALSE,
|
||||
check.names = FALSE,
|
||||
na.strings = "NA"
|
||||
)
|
||||
|
||||
between_vars <- c("perspective", "temporalDO")
|
||||
within_vars <- c(
|
||||
"past_pref_DGEN", "past_pers_DGEN", "past_val_DGEN",
|
||||
"fut_pref_DGEN", "fut_pers_DGEN", "fut_val_DGEN"
|
||||
)
|
||||
|
||||
missing_vars <- setdiff(c(between_vars, within_vars, "pID"), names(df))
|
||||
if (length(missing_vars) > 0) {
|
||||
stop(paste("Missing required variables:", paste(missing_vars, collapse = ", ")))
|
||||
}
|
||||
|
||||
anova_data <- df %>%
|
||||
select(pID, all_of(between_vars), all_of(within_vars)) %>%
|
||||
filter(
|
||||
!is.na(perspective), perspective != "",
|
||||
!is.na(temporalDO), temporalDO != ""
|
||||
)
|
||||
```
|
||||
|
||||
# Long format
|
||||
|
||||
```{r long-format}
|
||||
long_data <- anova_data %>%
|
||||
pivot_longer(
|
||||
cols = all_of(within_vars),
|
||||
names_to = "variable",
|
||||
values_to = "DGEN_SCORE"
|
||||
) %>%
|
||||
mutate(
|
||||
time = case_when(
|
||||
grepl("^past_", variable) ~ "past",
|
||||
grepl("^fut_", variable) ~ "fut",
|
||||
TRUE ~ NA_character_
|
||||
),
|
||||
domain = case_when(
|
||||
grepl("_pref_DGEN$", variable) ~ "pref",
|
||||
grepl("_pers_DGEN$", variable) ~ "pers",
|
||||
grepl("_val_DGEN$", variable) ~ "val",
|
||||
TRUE ~ NA_character_
|
||||
)
|
||||
) %>%
|
||||
mutate(
|
||||
TIME = factor(time, levels = c("past", "fut")),
|
||||
DOMAIN = factor(domain, levels = c("pref", "pers", "val")),
|
||||
perspective = factor(perspective),
|
||||
temporalDO = factor(temporalDO),
|
||||
pID = factor(pID)
|
||||
) %>%
|
||||
select(pID, perspective, temporalDO, TIME, DOMAIN, DGEN_SCORE) %>%
|
||||
filter(!is.na(DGEN_SCORE))
|
||||
```
|
||||
|
||||
# Descriptive statistics
|
||||
|
||||
```{r desc-stats}
|
||||
desc_stats <- long_data %>%
|
||||
group_by(perspective, temporalDO, TIME, DOMAIN) %>%
|
||||
summarise(
|
||||
n = n(),
|
||||
mean = round(mean(DGEN_SCORE), 5),
|
||||
variance = round(var(DGEN_SCORE), 5),
|
||||
sd = round(sd(DGEN_SCORE), 5),
|
||||
median = round(median(DGEN_SCORE), 5),
|
||||
q1 = round(quantile(DGEN_SCORE, 0.25), 5),
|
||||
q3 = round(quantile(DGEN_SCORE, 0.75), 5),
|
||||
min = round(min(DGEN_SCORE), 5),
|
||||
max = round(max(DGEN_SCORE), 5),
|
||||
.groups = "drop"
|
||||
)
|
||||
|
||||
# Show all rows and columns (no truncation)
|
||||
options(tibble.width = Inf)
|
||||
print(desc_stats, n = Inf)
|
||||
```
|
||||
|
||||
Interpretation:
|
||||
1. Mean and median values are similar w/ slightly more variation than in the domain specific anova.
|
||||
2. Highest to lowest group n size ratio is 1.14 (139/122). Acceptable ratio for ANOVA (under 1.5).
|
||||
3. Highest to lowest overall group variance ratio is 1.40 (9.32/6.65). Acceptable ratio for ANOVA (under 4).
|
||||
For the sake of consistency w/ the other EHI studies, I also calculated Hartley's F-max ratio.
|
||||
The conservative F-max critical value is 1.60 (same as DS anova since number of groups and n sizes doesn't change), which is still higher than the highest observed F-max ratio of 1.28.
|
||||
|
||||
# Assumption checks
|
||||
|
||||
## Missing values
|
||||
|
||||
```{r missing}
|
||||
missing_summary <- long_data %>%
|
||||
group_by(perspective, temporalDO, TIME, DOMAIN) %>%
|
||||
summarise(
|
||||
n_total = n(),
|
||||
n_missing = sum(is.na(DGEN_SCORE)),
|
||||
pct_missing = round(100 * n_missing / n_total, 2),
|
||||
.groups = "drop"
|
||||
)
|
||||
|
||||
print(missing_summary, n = Inf)
|
||||
```
|
||||
|
||||
No missing values. As expected.
|
||||
|
||||
## Outliers
|
||||
|
||||
```{r outliers}
|
||||
outlier_summary <- long_data %>%
|
||||
group_by(perspective, temporalDO, TIME, DOMAIN) %>%
|
||||
summarise(
|
||||
n = n(),
|
||||
n_outliers = sum(abs(scale(DGEN_SCORE)) > 3),
|
||||
.groups = "drop"
|
||||
)
|
||||
|
||||
print(outlier_summary, n = Inf)
|
||||
```
|
||||
|
||||
No outliers present. Good. Same as DS anova.
|
||||
|
||||
## Homogeneity of variance
|
||||
|
||||
```{r homogeneity}
|
||||
homogeneity_between <- long_data %>%
|
||||
group_by(TIME, DOMAIN) %>%
|
||||
rstatix::levene_test(DGEN_SCORE ~ perspective * temporalDO)
|
||||
|
||||
print(homogeneity_between, n = Inf)
|
||||
```
|
||||
|
||||
Levene's test is signiicant for 1 cell only: past-val. However, variance ratios and F-max tests show that any heteroscedasticity is mild.
|
||||
|
||||
## Normality (within-subjects residuals)
|
||||
|
||||
```{r normality}
|
||||
resid_within <- long_data %>%
|
||||
group_by(pID) %>%
|
||||
mutate(person_mean = mean(DGEN_SCORE, na.rm = TRUE)) %>%
|
||||
ungroup() %>%
|
||||
mutate(resid = DGEN_SCORE - person_mean) %>%
|
||||
pull(resid)
|
||||
resid_within <- resid_within[!is.na(resid_within)]
|
||||
|
||||
n_resid <- length(resid_within)
|
||||
if (n_resid < 3L) {
|
||||
message("Too few within-subjects residuals (n < 3); skipping Shapiro-Wilk.")
|
||||
} else {
|
||||
resid_for_shapiro <- if (n_resid > 5000L) {
|
||||
set.seed(1L)
|
||||
sample(resid_within, 5000L)
|
||||
} else {
|
||||
resid_within
|
||||
}
|
||||
print(shapiro.test(resid_for_shapiro))
|
||||
}
|
||||
```
|
||||
|
||||
### Q-Q plot
|
||||
|
||||
```{r qqplot, fig.height = 4}
|
||||
qqnorm(resid_within)
|
||||
qqline(resid_within)
|
||||
```
|
||||
|
||||
Shapiro-Wilk test is significant but is sensitive to large sample size.
|
||||
QQ plot shows that strict centre residuals are normally distributed, but there is some deviation from normality.
|
||||
ANOVA is robust to violations of normality w/ large sample size.
|
||||
|
||||
Overall, ANOVA can proceed.
|
||||
|
||||
# Mixed ANOVA
|
||||
|
||||
```{r anova}
|
||||
aov_afex <- aov_ez(
|
||||
id = "pID",
|
||||
dv = "DGEN_SCORE",
|
||||
data = long_data,
|
||||
between = c("perspective", "temporalDO"),
|
||||
within = c("TIME", "DOMAIN"),
|
||||
type = 3,
|
||||
anova_table = list(correction = "none")
|
||||
)
|
||||
|
||||
print(aov_afex)
|
||||
```
|
||||
|
||||
Mauchly's test of sphericity is not significant. Using uncorrected values for interpretation and analysis.
|
||||
|
||||
|
||||
Significant main effects and interactions:
|
||||
Effect df MSE F ges p
|
||||
4 TIME 1, 518 3.11 8.39 ** .001 .004
|
||||
8 DOMAIN 2, 1036 2.13 7.85 *** .001 <.001
|
||||
10 temporalDO:DOMAIN 2, 1036 2.13 5.00 ** <.001 .007
|
||||
15 perspective:temporalDO:TIME:DOMAIN 2, 1036 1.52 3.12 * <.001 .045
|
||||
|
||||
|
||||
# Mauchly and epsilon
|
||||
|
||||
```{r mauchly}
|
||||
anova_wide <- anova_data %>%
|
||||
select(pID, perspective, temporalDO, all_of(within_vars)) %>%
|
||||
filter(if_all(all_of(within_vars), ~ !is.na(.)))
|
||||
response_matrix <- as.matrix(anova_wide[, within_vars])
|
||||
rm_model <- lm(response_matrix ~ perspective * temporalDO, data = anova_wide)
|
||||
idata <- data.frame(
|
||||
TIME = factor(rep(c("past", "fut"), each = 3), levels = c("past", "fut")),
|
||||
DOMAIN = factor(rep(c("pref", "pers", "val"), 2), levels = c("pref", "pers", "val"))
|
||||
)
|
||||
rm_anova <- car::Anova(rm_model, idata = idata, idesign = ~ TIME * DOMAIN, type = 3)
|
||||
rm_summary <- summary(rm_anova, multivariate = FALSE)
|
||||
if (!is.null(rm_summary$sphericity.tests)) {
|
||||
print(rm_summary$sphericity.tests)
|
||||
}
|
||||
if (!is.null(rm_summary$epsilon)) {
|
||||
print(rm_summary$epsilon)
|
||||
}
|
||||
```
|
||||
|
||||
# Post hoc comparisons
|
||||
|
||||
## TIME (main effect)
|
||||
|
||||
```{r posthoc-TIME}
|
||||
emm_TIME <- emmeans(aov_afex, ~ TIME)
|
||||
print(pairs(emm_TIME, adjust = "none"))
|
||||
```
|
||||
|
||||
Supports presence of EHI effect.
|
||||
|
||||
## DOMAIN (main effect)
|
||||
|
||||
```{r posthoc-domain}
|
||||
emm_DOMAIN <- emmeans(aov_afex, ~ DOMAIN)
|
||||
print(pairs(emm_DOMAIN, adjust = "tukey"))
|
||||
```
|
||||
|
||||
Only preference to values contrast is significant.
|
||||
|
||||
## temporalDO:DOMAIN
|
||||
|
||||
```{r posthoc-temporaldo-domain}
|
||||
emmeans(aov_afex, pairwise ~ temporalDO | DOMAIN, adjust = "none")$contrasts
|
||||
emmeans(aov_afex, pairwise ~ DOMAIN | temporalDO, adjust = "tukey")$contrasts
|
||||
```
|
||||
|
||||
When grouped by domain, no contrasts are significant.
|
||||
|
||||
|
||||
When grouped by temporalDO, some contrasts are significant:
|
||||
|
||||
Future-first temporal display order:
|
||||
contrast estimate SE df t.ratio p.value
|
||||
pref - pers 0.25065 0.0892 518 2.810 0.0142
|
||||
|
||||
|
||||
Past-first temporal display order:
|
||||
contrast estimate SE df t.ratio p.value
|
||||
pref - val 0.33129 0.0895 518 3.702 0.0007
|
||||
pers - val 0.32478 0.0921 518 3.527 0.0013
|
||||
|
||||
## perspective:temporalDO:TIME:DOMAIN
|
||||
|
||||
### contrasts for TIME grouped by perspective, temporalDO, and DOMAIN
|
||||
```{r posthoc-fourway}
|
||||
emm_fourway <- emmeans(aov_afex, pairwise ~ TIME | perspective * temporalDO * DOMAIN, adjust = "tukey")
|
||||
print(emm_fourway$contrasts)
|
||||
```
|
||||
|
||||
Significant contrasts:
|
||||
|
||||
contrast estimate SE df t.ratio p.value
|
||||
past - fut 0.5285 0.179 518 2.957 0.0032 (self-perspective, personality domain, past-first temporal display order)
|
||||
past - fut 0.5366 0.187 518 2.863 0.0044 (self-perspective, values domain, past-first temporal display order)
|
||||
|
||||
### contrasts for DOMAIN grouped by perspective, TIME, and temporalDO
|
||||
```{r posthoc-fourway2}
|
||||
emm_fourway2 <- emmeans(aov_afex, pairwise ~ DOMAIN | perspective * TIME * temporalDO, adjust = "tukey")
|
||||
print(emm_fourway2$contrasts)
|
||||
```
|
||||
|
||||
Significant contrasts:
|
||||
|
||||
contrast estimate SE df t.ratio p.value
|
||||
pref - val 0.6259 0.166 518 3.778 0.0005 (other-perspective, past-directed questions, past-first temporal display order)
|
||||
pers - val 0.4892 0.160 518 3.056 0.0066 (other-perspective, past-directed questions, past-first temporal display order)
|
||||
pref - val 0.4309 0.168 518 2.559 0.0290 (self-perspective, future-directed questions, past-first temporal display order)
|
||||
|
||||
## Cohen's d (significant contrasts only)
|
||||
|
||||
```{r cohens-d-significant}
|
||||
d_data <- anova_data %>%
|
||||
mutate(
|
||||
past_mean = (past_pref_DGEN + past_pers_DGEN + past_val_DGEN) / 3,
|
||||
fut_mean = (fut_pref_DGEN + fut_pers_DGEN + fut_val_DGEN) / 3,
|
||||
pref_mean = (past_pref_DGEN + fut_pref_DGEN) / 2,
|
||||
pers_mean = (past_pers_DGEN + fut_pers_DGEN) / 2,
|
||||
val_mean = (past_val_DGEN + fut_val_DGEN) / 2
|
||||
)
|
||||
|
||||
cohens_d_results <- tibble(
|
||||
contrast = character(),
|
||||
condition = character(),
|
||||
d = double()
|
||||
)
|
||||
|
||||
# TIME main: past vs fut
|
||||
cohens_d_results <- cohens_d_results %>%
|
||||
add_row(
|
||||
contrast = "TIME (past - fut)",
|
||||
condition = "overall",
|
||||
d = suppressMessages(effectsize::cohens_d(d_data$past_mean, d_data$fut_mean, paired = TRUE)$Cohens_d)
|
||||
)
|
||||
|
||||
# DOMAIN main: pref vs val
|
||||
cohens_d_results <- cohens_d_results %>%
|
||||
add_row(
|
||||
contrast = "DOMAIN (pref - val)",
|
||||
condition = "overall",
|
||||
d = suppressMessages(effectsize::cohens_d(d_data$pref_mean, d_data$val_mean, paired = TRUE)$Cohens_d)
|
||||
)
|
||||
|
||||
# temporalDO:DOMAIN - future: pref vs pers
|
||||
d_fut <- d_data %>% filter(temporalDO == "future")
|
||||
cohens_d_results <- cohens_d_results %>%
|
||||
add_row(
|
||||
contrast = "DOMAIN (pref - pers)",
|
||||
condition = "temporalDO = future",
|
||||
d = suppressMessages(effectsize::cohens_d(d_fut$pref_mean, d_fut$pers_mean, paired = TRUE)$Cohens_d)
|
||||
)
|
||||
|
||||
# temporalDO:DOMAIN - past: pref vs val, pers vs val
|
||||
d_past <- d_data %>% filter(temporalDO == "past")
|
||||
cohens_d_results <- cohens_d_results %>%
|
||||
add_row(
|
||||
contrast = "DOMAIN (pref - val)",
|
||||
condition = "temporalDO = past",
|
||||
d = suppressMessages(effectsize::cohens_d(d_past$pref_mean, d_past$val_mean, paired = TRUE)$Cohens_d)
|
||||
) %>%
|
||||
add_row(
|
||||
contrast = "DOMAIN (pers - val)",
|
||||
condition = "temporalDO = past",
|
||||
d = suppressMessages(effectsize::cohens_d(d_past$pers_mean, d_past$val_mean, paired = TRUE)$Cohens_d)
|
||||
)
|
||||
|
||||
# 4-way TIME: self, past temporalDO, pers
|
||||
d_self_past <- d_data %>% filter(perspective == "self", temporalDO == "past")
|
||||
cohens_d_results <- cohens_d_results %>%
|
||||
add_row(
|
||||
contrast = "TIME (past - fut)",
|
||||
condition = "self, past temporalDO, pers domain",
|
||||
d = suppressMessages(effectsize::cohens_d(d_self_past$past_pers_DGEN, d_self_past$fut_pers_DGEN, paired = TRUE)$Cohens_d)
|
||||
)
|
||||
|
||||
# 4-way TIME: self, past temporalDO, val
|
||||
cohens_d_results <- cohens_d_results %>%
|
||||
add_row(
|
||||
contrast = "TIME (past - fut)",
|
||||
condition = "self, past temporalDO, val domain",
|
||||
d = suppressMessages(effectsize::cohens_d(d_self_past$past_val_DGEN, d_self_past$fut_val_DGEN, paired = TRUE)$Cohens_d)
|
||||
)
|
||||
|
||||
# 4-way DOMAIN: other, past TIME, past temporalDO - pref vs val
|
||||
d_other_past_tpast <- d_data %>% filter(perspective == "other", temporalDO == "past")
|
||||
cohens_d_results <- cohens_d_results %>%
|
||||
add_row(
|
||||
contrast = "DOMAIN (pref - val)",
|
||||
condition = "other, past TIME, past temporalDO",
|
||||
d = suppressMessages(effectsize::cohens_d(d_other_past_tpast$past_pref_DGEN, d_other_past_tpast$past_val_DGEN, paired = TRUE)$Cohens_d)
|
||||
)
|
||||
|
||||
# 4-way DOMAIN: other, past TIME, past temporalDO - pers vs val
|
||||
cohens_d_results <- cohens_d_results %>%
|
||||
add_row(
|
||||
contrast = "DOMAIN (pers - val)",
|
||||
condition = "other, past TIME, past temporalDO",
|
||||
d = suppressMessages(effectsize::cohens_d(d_other_past_tpast$past_pers_DGEN, d_other_past_tpast$past_val_DGEN, paired = TRUE)$Cohens_d)
|
||||
)
|
||||
|
||||
# 4-way DOMAIN: self, fut TIME, past temporalDO - pref vs val
|
||||
d_self_fut_tpast <- d_data %>% filter(perspective == "self", temporalDO == "past")
|
||||
cohens_d_results <- cohens_d_results %>%
|
||||
add_row(
|
||||
contrast = "DOMAIN (pref - val)",
|
||||
condition = "self, fut TIME, past temporalDO",
|
||||
d = suppressMessages(effectsize::cohens_d(d_self_fut_tpast$fut_pref_DGEN, d_self_fut_tpast$fut_val_DGEN, paired = TRUE)$Cohens_d)
|
||||
)
|
||||
|
||||
cohens_d_results %>%
|
||||
mutate(d = round(d, 3)) %>%
|
||||
print(n = Inf)
|
||||
```
|
||||
|
||||
Size d Interpretation
|
||||
Small 0.2 Weak effect
|
||||
Medium 0.5 Moderate effect
|
||||
Large 0.8 Strong effect
|
||||
Loading…
x
Reference in New Issue
Block a user