================================================================================
EOHI2 DATA PROCESSING PIPELINE - VARIABLE CREATION DOCUMENTATION
================================================================================

This README documents the complete data processing pipeline for eohi2.csv.
All processing scripts should be run in the order listed below.

Source File: eohi2.csv
Processing Scripts: dataP 01 through dataP 09

================================================================================
SCRIPT 01: dataP 01 - recode and combine past & future vars.r
================================================================================

PURPOSE:
  Combines responses from two survey versions (01 and 02) and recodes Likert
  scale text responses to numeric values for past and future time periods.

VARIABLES CREATED: 60 total (15 items × 4 time periods)

SOURCE COLUMNS:
  - Set A: 01past5PrefItem_1 through 01fut10ValItem_5 (60 columns)
  - Set B: 02past5PrefItem_1 through 02fut10ValItem_5 (60 columns)

TARGET VARIABLES:
  Past 5 Years (15 variables):
    - past_5_pref_read, past_5_pref_music, past_5_pref_TV, past_5_pref_nap, 
      past_5_pref_travel
    - past_5_pers_extravert, past_5_pers_critical, past_5_pers_dependable,
      past_5_pers_anxious, past_5_pers_complex
    - past_5_val_obey, past_5_val_trad, past_5_val_opinion, 
      past_5_val_performance, past_5_val_justice

  Past 10 Years (15 variables):
    - past_10_pref_read, past_10_pref_music, past_10_pref_TV, past_10_pref_nap,
      past_10_pref_travel
    - past_10_pers_extravert, past_10_pers_critical, past_10_pers_dependable,
      past_10_pers_anxious, past_10_pers_complex
    - past_10_val_obey, past_10_val_trad, past_10_val_opinion,
      past_10_val_performance, past_10_val_justice

  Future 5 Years (15 variables):
    - fut_5_pref_read, fut_5_pref_music, fut_5_pref_TV, fut_5_pref_nap,
      fut_5_pref_travel
    - fut_5_pers_extravert, fut_5_pers_critical, fut_5_pers_dependable,
      fut_5_pers_anxious, fut_5_pers_complex
    - fut_5_val_obey, fut_5_val_trad, fut_5_val_opinion,
      fut_5_val_performance, fut_5_val_justice

  Future 10 Years (15 variables):
    - fut_10_pref_read, fut_10_pref_music, fut_10_pref_TV, fut_10_pref_nap,
      fut_10_pref_travel
    - fut_10_pers_extravert, fut_10_pers_critical, fut_10_pers_dependable,
      fut_10_pers_anxious, fut_10_pers_complex
    - fut_10_val_obey, fut_10_val_trad, fut_10_val_opinion,
      fut_10_val_performance, fut_10_val_justice

TRANSFORMATION LOGIC:
  Step 1: Combine responses from Set A (01) and Set B (02)
          - If Set A has a value, use Set A
          - If Set A is empty, use Set B
  
  Step 2: Recode text responses to numeric values:
          "Strongly Disagree"            → -3
          "Disagree"                     → -2
          "Somewhat Disagree"            → -1
          "Neither Agree nor Disagree"   →  0
          "Somewhat Agree"               →  1
          "Agree"                        →  2
          "Strongly Agree"               →  3
          Empty/Missing                  → NA

ITEM DOMAINS:
  - Preferences (pref): Reading, Music, TV, Nap, Travel
  - Personality (pers): Extravert, Critical, Dependable, Anxious, Complex
  - Values (val): Obey, Tradition, Opinion, Performance, Justice


================================================================================
SCRIPT 02: dataP 02 - recode present VARS.r
================================================================================

PURPOSE:
  Recodes present-time Likert scale text responses to numeric values.

VARIABLES CREATED: 15 total

SOURCE COLUMNS:
  - prePrefItem_1 through prePrefItem_5 (5 columns)
  - prePersItem_1 through prePersItem_5 (5 columns)
  - preValItem_1 through preValItem_5 (5 columns)

TARGET VARIABLES:
  Present Time (15 variables):
    - present_pref_read, present_pref_music, present_pref_tv, present_pref_nap,
      present_pref_travel
    - present_pers_extravert, present_pers_critical, present_pers_dependable,
      present_pers_anxious, present_pers_complex
    - present_val_obey, present_val_trad, present_val_opinion,
      present_val_performance, present_val_justice

TRANSFORMATION LOGIC:
  Recode text responses to numeric values:
    "Strongly Disagree"            → -3
    "Disagree"                     → -2
    "Somewhat Disagree"            → -1
    "Neither Agree nor Disagree"   →  0
    "Somewhat Agree"               →  1
    "Agree"                        →  2
    "Strongly Agree"               →  3
    Empty/Missing                  → NA

SPECIAL NOTE:
  Present time uses "present_pref_tv" (lowercase) while past/future use
  "past_5_pref_TV" (uppercase). This is intentional and preserved from the
  original data structure.


================================================================================
SCRIPT 03: dataP 03 - recode DGEN vars.r
================================================================================

PURPOSE:
  Combines DGEN (domain general) responses from two survey versions (01 and 02).
  These are single-item measures for each domain/time combination.
  NO RECODING - just copies numeric values as-is.

VARIABLES CREATED: 12 total (3 domains × 4 time periods)

SOURCE COLUMNS:
  - Set A: 01past5PrefDGEN_1, 01past5PersDGEN_1, 01past5ValDGEN_1, etc.
  - Set B: 02past5PrefDGEN_1, 02past5PersDGEN_1, 02past5ValDGEN_1, etc.

TARGET VARIABLES:
  - DGEN_past_5_Pref, DGEN_past_5_Pers, DGEN_past_5_Val
  - DGEN_past_10_Pref, DGEN_past_10_Pers, DGEN_past_10_Val
  - DGEN_fut_5_Pref, DGEN_fut_5_Pers, DGEN_fut_5_Val
  - DGEN_fut_10_Pref, DGEN_fut_10_Pers, DGEN_fut_10_Val

TRANSFORMATION LOGIC:
  - If Set A (01) has a value, use Set A
  - If Set A is empty, use Set B (02)
  - NO RECODING: Values are copied directly as numeric

SPECIAL NOTES:
  - Future columns in raw data use "_8" suffix for Pref/Pers items
  - Future Val columns use "ValuesDGEN" spelling in Set A, "ValDGEN" in Set B


================================================================================
SCRIPT 04: dataP 04 - DGEN means.r
================================================================================

PURPOSE:
  Calculates mean DGEN scores by averaging the three domain scores (Preferences,
  Personality, Values) for each time period.

VARIABLES CREATED: 4 total (1 per time period)

SOURCE COLUMNS:
  - DGEN_past_5_Pref, DGEN_past_5_Pers, DGEN_past_5_Val
  - DGEN_past_10_Pref, DGEN_past_10_Pers, DGEN_past_10_Val
  - DGEN_fut_5_Pref, DGEN_fut_5_Pers, DGEN_fut_5_Val
  - DGEN_fut_10_Pref, DGEN_fut_10_Pers, DGEN_fut_10_Val

TARGET VARIABLES:
  - DGEN_past_5_mean
  - DGEN_past_10_mean
  - DGEN_fut_5_mean
  - DGEN_fut_10_mean

TRANSFORMATION LOGIC:
  Each mean = (Pref + Pers + Val) / 3
  - NA values are excluded from calculation (na.rm = TRUE)


================================================================================
SCRIPT 05: dataP 05 - recode scales VARS.r
================================================================================

PURPOSE:
  Processes two cognitive scales:
  1. AOT (Actively Open-minded Thinking): 8-item scale with reverse coding
  2. CRT (Cognitive Reflection Test): 3-item test with correct/intuitive scoring

VARIABLES CREATED: 3 total

SOURCE COLUMNS:
  AOT Scale:
    - aot_1, aot_2, aot_3, aot_4, aot_5, aot_6, aot_7, aot_8

  CRT Test:
    - crt_1, crt_2, crt_3

TARGET VARIABLES:
  - aot_total     (mean of 8 items with reverse coding)
  - crt_correct   (proportion of correct answers)
  - crt_int       (proportion of intuitive/incorrect answers)

TRANSFORMATION LOGIC:

  AOT Scale (aot_total):
    1. Items 4, 5, 6, 7 are reverse coded by multiplying by -1
    2. Calculate mean of all 8 items (with reverse coding applied)
    3. Original source values are NOT modified in the dataframe
    4. NA values excluded from calculation (na.rm = TRUE)

  CRT Correct (crt_correct):
    Correct answers:
      - crt_1: "5 cents"
      - crt_2: "5 minutes"
      - crt_3: "47 days"
    Calculation: (Number of correct answers) / (Number of non-missing answers)

  CRT Intuitive (crt_int):
    Intuitive (common incorrect) answers:
      - crt_1: "10 cents"
      - crt_2: "100 minutes"
      - crt_3: "24 days"
    Calculation: (Number of intuitive answers) / (Number of non-missing answers)

SPECIAL NOTES:
  - CRT scoring is case-insensitive and trims whitespace
  - Both CRT scores are proportions (0.00 to 1.00)
  - Empty/missing CRT responses are excluded from denominator


================================================================================
SCRIPT 06: dataP 06 - time interval differences.r
================================================================================

PURPOSE:
  Calculates absolute differences between time intervals to measure perceived
  change across time periods for all 15 items.

VARIABLES CREATED: 90 total (6 difference types × 15 items)

SOURCE COLUMNS:
  - present_pref_read through present_val_justice (15 columns)
  - past_5_pref_read through past_5_val_justice (15 columns)
  - past_10_pref_read through past_10_val_justice (15 columns)
  - fut_5_pref_read through fut_5_val_justice (15 columns)
  - fut_10_pref_read through fut_10_val_justice (15 columns)

TARGET VARIABLES (by difference type):

  NPast_5 (Present vs Past 5 years) - 15 variables:
    Formula: |present - past_5|
    - NPast_5_pref_read, NPast_5_pref_music, NPast_5_pref_TV, NPast_5_pref_nap,
      NPast_5_pref_travel
    - NPast_5_pers_extravert, NPast_5_pers_critical, NPast_5_pers_dependable,
      NPast_5_pers_anxious, NPast_5_pers_complex
    - NPast_5_val_obey, NPast_5_val_trad, NPast_5_val_opinion,
      NPast_5_val_performance, NPast_5_val_justice

  NPast_10 (Present vs Past 10 years) - 15 variables:
    Formula: |present - past_10|
    - NPast_10_pref_read, NPast_10_pref_music, NPast_10_pref_TV, 
      NPast_10_pref_nap, NPast_10_pref_travel
    - NPast_10_pers_extravert, NPast_10_pers_critical, NPast_10_pers_dependable,
      NPast_10_pers_anxious, NPast_10_pers_complex
    - NPast_10_val_obey, NPast_10_val_trad, NPast_10_val_opinion,
      NPast_10_val_performance, NPast_10_val_justice

  NFut_5 (Present vs Future 5 years) - 15 variables:
    Formula: |present - fut_5|
    - NFut_5_pref_read, NFut_5_pref_music, NFut_5_pref_TV, NFut_5_pref_nap,
      NFut_5_pref_travel
    - NFut_5_pers_extravert, NFut_5_pers_critical, NFut_5_pers_dependable,
      NFut_5_pers_anxious, NFut_5_pers_complex
    - NFut_5_val_obey, NFut_5_val_trad, NFut_5_val_opinion,
      NFut_5_val_performance, NFut_5_val_justice

  NFut_10 (Present vs Future 10 years) - 15 variables:
    Formula: |present - fut_10|
    - NFut_10_pref_read, NFut_10_pref_music, NFut_10_pref_TV, NFut_10_pref_nap,
      NFut_10_pref_travel
    - NFut_10_pers_extravert, NFut_10_pers_critical, NFut_10_pers_dependable,
      NFut_10_pers_anxious, NFut_10_pers_complex
    - NFut_10_val_obey, NFut_10_val_trad, NFut_10_val_opinion,
      NFut_10_val_performance, NFut_10_val_justice

  5.10past (Past 5 vs Past 10 years) - 15 variables:
    Formula: |past_5 - past_10|
    - 5.10past_pref_read, 5.10past_pref_music, 5.10past_pref_TV, 
      5.10past_pref_nap, 5.10past_pref_travel
    - 5.10past_pers_extravert, 5.10past_pers_critical, 5.10past_pers_dependable,
      5.10past_pers_anxious, 5.10past_pers_complex
    - 5.10past_val_obey, 5.10past_val_trad, 5.10past_val_opinion,
      5.10past_val_performance, 5.10past_val_justice

  5.10fut (Future 5 vs Future 10 years) - 15 variables:
    Formula: |fut_5 - fut_10|
    - 5.10fut_pref_read, 5.10fut_pref_music, 5.10fut_pref_TV, 5.10fut_pref_nap,
      5.10fut_pref_travel
    - 5.10fut_pers_extravert, 5.10fut_pers_critical, 5.10fut_pers_dependable,
      5.10fut_pers_anxious, 5.10fut_pers_complex
    - 5.10fut_val_obey, 5.10fut_val_trad, 5.10fut_val_opinion,
      5.10fut_val_performance, 5.10fut_val_justice

TRANSFORMATION LOGIC:
  All calculations use absolute differences:
    - NPast_5: |present_[item] - past_5_[item]|
    - NPast_10: |present_[item] - past_10_[item]|
    - NFut_5: |present_[item] - fut_5_[item]|
    - NFut_10: |present_[item] - fut_10_[item]|
    - 5.10past: |past_5_[item] - past_10_[item]|
    - 5.10fut: |fut_5_[item] - fut_10_[item]|

  Result: Always positive values representing magnitude of change
  Missing values in either source column result in NA

SPECIAL NOTES:
  - Present time uses "pref_tv" (lowercase) while past/future use "pref_TV"
    (uppercase), so script handles this naming inconsistency
  - All values are absolute differences (non-negative)


================================================================================
SCRIPT 07: dataP 07 - domain means.r
================================================================================

PURPOSE:
  Calculates domain-level means by averaging the 5 items within each domain
  (Preferences, Personality, Values) for each of the 6 time interval difference
  types.

VARIABLES CREATED: 18 total (6 time intervals × 3 domains)

SOURCE COLUMNS:
  - NPast_5_pref_read through NPast_5_val_justice (15 columns)
  - NPast_10_pref_read through NPast_10_val_justice (15 columns)
  - NFut_5_pref_read through NFut_5_val_justice (15 columns)
  - NFut_10_pref_read through NFut_10_val_justice (15 columns)
  - 5.10past_pref_read through 5.10past_val_justice (15 columns)
  - 5.10fut_pref_read through 5.10fut_val_justice (15 columns)
  Total: 90 difference columns (created in Script 06)

TARGET VARIABLES:
  NPast_5 Domain Means (3 variables):
    - NPast_5_pref_MEAN  (mean of 5 preference items)
    - NPast_5_pers_MEAN  (mean of 5 personality items)
    - NPast_5_val_MEAN   (mean of 5 values items)

  NPast_10 Domain Means (3 variables):
    - NPast_10_pref_MEAN
    - NPast_10_pers_MEAN
    - NPast_10_val_MEAN

  NFut_5 Domain Means (3 variables):
    - NFut_5_pref_MEAN
    - NFut_5_pers_MEAN
    - NFut_5_val_MEAN

  NFut_10 Domain Means (3 variables):
    - NFut_10_pref_MEAN
    - NFut_10_pers_MEAN
    - NFut_10_val_MEAN

  5.10past Domain Means (3 variables):
    - 5.10past_pref_MEAN
    - 5.10past_pers_MEAN
    - 5.10past_val_MEAN

  5.10fut Domain Means (3 variables):
    - 5.10fut_pref_MEAN
    - 5.10fut_pers_MEAN
    - 5.10fut_val_MEAN

TRANSFORMATION LOGIC:
  Each domain mean = average of 5 items within that domain

  Example for NPast_5_pref_MEAN:
    = mean(NPast_5_pref_read, NPast_5_pref_music, NPast_5_pref_TV, 
           NPast_5_pref_nap, NPast_5_pref_travel)

  Example for NFut_10_pers_MEAN:
    = mean(NFut_10_pers_extravert, NFut_10_pers_critical, 
           NFut_10_pers_dependable, NFut_10_pers_anxious, 
           NFut_10_pers_complex)

  NA values excluded from calculation (na.rm = TRUE)

PURPOSE OF DOMAIN MEANS:
  - Provides higher-level summary of perceived change by domain
  - Reduces item-level noise by aggregating across related items
  - Enables domain-level comparisons across time intervals
  - Parallel to Script 04 (DGEN means) but for difference scores instead of
    raw DGEN ratings

SPECIAL NOTES:
  - This script depends on Script 06 being run first
  - Creates domain-level aggregates of absolute difference scores
  - All means are averages of non-negative values (absolute differences)


================================================================================
SCRIPT 08: dataP 08 - DGEN 510 vars.r
================================================================================

PURPOSE:
  Calculates absolute differences between 5-year and 10-year DGEN ratings for
  both Past and Future time directions. These variables measure the perceived
  difference in domain-general change between the two time intervals.

VARIABLES CREATED: 6 total (3 domains × 2 time directions)

SOURCE COLUMNS:
  - DGEN_past_5_Pref, DGEN_past_5_Pers, DGEN_past_5_Val
  - DGEN_past_10_Pref, DGEN_past_10_Pers, DGEN_past_10_Val
  - DGEN_fut_5_Pref, DGEN_fut_5_Pers, DGEN_fut_5_Val
  - DGEN_fut_10_Pref, DGEN_fut_10_Pers, DGEN_fut_10_Val
  Total: 12 DGEN columns (created in Script 03)

TARGET VARIABLES:
  Past Direction (3 variables):
    - X5_10DGEN_past_pref  (|DGEN_past_5_Pref - DGEN_past_10_Pref|)
    - X5_10DGEN_past_pers  (|DGEN_past_5_Pers - DGEN_past_10_Pers|)
    - X5_10DGEN_past_val   (|DGEN_past_5_Val - DGEN_past_10_Val|)

  Future Direction (3 variables):
    - X5_10DGEN_fut_pref   (|DGEN_fut_5_Pref - DGEN_fut_10_Pref|)
    - X5_10DGEN_fut_pers   (|DGEN_fut_5_Pers - DGEN_fut_10_Pers|)
    - X5_10DGEN_fut_val    (|DGEN_fut_5_Val - DGEN_fut_10_Val|)

TRANSFORMATION LOGIC:
  Formula: |DGEN_5 - DGEN_10|
  
  All calculations use absolute differences:
    - Past Preferences: |DGEN_past_5_Pref - DGEN_past_10_Pref|
    - Past Personality: |DGEN_past_5_Pers - DGEN_past_10_Pers|
    - Past Values: |DGEN_past_5_Val - DGEN_past_10_Val|
    - Future Preferences: |DGEN_fut_5_Pref - DGEN_fut_10_Pref|
    - Future Personality: |DGEN_fut_5_Pers - DGEN_fut_10_Pers|
    - Future Values: |DGEN_fut_5_Val - DGEN_fut_10_Val|
  
  Result: Always positive values representing magnitude of difference
  Missing values in either source column result in NA

SPECIAL NOTES:
  - Variable names use "X" prefix because R automatically adds it to column
    names starting with numbers (5_10 becomes X5_10)
  - This script depends on Script 03 being run first
  - Measures interval effects within time direction (past vs future)
  - Parallel to Script 06's 5.10past and 5.10fut variables but for DGEN scores


================================================================================
SCRIPT 09: dataP 09 - interval x direction means.r
================================================================================

PURPOSE:
  Calculates comprehensive mean scores by averaging item-level differences
  across intervals and directions. Creates both narrow-scope means (single
  time interval) and broad-scope global means (combining multiple intervals).

VARIABLES CREATED: 11 total (6 narrow-scope + 5 global-scope)

SOURCE COLUMNS:
  All 90 difference variables created in Script 06:
    - NPast_5_[domain]_[item] (15 variables)
    - NPast_10_[domain]_[item] (15 variables)
    - NFut_5_[domain]_[item] (15 variables)
    - NFut_10_[domain]_[item] (15 variables)
    - X5.10past_[domain]_[item] (15 variables)
    - X5.10fut_[domain]_[item] (15 variables)

TARGET VARIABLES:

  Narrow-Scope Means (15 source items each):
    - NPast_5_mean      (mean across all 15 NPast_5 items)
    - NPast_10_mean     (mean across all 15 NPast_10 items)
    - NFut_5_mean       (mean across all 15 NFut_5 items)
    - NFut_10_mean      (mean across all 15 NFut_10 items)
    - X5.10past_mean    (mean across all 15 X5.10past items)
    - X5.10fut_mean     (mean across all 15 X5.10fut items)

  Global-Scope Means (30 source items each):
    - NPast_global_mean     (NPast_5 + NPast_10: all past intervals)
    - NFut_global_mean      (NFut_5 + NFut_10: all future intervals)
    - X5.10_global_mean     (X5.10past + X5.10fut: all 5-vs-10 intervals)
    - N5_global_mean        (NPast_5 + NFut_5: all 5-year intervals)
    - N10_global_mean       (NPast_10 + NFut_10: all 10-year intervals)

TRANSFORMATION LOGIC:

  Narrow-Scope Means (15 items each):
    Each mean averages all 15 difference items within one time interval
    
    Example for NPast_5_mean:
      = mean(NPast_5_pref_read, NPast_5_pref_music, NPast_5_pref_TV, 
             NPast_5_pref_nap, NPast_5_pref_travel,
             NPast_5_pers_extravert, NPast_5_pers_critical, 
             NPast_5_pers_dependable, NPast_5_pers_anxious, 
             NPast_5_pers_complex,
             NPast_5_val_obey, NPast_5_val_trad, NPast_5_val_opinion,
             NPast_5_val_performance, NPast_5_val_justice)

  Global-Scope Means (30 items each):
    Each mean averages 30 difference items across two related intervals
    
    Example for NPast_global_mean:
      = mean(all 15 NPast_5 items + all 15 NPast_10 items)
      Represents overall perceived change from present to any past timepoint
    
    Example for N5_global_mean:
      = mean(all 15 NPast_5 items + all 15 NFut_5 items)
      Represents overall perceived change at 5-year interval regardless of 
      direction

  NA values excluded from calculation (na.rm = TRUE)

PURPOSE OF INTERVAL × DIRECTION MEANS:
  - Narrow-scope means: Single-interval summaries across all domains and items
  - Global-scope means: Cross-interval summaries for testing:
      * Direction effects (past vs future)
      * Interval effects (5-year vs 10-year)
      * Combined temporal distance effects
  - Enables comprehensive analysis of temporal self-perception patterns
  - Reduces item-level and domain-level noise through broad aggregation

QUALITY ASSURANCE:
  - Script includes automated QA checks for first 5 rows
  - Manually recalculates each mean and verifies against stored values
  - Prints TRUE/FALSE match status for each variable
  - Ensures calculation accuracy before further analysis

SPECIAL NOTES:
  - This script depends on Script 06 being run first
  - All means are averages of absolute difference scores (non-negative)
  - Global means provide the broadest temporal self-perception summaries
  - Naming convention uses "global" for 30-item means, no suffix for 15-item


================================================================================
SCRIPT 10: dataP 10 - DGEN mean vars.r
================================================================================

PURPOSE:
  Calculates mean DGEN scores by averaging across different time combinations.
  Creates means for Past, Future, and interval-based (5-year, 10-year) groupings.

VARIABLES CREATED: 6 total

SOURCE COLUMNS:
  - DGEN_past_5_Pref, DGEN_past_5_Pers, DGEN_past_5_Val
  - DGEN_past_10_Pref, DGEN_past_10_Pers, DGEN_past_10_Val
  - DGEN_fut_5_Pref, DGEN_fut_5_Pers, DGEN_fut_5_Val
  - DGEN_fut_10_Pref, DGEN_fut_10_Pers, DGEN_fut_10_Val

TARGET VARIABLES:
  Direction-Based Means (2 variables):
    - DGEN_past_mean  (mean of past_5_mean and past_10_mean)
    - DGEN_fut_mean   (mean of fut_5_mean and fut_10_mean)
  
  Interval-Based Means (2 variables):
    - DGEN_5_mean     (mean of past_5_mean and fut_5_mean)
    - DGEN_10_mean    (mean of past_10_mean and fut_10_mean)
  
  Domain-Based Means (2 variables):
    - DGEN_pref_mean  (mean across all 4 time periods for Preferences)
    - DGEN_pers_mean  (mean across all 4 time periods for Personality)

TRANSFORMATION LOGIC:
  Direction-based:
    - DGEN_past_mean = mean(DGEN_past_5_mean, DGEN_past_10_mean)
    - DGEN_fut_mean = mean(DGEN_fut_5_mean, DGEN_fut_10_mean)
  
  Interval-based:
    - DGEN_5_mean = mean(DGEN_past_5_mean, DGEN_fut_5_mean)
    - DGEN_10_mean = mean(DGEN_past_10_mean, DGEN_fut_10_mean)
  
  Domain-based:
    - DGEN_pref_mean = mean across all 4 Pref scores
    - DGEN_pers_mean = mean across all 4 Pers scores
  
  NA values excluded from calculation (na.rm = TRUE)


================================================================================
SCRIPT 11: dataP 11 - CORRECT ehi vars.r
================================================================================

PURPOSE:
  Creates Enduring Hedonic Impact (EHI) variables by calculating differences
  between Past and Future responses for each item across different time intervals.
  Formula: NPast - NFut (positive values indicate greater past-present change)

VARIABLES CREATED: 45 total (15 items × 3 time intervals)

SOURCE COLUMNS:
  5-year intervals:
    - NPast_5_pref_read through NPast_5_val_justice (15 columns)
    - NFut_5_pref_read through NFut_5_val_justice (15 columns)
  
  10-year intervals:
    - NPast_10_pref_read through NPast_10_val_justice (15 columns)
    - NFut_10_pref_read through NFut_10_val_justice (15 columns)
  
  5-10 year change:
    - X5.10past_pref_read through X5.10past_val_justice (15 columns)
    - X5.10fut_pref_read through X5.10fut_val_justice (15 columns)

TARGET VARIABLES:
  5-Year EHI Variables (15 variables):
    - ehi5_pref_read, ehi5_pref_music, ehi5_pref_TV, ehi5_pref_nap, 
      ehi5_pref_travel
    - ehi5_pers_extravert, ehi5_pers_critical, ehi5_pers_dependable,
      ehi5_pers_anxious, ehi5_pers_complex
    - ehi5_val_obey, ehi5_val_trad, ehi5_val_opinion, ehi5_val_performance,
      ehi5_val_justice
  
  10-Year EHI Variables (15 variables):
    - ehi10_pref_read, ehi10_pref_music, ehi10_pref_TV, ehi10_pref_nap,
      ehi10_pref_travel
    - ehi10_pers_extravert, ehi10_pers_critical, ehi10_pers_dependable,
      ehi10_pers_anxious, ehi10_pers_complex
    - ehi10_val_obey, ehi10_val_trad, ehi10_val_opinion, ehi10_val_performance,
      ehi10_val_justice
  
  5-10 Year Change EHI Variables (15 variables):
    - ehi5.10_pref_read, ehi5.10_pref_music, ehi5.10_pref_TV, ehi5.10_pref_nap,
      ehi5.10_pref_travel
    - ehi5.10_pers_extravert, ehi5.10_pers_critical, ehi5.10_pers_dependable,
      ehi5.10_pers_anxious, ehi5.10_pers_complex
    - ehi5.10_val_obey, ehi5.10_val_trad, ehi5.10_val_opinion,
      ehi5.10_val_performance, ehi5.10_val_justice

TRANSFORMATION LOGIC:
  Formula: NPast - NFut
  
  All calculations use signed differences:
    - ehi5_[item] = NPast_5_[item] - NFut_5_[item]
    - ehi10_[item] = NPast_10_[item] - NFut_10_[item]
    - ehi5.10_[item] = X5.10past_[item] - X5.10fut_[item]
  
  Result: Positive = greater past change, Negative = greater future change
  Missing values in either source column result in NA

QUALITY ASSURANCE:
  - Comprehensive QA checks for all 45 variables across all rows
  - First 5 rows displayed with detailed calculations showing source values,
    computed differences, and stored values
  - Pass/Fail status for each variable reported


================================================================================
SCRIPT 12: dataP 12 - CORRECT DGEN ehi vars.r
================================================================================

PURPOSE:
  Creates domain-general EHI variables by calculating differences between Past
  and Future DGEN responses. These are the domain-general parallel to Script 11's
  domain-specific EHI variables.

VARIABLES CREATED: 6 total (3 domains × 2 time intervals)

SOURCE COLUMNS:
  - DGEN_past_5_Pref, DGEN_past_5_Pers, DGEN_past_5_Val
  - DGEN_past_10_Pref, DGEN_past_10_Pers, DGEN_past_10_Val
  - DGEN_fut_5_Pref, DGEN_fut_5_Pers, DGEN_fut_5_Val
  - DGEN_fut_10_Pref, DGEN_fut_10_Pers, DGEN_fut_10_Val

TARGET VARIABLES:
  5-Year DGEN EHI (3 variables):
    - ehiDGEN_5_Pref
    - ehiDGEN_5_Pers
    - ehiDGEN_5_Val
  
  10-Year DGEN EHI (3 variables):
    - ehiDGEN_10_Pref
    - ehiDGEN_10_Pers
    - ehiDGEN_10_Val

TRANSFORMATION LOGIC:
  Formula: DGEN_past - DGEN_fut
  
  All calculations use signed differences:
    - ehiDGEN_5_Pref = DGEN_past_5_Pref - DGEN_fut_5_Pref
    - ehiDGEN_5_Pers = DGEN_past_5_Pers - DGEN_fut_5_Pers
    - ehiDGEN_5_Val = DGEN_past_5_Val - DGEN_fut_5_Val
    - ehiDGEN_10_Pref = DGEN_past_10_Pref - DGEN_fut_10_Pref
    - ehiDGEN_10_Pers = DGEN_past_10_Pers - DGEN_fut_10_Pers
    - ehiDGEN_10_Val = DGEN_past_10_Val - DGEN_fut_10_Val
  
  Result: Positive = greater past change, Negative = greater future change

QUALITY ASSURANCE:
  - QA checks for all 6 variables across all rows
  - First 5 rows displayed with detailed calculations
  - Pass/Fail status for each variable reported


================================================================================
SCRIPT 13: datap 13 - ehi domain specific means.r
================================================================================

PURPOSE:
  Calculates domain-level mean EHI scores by averaging the 5 items within each
  domain (Preferences, Personality, Values) for each time interval.

VARIABLES CREATED: 9 total (3 domains × 3 time intervals)

SOURCE COLUMNS:
  - ehi5_pref_read through ehi5_val_justice (15 columns)
  - ehi10_pref_read through ehi10_val_justice (15 columns)
  - ehi5.10_pref_read through ehi5.10_val_justice (15 columns)

TARGET VARIABLES:
  5-Year Domain Means (3 variables):
    - ehi5_pref_MEAN  (mean of 5 preference items)
    - ehi5_pers_MEAN  (mean of 5 personality items)
    - ehi5_val_MEAN   (mean of 5 values items)
  
  10-Year Domain Means (3 variables):
    - ehi10_pref_MEAN
    - ehi10_pers_MEAN
    - ehi10_val_MEAN
  
  5-10 Year Change Domain Means (3 variables):
    - ehi5.10_pref_MEAN
    - ehi5.10_pers_MEAN
    - ehi5.10_val_MEAN

TRANSFORMATION LOGIC:
  Each domain mean = average of 5 items within that domain
  
  Example for ehi5_pref_MEAN:
    = mean(ehi5_pref_read, ehi5_pref_music, ehi5_pref_TV, 
           ehi5_pref_nap, ehi5_pref_travel)
  
  NA values excluded from calculation (na.rm = TRUE)

QUALITY ASSURANCE:
  - Comprehensive QA for all 9 variables across all rows
  - First 5 rows displayed for multiple domain means
  - Pass/Fail status for each variable


================================================================================
SCRIPT 14: datap 14 - all ehi global means.r
================================================================================

PURPOSE:
  Calculates global EHI means by averaging domain-level means. Creates the
  highest-level summary scores for EHI across both domain-general and
  domain-specific measures.

VARIABLES CREATED: 5 total

SOURCE COLUMNS:
  - ehiDGEN_5_Pref, ehiDGEN_5_Pers, ehiDGEN_5_Val
  - ehiDGEN_10_Pref, ehiDGEN_10_Pers, ehiDGEN_10_Val
  - ehi5_pref_MEAN, ehi5_pers_MEAN, ehi5_val_MEAN
  - ehi10_pref_MEAN, ehi10_pers_MEAN, ehi10_val_MEAN
  - ehi5.10_pref_MEAN, ehi5.10_pers_MEAN, ehi5.10_val_MEAN

TARGET VARIABLES:
  DGEN Global Means (2 variables):
    - ehiDGEN_5_mean   (mean of 3 DGEN domains for 5-year)
    - ehiDGEN_10_mean  (mean of 3 DGEN domains for 10-year)
  
  Domain-Specific Global Means (3 variables):
    - ehi5_global_mean      (mean of 3 domain means for 5-year)
    - ehi10_global_mean     (mean of 3 domain means for 10-year)
    - ehi5.10_global_mean   (mean of 3 domain means for 5-10 change)

TRANSFORMATION LOGIC:
  Each global mean = average of 3 domain-level scores
  
  Example for ehiDGEN_5_mean:
    = mean(ehiDGEN_5_Pref, ehiDGEN_5_Pers, ehiDGEN_5_Val)
  
  Example for ehi5_global_mean:
    = mean(ehi5_pref_MEAN, ehi5_pers_MEAN, ehi5_val_MEAN)
  
  NA values excluded from calculation (na.rm = TRUE)

QUALITY ASSURANCE:
  - QA for all 5 global means across all rows
  - First 5 rows displayed with detailed calculations
  - Values shown with 5 decimal precision
  - Pass/Fail status for each variable


================================================================================
SUMMARY OF ALL CREATED VARIABLES
================================================================================

Total Variables Created: 285

By Script:
  - Script 01: 60 variables (past/future recoded items)
  - Script 02: 15 variables (present recoded items)
  - Script 03: 12 variables (DGEN domain scores)
  - Script 04:  4 variables (DGEN time period means)
  - Script 05:  3 variables (AOT & CRT scales)
  - Script 06: 90 variables (time interval differences)
  - Script 07: 18 variables (domain means for differences)
  - Script 08:  6 variables (DGEN 5-vs-10 differences)
  - Script 09: 11 variables (interval × direction means)
  - Script 10:  6 variables (DGEN combined means)
  - Script 11: 45 variables (domain-specific EHI scores)
  - Script 12:  6 variables (DGEN EHI scores)
  - Script 13:  9 variables (EHI domain means)
  - Script 14:  5 variables (EHI global means)

By Category:
  - Time Period Items (75 total):
      * Present: 15 items
      * Past 5: 15 items
      * Past 10: 15 items
      * Future 5: 15 items
      * Future 10: 15 items

  - DGEN Variables (28 total):
      * Domain scores: 12 (3 domains × 4 time periods)
      * Time period means: 4 (1 per time period)
      * 5-vs-10 differences: 6 (3 domains × 2 directions)
      * Combined means: 6 (past, future, interval-based, domain-based)

  - Cognitive Scales (3 total):
      * AOT total
      * CRT correct
      * CRT intuitive

  - Time Differences (90 total):
      * NPast_5: 15 differences
      * NPast_10: 15 differences
      * NFut_5: 15 differences
      * NFut_10: 15 differences
      * 5.10past: 15 differences
      * 5.10fut: 15 differences

  - Domain Means for Differences (18 total):
      * NPast_5: 3 domain means
      * NPast_10: 3 domain means
      * NFut_5: 3 domain means
      * NFut_10: 3 domain means
      * 5.10past: 3 domain means
      * 5.10fut: 3 domain means

  - Interval × Direction Means (11 total):
      * Narrow-scope means: 6 (NPast_5, NPast_10, NFut_5, NFut_10, 
                                X5.10past, X5.10fut)
      * Global-scope means: 5 (NPast_global, NFut_global, X5.10_global,
                                N5_global, N10_global)

  - EHI Variables (60 total):
      * Domain-specific EHI: 45 (15 items × 3 time intervals)
      * DGEN EHI: 6 (3 domains × 2 time intervals)
      * Domain means: 9 (3 domains × 3 time intervals)
      * Global means: 5 (2 DGEN + 3 domain-specific)


================================================================================
DATA PROCESSING NOTES
================================================================================

1. PROCESSING ORDER:
   Scripts MUST be run in numerical order (01 → 14) as later scripts depend
   on variables created by earlier scripts.
   
   Key Dependencies:
   - Script 03 required before Script 04, 08, 10, 12 (DGEN scores)
   - Script 04 required before Script 10 (DGEN time period means)
   - Script 06 required before Script 07, 09, 11 (time interval differences)
   - Script 11 required before Script 13 (domain-specific EHI items)
   - Script 12 required before Script 14 (DGEN EHI scores)
   - Script 13 required before Script 14 (EHI domain means)

2. SURVEY VERSION HANDLING:
   - Two survey versions (01 and 02) were used
   - Scripts 01 and 03 combine these versions
   - Preference given to version 01 when both exist

3. MISSING DATA:
   - Empty cells and NA values are preserved throughout processing
   - Calculations use na.rm=TRUE to exclude missing values from means
   - Difference calculations result in NA if either source value is missing

4. QUALITY ASSURANCE:
   - Each script includes QA checks with row verification
   - Manual calculation checks confirm proper transformations
   - Column existence checks prevent errors from missing source data
   - Scripts 09-14 include comprehensive QA with first 5 rows displayed
   - All EHI scripts (11-14) verify calculations against stored values
   - Pass/Fail status reported for all variables in QA-enabled scripts

5. FILE SAVING:
   - Most scripts save directly to eohi2.csv
   - Scripts 04, 06, and 07 have commented-out write commands for review
   - Scripts 08 and 09 save directly to eohi2.csv
   - Each script overwrites existing target columns if present

6. SPECIAL NAMING CONVENTIONS:
   - "pref_tv" vs "pref_TV" inconsistency maintained from source data
   - DGEN variables use underscores (DGEN_past_5_Pref)
   - Difference variables use descriptive prefixes (NPast_5_, 5.10past_)
   - "X" prefix added to variables starting with numbers (X5.10past_mean)
   - Global means use "_global_" to distinguish from narrow-scope means


================================================================================
ITEM REFERENCE GUIDE
================================================================================

15 Core Items (Used across all time periods):

PREFERENCES (5 items):
  1. pref_read    - Reading preferences
  2. pref_music   - Music preferences
  3. pref_TV/tv   - TV watching preferences (note case variation)
  4. pref_nap     - Napping preferences
  5. pref_travel  - Travel preferences

PERSONALITY (5 items):
  6. pers_extravert   - Extraverted personality
  7. pers_critical    - Critical thinking personality
  8. pers_dependable  - Dependable personality
  9. pers_anxious     - Anxious personality
 10. pers_complex     - Complex personality

VALUES (5 items):
 11. val_obey         - Value of obedience
 12. val_trad         - Value of tradition
 13. val_opinion      - Value of expressing opinions
 14. val_performance  - Value of performance
 15. val_justice      - Value of justice


================================================================================
END OF DOCUMENTATION
================================================================================
Last Updated: October 8, 2025