diff --git a/docs/PORTRAIT_DETECTION_PLAN.md b/docs/PORTRAIT_DETECTION_PLAN.md new file mode 100644 index 0000000..60b2887 --- /dev/null +++ b/docs/PORTRAIT_DETECTION_PLAN.md @@ -0,0 +1,1480 @@ +# Portrait/Profile Face Detection Plan + +**Version:** 1.0 +**Created:** November 2025 +**Status:** Planning Phase + +--- + +## Executive Summary + +This plan outlines the implementation of automatic face pose detection using RetinaFace directly (not via DeepFace) to identify and mark faces based on their pose/orientation. The system will detect multiple pose modes including profile (yaw), looking up/down (pitch), tilted faces (roll), and their combinations. This enables intelligent filtering in auto-match and other features. + +**Key Benefits:** +- Automatic detection of face pose (yaw, pitch, roll) without user input +- Ability to filter faces by pose mode in auto-match (profile, looking up, tilted, etc.) +- Better face matching accuracy by excluding low-quality or extreme-angle views +- Enhanced user experience with automatic pose classification +- Support for multiple pose modes: profile, looking up/down, tilted, extreme angles + +--- + +## Current State Analysis + +### Current Implementation + +**Face Detection Method:** +- Uses DeepFace library which wraps RetinaFace +- `DeepFace.represent()` provides: `facial_area`, `face_confidence`, `embedding` +- No access to facial landmarks or pose information + +**Data Stored:** +- Face bounding box: `{x, y, w, h}` +- Detection confidence: `face_confidence` +- Face encoding: 512-dimensional embedding +- Quality score: calculated from image properties +- No pose/angle information stored + +**Database Schema:** +```sql +CREATE TABLE faces ( + id INTEGER PRIMARY KEY, + photo_id INTEGER, + person_id INTEGER, + encoding BLOB, + location TEXT, -- JSON: {"x": x, "y": y, "w": w, "h": h} + confidence REAL, + quality_score REAL, + is_primary_encoding BOOLEAN, + detector_backend TEXT, + model_name TEXT, + face_confidence REAL, + exif_orientation INTEGER + -- NO is_portrait field +) +``` + +### Limitations + +1. **No Landmark Access:** DeepFace doesn't expose RetinaFace landmarks +2. **No Pose Estimation:** Cannot calculate yaw, pitch, roll angles +3. **No Profile Classification:** Cannot automatically identify profile faces +4. **Manual Filtering Required:** Users cannot filter profile faces in auto-match + +--- + +## Requirements + +### Functional Requirements + +1. **Automatic Pose Detection:** + - Detect face pose angles (yaw, pitch, roll) during processing + - Classify faces into pose modes: frontal, profile, looking up, looking down, tilted, extreme angles + - Store pose information in database + - No user intervention required + +2. **Pose Mode Classification:** + - **Yaw (left/right):** frontal, profile_left, profile_right, extreme_yaw + - **Pitch (up/down):** looking_up, looking_down, extreme_pitch + - **Roll (tilted):** tilted_left, tilted_right, extreme_roll + - **Combined modes:** e.g., profile_left_looking_up, tilted_profile_right + - Threshold-based classification using pose angles + +3. **Filtering Support:** + - Filter faces by pose mode in auto-match (exclude profile, looking up, tilted, etc.) + - Multiple filter options: exclude profile, exclude extreme angles, exclude specific modes + - Optional filtering in other features (search, identify) + +4. **Clean Database:** + - Starting with fresh database - no migration needed + - All faces will have pose data from the start + +### Technical Requirements + +1. **RetinaFace Direct Integration:** + - Use RetinaFace library directly (not via DeepFace) + - Extract facial landmarks (5 points: eyes, nose, mouth corners) + - Calculate all pose angles (yaw, pitch, roll) from landmarks + +2. **Performance:** + - Minimal performance impact (RetinaFace is already used by DeepFace) + - Reuse existing face detection results where possible + - Angle calculations are fast (< 1ms per face) + +3. **Accuracy:** + - Pose detection accuracy > 90% for clear frontal/profile views + - Handle edge cases (slight angles, extreme angles, occlusions) + - Robust to lighting and image quality variations + +4. **Pose Modes Supported:** + - **Yaw:** Frontal (|yaw| < 30°), Profile Left (yaw < -30°), Profile Right (yaw > 30°), Extreme Yaw (|yaw| > 60°) + - **Pitch:** Level (|pitch| < 20°), Looking Up (pitch > 20°), Looking Down (pitch < -20°), Extreme Pitch (|pitch| > 45°) + - **Roll:** Upright (|roll| < 15°), Tilted Left (roll < -15°), Tilted Right (roll > 15°), Extreme Roll (|roll| > 45°) + +--- + +## Technical Approach + +### Architecture Overview + +``` +┌─────────────────────────────────────────────────────────┐ +│ Face Processing │ +├─────────────────────────────────────────────────────────┤ +│ │ +│ 1. Use RetinaFace directly for face detection │ +│ └─> Returns: bounding box, landmarks, confidence │ +│ │ +│ 2. Calculate pose angles from landmarks │ +│ └─> Yaw (left/right rotation) │ +│ └─> Pitch (up/down tilt) │ +│ └─> Roll (rotation around face axis) │ +│ │ +│ 3. Calculate all pose angles from landmarks │ +│ └─> Yaw (left/right rotation): -90° to +90° │ +│ └─> Pitch (up/down tilt): -90° to +90° │ +│ └─> Roll (rotation around face): -90° to +90° │ +│ │ +│ 4. Classify face pose modes │ +│ └─> Yaw: frontal, profile_left, profile_right │ +│ └─> Pitch: level, looking_up, looking_down │ +│ └─> Roll: upright, tilted_left, tilted_right │ +│ └─> Combined: profile_left_looking_up, etc. │ +│ │ +│ 5. Still use DeepFace for encoding generation │ +│ └─> RetinaFace: detection + landmarks │ +│ └─> DeepFace: encoding generation (ArcFace) │ +│ │ +│ 6. Store pose information in database │ +│ └─> pose_mode TEXT (e.g., "frontal", "profile_left")│ +│ └─> yaw_angle, pitch_angle, roll_angle REAL │ +│ │ +└─────────────────────────────────────────────────────────┘ +``` + +### Pose Estimation from Landmarks + +**RetinaFace Landmarks (5 points):** +- Left eye: `(x1, y1)` +- Right eye: `(x2, y2)` +- Nose: `(x3, y3)` +- Left mouth corner: `(x4, y4)` +- Right mouth corner: `(x5, y5)` + +**Yaw Angle Calculation (Left/Right Rotation):** +```python +# Calculate yaw from eye and nose positions +left_eye = landmarks['left_eye'] +right_eye = landmarks['right_eye'] +nose = landmarks['nose'] + +# Eye midpoint +eye_mid_x = (left_eye[0] + right_eye[0]) / 2 +eye_mid_y = (left_eye[1] + right_eye[1]) / 2 + +# Horizontal offset from nose to eye midpoint +horizontal_offset = nose[0] - eye_mid_x +face_width = abs(right_eye[0] - left_eye[0]) + +# Yaw angle (degrees) +yaw_angle = atan2(horizontal_offset, face_width) * 180 / π +# Negative: face turned left (right profile visible) +# Positive: face turned right (left profile visible) +``` + +**Pitch Angle Calculation (Up/Down Tilt):** +```python +# Calculate pitch from eye and mouth positions +left_eye = landmarks['left_eye'] +right_eye = landmarks['right_eye'] +left_mouth = landmarks['left_mouth'] +right_mouth = landmarks['right_mouth'] +nose = landmarks['nose'] + +# Eye midpoint +eye_mid_y = (left_eye[1] + right_eye[1]) / 2 +# Mouth midpoint +mouth_mid_y = (left_mouth[1] + right_mouth[1]) / 2 +# Nose vertical position +nose_y = nose[1] + +# Expected nose position (between eyes and mouth) +expected_nose_y = eye_mid_y + (mouth_mid_y - eye_mid_y) * 0.6 +face_height = abs(mouth_mid_y - eye_mid_y) + +# Vertical offset +vertical_offset = nose_y - expected_nose_y + +# Pitch angle (degrees) +pitch_angle = atan2(vertical_offset, face_height) * 180 / π +# Positive: looking up +# Negative: looking down +``` + +**Roll Angle Calculation (Rotation Around Face Axis):** +```python +# Calculate roll from eye positions +left_eye = landmarks['left_eye'] +right_eye = landmarks['right_eye'] + +# Calculate angle of eye line +dx = right_eye[0] - left_eye[0] +dy = right_eye[1] - left_eye[1] + +# Roll angle (degrees) +roll_angle = atan2(dy, dx) * 180 / π +# Positive: tilted right (clockwise) +# Negative: tilted left (counterclockwise) +``` + +**Combined Pose Mode Classification:** +```python +# Classify pose mode based on all three angles +def classify_pose_mode(yaw, pitch, roll): + """Classify face pose mode from angles""" + + # Yaw classification + if abs(yaw) < 30: + yaw_mode = "frontal" + elif yaw < -30: + yaw_mode = "profile_right" + elif yaw > 30: + yaw_mode = "profile_left" + else: + yaw_mode = "slight_yaw" + + # Pitch classification + if abs(pitch) < 20: + pitch_mode = "level" + elif pitch > 20: + pitch_mode = "looking_up" + elif pitch < -20: + pitch_mode = "looking_down" + else: + pitch_mode = "slight_pitch" + + # Roll classification + if abs(roll) < 15: + roll_mode = "upright" + elif roll > 15: + roll_mode = "tilted_right" + elif roll < -15: + roll_mode = "tilted_left" + else: + roll_mode = "slight_roll" + + # Combine modes + if yaw_mode == "frontal" and pitch_mode == "level" and roll_mode == "upright": + return "frontal" + else: + return f"{yaw_mode}_{pitch_mode}_{roll_mode}" +``` + +--- + +## Implementation Plan + +### Phase 1: Database Schema Updates + +#### Step 1.1: Add Pose Fields to Database + +**Desktop Database (`src/core/database.py`):** +```python +# Add to faces table +ALTER TABLE faces ADD COLUMN pose_mode TEXT DEFAULT 'frontal'; -- e.g., 'frontal', 'profile_left', 'looking_up', etc. +ALTER TABLE faces ADD COLUMN yaw_angle REAL DEFAULT NULL; -- Yaw angle in degrees +ALTER TABLE faces ADD COLUMN pitch_angle REAL DEFAULT NULL; -- Pitch angle in degrees +ALTER TABLE faces ADD COLUMN roll_angle REAL DEFAULT NULL; -- Roll angle in degrees +``` + +**Web Database (Alembic Migration):** +```python +# Create new Alembic migration +alembic revision -m "add_pose_detection_to_faces" + +# Migration file: alembic/versions/YYYYMMDD_add_pose_to_faces.py +def upgrade(): + # Add pose fields + op.add_column('faces', sa.Column('pose_mode', sa.String(50), + nullable=False, server_default='frontal')) + op.add_column('faces', sa.Column('yaw_angle', sa.Numeric(), nullable=True)) + op.add_column('faces', sa.Column('pitch_angle', sa.Numeric(), nullable=True)) + op.add_column('faces', sa.Column('roll_angle', sa.Numeric(), nullable=True)) + + # Create indices + op.create_index('ix_faces_pose_mode', 'faces', ['pose_mode']) + +def downgrade(): + op.drop_index('ix_faces_pose_mode', table_name='faces') + op.drop_column('faces', 'roll_angle') + op.drop_column('faces', 'pitch_angle') + op.drop_column('faces', 'yaw_angle') + op.drop_column('faces', 'pose_mode') +``` + +**SQLAlchemy Model (`src/web/db/models.py`):** +```python +class Face(Base): + # ... existing fields ... + pose_mode = Column(String(50), default='frontal', nullable=False, index=True) # e.g., 'frontal', 'profile_left' + yaw_angle = Column(Numeric, nullable=True) # Yaw angle in degrees + pitch_angle = Column(Numeric, nullable=True) # Pitch angle in degrees + roll_angle = Column(Numeric, nullable=True) # Roll angle in degrees +``` + +#### Step 1.2: Update Database Methods + +**`src/core/database.py` - `add_face()` method:** +```python +def add_face(self, photo_id: int, encoding: bytes, location: str, + confidence: float = 0.0, quality_score: float = 0.0, + person_id: Optional[int] = None, + detector_backend: str = 'retinaface', + model_name: str = 'ArcFace', + face_confidence: float = 0.0, + pose_mode: str = 'frontal', # Pose mode classification + yaw_angle: Optional[float] = None, # Yaw angle in degrees + pitch_angle: Optional[float] = None, # Pitch angle in degrees + roll_angle: Optional[float] = None) -> int: # Roll angle in degrees + """Add face to database with pose detection""" + cursor.execute(''' + INSERT INTO faces (photo_id, person_id, encoding, location, + confidence, quality_score, is_primary_encoding, + detector_backend, model_name, face_confidence, + pose_mode, yaw_angle, pitch_angle, roll_angle) + VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?) + ''', (photo_id, person_id, encoding, location, confidence, + quality_score, False, detector_backend, model_name, + face_confidence, pose_mode, yaw_angle, pitch_angle, roll_angle)) + return cursor.lastrowid +``` + +--- + +### Phase 2: RetinaFace Direct Integration + +#### Step 2.1: Install/Verify RetinaFace Library + +**Check if RetinaFace is available:** +```python +try: + from retinaface import RetinaFace + RETINAFACE_AVAILABLE = True +except ImportError: + RETINAFACE_AVAILABLE = False + # RetinaFace is typically installed with DeepFace + # If not, install: pip install retina-face +``` + +**Update `requirements.txt`:** +``` +retina-face>=0.0.13 # Already included, but verify version +``` + +#### Step 2.2: Create Pose Detection Utility + +**New file: `src/utils/pose_detection.py`** + +```python +"""Face pose detection (yaw, pitch, roll) using RetinaFace landmarks""" + +import numpy as np +from math import atan2, degrees, pi +from typing import Dict, Tuple, Optional, List + +try: + from retinaface import RetinaFace + RETINAFACE_AVAILABLE = True +except ImportError: + RETINAFACE_AVAILABLE = False + RetinaFace = None + + +class PoseDetector: + """Detect face pose (yaw, pitch, roll) using RetinaFace landmarks""" + + # Thresholds for pose detection (in degrees) + PROFILE_YAW_THRESHOLD = 30.0 # Faces with |yaw| >= 30° are considered profile + EXTREME_YAW_THRESHOLD = 60.0 # Faces with |yaw| >= 60° are extreme profile + + PITCH_THRESHOLD = 20.0 # Faces with |pitch| >= 20° are looking up/down + EXTREME_PITCH_THRESHOLD = 45.0 # Faces with |pitch| >= 45° are extreme + + ROLL_THRESHOLD = 15.0 # Faces with |roll| >= 15° are tilted + EXTREME_ROLL_THRESHOLD = 45.0 # Faces with |roll| >= 45° are extreme + + def __init__(self, + yaw_threshold: float = None, + pitch_threshold: float = None, + roll_threshold: float = None): + """Initialize pose detector + + Args: + yaw_threshold: Yaw angle threshold for profile detection (degrees) + Default: 30.0 + pitch_threshold: Pitch angle threshold for up/down detection (degrees) + Default: 20.0 + roll_threshold: Roll angle threshold for tilt detection (degrees) + Default: 15.0 + """ + if not RETINAFACE_AVAILABLE: + raise RuntimeError("RetinaFace not available") + + self.yaw_threshold = yaw_threshold or self.PROFILE_YAW_THRESHOLD + self.pitch_threshold = pitch_threshold or self.PITCH_THRESHOLD + self.roll_threshold = roll_threshold or self.ROLL_THRESHOLD + + @staticmethod + def detect_faces_with_landmarks(img_path: str) -> Dict: + """Detect faces using RetinaFace directly + + Returns: + Dictionary with face keys and landmark data: + { + 'face_1': { + 'facial_area': {'x': x, 'y': y, 'w': w, 'h': h}, + 'landmarks': { + 'left_eye': (x, y), + 'right_eye': (x, y), + 'nose': (x, y), + 'left_mouth': (x, y), + 'right_mouth': (x, y) + }, + 'confidence': 0.95 + } + } + """ + if not RETINAFACE_AVAILABLE: + return {} + + faces = RetinaFace.detect_faces(img_path) + return faces + + @staticmethod + def calculate_yaw_from_landmarks(landmarks: Dict) -> Optional[float]: + """Calculate yaw angle from facial landmarks + + Args: + landmarks: Dictionary with landmark positions: + { + 'left_eye': (x, y), + 'right_eye': (x, y), + 'nose': (x, y), + 'left_mouth': (x, y), + 'right_mouth': (x, y) + } + + Returns: + Yaw angle in degrees (-90 to +90): + - Negative: face turned left (right profile) + - Positive: face turned right (left profile) + - Zero: frontal face + - None: if landmarks invalid + """ + if not landmarks: + return None + + left_eye = landmarks.get('left_eye') + right_eye = landmarks.get('right_eye') + nose = landmarks.get('nose') + + if not all([left_eye, right_eye, nose]): + return None + + # Calculate eye midpoint + eye_mid_x = (left_eye[0] + right_eye[0]) / 2 + eye_mid_y = (left_eye[1] + right_eye[1]) / 2 + + # Calculate horizontal distance from nose to eye midpoint + nose_x = nose[0] + eye_midpoint_x = eye_mid_x + + # Calculate face width (eye distance) + face_width = abs(right_eye[0] - left_eye[0]) + + if face_width == 0: + return None + + # Calculate horizontal offset + horizontal_offset = nose_x - eye_midpoint_x + + # Calculate yaw angle using atan2 + # Normalize by face width to get angle + yaw_radians = atan2(horizontal_offset, face_width) + yaw_degrees = degrees(yaw_radians) + + return yaw_degrees + + @staticmethod + def calculate_pitch_from_landmarks(landmarks: Dict) -> Optional[float]: + """Calculate pitch angle from facial landmarks (up/down tilt) + + Args: + landmarks: Dictionary with landmark positions + + Returns: + Pitch angle in degrees (-90 to +90): + - Positive: looking up + - Negative: looking down + - None: if landmarks invalid + """ + if not landmarks: + return None + + left_eye = landmarks.get('left_eye') + right_eye = landmarks.get('right_eye') + left_mouth = landmarks.get('left_mouth') + right_mouth = landmarks.get('right_mouth') + nose = landmarks.get('nose') + + if not all([left_eye, right_eye, left_mouth, right_mouth, nose]): + return None + + # Eye midpoint + eye_mid_y = (left_eye[1] + right_eye[1]) / 2 + # Mouth midpoint + mouth_mid_y = (left_mouth[1] + right_mouth[1]) / 2 + # Nose vertical position + nose_y = nose[1] + + # Expected nose position (typically 60% down from eyes to mouth) + expected_nose_y = eye_mid_y + (mouth_mid_y - eye_mid_y) * 0.6 + face_height = abs(mouth_mid_y - eye_mid_y) + + if face_height == 0: + return None + + # Vertical offset from expected position + vertical_offset = nose_y - expected_nose_y + + # Calculate pitch angle + pitch_radians = atan2(vertical_offset, face_height) + pitch_degrees = degrees(pitch_radians) + + return pitch_degrees + + @staticmethod + def calculate_roll_from_landmarks(landmarks: Dict) -> Optional[float]: + """Calculate roll angle from facial landmarks (rotation around face axis) + + Args: + landmarks: Dictionary with landmark positions + + Returns: + Roll angle in degrees (-90 to +90): + - Positive: tilted right (clockwise) + - Negative: tilted left (counterclockwise) + - None: if landmarks invalid + """ + if not landmarks: + return None + + left_eye = landmarks.get('left_eye') + right_eye = landmarks.get('right_eye') + + if not all([left_eye, right_eye]): + return None + + # Calculate angle of eye line + dx = right_eye[0] - left_eye[0] + dy = right_eye[1] - left_eye[1] + + if dx == 0: + return 90.0 if dy > 0 else -90.0 # Vertical line + + # Roll angle + roll_radians = atan2(dy, dx) + roll_degrees = degrees(roll_radians) + + return roll_degrees + + @staticmethod + def classify_pose_mode(yaw: Optional[float], + pitch: Optional[float], + roll: Optional[float]) -> str: + """Classify face pose mode from all three angles + + Args: + yaw: Yaw angle in degrees + pitch: Pitch angle in degrees + roll: Roll angle in degrees + + Returns: + Pose mode classification string: + - 'frontal': frontal, level, upright + - 'profile_left', 'profile_right': profile views + - 'looking_up', 'looking_down': pitch variations + - 'tilted_left', 'tilted_right': roll variations + - Combined modes: e.g., 'profile_left_looking_up' + """ + # Default to frontal if angles unknown + if yaw is None: + yaw = 0.0 + if pitch is None: + pitch = 0.0 + if roll is None: + roll = 0.0 + + # Yaw classification + abs_yaw = abs(yaw) + if abs_yaw < 30.0: + yaw_mode = "frontal" + elif yaw < -30.0: + yaw_mode = "profile_right" + elif yaw > 30.0: + yaw_mode = "profile_left" + else: + yaw_mode = "slight_yaw" + + # Pitch classification + abs_pitch = abs(pitch) + if abs_pitch < 20.0: + pitch_mode = "level" + elif pitch > 20.0: + pitch_mode = "looking_up" + elif pitch < -20.0: + pitch_mode = "looking_down" + else: + pitch_mode = "slight_pitch" + + # Roll classification + abs_roll = abs(roll) + if abs_roll < 15.0: + roll_mode = "upright" + elif roll > 15.0: + roll_mode = "tilted_right" + elif roll < -15.0: + roll_mode = "tilted_left" + else: + roll_mode = "slight_roll" + + # Combine modes - simple case first + if yaw_mode == "frontal" and pitch_mode == "level" and roll_mode == "upright": + return "frontal" + + # Build combined mode string + modes = [] + if yaw_mode != "frontal": + modes.append(yaw_mode) + if pitch_mode != "level": + modes.append(pitch_mode) + if roll_mode != "upright": + modes.append(roll_mode) + + return "_".join(modes) if modes else "frontal" + + def detect_pose_faces(self, img_path: str) -> List[Dict]: + """Detect all faces and classify pose status (all angles) + + Args: + img_path: Path to image file + + Returns: + List of face dictionaries with pose information: + [{ + 'facial_area': {...}, + 'landmarks': {...}, + 'confidence': 0.95, + 'yaw_angle': -45.2, + 'pitch_angle': 10.5, + 'roll_angle': -5.2, + 'pose_mode': 'profile_right_level_upright' + }, ...] + """ + faces = self.detect_faces_with_landmarks(img_path) + + results = [] + for face_key, face_data in faces.items(): + landmarks = face_data.get('landmarks', {}) + + # Calculate all three angles + yaw_angle = self.calculate_yaw_from_landmarks(landmarks) + pitch_angle = self.calculate_pitch_from_landmarks(landmarks) + roll_angle = self.calculate_roll_from_landmarks(landmarks) + + # Classify pose mode + pose_mode = self.classify_pose_mode(yaw_angle, pitch_angle, roll_angle) + + result = { + 'facial_area': face_data.get('facial_area', {}), + 'landmarks': landmarks, + 'confidence': face_data.get('confidence', 0.0), + 'yaw_angle': yaw_angle, + 'pitch_angle': pitch_angle, + 'roll_angle': roll_angle, + 'pose_mode': pose_mode + } + results.append(result) + + return results +``` + +--- + +### Phase 3: Integrate Portrait Detection into Face Processing + +**Important: Backward Compatibility Requirement** +- All pose detection must have graceful fallback to defaults (`frontal`, `None` angles) +- If RetinaFace is unavailable or fails, use defaults and continue processing +- Do not fail face processing if pose detection fails +- See "Backward Compatibility & Graceful Degradation" section for details + +#### Step 3.1: Update Face Processing Pipeline + +**File: `src/core/face_processing.py`** + +**Changes:** +1. Import portrait detection utility +2. Use RetinaFace for detection + landmarks (with graceful fallback) +3. Use DeepFace for encoding generation +4. Store portrait status in database (with defaults if unavailable) + +**Modified `process_faces()` method:** + +```python +from src.utils.pose_detection import PoseDetector, RETINAFACE_AVAILABLE + +class FaceProcessor: + def __init__(self, ...): + # ... existing initialization ... + self.pose_detector = None + if RETINAFACE_AVAILABLE: + try: + self.pose_detector = PoseDetector() + except Exception as e: + print(f"⚠️ Pose detection not available: {e}") + + def process_faces(self, ...): + """Process faces with portrait detection""" + # ... existing code ... + + for photo_id, photo_path, filename in unprocessed_photos: + # Step 1: Use RetinaFace directly for detection + landmarks + pose_faces = [] + if self.pose_detector: + try: + pose_faces = self.pose_detector.detect_pose_faces(photo_path) + except Exception as e: + print(f"⚠️ Pose detection failed for {filename}: {e}") + pose_faces = [] + + # Step 2: Use DeepFace for encoding generation + results = DeepFace.represent( + img_path=face_detection_path, + model_name=self.model_name, + detector_backend=self.detector_backend, + enforce_detection=DEEPFACE_ENFORCE_DETECTION, + align=DEEPFACE_ALIGN_FACES + ) + + # Step 3: Match RetinaFace results with DeepFace results + # Match by facial_area position + for i, deepface_result in enumerate(results): + facial_area = deepface_result.get('facial_area', {}) + + # Find matching RetinaFace result + pose_info = self._find_matching_pose_info( + facial_area, pose_faces + ) + + pose_mode = pose_info.get('pose_mode', 'frontal') + yaw_angle = pose_info.get('yaw_angle') + pitch_angle = pose_info.get('pitch_angle') + roll_angle = pose_info.get('roll_angle') + + # Store face with pose information + face_id = self.db.add_face( + photo_id=photo_id, + encoding=encoding.tobytes(), + location=location_str, + confidence=0.0, + quality_score=quality_score, + person_id=None, + detector_backend=self.detector_backend, + model_name=self.model_name, + face_confidence=face_confidence, + pose_mode=pose_mode, + yaw_angle=yaw_angle, + pitch_angle=pitch_angle, + roll_angle=roll_angle + ) + + def _find_matching_pose_info(self, facial_area: Dict, + pose_faces: List[Dict]) -> Dict: + """Match DeepFace result with RetinaFace pose detection result + + Args: + facial_area: DeepFace facial_area {'x': x, 'y': y, 'w': w, 'h': h} + pose_faces: List of RetinaFace detection results with pose info + + Returns: + Dictionary with pose information, or defaults + """ + # Match by bounding box overlap + # Simple approach: find closest match by center point + if not pose_faces: + return { + 'pose_mode': 'frontal', + 'yaw_angle': None, + 'pitch_angle': None, + 'roll_angle': None + } + + deepface_center_x = facial_area.get('x', 0) + facial_area.get('w', 0) / 2 + deepface_center_y = facial_area.get('y', 0) + facial_area.get('h', 0) / 2 + + best_match = None + min_distance = float('inf') + + for pose_face in pose_faces: + pose_area = pose_face.get('facial_area', {}) + pose_center_x = (pose_area.get('x', 0) + + pose_area.get('w', 0) / 2) + pose_center_y = (pose_area.get('y', 0) + + pose_area.get('h', 0) / 2) + + # Calculate distance between centers + distance = ((deepface_center_x - pose_center_x) ** 2 + + (deepface_center_y - pose_center_y) ** 2) ** 0.5 + + if distance < min_distance: + min_distance = distance + best_match = pose_face + + # If match is close enough (within 50 pixels), use it + if best_match and min_distance < 50: + return { + 'pose_mode': best_match.get('pose_mode', 'frontal'), + 'yaw_angle': best_match.get('yaw_angle'), + 'pitch_angle': best_match.get('pitch_angle'), + 'roll_angle': best_match.get('roll_angle') + } + + return { + 'pose_mode': 'frontal', + 'yaw_angle': None, + 'pitch_angle': None, + 'roll_angle': None + } +``` + +#### Step 3.2: Update Web Face Service + +**File: `src/web/services/face_service.py`** + +Similar changes to integrate portrait detection in web service: + +```python +from src.utils.pose_detection import PoseDetector, RETINAFACE_AVAILABLE + +def process_photo_faces(...): + """Process faces with pose detection""" + # ... existing code ... + + # Step 1: Detect faces with RetinaFace for landmarks + pose_detector = None + pose_faces = [] + if RETINAFACE_AVAILABLE: + try: + pose_detector = PoseDetector() + pose_faces = pose_detector.detect_pose_faces(photo_path) + except Exception as e: + print(f"[FaceService] Pose detection failed: {e}") + + # Step 2: Use DeepFace for encoding + results = DeepFace.represent(...) + + # Step 3: Match and store + for idx, result in enumerate(results): + # ... existing processing ... + + # Match pose info + pose_info = _find_matching_pose_info( + facial_area, pose_faces + ) + + # Store face + face = Face( + # ... existing fields ... + pose_mode=pose_info.get('pose_mode', 'frontal'), + yaw_angle=pose_info.get('yaw_angle'), + pitch_angle=pose_info.get('pitch_angle'), + roll_angle=pose_info.get('roll_angle') + ) +``` + +--- + +### Phase 4: Update Auto-Match Filtering + +#### Step 4.1: Add Portrait Filter to Auto-Match + +**File: `src/web/services/face_service.py` - `find_auto_match_matches()`** + +```python +def find_auto_match_matches( + db: Session, + tolerance: float = 0.6, + exclude_portraits: bool = True, # NEW parameter +) -> List[Tuple[int, int, Face, List[Tuple[Face, float, float]]]]: + """Find auto-match matches with optional portrait filtering""" + + # ... existing code to get identified faces ... + + # For each person, find similar faces + for person_id, reference_face in person_faces.items(): + # Find similar faces + similar_faces = find_similar_faces( + db, reference_face.id, limit=100, tolerance=tolerance + ) + + # Filter out portraits/extreme angles if requested + if exclude_portraits: + similar_faces = [ + (face, distance, confidence_pct) + for face, distance, confidence_pct in similar_faces + if face.pose_mode == 'frontal' # Filter non-frontal faces + ] + + # ... rest of matching logic ... +``` + +#### Step 4.2: Update Auto-Match API + +**File: `src/web/api/faces.py`** + +```python +@router.post("/auto-match", response_model=AutoMatchResponse) +def auto_match_faces( + request: AutoMatchRequest, + db: Session = Depends(get_db), +) -> AutoMatchResponse: + """Auto-match with portrait filtering option""" + + # Get exclude_portraits from request (default: True) + exclude_portraits = getattr(request, 'exclude_portraits', True) + + matches = find_auto_match_matches( + db, + tolerance=request.tolerance, + exclude_portraits=exclude_portraits, # NEW + ) + # ... rest of logic ... +``` + +**File: `src/web/schemas/faces.py`** + +```python +class AutoMatchRequest(BaseModel): + tolerance: float = Field(0.6, ge=0.0, le=1.0) + exclude_portraits: bool = Field(True, description="Exclude portrait/profile faces from matching") # NEW + exclude_pose_modes: List[str] = Field([], description="Exclude specific pose modes (e.g., ['looking_up', 'tilted'])") # NEW + exclude_extreme_angles: bool = Field(True, description="Exclude extreme angle faces (|yaw|>60°, |pitch|>45°, |roll|>45°)") # NEW +``` + +#### Step 4.3: Update Frontend Auto-Match UI + +**File: `frontend/src/pages/AutoMatch.tsx`** + +```typescript +// Add checkbox for excluding portraits +const [excludePortraits, setExcludePortraits] = useState(true) + +const startAutoMatch = async () => { + // ... + const response = await facesApi.autoMatch({ + tolerance, + exclude_portraits: excludePortraits // NEW + }) + // ... +} + +// Add UI control +
+ +
+``` + +--- + +### Phase 5: Update Identify Panel Filtering + +**File: `src/web/services/face_service.py` - `find_similar_faces()`** + +```python +def find_similar_faces( + db: Session, + face_id: int, + limit: int = 20, + tolerance: float = 0.6, + exclude_portraits: bool = False, # NEW: optional filtering + exclude_pose_modes: List[str] = None, # NEW: exclude specific pose modes + exclude_extreme_angles: bool = False, # NEW: exclude extreme angles +) -> List[Tuple[Face, float, float]]: + """Find similar faces with optional pose filtering""" + + # ... existing matching logic ... + + # Filter by pose if requested + if exclude_portraits or exclude_pose_modes or exclude_extreme_angles: + filtered_matches = [] + for face, distance, confidence_pct in matches: + # Exclude portraits (non-frontal faces) + if exclude_portraits and face.pose_mode != 'frontal': + continue + + # Exclude specific pose modes + if exclude_pose_modes and face.pose_mode in exclude_pose_modes: + continue + + # Exclude extreme angles + if exclude_extreme_angles: + yaw = abs(face.yaw_angle) if face.yaw_angle else 0 + pitch = abs(face.pitch_angle) if face.pitch_angle else 0 + roll = abs(face.roll_angle) if face.roll_angle else 0 + if yaw > 60 or pitch > 45 or roll > 45: + continue + + filtered_matches.append((face, distance, confidence_pct)) + + matches = filtered_matches + + return matches[:limit] +``` + +--- + +### Phase 6: Testing Strategy + +#### Unit Tests + +**New file: `tests/test_pose_detection.py`** + +```python +import pytest +from src.utils.pose_detection import ( + PoseDetector, + calculate_yaw_from_landmarks, + calculate_pitch_from_landmarks, + calculate_roll_from_landmarks, + classify_pose_mode +) + +def test_pose_detector_initialization(): + """Test pose detector can be initialized""" + detector = PoseDetector() + assert detector.yaw_threshold == 30.0 + assert detector.pitch_threshold == 20.0 + assert detector.roll_threshold == 15.0 + +def test_yaw_calculation(): + """Test yaw angle calculation from landmarks""" + # Frontal face landmarks + frontal_landmarks = { + 'left_eye': (100, 100), + 'right_eye': (200, 100), + 'nose': (150, 150), + 'left_mouth': (120, 200), + 'right_mouth': (180, 200) + } + yaw = calculate_yaw_from_landmarks(frontal_landmarks) + assert abs(yaw) < 30.0, "Frontal face should have low yaw" + + # Profile face landmarks (face turned right) + profile_landmarks = { + 'left_eye': (150, 100), + 'right_eye': (200, 100), + 'nose': (180, 150), # Nose shifted right + 'left_mouth': (160, 200), + 'right_mouth': (190, 200) + } + yaw = calculate_yaw_from_landmarks(profile_landmarks) + assert abs(yaw) >= 30.0, "Profile face should have high yaw" + +def test_pitch_calculation(): + """Test pitch angle calculation from landmarks""" + # Level face landmarks + level_landmarks = { + 'left_eye': (100, 100), + 'right_eye': (200, 100), + 'nose': (150, 150), # Normal nose position + 'left_mouth': (120, 200), + 'right_mouth': (180, 200) + } + pitch = calculate_pitch_from_landmarks(level_landmarks) + assert abs(pitch) < 20.0, "Level face should have low pitch" + + # Looking up landmarks (nose higher) + looking_up_landmarks = { + 'left_eye': (100, 100), + 'right_eye': (200, 100), + 'nose': (150, 120), # Nose higher than expected + 'left_mouth': (120, 200), + 'right_mouth': (180, 200) + } + pitch = calculate_pitch_from_landmarks(looking_up_landmarks) + assert pitch > 20.0, "Looking up should have positive pitch" + +def test_roll_calculation(): + """Test roll angle calculation from landmarks""" + # Upright face landmarks + upright_landmarks = { + 'left_eye': (100, 100), + 'right_eye': (200, 100), # Eyes level + 'nose': (150, 150), + 'left_mouth': (120, 200), + 'right_mouth': (180, 200) + } + roll = calculate_roll_from_landmarks(upright_landmarks) + assert abs(roll) < 15.0, "Upright face should have low roll" + + # Tilted face landmarks + tilted_landmarks = { + 'left_eye': (100, 100), + 'right_eye': (200, 120), # Right eye lower (tilted right) + 'nose': (150, 150), + 'left_mouth': (120, 200), + 'right_mouth': (180, 200) + } + roll = calculate_roll_from_landmarks(tilted_landmarks) + assert abs(roll) >= 15.0, "Tilted face should have high roll" + +def test_pose_mode_classification(): + """Test pose mode classification""" + # Frontal face + mode = classify_pose_mode(10.0, 5.0, 3.0) + assert mode == 'frontal', "Should classify as frontal" + + # Profile left + mode = classify_pose_mode(45.0, 5.0, 3.0) + assert 'profile_left' in mode, "Should classify as profile_left" + + # Looking up + mode = classify_pose_mode(10.0, 30.0, 3.0) + assert 'looking_up' in mode, "Should classify as looking_up" + + # Tilted right + mode = classify_pose_mode(10.0, 5.0, 25.0) + assert 'tilted_right' in mode, "Should classify as tilted_right" + + # Combined mode + mode = classify_pose_mode(45.0, 30.0, 25.0) + assert 'profile_left' in mode and 'looking_up' in mode, "Should have combined mode" + +``` + +#### Integration Tests + +**Test pose detection in face processing pipeline:** +1. Process test images with frontal faces → `pose_mode = 'frontal'` +2. Process test images with profile faces → `pose_mode = 'profile_left'` or `'profile_right'` +3. Process test images with looking up faces → `pose_mode = 'looking_up'` +4. Process test images with tilted faces → `pose_mode = 'tilted_left'` or `'tilted_right'` +5. Process test images with combined poses → `pose_mode = 'profile_left_looking_up'`, etc. +6. Verify pose information (pose_mode, angles) is stored correctly +7. Test auto-match filtering excludes portraits, extreme angles, and specific pose modes + +--- + + +--- + +## Implementation Checklist + +### Phase 1: Database Schema +- [ ] Create Alembic migration for pose fields (`pose_mode`, `yaw_angle`, `pitch_angle`, `roll_angle`) +- [ ] Update desktop database schema (`src/core/database.py`) +- [ ] Update SQLAlchemy model (`src/web/db/models.py`) +- [ ] Update `DatabaseManager.add_face()` method signature +- [ ] Run migration on test database +- [ ] Verify schema changes + +### Phase 2: Pose Detection Utility +- [ ] Create `src/utils/pose_detection.py` +- [ ] Implement `PoseDetector` class +- [ ] Implement landmark-based yaw calculation +- [ ] Implement landmark-based pitch calculation +- [ ] Implement landmark-based roll calculation +- [ ] Implement pose mode classification logic +- [ ] Write unit tests for pose detection (yaw, pitch, roll) +- [ ] Test with sample images (frontal, profile, looking up/down, tilted, extreme) + +### Phase 3: Face Processing Integration +- [ ] Update `src/core/face_processing.py` to use RetinaFace directly +- [ ] Integrate pose detection into processing pipeline **with graceful fallback** +- [ ] Implement face matching logic (RetinaFace ↔ DeepFace) **with defaults if matching fails** +- [ ] Update `src/web/services/face_service.py` **with graceful fallback** +- [ ] Test processing with mixed pose faces (frontal, profile, looking up/down, tilted) +- [ ] Verify pose information in database (pose_mode, angles) +- [ ] **Test backward compatibility: verify processing continues if RetinaFace unavailable** +- [ ] **Test error handling: verify processing continues if pose detection fails** + +### Phase 4: Auto-Match Filtering +- [ ] Add pose filtering parameters to auto-match functions (`exclude_portraits`, `exclude_pose_modes`, `exclude_extreme_angles`) +- [ ] Update auto-match API endpoint +- [ ] Update auto-match schema +- [ ] Update frontend auto-match UI with pose filtering options +- [ ] Test auto-match with various pose filtering options enabled/disabled + +### Phase 5: Identify Panel Filtering +- [ ] Add optional pose filtering to similar faces +- [ ] Update identify API with pose filtering options (optional) +- [ ] Test identify panel with pose filtering + +### Phase 6: Testing +- [ ] Write unit tests for pose detection (yaw, pitch, roll) +- [ ] Write integration tests for face processing with pose detection +- [ ] Write tests for auto-match filtering (all pose modes) +- [ ] Test with real-world images (frontal, profile, looking up/down, tilted, extreme) +- [ ] Performance testing (ensure minimal overhead) +- [ ] Accuracy testing (verify > 90% correct pose classification) +- [ ] **Backward compatibility testing: test with existing databases (add columns, verify queries work)** +- [ ] **Graceful degradation testing: test with RetinaFace unavailable (should use defaults)** +- [ ] **Error handling testing: test with RetinaFace errors (should use defaults, not fail)** +- [ ] **Verify existing `add_face()` calls still work without pose parameters** +- [ ] **Verify face matching still works without pose data** + +### Phase 7: Documentation +- [ ] Update README with pose detection feature +- [ ] Document pose modes and filtering options +- [ ] Update API documentation with pose filtering parameters +- [ ] Create migration guide (if needed) +- [ ] Document pose mode classifications and thresholds + +--- + +## Performance Considerations + +### Expected Overhead + +1. **Additional RetinaFace Call:** + - RetinaFace is already used by DeepFace internally + - Direct call adds ~10-50ms per image (depending on image size) + - Can be optimized by caching results + +2. **Landmark Processing:** + - Yaw calculation is very fast (< 1ms per face) + - Negligible performance impact + +3. **Database:** + - New pose_mode field: text string (50 chars max, ~50 bytes per face) + - Optional yaw_angle, pitch_angle, roll_angle: 8 bytes each if stored + - Index on `pose_mode` for fast filtering + +### Optimization Strategies + +1. **Cache RetinaFace Results:** + - Store RetinaFace detection results temporarily + - Reuse for both DeepFace and pose detection + +2. **Parallel Processing:** + - Run RetinaFace and DeepFace in parallel (if possible) + - Combine results afterwards + +3. **Lazy Evaluation:** + - Only run pose detection if explicitly requested + - Make it optional via configuration + +--- + +## Configuration Options + +### Add to `src/core/config.py`: + +```python +# Pose Detection Settings +ENABLE_POSE_DETECTION = True # Enable/disable pose detection +POSE_YAW_THRESHOLD = 30.0 # Yaw angle threshold for profile detection (degrees) +POSE_PITCH_THRESHOLD = 20.0 # Pitch angle threshold for up/down detection (degrees) +POSE_ROLL_THRESHOLD = 15.0 # Roll angle threshold for tilt detection (degrees) +STORE_POSE_ANGLES = True # Store yaw/pitch/roll angles in database (optional) +EXCLUDE_NON_FRONTAL_IN_AUTOMATCH = True # Default auto-match behavior (exclude non-frontal faces) +EXCLUDE_EXTREME_ANGLES_IN_AUTOMATCH = True # Exclude extreme angles by default +``` + +--- + +## Success Criteria + +1. ✅ Face poses (yaw, pitch, roll) are automatically detected during processing +2. ✅ Pose information (pose_mode, angles) is stored in database for all faces +3. ✅ Auto-match can filter faces by pose mode (profile, looking up/down, tilted, extreme angles) +4. ✅ Performance impact is minimal (< 10% processing time increase) +5. ✅ Accuracy: > 90% correct classification of pose modes (frontal, profile, looking up/down, tilted) +6. ✅ Support for combined pose modes (e.g., profile_left_looking_up_tilted_right) +7. ✅ Clean database implementation - all faces have pose data from the start + +--- + +## Risks and Mitigation + +### Risk 1: False Positives/Negatives +- **Risk:** Profile detection may misclassify some faces +- **Mitigation:** Tune threshold based on testing, allow manual override + +### Risk 2: Performance Impact +- **Risk:** Additional RetinaFace call slows processing +- **Mitigation:** Optimize by caching results, make it optional + +### Risk 3: RetinaFace Dependency +- **Risk:** RetinaFace may not be available or may fail +- **Mitigation:** Graceful fallback to default (pose_mode = 'frontal') + +### Risk 4: Matching Accuracy +- **Risk:** Matching RetinaFace and DeepFace results may be inaccurate +- **Mitigation:** Use bounding box overlap for matching, test thoroughly + +### Risk 5: Clean Database Requirements +- **Risk:** Database will be wiped - all existing data will be lost +- **Mitigation:** This is intentional - plan assumes fresh database start + +--- + +## Backward Compatibility & Graceful Degradation + +### Make Pose Detection Optional with Graceful Fallback + +To ensure existing functionality continues to work without disruption, pose detection must be implemented with graceful fallback mechanisms: + +#### 1. **RetinaFace Availability Check** +- Check if RetinaFace is available before attempting pose detection +- If RetinaFace is not available, skip pose detection and use defaults +- Log warnings but do not fail face processing + +```python +# In face_processing.py +if RETINAFACE_AVAILABLE: + try: + pose_faces = self.pose_detector.detect_pose_faces(photo_path) + except Exception as e: + print(f"⚠️ Pose detection failed: {e}, using defaults") + pose_faces = [] # Fallback to defaults +else: + pose_faces = [] # RetinaFace not available, use defaults +``` + +#### 2. **Default Values for Missing Pose Data** +- If pose detection fails or is unavailable, use safe defaults: + - `pose_mode = 'frontal'` (assumes frontal face) + - `yaw_angle = None` + - `pitch_angle = None` + - `roll_angle = None` +- All existing faces without pose data will default to `'frontal'` +- New faces processed without pose detection will also use defaults + +#### 3. **Database Schema Defaults** +- All new columns have default values: + - `pose_mode TEXT DEFAULT 'frontal'` (NOT NULL with default) + - `yaw_angle REAL DEFAULT NULL` (nullable) + - `pitch_angle REAL DEFAULT NULL` (nullable) + - `roll_angle REAL DEFAULT NULL` (nullable) +- Existing queries will continue to work (NULL values for angles are acceptable) +- Existing faces will automatically get `pose_mode = 'frontal'` when schema is updated + +#### 4. **Method Signature Compatibility** +- `add_face()` method signature adds new optional parameters with defaults: + ```python + def add_face(self, ..., + pose_mode: str = 'frontal', # Default value + yaw_angle: Optional[float] = None, # Optional + pitch_angle: Optional[float] = None, # Optional + roll_angle: Optional[float] = None) # Optional + ``` +- All existing calls to `add_face()` will continue to work without modification +- New parameters are optional and backward compatible + +#### 5. **Error Handling in Face Processing** +- If RetinaFace detection fails for a specific photo: + - Log the error but continue processing + - Use default pose values (`frontal`, `None` angles) + - Do not fail the entire photo processing +- If pose matching between RetinaFace and DeepFace fails: + - Use default pose values + - Log warning but continue processing + +#### 6. **Configuration Flag (Optional)** +- Add configuration option to enable/disable pose detection: + ```python + # In config.py + ENABLE_POSE_DETECTION = True # Can be disabled if needed + ``` +- If disabled, skip RetinaFace calls entirely and use defaults +- Allows users to disable feature if experiencing issues + +#### 7. **Graceful Degradation Benefits** +- **Existing functionality preserved:** All current features continue to work +- **No breaking changes:** Database queries, face matching, auto-match all work +- **Progressive enhancement:** Pose detection adds value when available, but doesn't break when unavailable +- **Performance fallback:** If RetinaFace is slow or unavailable, processing continues without pose data + +#### 8. **Testing Backward Compatibility** +- Test with existing databases (add columns, verify queries still work) +- Test with RetinaFace unavailable (should use defaults) +- Test with RetinaFace errors (should use defaults, not fail) +- Verify existing `add_face()` calls still work +- Verify face matching still works without pose data + +--- + +## Future Enhancements + +1. **Landmark Visualization:** + - Show landmarks in UI for debugging + - Visualize pose angles (yaw, pitch, roll) + +2. **Advanced Filtering:** + - Filter by specific angle ranges (e.g., yaw between -45° and 45°) + - Filter by individual pose modes (e.g., only profile_left, exclude looking_up) + - Custom pose mode combinations + +3. **Quality-Based Filtering:** + - Combine pose information with quality score + - Filter low-quality faces with extreme angles + - Prefer frontal faces for matching + +4. **Pose Estimation Refinement:** + - Use more sophisticated algorithms (e.g., 3D face model fitting) + - Improve accuracy for edge cases + - Handle occluded faces better + +5. **UI Enhancements:** + - Display pose information in face details + - Visual indicators for pose modes (icons, colors) + - Pose-based photo organization + +--- + +## Timeline Estimate + +- **Phase 1 (Database):** 1-2 days +- **Phase 2 (Pose Detection Utility):** 3-4 days (includes yaw, pitch, roll calculations) +- **Phase 3 (Face Processing Integration):** 3-4 days +- **Phase 4 (Auto-Match Filtering):** 2-3 days (includes all pose filtering options) +- **Phase 5 (Identify Panel):** 1-2 days +- **Phase 6 (Testing):** 3-4 days (testing all pose modes) +- **Phase 7 (Documentation):** 1-2 days + +**Total Estimate:** 14-21 days + +--- + +## Conclusion + +This plan provides a comprehensive approach to implementing automatic face pose detection (yaw, pitch, roll) using RetinaFace directly. The implementation will enable automatic classification of face poses into multiple modes (frontal, profile, looking up/down, tilted, and combinations) and intelligent filtering in auto-match and other features. + +**Key Features:** +- **Multiple Pose Modes:** Detects yaw (profile), pitch (looking up/down), and roll (tilted) angles +- **Combined Classifications:** Supports combined pose modes (e.g., profile_left_looking_up_tilted_right) +- **Flexible Filtering:** Multiple filtering options (exclude portraits, exclude specific pose modes, exclude extreme angles) +- **Clean Database Design:** All faces have pose data from the start - no migration needed +- **Performance Optimized:** Minimal overhead with efficient angle calculations + +The phased approach ensures incremental progress with testing at each stage, minimizing risk and allowing for adjustments based on real-world testing results. This comprehensive pose detection system will significantly improve face matching accuracy and user experience by intelligently filtering out low-quality or difficult-to-match face poses. + diff --git a/src/core/database.py b/src/core/database.py index 5dfe2d7..0b1ed7b 100644 --- a/src/core/database.py +++ b/src/core/database.py @@ -74,7 +74,7 @@ class DatabaseManager: ) ''') - # Faces table (updated for DeepFace) + # Faces table (updated for DeepFace and pose detection) cursor.execute(''' CREATE TABLE IF NOT EXISTS faces ( id INTEGER PRIMARY KEY AUTOINCREMENT, @@ -89,11 +89,36 @@ class DatabaseManager: model_name TEXT DEFAULT 'ArcFace', face_confidence REAL DEFAULT 0.0, exif_orientation INTEGER DEFAULT NULL, + pose_mode TEXT DEFAULT 'frontal', + yaw_angle REAL DEFAULT NULL, + pitch_angle REAL DEFAULT NULL, + roll_angle REAL DEFAULT NULL, FOREIGN KEY (photo_id) REFERENCES photos (id), FOREIGN KEY (person_id) REFERENCES people (id) ) ''') + # Add pose fields if they don't exist (for existing databases) + try: + cursor.execute('ALTER TABLE faces ADD COLUMN pose_mode TEXT DEFAULT "frontal"') + except sqlite3.OperationalError: + pass # Column already exists + + try: + cursor.execute('ALTER TABLE faces ADD COLUMN yaw_angle REAL DEFAULT NULL') + except sqlite3.OperationalError: + pass # Column already exists + + try: + cursor.execute('ALTER TABLE faces ADD COLUMN pitch_angle REAL DEFAULT NULL') + except sqlite3.OperationalError: + pass # Column already exists + + try: + cursor.execute('ALTER TABLE faces ADD COLUMN roll_angle REAL DEFAULT NULL') + except sqlite3.OperationalError: + pass # Column already exists + # Person encodings table for multiple encodings per person (updated for DeepFace) cursor.execute(''' CREATE TABLE IF NOT EXISTS person_encodings ( @@ -143,6 +168,7 @@ class DatabaseManager: cursor.execute('CREATE INDEX IF NOT EXISTS idx_person_encodings_quality ON person_encodings(quality_score)') cursor.execute('CREATE INDEX IF NOT EXISTS idx_photos_date_taken ON photos(date_taken)') cursor.execute('CREATE INDEX IF NOT EXISTS idx_photos_date_added ON photos(date_added)') + cursor.execute('CREATE INDEX IF NOT EXISTS idx_faces_pose_mode ON faces(pose_mode)') @@ -233,7 +259,11 @@ class DatabaseManager: detector_backend: str = 'retinaface', model_name: str = 'ArcFace', face_confidence: float = 0.0, - exif_orientation: Optional[int] = None) -> int: + exif_orientation: Optional[int] = None, + pose_mode: str = 'frontal', + yaw_angle: Optional[float] = None, + pitch_angle: Optional[float] = None, + roll_angle: Optional[float] = None) -> int: """Add a face to the database and return its ID Args: @@ -247,6 +277,10 @@ class DatabaseManager: model_name: DeepFace model used (ArcFace, Facenet, etc.) face_confidence: Confidence from DeepFace detector exif_orientation: EXIF orientation value (1-8) for coordinate transformation + pose_mode: Pose mode classification (e.g., 'frontal', 'profile_left', 'looking_up') + yaw_angle: Yaw angle in degrees (left/right rotation) + pitch_angle: Pitch angle in degrees (up/down tilt) + roll_angle: Roll angle in degrees (rotation around face axis) Returns: Face ID @@ -255,10 +289,12 @@ class DatabaseManager: cursor = conn.cursor() cursor.execute(''' INSERT INTO faces (photo_id, person_id, encoding, location, confidence, - quality_score, detector_backend, model_name, face_confidence, exif_orientation) - VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?) + quality_score, detector_backend, model_name, face_confidence, + exif_orientation, pose_mode, yaw_angle, pitch_angle, roll_angle) + VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?) ''', (photo_id, person_id, encoding, location, confidence, quality_score, - detector_backend, model_name, face_confidence, exif_orientation)) + detector_backend, model_name, face_confidence, exif_orientation, + pose_mode, yaw_angle, pitch_angle, roll_angle)) return cursor.lastrowid def update_face_person(self, face_id: int, person_id: Optional[int]): diff --git a/src/core/face_processing.py b/src/core/face_processing.py index 394e849..32fc81a 100644 --- a/src/core/face_processing.py +++ b/src/core/face_processing.py @@ -35,6 +35,7 @@ from src.core.config import ( ) from src.core.database import DatabaseManager from src.utils.exif_utils import EXIFOrientationHandler +from src.utils.pose_detection import PoseDetector, RETINAFACE_AVAILABLE class FaceProcessor: @@ -59,6 +60,21 @@ class FaceProcessor: self._face_encoding_cache = {} self._image_cache = {} + # Initialize pose detector with graceful fallback + self.pose_detector = None + if RETINAFACE_AVAILABLE: + try: + self.pose_detector = PoseDetector() + if self.verbose >= 2: + print(f" Pose detection: enabled") + except Exception as e: + if self.verbose >= 1: + print(f"⚠️ Pose detection not available: {e}") + self.pose_detector = None + else: + if self.verbose >= 2: + print(f" Pose detection: RetinaFace not available") + if self.verbose >= 2: print(f"🔧 FaceProcessor initialized:") print(f" Detector: {self.detector_backend}") @@ -176,7 +192,19 @@ class FaceProcessor: # Use original image if no correction needed face_detection_path = photo_path - # Use DeepFace.represent() to get face detection and encodings + # Step 1: Use RetinaFace directly for detection + landmarks (with graceful fallback) + pose_faces = [] + if self.pose_detector: + try: + pose_faces = self.pose_detector.detect_pose_faces(face_detection_path) + if self.verbose >= 2 and pose_faces: + print(f" 📐 Pose detection: found {len(pose_faces)} faces with pose data") + except Exception as e: + if self.verbose >= 1: + print(f"⚠️ Pose detection failed for {filename}: {e}, using defaults") + pose_faces = [] + + # Step 2: Use DeepFace for encoding generation deepface_start_time = time.time() results = DeepFace.represent( img_path=face_detection_path, @@ -256,7 +284,14 @@ class FaceProcessor: image_np = np.array(image) quality_score = self._calculate_face_quality_score(image_np, face_location_dict) - # Store in database with DeepFace format and EXIF orientation + # Step 3: Match RetinaFace results with DeepFace results + pose_info = self._find_matching_pose_info(facial_area, pose_faces) + pose_mode = pose_info.get('pose_mode', 'frontal') + yaw_angle = pose_info.get('yaw_angle') + pitch_angle = pose_info.get('pitch_angle') + roll_angle = pose_info.get('roll_angle') + + # Store in database with DeepFace format, EXIF orientation, and pose data self.db.add_face( photo_id=photo_id, encoding=embedding.tobytes(), @@ -267,7 +302,11 @@ class FaceProcessor: detector_backend=self.detector_backend, model_name=self.model_name, face_confidence=face_confidence, - exif_orientation=exif_orientation + exif_orientation=exif_orientation, + pose_mode=pose_mode, + yaw_angle=yaw_angle, + pitch_angle=pitch_angle, + roll_angle=roll_angle ) if self.verbose >= 3: @@ -521,6 +560,78 @@ class FaceProcessor: print(f"⚠️ Error calculating face quality: {e}") return 0.5 # Default medium quality on error + def _find_matching_pose_info(self, facial_area: Dict, + pose_faces: List[Dict]) -> Dict: + """Match DeepFace result with RetinaFace pose detection result + + Args: + facial_area: DeepFace facial_area {'x': x, 'y': y, 'w': w, 'h': h} + pose_faces: List of RetinaFace detection results with pose info + + Returns: + Dictionary with pose information, or defaults + """ + # Match by bounding box overlap + # Simple approach: find closest match by center point + if not pose_faces: + return { + 'pose_mode': 'frontal', + 'yaw_angle': None, + 'pitch_angle': None, + 'roll_angle': None + } + + deepface_center_x = facial_area.get('x', 0) + facial_area.get('w', 0) / 2 + deepface_center_y = facial_area.get('y', 0) + facial_area.get('h', 0) / 2 + + best_match = None + min_distance = float('inf') + + for pose_face in pose_faces: + pose_area = pose_face.get('facial_area', {}) + + # Handle both dict and list formats (for robustness) + if isinstance(pose_area, list) and len(pose_area) >= 4: + # Convert list [x, y, w, h] to dict format + pose_area = { + 'x': pose_area[0], + 'y': pose_area[1], + 'w': pose_area[2], + 'h': pose_area[3] + } + elif not isinstance(pose_area, dict): + # Skip if not dict or list + continue + + pose_center_x = (pose_area.get('x', 0) + + pose_area.get('w', 0) / 2) + pose_center_y = (pose_area.get('y', 0) + + pose_area.get('h', 0) / 2) + + # Calculate distance between centers + distance = ((deepface_center_x - pose_center_x) ** 2 + + (deepface_center_y - pose_center_y) ** 2) ** 0.5 + + if distance < min_distance: + min_distance = distance + best_match = pose_face + + # If match is close enough (within 50 pixels), use it + if best_match and min_distance < 50: + return { + 'pose_mode': best_match.get('pose_mode', 'frontal'), + 'yaw_angle': best_match.get('yaw_angle'), + 'pitch_angle': best_match.get('pitch_angle'), + 'roll_angle': best_match.get('roll_angle') + } + + return { + 'pose_mode': 'frontal', + 'yaw_angle': None, + 'pitch_angle': None, + 'roll_angle': None + } + def _extract_face_crop(self, photo_path: str, location: dict, face_id: int) -> str: """Extract and save individual face crop for identification with EXIF orientation correction""" try: diff --git a/src/utils/pose_detection.py b/src/utils/pose_detection.py new file mode 100644 index 0000000..07c5cfe --- /dev/null +++ b/src/utils/pose_detection.py @@ -0,0 +1,349 @@ +"""Face pose detection (yaw, pitch, roll) using RetinaFace landmarks""" + +import numpy as np +from math import atan2, degrees +from typing import Dict, Tuple, Optional, List + +try: + from retinaface import RetinaFace + RETINAFACE_AVAILABLE = True +except ImportError: + RETINAFACE_AVAILABLE = False + RetinaFace = None + + +class PoseDetector: + """Detect face pose (yaw, pitch, roll) using RetinaFace landmarks""" + + # Thresholds for pose detection (in degrees) + PROFILE_YAW_THRESHOLD = 30.0 # Faces with |yaw| >= 30° are considered profile + EXTREME_YAW_THRESHOLD = 60.0 # Faces with |yaw| >= 60° are extreme profile + + PITCH_THRESHOLD = 20.0 # Faces with |pitch| >= 20° are looking up/down + EXTREME_PITCH_THRESHOLD = 45.0 # Faces with |pitch| >= 45° are extreme + + ROLL_THRESHOLD = 15.0 # Faces with |roll| >= 15° are tilted + EXTREME_ROLL_THRESHOLD = 45.0 # Faces with |roll| >= 45° are extreme + + def __init__(self, + yaw_threshold: float = None, + pitch_threshold: float = None, + roll_threshold: float = None): + """Initialize pose detector + + Args: + yaw_threshold: Yaw angle threshold for profile detection (degrees) + Default: 30.0 + pitch_threshold: Pitch angle threshold for up/down detection (degrees) + Default: 20.0 + roll_threshold: Roll angle threshold for tilt detection (degrees) + Default: 15.0 + """ + if not RETINAFACE_AVAILABLE: + raise RuntimeError("RetinaFace not available") + + self.yaw_threshold = yaw_threshold or self.PROFILE_YAW_THRESHOLD + self.pitch_threshold = pitch_threshold or self.PITCH_THRESHOLD + self.roll_threshold = roll_threshold or self.ROLL_THRESHOLD + + @staticmethod + def detect_faces_with_landmarks(img_path: str) -> Dict: + """Detect faces using RetinaFace directly + + Returns: + Dictionary with face keys and landmark data: + { + 'face_1': { + 'facial_area': {'x': x, 'y': y, 'w': w, 'h': h}, + 'landmarks': { + 'left_eye': (x, y), + 'right_eye': (x, y), + 'nose': (x, y), + 'left_mouth': (x, y), + 'right_mouth': (x, y) + }, + 'confidence': 0.95 + } + } + """ + if not RETINAFACE_AVAILABLE: + return {} + + faces = RetinaFace.detect_faces(img_path) + return faces + + @staticmethod + def calculate_yaw_from_landmarks(landmarks: Dict) -> Optional[float]: + """Calculate yaw angle from facial landmarks + + Args: + landmarks: Dictionary with landmark positions: + { + 'left_eye': (x, y), + 'right_eye': (x, y), + 'nose': (x, y), + 'left_mouth': (x, y), + 'right_mouth': (x, y) + } + + Returns: + Yaw angle in degrees (-90 to +90): + - Negative: face turned left (right profile) + - Positive: face turned right (left profile) + - Zero: frontal face + - None: if landmarks invalid + """ + if not landmarks: + return None + + left_eye = landmarks.get('left_eye') + right_eye = landmarks.get('right_eye') + nose = landmarks.get('nose') + + if not all([left_eye, right_eye, nose]): + return None + + # Calculate eye midpoint + eye_mid_x = (left_eye[0] + right_eye[0]) / 2 + eye_mid_y = (left_eye[1] + right_eye[1]) / 2 + + # Calculate horizontal distance from nose to eye midpoint + nose_x = nose[0] + eye_midpoint_x = eye_mid_x + + # Calculate face width (eye distance) + face_width = abs(right_eye[0] - left_eye[0]) + + if face_width == 0: + return None + + # Calculate horizontal offset + horizontal_offset = nose_x - eye_midpoint_x + + # Calculate yaw angle using atan2 + # Normalize by face width to get angle + yaw_radians = atan2(horizontal_offset, face_width) + yaw_degrees = degrees(yaw_radians) + + return yaw_degrees + + @staticmethod + def calculate_pitch_from_landmarks(landmarks: Dict) -> Optional[float]: + """Calculate pitch angle from facial landmarks (up/down tilt) + + Args: + landmarks: Dictionary with landmark positions + + Returns: + Pitch angle in degrees (-90 to +90): + - Positive: looking up + - Negative: looking down + - None: if landmarks invalid + """ + if not landmarks: + return None + + left_eye = landmarks.get('left_eye') + right_eye = landmarks.get('right_eye') + left_mouth = landmarks.get('left_mouth') + right_mouth = landmarks.get('right_mouth') + nose = landmarks.get('nose') + + if not all([left_eye, right_eye, left_mouth, right_mouth, nose]): + return None + + # Eye midpoint + eye_mid_y = (left_eye[1] + right_eye[1]) / 2 + # Mouth midpoint + mouth_mid_y = (left_mouth[1] + right_mouth[1]) / 2 + # Nose vertical position + nose_y = nose[1] + + # Expected nose position (typically 60% down from eyes to mouth) + expected_nose_y = eye_mid_y + (mouth_mid_y - eye_mid_y) * 0.6 + face_height = abs(mouth_mid_y - eye_mid_y) + + if face_height == 0: + return None + + # Vertical offset from expected position + vertical_offset = nose_y - expected_nose_y + + # Calculate pitch angle + pitch_radians = atan2(vertical_offset, face_height) + pitch_degrees = degrees(pitch_radians) + + return pitch_degrees + + @staticmethod + def calculate_roll_from_landmarks(landmarks: Dict) -> Optional[float]: + """Calculate roll angle from facial landmarks (rotation around face axis) + + Args: + landmarks: Dictionary with landmark positions + + Returns: + Roll angle in degrees (-90 to +90): + - Positive: tilted right (clockwise) + - Negative: tilted left (counterclockwise) + - None: if landmarks invalid + """ + if not landmarks: + return None + + left_eye = landmarks.get('left_eye') + right_eye = landmarks.get('right_eye') + + if not all([left_eye, right_eye]): + return None + + # Calculate angle of eye line + dx = right_eye[0] - left_eye[0] + dy = right_eye[1] - left_eye[1] + + if dx == 0: + return 90.0 if dy > 0 else -90.0 # Vertical line + + # Roll angle + roll_radians = atan2(dy, dx) + roll_degrees = degrees(roll_radians) + + return roll_degrees + + @staticmethod + def classify_pose_mode(yaw: Optional[float], + pitch: Optional[float], + roll: Optional[float]) -> str: + """Classify face pose mode from all three angles + + Args: + yaw: Yaw angle in degrees + pitch: Pitch angle in degrees + roll: Roll angle in degrees + + Returns: + Pose mode classification string: + - 'frontal': frontal, level, upright + - 'profile_left', 'profile_right': profile views + - 'looking_up', 'looking_down': pitch variations + - 'tilted_left', 'tilted_right': roll variations + - Combined modes: e.g., 'profile_left_looking_up' + """ + # Default to frontal if angles unknown + if yaw is None: + yaw = 0.0 + if pitch is None: + pitch = 0.0 + if roll is None: + roll = 0.0 + + # Yaw classification + abs_yaw = abs(yaw) + if abs_yaw < 30.0: + yaw_mode = "frontal" + elif yaw < -30.0: + yaw_mode = "profile_right" + elif yaw > 30.0: + yaw_mode = "profile_left" + else: + yaw_mode = "slight_yaw" + + # Pitch classification + abs_pitch = abs(pitch) + if abs_pitch < 20.0: + pitch_mode = "level" + elif pitch > 20.0: + pitch_mode = "looking_up" + elif pitch < -20.0: + pitch_mode = "looking_down" + else: + pitch_mode = "slight_pitch" + + # Roll classification + abs_roll = abs(roll) + if abs_roll < 15.0: + roll_mode = "upright" + elif roll > 15.0: + roll_mode = "tilted_right" + elif roll < -15.0: + roll_mode = "tilted_left" + else: + roll_mode = "slight_roll" + + # Combine modes - simple case first + if yaw_mode == "frontal" and pitch_mode == "level" and roll_mode == "upright": + return "frontal" + + # Build combined mode string + modes = [] + if yaw_mode != "frontal": + modes.append(yaw_mode) + if pitch_mode != "level": + modes.append(pitch_mode) + if roll_mode != "upright": + modes.append(roll_mode) + + return "_".join(modes) if modes else "frontal" + + def detect_pose_faces(self, img_path: str) -> List[Dict]: + """Detect all faces and classify pose status (all angles) + + Args: + img_path: Path to image file + + Returns: + List of face dictionaries with pose information: + [{ + 'facial_area': {'x': x, 'y': y, 'w': w, 'h': h}, + 'landmarks': {...}, + 'confidence': 0.95, + 'yaw_angle': -45.2, + 'pitch_angle': 10.5, + 'roll_angle': -5.2, + 'pose_mode': 'profile_right_level_upright' + }, ...] + """ + faces = self.detect_faces_with_landmarks(img_path) + + results = [] + for face_key, face_data in faces.items(): + landmarks = face_data.get('landmarks', {}) + + # Calculate all three angles + yaw_angle = self.calculate_yaw_from_landmarks(landmarks) + pitch_angle = self.calculate_pitch_from_landmarks(landmarks) + roll_angle = self.calculate_roll_from_landmarks(landmarks) + + # Classify pose mode + pose_mode = self.classify_pose_mode(yaw_angle, pitch_angle, roll_angle) + + # Normalize facial_area format (RetinaFace returns list [x, y, w, h] or dict) + facial_area_raw = face_data.get('facial_area', {}) + if isinstance(facial_area_raw, list) and len(facial_area_raw) >= 4: + # Convert list [x, y, w, h] to dict format + facial_area = { + 'x': facial_area_raw[0], + 'y': facial_area_raw[1], + 'w': facial_area_raw[2], + 'h': facial_area_raw[3] + } + elif isinstance(facial_area_raw, dict): + # Already in dict format + facial_area = facial_area_raw + else: + # Default to empty dict + facial_area = {} + + result = { + 'facial_area': facial_area, + 'landmarks': landmarks, + 'confidence': face_data.get('confidence', 0.0), + 'yaw_angle': yaw_angle, + 'pitch_angle': pitch_angle, + 'roll_angle': roll_angle, + 'pose_mode': pose_mode + } + results.append(result) + + return results + diff --git a/src/web/app.py b/src/web/app.py index 6832c97..af4f12d 100644 --- a/src/web/app.py +++ b/src/web/app.py @@ -21,6 +21,8 @@ from src.web.api.version import router as version_router from src.web.settings import APP_TITLE, APP_VERSION from src.web.db.base import Base, engine from src.web.db.session import database_url +# Import models to ensure they're registered with Base.metadata +from src.web.db import models # noqa: F401 # Global worker process (will be set in lifespan) _worker_process: subprocess.Popen | None = None diff --git a/src/web/db/models.py b/src/web/db/models.py index 27d0648..b7fa0e3 100644 --- a/src/web/db/models.py +++ b/src/web/db/models.py @@ -94,6 +94,10 @@ class Face(Base): model_name = Column(Text, default="ArcFace", nullable=False) face_confidence = Column(Numeric, default=0.0, nullable=False) exif_orientation = Column(Integer, nullable=True) + pose_mode = Column(Text, default="frontal", nullable=False, index=True) + yaw_angle = Column(Numeric, nullable=True) + pitch_angle = Column(Numeric, nullable=True) + roll_angle = Column(Numeric, nullable=True) photo = relationship("Photo", back_populates="faces") person = relationship("Person", back_populates="faces") @@ -105,6 +109,7 @@ class Face(Base): Index("idx_faces_person_id", "person_id"), Index("idx_faces_photo_id", "photo_id"), Index("idx_faces_quality", "quality_score"), + Index("idx_faces_pose_mode", "pose_mode"), ) diff --git a/src/web/services/face_service.py b/src/web/services/face_service.py index ab2496e..cb33a99 100644 --- a/src/web/services/face_service.py +++ b/src/web/services/face_service.py @@ -28,6 +28,7 @@ from src.core.config import ( MAX_FACE_SIZE, ) from src.utils.exif_utils import EXIFOrientationHandler +from src.utils.pose_detection import PoseDetector, RETINAFACE_AVAILABLE from src.web.db.models import Face, Photo, Person @@ -326,8 +327,21 @@ def process_photo_faces( else: face_detection_path = photo_path + # Step 1: Use RetinaFace directly for detection + landmarks (with graceful fallback) + pose_faces = [] + pose_detector = None + if RETINAFACE_AVAILABLE: + try: + pose_detector = PoseDetector() + pose_faces = pose_detector.detect_pose_faces(face_detection_path) + if pose_faces: + print(f"[FaceService] Pose detection: found {len(pose_faces)} faces with pose data") + except Exception as e: + print(f"[FaceService] ⚠️ Pose detection failed for {photo.filename}: {e}, using defaults") + pose_faces = [] + try: - # Use DeepFace to detect faces and compute embeddings + # Step 2: Use DeepFace for encoding generation # Note: First call may take time to download/initialize models print(f"[DeepFace] Processing {photo.filename} with {detector_backend}/{model_name}...") results = DeepFace.represent( @@ -437,6 +451,13 @@ def process_photo_faces( # Convert from 0-100 to 0.0-1.0 for database (desktop stores REAL) quality_score = quality_score_int / 100.0 + # Step 3: Match RetinaFace results with DeepFace results + pose_info = _find_matching_pose_info(facial_area, pose_faces) + pose_mode = pose_info.get('pose_mode', 'frontal') + yaw_angle = pose_info.get('yaw_angle') + pitch_angle = pose_info.get('pitch_angle') + roll_angle = pose_info.get('roll_angle') + # Store face in database - match desktop schema exactly # Desktop: confidence REAL DEFAULT 0.0 (legacy), face_confidence REAL (actual) # Desktop: quality_score REAL DEFAULT 0.0 (0.0-1.0 range) @@ -452,6 +473,10 @@ def process_photo_faces( model_name=model_name, face_confidence=face_confidence, # REAL in 0.0-1.0 range exif_orientation=exif_orientation, + pose_mode=pose_mode, + yaw_angle=yaw_angle, + pitch_angle=pitch_angle, + roll_angle=roll_angle, ) db.add(face) @@ -486,6 +511,78 @@ def process_photo_faces( raise Exception(f"Error processing faces in {photo.filename}: {str(e)}") +def _find_matching_pose_info(facial_area: Dict, pose_faces: List[Dict]) -> Dict: + """Match DeepFace result with RetinaFace pose detection result + + Args: + facial_area: DeepFace facial_area {'x': x, 'y': y, 'w': w, 'h': h} + pose_faces: List of RetinaFace detection results with pose info + + Returns: + Dictionary with pose information, or defaults + """ + # Match by bounding box overlap + # Simple approach: find closest match by center point + if not pose_faces: + return { + 'pose_mode': 'frontal', + 'yaw_angle': None, + 'pitch_angle': None, + 'roll_angle': None + } + + deepface_center_x = facial_area.get('x', 0) + facial_area.get('w', 0) / 2 + deepface_center_y = facial_area.get('y', 0) + facial_area.get('h', 0) / 2 + + best_match = None + min_distance = float('inf') + + for pose_face in pose_faces: + pose_area = pose_face.get('facial_area', {}) + + # Handle both dict and list formats (for robustness) + if isinstance(pose_area, list) and len(pose_area) >= 4: + # Convert list [x, y, w, h] to dict format + pose_area = { + 'x': pose_area[0], + 'y': pose_area[1], + 'w': pose_area[2], + 'h': pose_area[3] + } + elif not isinstance(pose_area, dict): + # Skip if not dict or list + continue + + pose_center_x = (pose_area.get('x', 0) + + pose_area.get('w', 0) / 2) + pose_center_y = (pose_area.get('y', 0) + + pose_area.get('h', 0) / 2) + + # Calculate distance between centers + distance = ((deepface_center_x - pose_center_x) ** 2 + + (deepface_center_y - pose_center_y) ** 2) ** 0.5 + + if distance < min_distance: + min_distance = distance + best_match = pose_face + + # If match is close enough (within 50 pixels), use it + if best_match and min_distance < 50: + return { + 'pose_mode': best_match.get('pose_mode', 'frontal'), + 'yaw_angle': best_match.get('yaw_angle'), + 'pitch_angle': best_match.get('pitch_angle'), + 'roll_angle': best_match.get('roll_angle') + } + + return { + 'pose_mode': 'frontal', + 'yaw_angle': None, + 'pitch_angle': None, + 'roll_angle': None + } + + def process_unprocessed_photos( db: Session, batch_size: Optional[int] = None,