# Auto-Match Load Performance Analysis

## Summary
Auto-Match page loads significantly slower than Identify page because it lacks the performance optimizations that Identify uses. Auto-Match always fetches all data upfront with no caching, while Identify uses sessionStorage caching and lazy loading.

## Identify Page Optimizations (Current)

### 1. **SessionStorage Caching**
- **State Caching**: Caches faces, current index, similar faces, and form data in sessionStorage
- **Settings Caching**: Caches filter settings (pageSize, minQuality, sortBy, etc.)
- **Restoration**: On mount, restores cached state instead of making API calls
- **Implementation**: 
  - `STATE_KEY = 'identify_state'` - stores faces, currentIdx, similar, faceFormData, selectedSimilar
  - `SETTINGS_KEY = 'identify_settings'` - stores filter settings
  - Only loads fresh data if no cached state exists

### 2. **Lazy Loading**
- **Similar Faces**: Only loads similar faces when:
  - `compareEnabled` is true
  - Current face changes
  - Not loaded during initial page load
- **Images**: Uses lazy loading for similar face images (`loading="lazy"`)

### 3. **Image Preloading**
- Preloads next/previous face images in background
- Uses `new Image()` to preload without blocking UI
- Delayed by 100ms to avoid blocking current image load

### 4. **Batch Operations**
- Uses `batchSimilarity` endpoint for unique faces filtering
- Single API call instead of multiple individual calls

### 5. **Progressive State Management**
- Uses refs to track restoration state
- Prevents unnecessary reloads during state restoration
- Only triggers API calls when actually needed

## Auto-Match Page (Current - No Optimizations)

### 1. **No Caching**
- **No sessionStorage**: Always makes fresh API calls on mount
- **No state restoration**: Always starts from scratch
- **No settings persistence**: Tolerance and other settings reset on page reload

### 2. **Eager Loading**
- **All Data Upfront**: Loads ALL people and ALL matches in single API call
- **No Lazy Loading**: All match data loaded even if user never views it
- **No Progressive Loading**: Everything must be loaded before UI is usable

### 3. **No Image Preloading**
- Images load on-demand as user navigates
- No preloading of next/previous person images

### 4. **Large API Response**
- Backend returns complete dataset:
  - All identified people
  - All matches for each person
  - All face metadata (photo info, locations, quality scores, etc.)
- Response size can be very large (hundreds of KB to MB) depending on:
  - Number of identified people
  - Number of matches per person
  - Amount of metadata per match

### 5. **Backend Processing**
The `find_auto_match_matches` function:
- Queries all identified faces (one per person, quality >= 0.3)
- For EACH person, calls `find_similar_faces` to find matches
- This means N database queries (where N = number of people)
- All processing happens synchronously before response is sent

## Performance Comparison

### Identify Page Load Flow
```
1. Check sessionStorage for cached state
2. If cached: Restore state (instant, no API call)
3. If not cached: Load faces (paginated, ~50 faces)
4. Load similar faces only when face changes (lazy)
5. Preload next/previous images (background)
```

### Auto-Match Page Load Flow
```
1. Always call API (no cache check)
2. Backend processes ALL people:
   - Query all identified faces
   - For each person: query similar faces
   - Build complete response with all matches
3. Wait for complete response (can be large)
4. Render all data at once
```

## Key Differences

| Feature | Identify | Auto-Match |
|---------|----------|------------|
| **Caching** | ✅ sessionStorage | ❌ None |
| **State Restoration** | ✅ Yes | ❌ No |
| **Lazy Loading** | ✅ Similar faces only | ❌ All data upfront |
| **Image Preloading** | ✅ Next/prev faces | ❌ None |
| **Pagination** | ✅ Yes (page_size) | ❌ No (all at once) |
| **Progressive Loading** | ✅ Yes | ❌ No |
| **API Call Size** | Small (paginated) | Large (all data) |
| **Backend Queries** | 1-2 queries | N+1 queries (N = people) |

## Why Auto-Match is Slower

1. **No Caching**: Every page load requires full API call
2. **Large Response**: All people + all matches in single response
3. **N+1 Query Problem**: Backend makes one query per person to find matches
4. **Synchronous Processing**: All processing happens before response
5. **No Lazy Loading**: All match data loaded even if never viewed

## Potential Optimizations for Auto-Match

### 1. **Add SessionStorage Caching** (High Impact)
- Cache people list and matches in sessionStorage
- Restore on mount instead of API call
- Similar to Identify page approach

### 2. **Lazy Load Matches** (High Impact)
- Load people list first
- Load matches for current person only
- Load matches for next person in background
- Similar to how Identify loads similar faces

### 3. **Pagination** (Medium Impact)
- Paginate people list (e.g., 20 people per page)
- Load matches only for visible people
- Reduces initial response size

### 4. **Backend Optimization** (High Impact)
- Batch similarity queries instead of N+1 pattern
- Use `calculate_batch_similarities` for all people at once
- Cache results if tolerance hasn't changed

### 5. **Image Preloading** (Low Impact)
- Preload reference face images for next/previous people
- Preload match images for current person

### 6. **Progressive Rendering** (Medium Impact)
- Show people list immediately
- Load matches progressively as user navigates
- Show loading indicators for matches

## Code Locations

### Identify Page
- **Frontend**: `frontend/src/pages/Identify.tsx`
  - Lines 42-45: SessionStorage keys
  - Lines 272-347: State restoration logic
  - Lines 349-399: State saving logic
  - Lines 496-527: Image preloading
  - Lines 258-270: Lazy loading of similar faces

### Auto-Match Page
- **Frontend**: `frontend/src/pages/AutoMatch.tsx`
  - Lines 35-71: `loadAutoMatch` function (always calls API)
  - Lines 74-77: Auto-load on mount (no cache check)

### Backend
- **API Endpoint**: `src/web/api/faces.py` (lines 539-702)
- **Service Function**: `src/web/services/face_service.py` (lines 1736-1846)
  - `find_auto_match_matches`: Processes all people synchronously

## Recommendations

1. **Immediate**: Add sessionStorage caching (similar to Identify)
2. **High Priority**: Implement lazy loading of matches
3. **Medium Priority**: Optimize backend to use batch queries
4. **Low Priority**: Add image preloading

The biggest win would be adding sessionStorage caching, which would make subsequent page loads instant (like Identify).