This commit introduces a new feature in the Identify component that allows users to filter for unique faces only, hiding duplicates with ≥60% match confidence. The API has been updated to log calls to the get_similar_faces endpoint, including warnings for non-existent faces and information on the number of results returned. Additionally, the SimilarFaceItem schema has been updated to include the filename, improving data handling and user experience. Documentation and tests have been updated accordingly.
3.5 KiB
Confidence Calibration Implementation
Problem Solved
The identify UI was showing confidence percentages that were not actual match probabilities. The old calculation used a simple linear transformation:
confidence_pct = (1 - distance) * 100
This gave misleading results:
- Distance 0.6 (at threshold) showed 40% confidence
- Distance 1.0 showed 0% confidence
- Distance 2.0 showed -100% confidence (impossible!)
Solution: Empirical Confidence Calibration
Implemented a proper confidence calibration system that converts DeepFace distance values to actual match probabilities based on empirical analysis of the ArcFace model.
Key Improvements
-
Realistic Probabilities:
- Distance 0.6 (threshold) now shows ~55% confidence (realistic)
- Distance 1.0 shows ~17% confidence (not 0%)
- No negative percentages
-
Non-linear Mapping: Accounts for the actual distribution of distances in face recognition
-
Configurable Methods: Support for different calibration approaches:
empirical: Based on DeepFace ArcFace characteristics (default)sigmoid: Sigmoid-based calibrationlinear: Original linear transformation (fallback)
Calibration Curve
The empirical calibration uses different approaches for different distance ranges:
- Very Close (≤ 0.5×tolerance): 95-100% confidence (exponential decay)
- Near Threshold (≤ tolerance): 55-95% confidence (linear interpolation)
- Above Threshold (≤ 1.5×tolerance): 20-55% confidence (rapid decay)
- Very Far (> 1.5×tolerance): 1-20% confidence (exponential decay)
Configuration
Added new settings in src/core/config.py:
USE_CALIBRATED_CONFIDENCE = True # Enable/disable calibration
CONFIDENCE_CALIBRATION_METHOD = "empirical" # Calibration method
Files Modified
src/core/face_processing.py: Added calibration methodssrc/gui/identify_panel.py: Updated to use calibrated confidencesrc/gui/auto_match_panel.py: Updated to use calibrated confidencesrc/core/config.py: Added calibration settingssrc/photo_tagger.py: Updated to use calibrated confidence
Test Results
The test script shows significant improvements:
| Distance | Old Linear | New Calibrated | Improvement |
|---|---|---|---|
| 0.6 | 40.0% | 55.0% | +15.0% |
| 1.0 | 0.0% | 17.2% | +17.2% |
| 1.5 | -50.0% | 8.1% | +58.1% |
Usage
The calibrated confidence is now automatically used throughout the application. Users will see more realistic match probabilities that better reflect the actual likelihood of a face match.
Future Enhancements
- Dynamic Calibration: Learn from user feedback to improve calibration
- Model-Specific Calibration: Different calibration for different DeepFace models
- Quality-Aware Calibration: Adjust confidence based on face quality scores
- User Preferences: Allow users to adjust calibration sensitivity
Technical Details
The calibration system uses empirical parameters derived from analysis of DeepFace ArcFace model behavior. The key insight is that face recognition distances don't follow a linear relationship with match probability - they follow a more complex distribution that varies by distance range.
This implementation provides a foundation for more sophisticated calibration methods while maintaining backward compatibility through configuration options.