tanyar09 817e95337f feat: Update documentation and API for face identification and people management

This commit enhances the README with detailed instructions on the automatic database initialization and schema compatibility between the web and desktop versions. It also introduces new API endpoints for managing unidentified faces and people, including listing, creating, and identifying faces. The schemas for these operations have been updated to reflect the new data structures. Additionally, tests have been added to ensure the functionality of the new API features, improving overall coverage and reliability.

2025-11-03 12:49:48 -05:00

14 KiB

Raw Blame History

PunimTag Web

Modern Photo Management and Facial Recognition System

A fast, simple, and modern web application for organizing and tagging photos using state-of-the-art DeepFace AI with ArcFace recognition model.

🎯 Features

🌐 Web-Based: Modern React frontend with FastAPI backend
🔥 DeepFace AI: State-of-the-art face detection with RetinaFace and ArcFace models
🎯 Superior Accuracy: 512-dimensional embeddings (4x more detailed than face_recognition)
⚙️ Multiple Detectors: Choose from RetinaFace, MTCNN, OpenCV, or SSD detectors
🎨 Flexible Models: Select ArcFace, Facenet, Facenet512, or VGG-Face recognition models
👤 Person Identification: Identify and tag people across your photo collection
🤖 Smart Auto-Matching: Intelligent face matching with quality scoring and cosine similarity
🔍 Advanced Search: Search by people, dates, tags, and folders
🏷️ Tag Management: Organize photos with hierarchical tags
⚡ Batch Processing: Process thousands of photos efficiently
🔒 Privacy-First: All data stored locally, no cloud dependencies

🚀 Quick Start

Prerequisites

Python 3.12 or higher
Node.js 18+ and npm
Virtual environment (recommended)

Installation

# Clone the repository
git clone <repository-url>
cd punimtag

# Create and activate virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install Python dependencies
pip install -r requirements.txt

# Install frontend dependencies
cd frontend
npm install
cd ..

Database Setup

Automatic Initialization: The database and all tables are automatically created on first startup. No manual migration is needed!

The web application will:

Create the database file at data/punimtag.db (SQLite default) if it doesn't exist
Create all required tables with the correct schema on startup
Match the desktop version schema exactly for compatibility

Manual Setup (Optional): If you need to reset the database or create it manually:

source venv/bin/activate
export PYTHONPATH=/home/ladmin/Code/punimtag
# Recreate all tables from models
python scripts/recreate_tables_web.py

PostgreSQL (Production): Set the DATABASE_URL environment variable:

export DATABASE_URL=postgresql+psycopg2://user:password@host:port/database

Database Schema: The web version uses the exact same schema as the desktop version for full compatibility:

photos - Photo metadata (path, filename, date_taken, processed)
people - Person records (first_name, last_name, middle_name, maiden_name, date_of_birth)
faces - Face detections (encoding, location, quality_score, face_confidence, exif_orientation)
person_encodings - Person face encodings for matching
tags - Tag definitions
phototaglinkage - Photo-tag relationships (with linkage_type)

Running the Application

Prerequisites:

Redis must be installed and running (for background jobs)

Install Redis (if not installed):

# On Ubuntu/Debian:
sudo apt update && sudo apt install -y redis-server
sudo systemctl start redis-server
sudo systemctl enable redis-server  # Auto-start on boot

# On macOS with Homebrew:
brew install redis
brew services start redis

# Verify Redis is running:
redis-cli ping  # Should respond with "PONG"

Start Redis (if installed but not running):

# On Linux:
sudo systemctl start redis-server

# Or run directly:
redis-server

Terminal 2 - Backend API (automatically starts RQ worker):

cd /home/ladmin/Code/punimtag
source venv/bin/activate
export PYTHONPATH=/home/ladmin/Code/punimtag
uvicorn src.web.app:app --host 127.0.0.1 --port 8000

You should see:

✅ RQ worker started in background subprocess (PID: ...)
INFO:     Started server process
INFO:     Uvicorn running on http://127.0.0.1:8000

Note: The RQ worker automatically starts in a background subprocess when the API starts. You'll see a confirmation message with the worker PID. If Redis isn't running, you'll see a warning message.

Terminal 3 - Frontend:

cd /home/ladmin/Code/punimtag/frontend
npm run dev

Then open your browser to http://localhost:3000

Default Login:

Username: admin
Password: admin

Note:

The database and tables are automatically created on first startup - no manual setup needed!
The RQ worker starts automatically in a background subprocess when the API server starts
Make sure Redis is running first, or the worker won't start
Worker names are unique to avoid conflicts when restarting
Photo uploads are stored in data/uploads (configurable via PHOTO_STORAGE_DIR env var)
DeepFace models download automatically on first use (can take 5-10 minutes)
If port 8000 is in use, kill the process: lsof -i :8000 then kill <PID> or pkill -f "uvicorn.*app"

📖 Documentation

Architecture: System design and technical details
Web Migration Plan: Detailed migration roadmap
Phase 1 Status: Phase 1 implementation status
Phase 1 Checklist: Complete Phase 1 checklist

Phase 2 Features:

Photo import via folder scan or file upload
Background processing with progress tracking
Real-time job status updates (SSE)
Duplicate detection by checksum
EXIF metadata extraction
DeepFace face detection and recognition pipeline
Configurable detectors (RetinaFace, MTCNN, OpenCV, SSD)
Configurable models (ArcFace, Facenet, Facenet512, VGG-Face)
Process tab UI for face processing
Job cancellation support

🏗️ Project Structure

punimtag/
├── src/                    # Source code
│   ├── web/               # Web backend
│   │   ├── api/           # API routers
│   │   ├── db/            # Database models and session
│   │   ├── schemas/       # Pydantic models
│   │   └── services/      # Business logic services
│   └── core/              # Legacy desktop business logic
├── frontend/               # React frontend
│   ├── src/
│   │   ├── api/           # API client
│   │   ├── components/    # React components
│   │   ├── context/       # React contexts (Auth)
│   │   ├── hooks/         # Custom hooks
│   │   └── pages/         # Page components
│   └── package.json
├── tests/                  # Test suite
├── docs/                   # Documentation
├── data/                   # Application data (database, images)
├── alembic/                # Database migrations
└── deploy/                 # Docker deployment configs

📊 Current Status

Phase 1: Foundations ✅ COMPLETE

Backend:

✅ FastAPI application with CORS middleware
✅ Health, version, and metrics endpoints
✅ JWT authentication (login, refresh, user info)
✅ Job management endpoints (RQ/Redis integration)
✅ SQLAlchemy models for all entities
✅ Alembic migrations configured and applied
✅ Database initialized (SQLite default, PostgreSQL supported)
✅ RQ worker auto-start (starts automatically with API server)

Frontend:

✅ React + Vite + TypeScript setup
✅ Tailwind CSS configured
✅ Authentication flow with login page
✅ Protected routes with auth context
✅ Navigation layout (left sidebar + top bar)
✅ All page routes (Dashboard, Scan, Process, Search, Identify, Auto-Match, Tags, Settings)

Database:

✅ All tables created automatically on startup: photos, faces, people, person_encodings, tags, phototaglinkage
✅ Schema matches desktop version exactly for full compatibility
✅ Indices configured for performance
✅ SQLite database at data/punimtag.db (auto-created if missing)

Phase 2: Image Ingestion & Processing ✅ COMPLETE

Backend:

✅ Photo import service with checksum computation
✅ EXIF date extraction and image metadata
✅ Folder scanning with recursive option
✅ File upload support
✅ Background job processing with RQ
✅ Real-time job progress via SSE (Server-Sent Events)
✅ Duplicate detection (by path and checksum)
✅ Photo storage configuration (PHOTO_STORAGE_DIR)
✅ DeepFace pipeline integration
✅ Face detection (RetinaFace, MTCNN, OpenCV, SSD)
✅ Face embeddings computation (ArcFace, Facenet, Facenet512, VGG-Face)
✅ Face processing service with configurable detectors/models
✅ EXIF orientation handling
✅ Face quality scoring and validation
✅ Batch processing with progress tracking
✅ Job cancellation support

Frontend:

✅ Scan tab UI with folder selection
✅ Drag-and-drop file upload
✅ Recursive scan toggle
✅ Real-time job progress with progress bar
✅ Job status monitoring (SSE integration)
✅ Results display (added/existing counts)
✅ Error handling and user feedback
✅ Process tab UI with configuration controls
✅ Detector/model selection dropdowns
✅ Batch size configuration
✅ Start/Stop processing controls
✅ Processing progress display with photo count
✅ Results summary (faces detected, faces stored)
✅ Job cancellation support

Worker:

✅ RQ worker auto-starts with API server
✅ Unique worker names to avoid conflicts
✅ Graceful shutdown handling
✅ String-based function paths for reliable serialization

Next: Phase 3 - Identify Workflow & Auto-Match

Identify workflow UI
Auto-match engine with similarity thresholds
Unidentified faces management
Person creation and linking
Batch identification support

🔧 Configuration

Database

SQLite (Default for Development):

# Default location: data/punimtag.db
# No configuration needed

PostgreSQL (Production):

export DATABASE_URL=postgresql+psycopg2://user:password@host:port/database

Environment Variables

# Database (optional, defaults to SQLite)
DATABASE_URL=sqlite:///data/punimtag.db

# JWT Secrets (change in production!)
SECRET_KEY=your-secret-key-here

# Single-user credentials (change in production!)
ADMIN_USERNAME=admin
ADMIN_PASSWORD=admin

# Photo storage directory (default: data/uploads)
PHOTO_STORAGE_DIR=data/uploads

🧪 Testing

# Backend tests (to be implemented)
cd /home/ladmin/Code/punimtag
source venv/bin/activate
export PYTHONPATH=/home/ladmin/Code/punimtag
pytest tests/

# Frontend tests (to be implemented)
cd frontend
npm test

🗺️ Roadmap

✅ Phase 1: Foundations (Complete)

FastAPI backend scaffold
React frontend scaffold
Authentication system
Database setup
Basic API endpoints

✅ Phase 2: Image Ingestion & Processing (Complete)

✅ Photo import (folder scan and file upload)
✅ Background job processing with RQ
✅ Real-time progress tracking via SSE
✅ Scan tab UI implementation
✅ Duplicate detection and metadata extraction
✅ DeepFace face detection and processing pipeline
✅ Process tab UI with configuration controls
✅ Configurable detectors and models
✅ Face processing with progress tracking
✅ Job cancellation support

🔄 Phase 3: Identify Workflow & Auto-Match (In Progress)

Identify workflow UI
Auto-match engine with similarity thresholds
Unidentified faces management
Person creation and linking

📋 Phase 4: Search & Tags

Search endpoints with filters
Tag management UI
Virtualized photo grid
Advanced filtering

🎨 Phase 5: Polish & Release

Performance optimization
Accessibility improvements
Production deployment
Documentation

🏗️ Architecture

Backend:

Framework: FastAPI (Python 3.12+)
Database: SQLite (dev), PostgreSQL (production)
ORM: SQLAlchemy 2.0
Migrations: Alembic
Jobs: Redis + RQ
Auth: JWT (python-jose)

Frontend:

Framework: React 18 + TypeScript
Build Tool: Vite
Styling: Tailwind CSS
State: React Query + Context API
Routing: React Router

Deployment:

Docker Compose for local development
Containerized services for production

📦 Dependencies

Backend:

fastapi==0.115.0
uvicorn[standard]==0.30.6
pydantic==2.9.1
SQLAlchemy==2.0.36
alembic==1.13.2
python-jose[cryptography]==3.3.0
redis==5.0.8
rq==1.16.2
deepface>=0.0.79
tensorflow>=2.13.0

Frontend:

react==18.2.0
react-router-dom==6.20.0
@tanstack/react-query==5.8.4
axios==1.6.2
tailwindcss==3.3.5

🔒 Security

JWT-based authentication with refresh tokens
Password hashing (to be implemented in production)
CORS configured for development (restrict in production)
SQL injection prevention via SQLAlchemy ORM
Input validation via Pydantic schemas

⚠️ Note: Default credentials (admin/admin) are for development only. Change in production!

🐛 Known Limitations

Single-user mode only (multi-user support planned)
SQLite for development (PostgreSQL recommended for production)
No password hashing yet (plain text comparison - fix before production)
GPU acceleration not yet implemented
Large databases (>50K photos) may require optimization
DeepFace model downloads on first use (can take 5-10 minutes)
Face processing is CPU-intensive (GPU support planned for future)

📝 License

[Add your license here]

👥 Authors

PunimTag Development Team

🙏 Acknowledgments

DeepFace library by Sefik Ilkin Serengil - Modern face recognition framework
ArcFace - Additive Angular Margin Loss for Deep Face Recognition
RetinaFace - State-of-the-art face detection
TensorFlow, React, FastAPI, and all open-source contributors

📧 Support

For questions or issues:

Check documentation in docs/
See Phase 1 Checklist for implementation status
Review Migration Plan for roadmap

Made with ❤️ for photo enthusiasts

For the desktop version, see README_DESKTOP.md

14 KiB Raw Blame History

PunimTag Web

🎯 Features

🚀 Quick Start

Prerequisites

Installation

Database Setup

Running the Application

📖 Documentation

🏗️ Project Structure

📊 Current Status

Phase 1: Foundations ✅ COMPLETE

Phase 2: Image Ingestion & Processing ✅ COMPLETE

Next: Phase 3 - Identify Workflow & Auto-Match

🔧 Configuration

Database

Environment Variables

🧪 Testing

🗺️ Roadmap

✅ Phase 1: Foundations (Complete)

✅ Phase 2: Image Ingestion & Processing (Complete)

🔄 Phase 3: Identify Workflow & Auto-Match (In Progress)

📋 Phase 4: Search & Tags

🎨 Phase 5: Polish & Release

🏗️ Architecture

📦 Dependencies

🔒 Security

🐛 Known Limitations

📝 License

👥 Authors

🙏 Acknowledgments

📧 Support

14 KiB

Raw Blame History