AI upscaling of images and videos
Key technologies and architecture:
- AI models: Integrated Real-ESRGAN (x2/x4/x8, normal/anime modes) for upscaling and GFPGAN (v1.4) for face restoration.
- Model management: Implemented Singleton ModelManager with LRU cache (based on OrderedDict) for efficient model loading and switching.
- GPU optimization: Includes GPUCapabilities class for automatic detection of VRAM and compute capability, allowing dynamic adjustment of tile_size, half-precision (FP16), and enabling TF32 for Ampere/Ada GPUs.
- Multithreading: Image processing moved to QThread (UpscaleWorker) to prevent GUI blocking.
- Video processing: Implemented pipeline (reader/processor/writer) on threading.Thread and queue.Queue for parallel frame processing. Uses subprocess to call ffmpeg (demultiplexing, saving audio tracks, assembly).
- Filters: Modular FilterPipeline for applying a chain of filters (CLAHE, Bilateral denoise, Dehaze, Canny edge sharpening, etc.).
- Memory optimization: MemoryOptimizedProcessor for processing ultra-large images (tiling).
Functionality:
- Batch image processing (including Drag-n-Drop).
- Live preview with SplitView widget (comparison "before/after").
- Preset management (via QSettings).
- Localization (JSON, Translator) and theme switching (QSS).
- Real-time monitoring of VRAM/RAM (psutil, pyqtgraph).
- AI models: Integrated Real-ESRGAN (x2/x4/x8, normal/anime modes) for upscaling and GFPGAN (v1.4) for face restoration.
- Model management: Implemented Singleton ModelManager with LRU cache (based on OrderedDict) for efficient model loading and switching.
- GPU optimization: Includes GPUCapabilities class for automatic detection of VRAM and compute capability, allowing dynamic adjustment of tile_size, half-precision (FP16), and enabling TF32 for Ampere/Ada GPUs.
- Multithreading: Image processing moved to QThread (UpscaleWorker) to prevent GUI blocking.
- Video processing: Implemented pipeline (reader/processor/writer) on threading.Thread and queue.Queue for parallel frame processing. Uses subprocess to call ffmpeg (demultiplexing, saving audio tracks, assembly).
- Filters: Modular FilterPipeline for applying a chain of filters (CLAHE, Bilateral denoise, Dehaze, Canny edge sharpening, etc.).
- Memory optimization: MemoryOptimizedProcessor for processing ultra-large images (tiling).
Functionality:
- Batch image processing (including Drag-n-Drop).
- Live preview with SplitView widget (comparison "before/after").
- Preset management (via QSettings).
- Localization (JSON, Translator) and theme switching (QSS).
- Real-time monitoring of VRAM/RAM (psutil, pyqtgraph).