ProbaV Processed Dataset¶

The ProbaV dataset is the origin point of the entire LWIR MFSR project. It is the dataset that the PIUnet architecture was originally designed for, and its directory format became the canonical data interchange format used across every tool in the pipeline.

What ProbaV Is¶

Proba-V is an ESA (European Space Agency) vegetation-monitoring satellite launched in 2013. The Proba-V Super Resolution competition, hosted on the Kelvin platform in 2018-2019, challenged participants to fuse multiple 100m-resolution images of the same scene into a single 33m-resolution reconstruction -- a 3x super-resolution factor.

The competition provided two spectral bands (RED and NIR), each containing hundreds of image sets. Each image set consists of:

One high-resolution (HR) reference image at 384x384 pixels, 16-bit unsigned integer (I;16 mode PNG)
Multiple low-resolution (LR) frames at 128x128 pixels (3x smaller), also 16-bit
Quality masks (QM) for each LR frame (binary masks indicating valid pixels)
A status mask (SM) for the HR reference (binary, same resolution as HR)

The 3x factor between LR and HR is the defining characteristic: 128 * 3 = 384. This 3x ratio became hard-wired into the PIUnet architecture and all downstream LWIR tooling.

Location and Structure¶

Path: /home/geoff/projects/ceres/superrez/probav_data_processed/

Not a git repository. Total disk usage: ~6.5 GB.

probav_data_processed/
  norm.csv                    # Per-imgset normalization values (1449 entries)
  failed_registrations.json   # 44 LR images that failed SuperGlue registration
  train/
    NIR/                      # 566 image sets (imgset0594 - imgset1159)
    RED/                      # 594 image sets (imgset0000 - imgset0593)
    probav_summary.json       # Dataset statistics and shift metadata
    corrections.json          # 14 outlier shift corrections
    outliers.json             # 14 detected outlier shifts
    suspicious_shifts.json    # 9968 suspicious shift comparisons (DNN vs original)
    All_shift_histograms.png  # Visualization of shift distributions
    corrected_All_shift_histograms.png
    NIR_shift_histograms.png
    RED_shift_histograms.png
    shift_correction_comparison.png
  test/
    NIR/                      # 144 image sets
    RED/                      # 146 image sets

Per-Imgset Contents (Training)¶

Each imgset####/ directory under train/ contains:

File	Size	Description
`HR.png`	384x384, uint16	High-resolution ground truth reference
`HR_downscaled.png`	128x128, uint16	HR bicubic-downsampled to LR resolution (added during processing)
`SM.png`	384x384, bool	Status mask for HR image
`SM_downscaled.png`	128x128	Downscaled status mask
`LR000.png` ... `LR0XX.png`	128x128, uint16	Original LR frames (~9-21 per set)
`LR000_aligned.png` ...	128x128, uint16	LR frames re-registered to HR (added Feb-Mar 2025)
`QM000.png` ... `QM0XX.png`	128x128, bool	Quality masks for original LR frames
`QM000_aligned.png` ...	128x128, bool	Quality masks warped to match aligned LR frames
`clearance.npy`	float32 array	Per-frame clearance scores (number of valid pixels)
`sharpness.npy`	float32 array	Per-frame sharpness scores (Laplacian variance)
`probav_shifts.json`	JSON	Registration shift metadata per LR frame
`registration_results.json`	JSON	Full registration results (large, ~3 MB per imgset in RED/imgset0001)

Test imgsets contain only LR frames, QM masks, and clearance.npy -- no HR ground truth or SM (consistent with the competition format where test HR was withheld).

Image Properties¶

HR:  384x384, uint16, typical range ~2700-15000
LR:  128x128, uint16, typical range ~4400-15200
QM:  128x128, boolean (valid pixel mask)
SM:  384x384, boolean (valid pixel mask)

The pixel values are satellite reflectance measurements, not temperature. The 16-bit range is much wider than typical 8-bit imagery, and the normalization constants used in the original PIUnet code reflect this: mu=7433.6436, sigma=2353.0723 (from piunet/piunet/data/datasets.py, line 35-36).

Processing History¶

Original Raw Data (Sep 2018)¶

The original competition data lives at piunet/Dataset/probav_data/ with subdirectories train/, val/, and test/. The raw data files carry timestamps from Sep 18, 2018 (the original distribution date). This raw data was also copied to probav_data_processed/ as the baseline.

The original PIUnet preprocessing pipeline (piunet/piunet/data/preprocess_probav.py) loads this raw data, performs phase-correlation-based registration via register_dataset(), selects the best T=9 frames per image set based on a clearance threshold of 85%, and saves the result as large .npy arrays:

Dataset/
  X_RED_train.npy, X_NIR_train.npy     # LR frames (N, 128, 128, 9)
  y_RED_train.npy, y_NIR_train.npy     # HR references (N, 384, 384, 1)
  y_RED_train_masks.npy, ...           # HR masks
  X_RED_val.npy, X_NIR_val.npy         # Validation split
  X_RED_test.npy, X_NIR_test.npy       # Test (no HR)

Enhanced Registration (Jan-Mar 2025)¶

The probav_data_processed/ directory represents a second-pass enhancement done in early 2025, adding improved registration of LR frames to the HR reference using SuperGlue feature matching (via lwir-align/tests/process_probav_folder.py). This produced the *_aligned.png files in each imgset.

The processing pipeline for this enhanced registration:

SuperGlue registration (process_probav_folder.py): Each LR frame was registered to HR_downscaled.png using SuperGlue with CLAHE preprocessing. Homographies were validated for scale (0.9-1.1), rotation (<3 deg), and translation (<5 px). Failed registrations were logged to failed_registrations.json (44 failures total, all in RED band).
Shift analysis (Mar 2025): Sub-pixel shifts were computed and analyzed. Per probav_summary.json:
1160 image sets total, 22,294 individual LR images processed
Mean shift magnitude: 0.576 pixels (LR scale) before correction, 0.279 after
14 outliers detected (shift magnitude > 1.33 pixels) and corrected
DNN-based shift refinement: The suspicious_shifts.json contains 9,968 entries comparing original grid-based shifts against DNN-estimated shifts, with quality metrics (SSIM, NCC, mutual information).
Clearance and sharpness scores: Computed by HighRes-net/src/save_clearance.py, stored as per-frame .npy arrays. Clearance counts valid pixels per QM mask; sharpness uses Laplacian variance.

Some RED imgsets (e.g., imgset0000, imgset0001) also contain extra analysis artifacts: FFT visualizations (*_fft_vis.png), frequency analysis plots, RAFT-based registrations, and debug visualizations from experimental registration methods.

The "ProbaV Format" as Data Interchange Standard¶

The ProbaV directory layout -- one folder per scene, with HR.png, LR###.png, QM###.png, SM.png -- became the canonical data format for the entire LWIR MFSR project. This was a pragmatic decision: since PIUnet was already designed to consume this layout, the LWIR data pipeline was built to produce it.

The LWIR tile validator (LWIR Tile Validator) exports validated LWIR tiles in ProbaV format via ProbaVExporter (lwir_tile_validator/src/frontend/export/probav_exporter.py). The export writes:

tile_####/HR.png -- LA mosaic patch (192x192 or 384x384)
tile_####/SM.png -- Status mask
tile_####/LR###.png -- HA frames warped to mosaic coordinates (64x64 or 128x128)
tile_####/QM###.png -- Quality masks
tile_####/metadata.json -- Tile provenance and quality data

The PIUnet dataset loader for LWIR (piunet/piunet/data/datasets_probav_format.py) loads these exports directly. The class LWIRProbaVDataset looks for tile_* directories and reads LR###.png, HR.png, SM.png, and QM###.png files, exactly mirroring the ProbaV structure but with different naming (tile_ prefix instead of imgset prefix).

Format Comparison: ProbaV vs LWIR¶

Property	ProbaV (Original)	LWIR (Adapted)
LR resolution	128x128	64x64 or 128x128
HR resolution	384x384	192x192 or 384x384
Scale factor	3x	3x (preserved)
Pixel type	uint16 (reflectance)	uint16 (temperature counts)
Normalization	mu=7433, sigma=2353	mu=31952, sigma=543
Frames per scene	9-30+ (variable)	8-9 (from flight overlap)
Bands	RED, NIR	Single (LWIR, 8-14 um)
Directory naming	imgset####	tile_####
Extra files	norm.csv, clearance.npy	metadata.json, best_reference_frame
Source	Satellite repeat passes	Aircraft altitude pairs

The 3x scale factor was inherited from ProbaV and applied to the LWIR problem by choosing tile sizes that maintained it: 64x64 LR / 192x192 HR and 128x128 LR / 384x384 HR. This was a design choice, not a physical property of the LWIR sensor.

Relationship to PIUnet Training¶

PIUnet was trained on ProbaV before being adapted for LWIR:

Original ProbaV training: The pretrained NIR model weights (piunet/pretrained_weights/model_weights_best.pt) were trained on ProbaV NIR data. The model learned satellite imagery super-resolution with ProbaVDatasetTrain class.
Transfer to LWIR: The same PIUnet architecture was retrained on LWIR data using LWIRProbaVDataset, which loads LWIR tiles exported in ProbaV format. The architecture remained identical (3x upsampling, TEFA attention blocks, TERN alignment module) -- only the data and normalization constants changed.
The normalization gap: ProbaV mu/sigma (7433/2353) vs LWIR mu/sigma (31952/543) reveals a fundamental difference. ProbaV satellite reflectance has wide dynamic range with high variance. LWIR thermal data has narrow dynamic range clustered around ambient temperature, with very low contrast -- this mismatch contributed to PIUnet's poor LWIR performance (see Bicubic Gap).

Relationship to Other Wiki Pages¶

PIUnet -- The model originally designed for this dataset
PIUnet Architecture -- Architecture details that assumed ProbaV data characteristics
LWIR Tile Validator -- The tool that exports LWIR data in ProbaV format
LWIR-Align -- Registration pipeline whose outputs feed the tile validator
Existing Data Limitations -- Problems with the LWIR data that ProbaV did not have
Bicubic Gap -- PIUnet's failure on LWIR, partly due to domain gap from ProbaV

Key Insight: Why ProbaV Format Persisted¶

The decision to maintain ProbaV's directory layout throughout the LWIR pipeline was driven by engineering pragmatism: it avoided rewriting the data loading code. But it also locked in assumptions about the data -- 3x scale factor, per-scene directory organization, quality mask conventions -- that may not be optimal for LWIR aerial imagery. The LWIR data has fundamentally different characteristics (narrow dynamic range, consistent viewpoint geometry, temporal correlation between frames) that ProbaV's satellite-repeat-pass format does not capture. Future architectures (see Data Collection v2) may benefit from a format designed specifically for aerial MFSR.