CNN Image Super-Resolution (4× Upscaling)

Course: CSCI E-89 (Deep Learning), Harvard Extension School, Fall 2024

Code: github.com/nthapaliya/cnn-image-upscaling

Overview

Single-image super-resolution (SISR) is the task of recovering a plausible high-resolution image from a low-resolution input. This project trains and compares three CNN architectures at 4× upscaling on the FFHQ dataset (70,000 high-quality face images), measuring output quality with PSNR and SSIM.

Faces provide a structured benchmark domain where quality degradation is perceptually obvious and metrics are well-calibrated.

Architectures

Model	Key idea
SRCNN	Pioneering 3-layer super-resolution CNN (Dong et al., 2014)
ESPCN	Sub-pixel convolution (pixel shuffle) for efficient upscaling (Shi et al., 2016)
EDSR	Removes batch norm for more stable training at depth (Lim et al., 2017)

Evaluation

Both metrics computed on the luminance channel (Y of YCbCr), matching standard practice:

PSNR — Peak Signal-to-Noise Ratio (higher is better, measured in dB)
SSIM — Structural Similarity Index (higher is better, 0–1)

Dataset

FFHQ — 70,000 high-quality PNG face images at 1024×1024. Downloaded via Kaggle. Low-resolution training inputs created by bicubic downsampling (4× reduction). 65,000 train / 5,000 test split.

Stack

TensorFlow 2.x, Keras, NumPy, Matplotlib, Kaggle API, uv

Overview#

Architectures#

Evaluation#

Dataset#

Stack#

Overview

Architectures

Evaluation

Dataset

Stack