Context-Aware Image Captioning with Scene Graphs
End-to-end image captioning pipeline conditioned on structured scene graph triples. BLEU-4 +11.4% over an image-only baseline.
End-to-end image captioning pipeline conditioned on structured scene graph triples. BLEU-4 +11.4% over an image-only baseline.
Fine-tuning a 3B instruct model with QLoRA to produce strict JSON from free-form text — on a single consumer GPU. Exact-match accuracy 0.258 → 0.624.
Rigorous evaluation of a 4-agent discussion pipeline on 7,661 benchmark questions. The framework decreased accuracy on all three benchmarks — an instructive negative result.
Comparison of three CNN architectures for 4× single-image super-resolution on the FFHQ dataset. Evaluated with PSNR and SSIM.