Photobombing Removal Benchmarking

Experiments

To evaluate the outcomes of all techniques, the manipulated images produced through photobombing removal are juxtaposed against meticulously adjusted reference images. Numerous assessment criteria are employed to gauge the efficacy of the photobombing removal results. Specifically, the evaluation involves metrics such as the Fréchet inception distance (FID) , Structural Similarity Index (SSIM) , and Peak-to-Noise Signal Ratio (PSNR) , which aid in comparing the reconstructed images with the custom-edited reference images. Additionally, we introduce the Texture-based Similarity Index (TSI) using the concept of Local Binary Patterns (LBP) [22].

Experimental Results

The evaluation of various inpainting techniques on our dataset focuses on assessing their ability to remove photobombed regions while maintaining overall image quality. Key metrics including FID, SSIM, and PSNR are utilized to gauge different methods against the ground truth. A novel metric, Texture-based Similarity Index (TSI), is introduced. Performance analysis, reveals LaMa as the most effective, boasting the best scores for FID, SSIM, and PSNR, along with the highest average rank based on TSI. LaMa's success is attributed to its efficient receptive field utilizing fast Fourier convolutions (FFCs) and a dedicated loss function. In contrast, GC lags due to its inability to effectively reconstruct masked regions with trivial pixel details. LaMa's superiority is further evident in the graphical representation of results.

The efficacy of inpainting methods concerning FID scores relative to the percentage of the region to be inpainted. Notably, all methods perform well in the 0-10% range, with LaMa consistently outperforming others across all percentages. Conversely, the performance of EBII drops in certain ranges, while FM maintains average performance. A direct relationship between FID score and inpainting region percentage is observed, highlighting LaMa's consistent prowess.

Conclusion and Future works

Getting rid of photobombs, you know, those annoying and distracting things that sneak into our pictures, is a bit of a puzzle. But it's super important because we all want our pictures to look great without any surprise guests, right? So, here's what we did: we started with pictures that had these sneaky intruders. Then, we made outlines of the stuff we wanted to kick out. After that, we played matchmaker by pairing up these pictures and their outlines and handed them over to some smart computer models that know how to fill in the missing parts. We then used fancy measuring tools to see how well these models did, comparing them to the real deal. And guess what? Our experiments showed that using these clever models really helps to zap those photobombs. We're pretty excited about this, and we're pretty sure this is just the beginning.

In the future, we're planning to make our toolkit even bigger and better for even cooler results. So, stay tuned! 📸🚫🙅‍♂️

Acknowledgment

This research was supported by the National Science Foundation (NSF) under Grant 2025234.

Authors and Affiliations

Sai Pavan Kumar Prakya, Madamanchi Manju Venkata Sainath, Vatsa S. Patel, Samah Saeed Baraheem & Tam V. Nguyen.
Department of Computer Science, University of Dayton, Dayton, OH, 45469, USA

Vatsa Patel, Ph.D. Vision & ML Engineer

Benchmarking Dataset

Benchmarking Methods

Experiments

Experimental Results

Conclusion and Future works

Acknowledgment

Authors and Affiliations

Next project: Data-Driven City Traffic Planning Simulation