Learning Document Graphs with Attention for Image Manipulation Detection
Published in ICPRAI International Conference on Pattern Recognition and Artificial Intelligence, 2022
Recommended citation: Hailey James, Otkrist Gupta, and Dan Raviv. "Learning Document Graphs with Attention for Image Manipulation Detection." ICPRAI 2022 https://link.springer.com/chapter/10.1007/978-3-031-09037-0_22
Detecting manipulations in images is becoming increasingly important for combating misinformation and forgery. While recent advances in computer vision have lead to improved methods for detecting spliced images, most state-of-the-art methods fail when applied to images containing mostly text, such as images of documents. We propose a deep-learning method for detecting manipulations in images of documents which leverages the unique structured nature of these images in comparison with those of natural scenes. Specifically, we re-frame the classic image splice detection problem as a node classification problem, in which Optical Character Recognition (OCR) bounding boxes form nodes and edges are added according to an text-specific distance heuristic. We propose a system composed of a Variational Autoencoder (VAE)-based embedding algorithm and a graph neural network with attention, trained end-to-end for robust manipulation detection. Our proposed model outperforms both a state-of-the-art image splice detection method and a document-specific method. Recommended citation: Hailey James, Otkrist Gupta, and Dan Raviv. “Learning Document Graphs with Attention for Image Manipulation Detection.” preprint.