Detecting manipulations in images is becoming increasingly important for combating misinformation and forgery. While recent advances in computer vision have lead to improved methods for detecting spliced images, most state-of-the-art methods fail when applied to images containing mostly text, such as images of documents. We propose a deep-learning method for detecting manipulations in images of documents which leverages the unique structured nature of these images in comparison with those of natural scenes. Specifically, we re-frame the classic image splice detection problem as a node classification problem, in which Optical Character Recognition (OCR) bounding boxes form nodes and edges are added according to an text-specific distance heuristic. We propose a system composed of a Variational Autoencoder (VAE)-based embedding algorithm and a graph neural network with attention, trained end-to-end for robust manipulation detection. Our proposed model outperforms both a state-of-the-art image splice detection method and a document-specific method. Recommended citation: Hailey James, Otkrist Gupta, and Dan Raviv. “Learning Document Graphs with Attention for Image Manipulation Detection.” preprint.