Academy & Industry Research Collaboration Center (AIRCC)

Volume 12, Number 02, January 2022

Detection Datasets: Forged Characters for Passport and Driving Licence


Teerath Kumar1, Muhammad Turab2, Shahnawaz Talpur2, Rob Brennan1 and Malika Bendechache1, 1Dublin City University, Ireland, 2Mehran University of Engineering and Technology, Pakistan


Forged characters detection from personal documents including a passport or a driving licence is an extremely important and challenging task in digital image forensics, as forged information on personal documents can be used for fraud purposes including theft, robbery etc. For any detection task i.e. forged character detection, deep learning models are data hungry and getting the forged characters dataset for personal documents is very difficult due to various reasons, including information privacy, unlabeled data or existing work is evaluated on private datasets with limited access and getting data labelled is another big challenge. To address these issues, we propose a new algorithm that generates two new datasets named forged characters detection on passport (FCD-P) and forged characters detection on driving licence (FCD-D). To the best of our knowledge, we are the first to release these datasets. The proposed algorithm first reads the plain image, then performs forging tasks i.e. randomly changes the position of the random character or randomly adds little noise. At the same time, the algorithm also records the bounding boxes of the forged characters. To meet real world situations, we perform multiple data augmentation on cards very carefully. Overall, each dataset consists of 15000 images, each image with size of 950 x 550. Our algorithm code, FCD-P and FCD-D are publicly available.


Character detection dataset, Deep learning forgery, Forged character detection.