The cheap, easy access and wide use of peripheral devices such as printers and scanners have played a major role in the amount of printed information generated today. From advertisements, currencies, books, newspapers, magazines, contracts, product packaging, etc., there is always a printing technology involved. With advancements in staffless and cashless stores adoption in big cities, supermarkets and stores will make available only printed data (such as the QR-CODES) for purchases and interaction with clients, making such printing and scanning technologies crucial in a near future. Notwithstanding such advancements in the availability of printed information, the lack of regulation and forensic procedures of such kind of medium has allowed counterfeiters and other criminals to use such technology for bad purposes. For example, printed documents that are proofs in criminal investigations, such as the ones related to corruption and money laundering, can be found in a suspect's house; fake currency can be printed and distributed in a neighborhood, thus harming the local economy; domestic or international terrorist plans can be found in a facility; pedophiles can print and distribute child porn in order to avoid security agencies control over the Internet; deceivers can fake badges to have access to restricted areas, hitting up the organization and security of events. Finally, such modern technologies in printing and scanning have made counterfeiting easier and more profitable than ever, as counterfeiters can perfectly copy and print packages of fake products to resemble the original ones. Such a problem has made the International Chamber of Commerce raise an alarm of €3.7 trillion losses due to counterfeiting and piracy, with 5.4 million jobs at risk by 2022. Products counterfeiting has also a significant impact on health: according to the World Health Organization, up to half of the malaria medications could be fake.
In the project PrintOut, we aim to tackle the above-mentioned problems by performing research on Computer Vision and Machine Learning solutions for printed document forensics. The project aims to tackle the following problems in research:
(i) lack of cheap procedures using precise statistical models to perform robust Digital Image Forensics on printed documents;
(ii) lack of comprehensive training data for machine learning models
(iii) open-set (or unknown classes) classification
(iv) security/adversarial attacks