Skip to main content
Go to the home page of the European Commission (opens in new window)
English English
CORDIS - EU research results
CORDIS
CORDIS Web 30th anniversary CORDIS Web 30th anniversary

ToothPic, a large-scale camera identification system based on compressed fingerprints

Periodic Reporting for period 1 - ToothPic (ToothPic, a large-scale camera identification system based on compressed fingerprints)

Reporting period: 2015-09-01 to 2017-02-28

Managing large databases of photos is an increasingly bigger problem with sizeable social and financial implications. More than 1 billion Facebook users upload in excess of 350 millions of new pictures per day and 250 billions overall, not to mention other social media sites as well as the pictures in one's smartphone or hard drive that are not shared with others, and these figures are growing steadily. It is hence not surprising that it is very difficult to track down a wide range of improper uses of the photos, such as exploiting them for commercial purposes, infringing copyright, acquiring photos containing unethical or illegal contents, and so forth. The solution to this problem requires the ability to effectively and properly manage large-scale databases of photos; this is an extremely desirable feature for several types of users and has plenty of potential applications with huge economical and legal impact. At the same time, it is a very challenging problem with few feasible technical solutions. Camera identification is a key technology to solve this problem, allowing to link a photo to the device that has shot it. Specifically, any digital imaging sensor leaves its own unique fingerprint in all pictures, and the fingerprint can be detected and compared with a database of known fingerprints. Despite having numerous appealing applications, however, camera identification has been so far underutilized. Indeed, state-of-the-art techniques are limited to a small scale because of the complexity of fingerprint matching and storage required by the fingerprint database. As a consequence, only few individuals have been able to benefit from this technology, and the applications that are more promising from the marketing and social standpoints still remained infeasible.

Based on camera fingerprint compression and search technology developed during an ERC starting grant, the ToothPic project has developed the proof-of-concept of a large-scale camera-based image search engine. The system is queried with the fingerprint of the optical sensor of an imaging device, extracted from its pictures, and returns all the pictures in the database acquired by that device. The search engine is built on a NoSQL/MapReduce distributed database with the capability of performing in-memory persistent operations and efficiently exploiting the availability of fast SSD storage. Its distributed computing features allow to distribute the storage and computational burden on several workstations, streamlining the deployment of the system over a large data center.

The system has been tested over a large database of pictures, which has been created from about 26 million publicly available images downloaded from Flickr in a fully unsupervised way. In order to be able to measure the system performance in terms of accuracy in the image retrieval task, the database has been augmented with the well-known "Dresden Image database", which includes a set of natural and flatfield pictures, labelled with the camera that shot them. The system performance has been measured using the following metrics: Precision, i.e. the percentage of retrieved photos that have actually been shot by the selected camera; Recall, i.e. the percentage of the natural photos of the selected camera that have been retrieved by the search engine; Execution time, i.e. the time taken by the system to complete a query. The database of over 26 million images has been queried with each of the 53 cameras of the Dresden database, obtaining a median precision of 98.9%, a median recall of 93.9% and a median execution time of 2 minutes 29 seconds. The obtained results in terms of precision and recall agree with the results published in scientific literature, where the same labeled database was used as reference. Moreover, the performance are close to those that could be obtained with uncompressed fingerprints. These results have been validated by a panel of independent experts, chosen among academic and industry experts with top-level scientific reputation, as well as prospective users.

The system performance is suitable for many applications requiring large-scale analysis of image databases, and particularly applications in image forensics and social media. It targets the market of software for forensic investigations, allowing investigators to perform camera identification on a large scale. In fact, law-enforcement and intelligence agencies, as well as private professionals, need to deal with an enormous number of photos linked to ongoing investigations on daily basis. The system allows to retrieve all the images acquired by a given device from a collection of millions of pictures. The project outcome is a software package with an intuitive graphical user interface allowing to perform search-by camera. The system is currently being tested on real forensic use cases in collaboration with the local police in Torino, Italy. The search engine addresses a niche market characterized by a relatively small number of potential customers like forensic labs of law-enforcement or intelligence agencies all over the world, having medium or large budgets. Overall, we estimate about 10,000 potential customers in Europe and 30,000 in the US. This is a conservative estimate which does not account for the worldwide market, in particular the Asian one. In Italy, there exist about 1,000 potential targets, including law-enforcement units and private technical consultants for court cases. Other applications are also being considered, e.g. in the field of social media.

Based on the successful results of the proof-of-concept activity, a start-up company named ToothPic has been founded in December 2016. The company is currently incubated by I3P, the start-up incubator of Politecnico di Torino, and has received the qualification of "spin-off of Politecnico di Torino". More information about the company and the related products and applications can be found at www.toothpic.eu.