Information management and open source intelligence for anti-fraud policy


Specific Objectives

- To support the European Commission's Work Programmes in the fight against fraud;
- To establish early warning (detecting new fraud trends as early as possible);
- To develop an open sources and analysis capacity with a view to offering services in this area;
- To support collaboration and information sharing requirements inside networks of contacts;
- To raise awareness and disseminate JRC know-how and lessons learnt.
Planned Deliverables

Specific deliverables to DGs:

- Software facilities for discovering relevant information, filtering, classifying, pre-processing and/or consolidating data (for selected data sources). Software facilities for data analysis and data visualization including navigation. Structures for knowledge management and sharing;
- Methodologies for cleaning and analysing data;
- Project management, project monitoring and quality assessments;
- Technology watch;
- Specific consultancy on an ad-hoc basis;
- Organization of workshops, establishment of networks, training.

As a result of this work, the following will be produced, organized or compiled:

- Relevant data resources worth analysing (including subject-matter based data warehouses);
- Software applications and tools (including copyrights);
- Technical reports and other publications (most reports are expected to be confidential);
- Workshops involving the use of science and technology for/by the anti-fraud community;
- Ad-hoc advice to policy makers or their partners including technology watch;
- Technical project management;
- Quality assessments;
- Presentations in conferences, specialized forums and other publications.

Specific Deliverables for 2002:

1. (Validation of OLAF's external communications) Following satisfactory completion of a feasibility study we are expecting that work will continue with the detailed analysis phase;

2. (Container traffic monitoring) A pilot exercise will be launched in collaboration with OLAF and selected member state maritime intelligence agencies. The system's database will be populated with container movements for
5 carrier companies. Finally, steps will be taken to improve the reliability and maintainability of the Contraffic application. Data analysis related to container traffic monitoring will continue. The objective of this work will be to identify and analyse exceptions to recurrent patterns and bring these to the attention of domain specialists for validation, refinement of the analysis and further investigation;

3. (Data warehousing) A basic sector-independent data warehouse application for import-export data for products in the COMEXT database will be completed. Experimentation with a number of data mining tools at the end-use tools layer of the data warehousing architecture will continue. Candidates for analysis are: Net map (of Alta) for network analysis and data mining software from SAS (Enterprise Miner) and BO (Business Miner). Finally, the current data warehousing architecture will be web-enabled. The aim of this activity is to satisfy ad hoc requests for data mining services from players in anti-fraud;

4. (Language technologies)
Cross-lingual keyword assignment and cross-lingual document similarity calculation: The objective is to improve our prototype software for assigning descriptors of the multilingual thesaurus EUROVOC. The overall aim of this activity is to give cross-language access to information found in large multi-lingual collections of text documents;
Intelligent Document Retrieval and Analysis (IDoRA): The aim is to further develop the software prototype already developed for the OSILIA project by improving its functionality (including its suitability for new areas of interest), its maintainability, and its usability. An email alert for newly arrived relevant documents will be incorporated;
Name recognition: Other work in this area may include the automatic recognition of products and product groups according to the Integrated Customs Tariff Code nomenclature (TARIC), as well as work towards the automatic recognition of other named entities (persons, companies and locations);

5. (Global audit management system for DG REGIO) We shall be supporting DG REGIO's financial management department in deploying an audit management system for all Structural Funds (SYSAUDIT). Our assistance may take the forms of guidance and other advice, or project monitoring and quality evaluation. Our experience in analysing text collections could be used to identify additional functionalities;

6. (Technology assessment) Ad hoc requests for technology assessment and proactive technology watch activities will be continued;

7. (Fraud control workshop) A two-day workshop will be organized in Ispra bringing together scientists and other specialists in fraud control. The aim is awareness raising and promoting prevention.

Summary of 2001 Deliverables: 31/12/2001

1. (Validation of OLAF's external communications) A feasibility study was launched on behalf of OLAF's IT department aiming at the specification of a general data and workflow architecture for validating different types of (structured) in-coming communications. A requirements document was produced and is being discussed with the customer;

2. (Container traffic monitoring) Development of the software prototype Contraffic continued. Its database was tested with a few million-container movements and a web interface to the system was implemented. The back-end IT infrastructure was considerably reinforced to prepare for hosting a pilot exercise. Exploratory data analysis and knowledge discovery activities were carried out to evaluate the data quality and identify recurrent patterns (e.g. frequent travel routes, including likely stopovers between start and end ports);

3. (Data warehousing) The data warehouse for the textiles sector was web-enabled and the data warehousing infrastructure was significantly strengthened to host production-grade applications;

4. (Language technologies) The software prototype OSILIA, which automatically gathers, analyses and stores textual information from the web, was evaluated for its suitability in other contexts and benefited from some improvements. Collaborations with academic and private organizations active in language technology were established or developed further (e.g. the Universities of Munich and Barcelona, the German company Sail Labs and the Belgian company DMP). Contacts with potential new customers of our work was further developed (Swedish Parliament, European Parliament, a Cabinet, some intelligence agencies);

5. (DG REGIO). Contacts with DG REGIO financial management were developed further with a view to identifying issues and areas where JRC scientific and technical know-how could bring sure benefits;

6. (Technology assessment) Software products were tested for their suitability for projects or following explicit requests (e.g. web technologies like XHTML, XPath and XSLT, or the Copernic 2001 Pro search engine);

7. (Dissemination) Project work was presented in specialized intelligence forums, academic forums and international refereed conferences. The planned joint workshop in fraud-control had to be postponed due to the restructuring of the co-sponsor (OLAF).
Summary of the project

This project is primarily intended to improve the capacity to control fraud through the development of methodologies, software, and the delivery of services in the following areas: strategic-level information analysis, early warning, open source intelligence, technology assessment and dissemination of results.
More specifically, JRC will develop an integrated set of facilities to support a range of open-sources and analysis projects; this will amount to a " laboratory " for data acquisition, preparation and mining. JRC will develop a geographical information system facility for monitoring and early warning purposes; this system will be capable of handling a number of different geo-referenced datasets, of particular interest for customs. JRC will assess relevant communication technologies to implement safe, secure, distributed and open communication networks linking partner administrations involved in the fight against fraud. Finally, JRC will disseminate relevant knowledge to authorized partners and will provide training and specific advice upon request.


For the EU budget to continue to finance EU policies, sound financial management and mechanisms to combat fraud are indispensable. At the same time, public administrations and services like insurance, health care, or finance, are all seriously threatened by fraud today.
While, traditionally, law enforcement has concentrated on reactive instruments, like investigating cases, this project puts the emphasis on fraud control in the form of proactive methods whose objective is to lower the overall level of fraud on payments and revenues.

