Skip to main content
European Commission logo
français français
CORDIS - Résultats de la recherche de l’UE
Contenu archivé le 2024-06-18

Small Area Methods for Poverty and Living Condition Estimates

Final Report Summary - SAMPLE (Small area methods for poverty and living condition estimates)

Executive summary:

It is well known that in order to ensure a good allocation of public funds and to guarantee the rights of the statistics final users (government, research institutes, and citizens), statistical data on monetary and supplementary poverty indicators have to be timely and effective. Effectiveness of statistical data is a function of their spatial relevance and accuracy.

Nevertheless, official data are often referred only to wider domains (e.g. NUTS 2 level) and, in some cases, it happens that the finer is the required spatial detail (LAU 1 and 2 level), the less accurate is the estimate. Local governments have to know accurate data referred to and / or small domains in order to:
1) ensure monitoring of poverty and inequality;
2) focus on special targets consisting of segments of population at higher risk of poverty (elusive population);
3) appreciate the multidimensional nature of poverty and inequality with attention to the non monetary aspects of it (social exclusion and deprivation);
4) measure the subjective aspects of poverty as they are perceived by local groups and population.

In this context, the aim of the SAMPLE project (please see online) is to identify and develop new indicators and models that will help the understanding of inequality and poverty with special attention to social exclusion and deprivation. Furthermore, SAMPLE also aims at developing models and implement procedures for estimating these indicators and their corresponding accuracy measures at the level of non-planned domains.

These goals have been achieved:
i) by combining data from national surveys with data from local administrative databases. In particular, local government agencies (LGAs) often have rich administrative data, which can be used for monitoring actions aiming at tackling situations of social exclusion, vulnerability and deprivation. Such data include information on claimants of unemployment benefit and benefits from other social security programs; ii) with the involvement of stakeholders and non-governmental organisations (NGOs) representing people experiencing poverty and which act to prevent poverty. Data of the local stakeholders, who are in direct contact with people experiencing poverty, had a crucial role in order to identify and develop new poverty indicators.

Project context and objectives:

The Lisbon European Council (March 2000), assisted by the indications of Nice European Council (December 2000) and of the Gothenburg Council (2001), agreed to put in place an European Union (EU) strategy aiming at making a decisive impact on the eradication of poverty in the EU countries by the year 2010 and also declared year 2010 as the year of struggle against poverty. One topic of high interest is now therefore the estimation and dissemination of poverty, inequality and life condition indicators. Such indicators can greatly assist in monitoring living conditions and in guiding the implementation of policies that aim at improving the living conditions in the EU Member States. Given the growing social, demographic and economic problems, the research community, policy makers and practitioners place great emphasis on the development of efficient, effective and reliable indicators and on the collection of high quality data on life conditions not only at national level but also at regional and at lower geographical levels.

Within this broader context, the SAMPLE project covers very relevant topics for the statistical knowledge of the European Community members:
i) It utilises the most widely used indicators of poverty, such as head count ratio, average poverty gap, Sen index, and of inequality, such as Gini index, Atkinson index and the Theil's entropy measures. The project pays attention also to the multidimensional nature of the phenomenon and to its non-monetary aspects, as indicated by the fuzzy monetary index and fuzzy supplementary index (Lemmi and Betti, 2006).
ii) It offers knowledge of the local distribution of poverty and inequality as this is measured by new and traditional indicators and by their measures of statistical accuracy.

i) The multidimensional nature of wellbeing and poverty is a widely recognised fact, not only by the international scientific community, but also by many official statistical agencies and international institutions. This fact implies a more complete and realistic vision of this phenomenon and also an increased complexity at both the conceptual and analytical levels. Such a complexity determines the need for tools of analysis and the availability of statistical data that have to be also adequate, complete and reliable. Several theoretical contributions emphasise the importance of the multi-dimensionality, such as the social exclusion theory (Lenoir, 1974; Silver, 1994) and the functioning and capability approach (Sen, 1985). Although some differences have been outlined, the two approaches are strictly intertwined. Regarding the methodology utilised in a multivariate setting, there are essentially two relevant types of approaches. The first consists of theoretical constructions that are based on opportune and consistent logic models of reference, such as the approach that derives from the mathematical theory of fuzzy sets. This technique has been effectively developed also at the dynamic level, allowing the measurement of both persistent and transitory poverty (Cheli and Lemmi, 1995; Lemmi and Betti, 2006). The second approach refers to multivariate statistical techniques (discriminant analysis, factor, cluster, multiple correspondences etc.) and aims to aggregate the 'disperse' information contained in a multiplicity of poverty indicators so as to analyse it within a space of reduced dimension. Both types of approaches have a common goal: to derive meaningful indicators and measures from the basic statistical information.

ii) It is well known that in order to ensure a good allocation of public funds and to guarantee the rights of final users of the statistics (government, research institutes and citizens), statistical data have to be timely and effective. Effectiveness of statistical data is a function of their spatial relevance and accuracy. Often official data are referred only to wider domains (e.g. NUTS 2 level) and, sometimes, the finer is the required spatial detail, the less accurate is the estimate. Local, national and European governments need accurate data referred to local areas and / or small domains (LAU1 and LAU2 levels) in order to:
1. ensure monitoring of poverty and inequality;
2. focus their policies on segments of population at higher risk of poverty, some of them specially elusive;
3. appreciate the multidimensional nature of poverty and inequality with attention to the non-monetary aspects of it, such as social exclusion, vulnerability and deprivation;
4. measure the subjective aspects of poverty as they are perceived by local groups and populations.

Drawing from these premises, the SAMPLE project's aims have been achieved via the following detailed objectives:
I. identification of new indicators of poverty and social exclusion at LAU1 and LAU2 level;
II. development of new models for estimating the alternative indicators at local level. This will be achieved via the integration of data from surveys of living conditions in the EU (EU-SILC) with administrative data from local government agencies;
III. development of new methods to estimate indicators at local level that will make efficient use of the available multiple data sources;
IV. definition and implementation of practices that will involve stakeholders in the process of producing and interpreting the alternative poverty and social exclusion indicators.

Project results:
On the scientific level, the SAMPLE partners have being working towards three different objectives:

1)The identification and development of new poverty indicators
One of the main targets of the SAMPLE project was to identify and develop new indicators and models for poverty and inequality with attention to social exclusion, vulnerability and deprivation. In order to meet these aims, it was necessary to recognise the mechanisms and the determinants of poverty and inequality and to translate them into effective indicators at regional and at local levels. The first stage of the project was concentrated on both reviewing the thematic scientific literature on poverty indicators and on developing new multidimensional, fuzzy measures of poverty and measures of changes in poverty.

On one hand, partners of work package (WP) 1 elaborated a manuscript reviewing the scientific literature on poverty indicators. The following topics has been intensively studied and reviewed: poverty indicators in fuzzy and non-fuzzy approach, pooled estimates of indicators, poverty and inequality measures for regional and local governments. The final version of the literature review has been completed by the end of February 2009 and was made available to the other partners for the cross-reading activity.

On the other hand, the WP1 partners worked at developing re-sampling methods for variance estimation for multidimensional measures of poverty with particular attention to the Jackknife repeated replication (JRR) and bootstrap methods. The working group was also deeply involved in the development of SAS and R programs for poverty measures and variance estimation to be distributed at EU level. In this context, the main objective was the integration of the classical indicators of poverty with the definition of fuzzy monetary and supplementary indicators.

Another objective of WP1 has been pooled estimates of indicators, i.e. the construction of poverty measures at local level from several waves and the comparison between different EU-SILC waves results with focus on the local longitudinal changes. Methodological aspects, in particular concerning cumulation over space and time from repeated multi-country surveys, have been provided taking illustrations from European social surveys and simple models have been developed to illustrate the effect on variance of pooling over correlated samples.

Results have been provided for Poland, Czech Republic and Italy, providing an improvement in sampling precision.

More in details, concerning the development of innovative methodology, the WP1 partners focused their activity on developing new multidimensional and fuzzy measures of poverty and re-sampling methods for variance estimation for these measures. They introduced alternative propositions of definition of deprivation indicators to the education, the labour market and the health condition dimensions. New multidimensional approach to the poverty measurement introduced by CRIDIRE (Integrated Fuzzy and Relative Approach - IFR) was expanded by new fuzzy measures of the depth of relative poverty (monetary poverty) and deprivation (non-monetary poverty). Concerning the re-sampling methods for variance estimation, JRR and bootstrap methods were adopted and implemented. SAS and R program was written for application of JRR method of variance estimations for poverty indicators and SAS programme was written for application of bootstrap method of variance estimations for poverty indicators. Original SAS and R programmes for calculation of the new fuzzy and non-fuzzy approach poverty measures were written.

Practical calculations using developed methodology and programmes were carried out to compare the degree of poverty and deprivation using the fuzzy approach. These have been applied to EU-SILC countries data for 2004, 2005, 2006 and 2007. Then an application to the Polish regions in 2007 and in the Polish regions and the Italian regions in 2008 has been done. The bases for the analyses conducted were the micro data from EU-SILC (2007 and 2008). The results of estimations (fuzzy measures with their standard errors) show that poverty in Poland and Italy has many dimensions and that its measurement based solely on monetary variables is highly insufficient.

In addition, the group introduced alternative proposition of assessment of the changes in poverty applying the developed IFR approach. The proposed methodology ensures that the changes in poverty are subjected not only to the changes in distribution of household incomes (in monetary poverty dimension) and in distribution of the values of deprivation symptoms (in non-monetary poverty dimensions) but also to changes in the levels of real household income and the values of deprivation symptoms. Moreover, the re-sampling method of standard errors estimations (bootstrap method) was developed. Original SAS programs for the assessment of changes in poverty using the on new (multidimensional and fuzzy) and classical (unidimensional and non-fuzzy) measures were written. SAS program was also written for application of bootstrap method of standard errors estimations to analyses the changes in poverty. The developed methodology and programs were employed in assessment of the changes in poverty in Poland for years 2005-2008.

Concerning pooled estimates of indicators, the work addresses some statistical aspects relating to improving the sampling precision of such indicators for subnational regions in EU countries, in particular through the cumulation of data over rounds of regularly repeated national surveys. EU-SILC data have been used for this purpose. A standard integrated design has been adopted by nearly all EU countries. It involves a rotational panel in which a new sample of households and persons is introduced each year to replace one quarter of the existing sample. Persons enumerated in each new sample are followed-up in the survey for four years. The design yields each year a cross-sectional sample, as well as longitudinal samples of various durations. Two types of measures can be so constructed at the regional level by aggregating information on individual elementary units: average measures such as totals, means, rates and proportions constructed by aggregating or averaging individual values; and distributional measures, such as measures of variation or dispersion among households and persons in the region. Average measures are often more easily constructed or are available from alternative sources. Distributional measures tend to be more complex and are less readily available from sources other than complex surveys; at the same time, such measures are more pertinent to the analysis of poverty and social exclusion. An important point to note is that, more than at the national level, many measures of averages can also serve as indicators of disparity and deprivation when seen in the regional context: the dispersion of regional means is of direct relevance in the identification of geographical disparity. Survey data such as from EU-SILC can be used in different forms or manners to construct regional indicators. Results for Poland, Czech Republic and Italy have been elaborated.

Another crucial part of the SAMPLE project was the active involvement of third sector organisations in the proposal of new indicators.

PP-UROPS acquired some of the non-profit organisations databases and integrated them with administrative archives, retrieving with a special survey (Delphi method) their point of view on poverty and social exclusion and on the usefulness of poverty indicators. All data and indicators were considered referring to a multidimensional approach.

In this context, task 1.4 was dedicated to the feedback with local stakeholders on indicators for local government. The partners involved have conducted a survey among local stakeholders in order to know their point of view on poverty and social exclusion and on the usefulness of poverty indicators. As far as the selection of the stakeholders, the WP1 partners involved in this task decided to include all institutional and non institutional organisations carrying out actions against poverty in a multidimensional sense (not only against extreme poverty, but also against social exclusion). Organisations not acting directly, but having a particular viewpoint on this phenomenon were involved too. At the end of this process 690 (six hundred ninety) stakeholders were selected.

The instrument used for the survey was an online questionnaire. The respondents were 252 stakeholders, of which 42,2 % are associations; the 37,2 % public administrations; the 10,3 % Parishes and Caritas centres. Then there are social cooperatives 9,3 % and informal groups 1 %. The questions proposed in the survey to local stakeholders covered the following fields:
- the information system used;
- their opinion about classic poverty indicators;
- their opinion about poverty situation;
- their proposals about new poverty indicators;
- their opinion about the creation of a poverty observatory.

About the poverty indicators, the 68,2 % of local stakeholders considers that indicators are very useful for the planning of social policies. The 42,1 % thinks that they are also very useful for the implementation of their activities. They also suggest new indicators to monitor poverty: the debt (which is the more relevant indicator), the quality of food and the quality of housing. They also emphasise the difficulties in payment of utility bills due to low incomes and the phenomenon of job insecurity. Another suggested indicator is the capability of access to services, which is the knowledge and usability of services directed to citizens in distress.

Other indicators are those created by the Tuscany Regional Network of Social Observatories to build the Health Profile at local level: demographic profile, economic profile, health condition, elderly, families and youngsters, immigration, disability, mental health, dependences. Within the SAMPLE project, we created a link with this regional activity and we selected all the indicators that have a connection with the phenomena of poverty and social exclusion (most of all).

The methodology used to conduct the interviews was the Delphi method, i.e. a systematic, interactive method which relies on a panel of experts. The experts were asked to answer to the questionnaires twice. After each round, a facilitator or 'administrator' made a summary of the results of the previous round anonymously and explained the reasons of their answers. In this way, the experts were encouraged to revise their first answers taking into consideration the replies of other members of their panel.

The Delphi method relies on the believe that during this process the range of the answers decreases and the group converges towards the 'correct' answer.

2) Local agreements for data access and the treatment of administrative data

As for the availability of data, a major effort has been done by partners involved in WP1, WP2, WP3 in order to establish, on one hand, agreements with national statistical offices. On the other hand, to activate and consolidate contacts with important local public agencies in order to have access to their administrative local databases.

In particular, the partners of WP1 have established agreements with their national statistical offices: Polish GUS gave access to the project (all consortium members) to Polish LFS, Household Budget Survey (HBS) and EU-SILC micro-data. Moreover, the University of Siena gave access to the project to EU-SILC micro-data. Finally, in the first and second semester, the Province of Pisa has supported Istat for the sampling design and selection of 650 households to be interviewed, for contacts with managers of statistical offices of municipalities, for information and dissemination about EU-SILC oversampling.

As far as concerns the partners involved in WP2, a lot of effort has been placed in obtaining real data from Statistical Offices of Italy and Spain. They have mainly got survey data from the Spanish and Italian EU-SILC for years 2004-2006. Other statistical sources have been employed to complete the data files of auxiliary variables.

As it is well known, in EU countries there is great lack of statistical information on income and living conditions below the NUTS2 level. These administrative data sources can be exploited to obtain estimates of living conditions and poverty indicators. However, some of them produce biased estimates because they are referred only to people eligible for obtaining a service (i.e. medicare, pensions and enrolment in social programs). Therefore, the production of periodical and low cost estimates based on administrative data requires a first correction step. These estimates are very useful to support local social policies by aiding the monitoring of poverty and social exclusion.

Specifically, the aims of WP3 have been:
1. the definition of living conditions indicators based on administrative data;
2. the collection and exploration of administrative and NGO databases locally available in some focus areas;
3. the definition of standard operating procedures and models that could be applied 'to weight' indicators obtained from administrative databases in order to correct for the potential self-selection bias in administrative data;
4. signing agreements and wide partnerships with local authorities to accomplish these objectives and to set up an observation system supporting local policies.

Concerning the first point:
a) we integrated the SAMPLE project in a process, coordinated by Tuscany region, of selection of indicators useful for planning. At the end of this process, we arrived at a list of 250 indicators useful to estimate health's and social state of the population: demographic profile, health's determinants, health state, essential levels of sanitary assistance, social and sanitary assistance;
b) we asked to our local stakeholders (with a survey) their judgment about the relevance of poverty indicators and especially about the European indicators and their proposals about new indicators.

Concerning the second point:
The SAMPLE consortium got access to the following databases concerning the Province of Pisa:
- the entire job centre database (IDOL);
- part of the revenue agency database (SIATEL).

For what concerns the third sector, the consortium obtained the access to the Caritas database (MIROD).

Moving from these databases a set of indicators were calculated for analysing the living conditions of two population subgroups: the 'socially-integrated' group, composed mainly by people with a house and / or a job; the 'socially-emarginated' group composed mainly by Italian homeless and migrants. For what concerns the first group, a set of indicators concerning income and unemployment were calculated. For what concerns the second group, it was possible to depict the profile of the Caritas services typical user. It is worth stressing the essential contribute of this last source in covering a segment of population generally not covered by official statistics.

Concerning the third point:
We have built an integrated database for the year 2008 having PI-SILC (the EU-SILC oversampling for the Province of Pisa) as the core dataset and the linked administrative data sets (IDOL and SIATEL) as satellites for in depth analysis on specific aspects (labour, income, taxes). Thus, for a set of individuals (i.e. the PI-SILC units linked to the IDOL and SIATEL databases), information is available from each of the integrated data sources (see deliverable 11). For the year 2009, we have collected and analysed data coming from the PI-SILC and a set of administrative databases not usually used for statistical purposes.

Due to confidentiality issues, the consortium could actually access less than the expected amount of administrative data sets. This changed partially the objectives of the project in two respects. On the one hand, it reduced considerably the quantity of information especially on pensioners. On the other hand, it did not allow for running a proper integration procedure between the PI-SILC and the administrative databases (SIATEL and IDOL).

It is worth reminding that the final outcome of task 3.3 was the building of a model for estimating administrative indicators corrected for self selection bias. The method (see deliverable 9) relied on the possibility of linking the PI-SILC and the administrative databases at individual level.

As expected, the attempt to match the PI-SILC and MIROD data resulted in very few matched units given the different populations covered by the two data sources. As a consequence, the Caritas data were discarded and only the SIATEL and IDOL databases were used in the integration procedures.

Due to confidentiality issues, some demographic variables were removed from PI-SILC, IDOL and MIROD making the linking procedure more difficult. Thus, the integrating process results were not completely satisfying. Partially changing the task 3.3 objective, we decided to focus on the matched dataset in order to:
i) gross up administrative-based indicators using the PI-SILC sample weights; analyse the administrative-based indicators taking into account the characteristics of the households whom the individuals belong to (see deliverable 11 for methods and results).

Concerning fourth point:
Within the SAMPLE project PP-UROPS created a first local observatory on poverty, vulnerability and social exclusion. The focal achievements are:
- a formal agreement between the Province of Pisa and three local Caritas Agencies for a permanent monitoring of poverty and social exclusion, accessing to the data of Caritas counselling centres. Caritas is one of the most important international organisation that implements actions to contrast poverty;
- the involvement of 252 stakeholders (institutions and third sector organisations) to create a local network of qualified 'antennas' on the territory. The main instrument for their involvement has been the above mentioned survey using the Delphi method (please see online see D10) and the participation in the development of the project software.

3)The development of innovative methodology

The context of the contribution of WP2 partners to the SAMPLE project is framed within the small area estimation of the FGT poverty measures, introduced by Foster, Greer and Thornbecke (1984). This is done using extra information coming from:
a. spatial dependence in the areas: we proposed to borrow strength from space using spatial Fay-Herriot models, which are area level models in which the area effects follow a simultaneously autorregresive (SAR) process.
b. unit level data: these kind of data contains very relevant information that could be used to find good small area estimators of FGT poverty measuyres through the use of unit level models such as the nested error linear regression model of Battese, Harter and Fuller (1999).
c. spatio-temporal correlation: we proposed to find some area level spatio-temporal model which helped to borrow strength from time and space.
d. time dependence in the areas: we proposed to borrow strength from time using area-level linear mixed models with time-dependent random effects.
e. time dependence in the areas and population groups: we proposed to borrow strength from time within population groups (mainly, sex groups) by using area-level linear mixed models with time-dependent random effects. Here the time dependency may vary from group to group.

f. time dependence in the individuals: we proposed to borrow strength from time using unit-level linear mixed models with time-dependent random effects. Further, we propose to take advantage of the hierarchical structure of the population for employing unit-level models at territories (units) with low level of aggregation.

The objectives of WP2 are:
- development of methodology for the estimation of FGT poverty measures in small areas using the information described in (a)-(g);
- development of methodology for the approximation of the mean squared errors of the obtained small area estimators of FGT poverty measures;
- carrying out simulation studies to analyse the performance of the developed methodology and compare it with existing procedures;
- applying the obtained results to Spanish and Italian data from the Survey on Income and Living conditions to estimate FGT measures in Spanish and / or Italian areas (regions, provinces, municipalities);
- development of free software to apply all the proposed procedures, including complete software documentation (R routines).

Within WP2, the following science and technology (S&T) results have been developed:
- Methodology for small area estimation of FGT poverty measures using spatial Fay-Herriot models, together with methods for approximation of mean squared errors of small area estimators based on parametric and nonparametric bootstrap. Free software with appropriate documentation has been created.
- Methodology for small area estimation of FGT poverty measures using unit level models through the empirical best / Bayes (EB) method with Monte Carlo approximation. A bootstrap procedure has been proposed for the estimation of the mean squared errors of small area estimators based on EB method. A faster version of the method is also developed. Free software with appropriate documentation has been created.
- Methodology for small area estimation of FGT poverty measures using area level spatio-temporal models, together with a parametric bootstrap method for approximation of mean squared errors of small area estimators obtained from that model. Free software with appropriate documentation has been created.
- Methodology for small area estimation of FGT poverty measures using time-dependent linear mixed models, together with methods for approximation of mean squared errors of small area estimators based on parametric and nonparametric bootstrap. Free software with appropriate documentation has been created.
- Methodology for small area estimation of FGT poverty measures using partitioned time-dependent linear mixed models, together with methods for approximation of mean squared errors of small area estimators based on parametric and nonparametric bootstrap. Free software with appropriate documentation has been created.
- methodology for small area estimation of FGT poverty measures using unit level linear mixed models, together with methods for approximation of mean squared errors of small area estimators based on parametric and nonparametric bootstrap. Free software with appropriate documentation has been created.
- Methodology for small area estimation of the income distribution function and of FGT and / or Fuzzy poverty measures using linear, robust, nonparametric and geographically weighted M-quantile models, together with analytical or bootstrap estimators of mean squared error of small area estimators. Free software with appropriate documentation has been created.

Applying small area methodologies to the Italian survey EU-SILC 2007: some results

DSMAE and SOTON / MANCHESTER have developed small area methodologies based mainly on M-quantile models. In particular, they proposed linear, robust, nonparametric and geographically weighted M-quantile regression models. These methodologies have been applied to obtain estimates at provincial and municipal levels (LAU1 - LAU2), using data from the Italian survey EU-SILC 2007. They focused on three Italian regions: Lombardia, in the North of the Italy, Toscana, in Central Italy and Campania, in Southern Italy. The choice of these three regions, out of the 20 existing regions in Italy, is motivated by the geographical differences characterising the Italian territory. In particular, DSMAE and SOTON / MANCHESTER have investigated the so-called 'north-south' divide characterising the Italian territory, since each of the three regions can be considered as representative of the corresponding geographical area of Italy (Northern, Central and Southern / Insular). The main result is the higher incidence of poverty in the provinces of Campania. For this region the estimates of the incidence of poverty, i.e. the percentage of households below the poverty line (EUR 9504, corresponding to the 60 % of the national median income), are between 25 % and 44 %. Unlike Campania, in Lombardia and in Toscana the ranges of the incidence of poverty are between 10 % and 19 %, and 11 % and 26 %, respectively. Moreover, the estimated mean equivalised income also suggests a gap between these three Italian regions and their provinces.

Applying small area methodologies to the Spanish surveys EU-SILC 2004-6: some results

The small area methodologies developed by UC3M and UMH have been applied to data from the Spanish surveys EU-SILC 2004-6, focusing on the whole set of the Spanish provinces. The aim of the applications is to estimate poverty indicators by province and sex. A wide variety of procedures have been applied, including empirical best prediction (EBP) and empirical best linear unbiased prediction (EBLUP) method based on time, space and spatio-temporal mixed models. From the socioeconomic point of view, UC3M and UMH have investigated the so-called 'North-East to South-West' division characterising the Spanish territory. From the mapping of poverty indicators at province level, the Spanish regions could be classified in three subsets. First one contains the regions with the lowest poverty incidence, like Cataluña, Aragón, Navarra, País Vasco, Cantabria and Baleares. The second subset contains the regions having an intermediate position, like Galicia, La Rioja, Castilla León, Asturias, Comunidad Valenciana and Madrid. Finally the third subset contains those regions with higher poverty incidence, like Andalucía, Extremadura, Murcia, Castilla La Mancha, Canarias, Canarias, Ceuta and Melilla.

The 2008 EU-SILC oversampling for the Italian Province of Pisa: some results

DSMAE estimated the mean and median equivalised household income, the rate of households declaring to be unable to face unexpected financial expenses and other poverty indicators using data coming form the EU-SILC 2008 oversampling for the Province of Pisa. All these estimates are computed at provincial level, that is for the Province of Pisa, and also at a finer geographical level, namely for the five 'health societies' of the province of Pisa.

The mean of the equivalised household income was equal to EUR 18 820, in the Province of Pisa in 2007. In terms of income quantiles, the 20 % of the households in the Province had an equivalised income under EUR 11 000, the 50 % under EUR 16 707 and the 80% under EUR 23 576. As far as concerns some of the main household characteristics, in the Province of Pisa an increasing level of education of the head of household corresponds to a higher level of household income, both in terms of mean and percentile estimates. In terms of gender, if the head is a male, the estimated mean household income is significantly higher, around EUR 19 500, with respect to the nearly EUR 15 500 estimated for the households where the head is a female.

The head count ratio (HCR) or at-risk-of-poverty-rate of the Province of Pisa is equal to 15.8 %. The HCRs estimated for four out of the five health societies in the Province are very similar to the provincial estimate, even if characterised by higher estimated standard errors.

The main result standing from the oversampling direct estimates is that the economic, poverty and social benefits indicators of the five health societies of the Province of Pisa are characterised by certain variability. Thus, computing the direct estimates only at provincial level would mask the important differences emerging when repeating the analysis at a more detailed geographical level.

Another interesting result is the variability between the different computed indicators. In particular, areas characterised by low mean and median household income estimates can be characterised by low estimated discomfort indicators. That is, direct income estimates and indicators of perceived economic discomfort can give different indication of the poverty and living conditions in a given area. Thus, it is important to always consider both types of indicators when analysing the areas of interest.

4) A data-centred social application

Within SAMPLE, SR developed a web-based software implementing one of the core part of the project (please refer to D 23 for a detailed description of the software). In fact, the main goal of WP 4 was to develop an application, whose objectives would be:
- to give a real-world application of SAMPLE results;
- to support policy-making and implementation at the local level;
- to contribute to the public awareness by creating information-rich, easy to use, easy-to-understand graphics (unveiling the meaning of data without hiding their complexity);
- share knowledge and improve local capabilities.

The software is meant to feed the local policy makers with robust, disaggregated and up-to-date indicators for a knowledge-based planning. This web application (portal) intends to be the entry point for social inclusion activities in the province of Pisa. In this perspective, we involved from the beginning all local stakeholders in the application design, asking them for suggestions about the desired software layout and functionalities.

At the end of the project, the software will be used by the local observatory on poverty and social exclusion to monitor Laeken indicators and other relevant selected social indicators at province (LAU1) and municipality (LAU2) level. The aim is to improve the knowledge of the phenomena and the local social policies, increasing participation and collaboration between all local stakeholders involved in fighting and preventing social exclusion.

The software allows to store and to update the raw administrative data and the automatic processing and calculation of most indicators. The core engine of the software uses some advanced R functions and SAE algorithms, developed in WP 2, using administrative and survey data to estimate living condition variables (such as income, consumption, savings, poverty, housing, etc.) and improving the quality of the estimations allowing the calculation of poverty indicators. Correction algorithms, developed in WP 3 and based on EU-SILC samples, will be used to minimise the bias in the estimates from administrative sources.

This software package has three main modules:
- database module;
- computing algorithms;
- graphical reports and indicators.

Data management and reporting

The first module of the application, the core of the system is the data and reporting module.

A slick web interface allows administrative users to easily load raw statistical data (EU-SILC, administrative data and others), and applies R engine calculations by selecting a few input parameters. We implemented the methodologies and the R routines developed in WP 3 for data linkage between EU-SILC oversampling data and administrative databases. We will obtain two kinds of indicators:
1) indicators from EU-SILC survey;
2) unbiased indicators from administrative data.
The calculated data (indicators) will be saved on a MySQL database. This database will be then exposed as a web service with a Google wire protocol compliant data source, available for everyone to query it.

As for the data visualisation, again we used Google tools to reach the most widespread audience: Google visualisation APIs allows embedding interactive charts, graphs or other graphics on any webpage, easily, even by non skilled and non technical users.

Using the available visualisation applications, the general public will be able to create reports and dashboards based on the SAMPLE database.

On the front-end, our portal application itself uses these technologies: we show our data in interactive graphic visualisations made by using the same visualisation APIs.

Social networking tools

The application is based upon a social networking tool, to build upon the involvement and collaboration of local stakeholders. That is why we primarily asked their interest to take an active role in the building of the web portal.

Our platform of choice is the ELGG framework (please see online), which comes with advanced user management and administration, social networking features, cross-site tagging, a powerful ACL, internationalisation support, and more.

Website registered users will be able to:
- create their own interest groups and blogs inside the portal;
- communicate with micro-blogging tools (à la Twitter);
- share documentation with team (group) mates;
- upload publicly available documentation;
- discuss and comment (almost) every piece of content (news, topics, graphics and so on).

Within these foundations, a version two of the platform could allow users to create their own data mash-ups straight inside the application. Even in this first version, people will be able to use third party visualisation tools like the above mentioned ones.

News and documentation

The third module (front-end) disseminates and shares the data and the indicators by creating information-rich, easy to use, easy-to-understand graphics (unveiling the meaning of data without hiding their complexity). The front-end is empowered by social tools (blog, feed, personal profiles, etc.) aiming at boosting public awareness and active participation among local stakeholders (please see online).

Apart from data gathered within the SAMPLE project, the application is a repository for collecting various documentation, other statistics and news of stakeholders' interest.

Most information will be open to the general public for viewing and downloading:
- statistics and indicators;
- official programming acts and regulations;
- useful links and resources;
- registered users 'open' discussions.
It is open to debate whether to let people view everything as guests, or require a simple registration (only an email address) to enter the portal. This is one of the choices we will take according to our stakeholders' opinion.

Potential impact:

One of the main targets of the project was the development and study of new indicators of poverty and living conditions alongside existing indicators. The newly defined multidimensional and fuzzy measures of poverty and deprivation measures offer new knowledge to research on the measurement of these phenomena and therefore contribute to changes to the standard ways of measuring poverty and deprivation.

The tools developed by WP 1 partners, i.e. the WSE team, in cooperation with CRIDIRE and CES-GUS, can assist in identifying problematic areas of different poverty and deprivation dimensions and thus in implementing area specific policies. The deliverables by the WSE team and the partners work provide the regional and the local governmental agencies with reliable indications about the actual dimensions of poverty and deprivation and therefore will impact their routine activities.

Concerning variance estimation of poverty indexes, Professor Verma has taken part in the '2010 International conference on comparative EU statistics on income and living condition' (Warsaw, 25-26 March 2010) where he has showed two papers shared with Professor Gianni Betti: 'Sampling and non-sampling errors in EU-SILC' and 'Robustness of some EU-SILC based indicators at regional level'.

Results concerning the fuzzy multidimensional approach to poverty have been presented: at the conference 'New indicators and models for inequality and poverty with attention to social exclusion, vulnerability and deprivation' in Elche (June 2009), at the NTTS Eurostat Conference 2010, at Siena SAMPLE Seminar (October 2010) and NTTS Eurostat Conference 2011.

Results concerning pooled estimates of indicators ha been presented at the following conferences and seminars:
- conference 'Social challenges under demographic change and economic transformation' (Warsaw, December 2008);
- workshop 'Metodi quantitativi per l'analisi delle condizioni di vita: nuove concettualizzazioni, stime statistiche e procedure operative' (Modena, 30 January 2009);
- 45th Scientific Meeting of the Italian Statistical Society (Padua, 16-18 June 2010);
- ITACOSM 2009 Siena Conference;
- Siena SAMPLE Seminar (October 2010);
- NTTS Eurostat Conference 2011.

1. Tomasz Panek presented the multidimensional assessment of poverty in Poland by voivodships in 2007 during the press conference dedicated for mass media (TV, radio, newspapers), organised by the central statistical office (December 2009). The presented empirical results based on the methodology worked out within the SAMPLE project.
2. WSE participated in the organisation of the open scientific seminar (March 2010) during which the SAMPLE project objectives and the stage reached by the international consortium in the project performance were presented. In the seminar took part researches from the universities, staff members both the central statistical office and regional offices as well as representatives of the Ministry of Labour and Social Policy, Chancellery of the Council of Ministers and Mazovian Centre for Social Policy. Moreover, the WSE and CES-GUS team (J. Kordos - WSE, A. Zieba-Pietrzak - WSE, R. Wieczorkowski - GUS) presented the paper entitled 'Comparison of two bootstrap methods of standard error estimation for some poverty measures'.
3. Tomasz Panek presented the paper 'Multidimensional analysis of poverty in Poland in the period 2005-2008' in the seminar on poverty in Poland organised by the central statistical office (February 2011). The published text of presentation was distributed to central governmental bodies, local governmental bodies and research centres.

Finally, from the policymakers point of view, the deliverables of the project will impact the routine activities of the local governmental agencies and will provide NGOs with fresh and reliable indications about the actual consistency and dimensions of poverty at a useful local level. This impact (local governmental agencies and NGO impact) should be realised in a general manner: in other words the European perspective used in the project assures that the day by day activities of these agencies will follow a European standard in monitoring poverty instead of local best practices.

Among the outputs of WP 1, closely related to WP 3 (task 3.4) it is crucial to highlight another important sample output which has a strong political and societal impact potentiality: the design of the observation system to monitor poverty and social exclusion. In this context, task 1.4 was dedicated to the feedback with local stakeholders on indicators for local government. The partners involved have conducted a survey among local stakeholders in order to know their point of view on poverty and social exclusion and on the usefulness of poverty indicators. The most important results of the survey are:
i) the activation of an important local network of associations, public administrations, parishes, counselling centres that are involved in local actions against poverty; and
ii) the creation of a mailing list of more than 200 stakeholders that want to share a web space, documents and good practices on actions against poverty.
The results of the survey are disseminated by the Province of Pisa website (please see online).

Furthermore, within the SAMPLE project and in collaboration with the Tuscany Region, a set of indicators have been developed which will be used by health societies for their planning of social policies.

Now the next challenges to pursue are the following:
1. To disseminate our experience at regional level. The Tuscany Region has created a group of coordination of actions against poverty and the Province of Pisa will use this experience to disseminate the SAMPLE results. One of the aim of this regional group is the opening to European experiences and good practices exchange.
2. To animate the local network. The Province of Pisa intends to better involve the network of qualified stakeholders created within the SAMPLE project.
3. Finally, PP-UROPS and SR are willing to disseminate to other local policymakers the data, the indicators and the software to monitor the Laeken indicators and other relevant selected social indicators at province and municipality level.

On this level, the potential impact directly related to WP 3, can be summarised as follows:
- a deeper knowledge of poverty and living conditions in the Province of Pisa at sub provincial level;
- a better knowledge of the accessed administrative databases for statistical research purposes. The acquired experience could be useful for building more harmonised and statistical-oriented databases;
- standardisation of data collection and data dissemination;
- developement of new services based on public data and crowdsourcing;
- improvement of policies through a more evidence based social planning;
- enhancing local network capabilities of understanding and acting against social exclusion;
- involvement of citizen in social policy planning;
- citizens - local authorities interactions;
- promotion of the new model of politics: government 2.0;
- more transparency and accountability;
- better use of public resources;
- better policy impact evaluation;
- more trust in public local authorities.

Furthermore, the tools developed in this project will assist in identifying problematic areas and thus in implementing local specific policies. In this respect, the participation of research institutions from new and older EU Member States enabled the use of different data sources for examining the robustness and applicability of the methodology.

In this context, the potential impact of the contribution of WP 2 to society is huge: the measurement of poverty in a compulsory step before trying to administer policies to reduce regional inequality and to enhance the economic convergence of European regions. The developed methodology has been applied to European data, but it be potentially applied in practically any country in the world with valid statistical data. Moreover, some of the developed methodology is so general that it has potential use in other fields such as health, allowing to estimate practically any non-linear measure in small areas. The EB method developed within the SAMPLE project have been proved to improve significantly existing procedures that were well established in the World Bank.

As the distribution of direct estimates of non linear parameters is often close to normality, the introduced area-level models, developed within the SAMPLE project, are widely applicable.

The methodology developed by the consortium could be applied to other areas of social research. The small area estimation techniques that this project has developed can be employed for example, in estimating educational inequalities at finer levels of geography. This can be achieved by integrating data from sample surveys that collect information on educational outcomes and administrative data such as the Pupil Level Annual School Census (PLASC), which is a census of students in the United Kingdom (UK). Another area of potential application is in estimating health related outcomes also at finer geographical levels. The project will further advance the understanding of record linkage processes required for integrating administrative with survey data and the problems associated with administrative data such as self-selection bias. The knowledge gained during this project will then be transferable to other areas of social research that require integration from multiple data sources. Last but not least, the tools that this project will develop - software, questionnaires, guides of good practise - can then be employed for studying other social phenomena.

The methodology produced by the project is in depth tested in the Pisa Province (Italy). It can be extended outside this region because:
a) the small area models that will be developed are general i.e. are not restricted to data from a specific European country;
b) the flow-chart of the procedure used to (i) find registers and administrative sources on poverty and deprivation, (ii) ensure logical integrity of data, (iii) change of definitions and the sense of variables, (iv) integrate the administrative data with the data from EU-SILC and (v) identify and correct the eventual self-selection bias, is general and not restricted to data from a specific European country.
Depending on the availability of the data, the geographical level at which the output will be produced might change from country.

There are additional developmental benefits that may results from the project implementation, including spin-off and demonstration effects, which are likely to influence good practice in LGAs.

In case that we do not get the expected survey and administrative data, we could overcome the problem by using confidential data in a safe setting. The team has extensive experience in using confidential data in a safe setting. For example, the partner from the University of Manchester is currently using small area micro-data in a safe setting in the Office for National Statistics Headquarters in London. In case small area micro-data cannot be released, we can apply the same procedure by coming to an agreement with the respective national statistical institute, NGOs and LGAs.

WP4 has developed a web-based software implementing some of the main deliverables of the project. It will feed the local policy makers with robust, disaggregated and up-to-date indicators for a knowledge-based planning. This web application (portal) is meant to be the entry point for social inclusion activities in the province of Pisa. In this perspective, we involved from the beginning all local stakeholders in the application design, asking them for suggestions about the desired software layout and functionalities.

The software will be used by the local observatory on poverty and social exclusion to monitor Laeken indicators and other relevant selected social indicators at province (LAU1) and municipality (LAU2) level. The aim is to improve the knowledge of the phenomena and the local social policies, increasing participation and collaboration between all local stakeholders involved in fighting and preventing social exclusion. (unveiling the meaning of data without hiding their complexity). The front-end is empowered by social tools (blog, feed, personal profiles, etc.) aiming at boosting public awareness and active participation among local stakeholders.

The long-term outcome of WP4 is to improve the understanding of the social exclusion processes at a local level by all people working in the field of social policy making. This web application (portal) will be the entry point for social inclusion activities in the Province of Pisa.

The target of this WP6 has been to ensure that appropriate dissemination of all project results has been undertaken and results are disseminated within the research network in a meaningful way. This will be achieved first of all by encouraging discussions between public actors, and by presenting results in ways that will be useful to policymakers at various levels (that is national, regional and subregional level).

Concerning our website, at this moment (13 April 2011) we get to 14 159 contacts with a mean.

We disseminate SAMPLE results in all the countries of SAMPLE partners and also at European level. We participate also to the campaign of '2010 European year for combating poverty and social exclusion'.

Particularly in Italy, we obtained good results in involving institution and third sector organisations. Within SAMPLE project we created a social network of more than 250 stakeholders that are active involved in policies against poverty and social exclusion.

In Italy, we also disseminated our experience at regional level. The Tuscany Region has created a group of coordination of poverty activities and we'll use this experience to disseminate our results.

The SAMPLE dissemination phase started during the project kick- off meeting, which took place in Pisa on 15-16 May 2009. In this occasion, the PP-PC organised a local press conference and presented a draft of the SAMPLE communication plan to the consortium general assembly.

The first activity of dissemination was the redaction of a dissemination plan (deliverable 3) in which were synthesised the main tools, procedures, and achieve to maximise the impact and the activities of dissemination of activities project. In particular, there are individuated in details the phases of dissemination, an analysis of target groups to join with project activities, the general rules to follow by all partnership, the communication tools, the expected outputs, and the reporting activity.

For this last point was predisposed a model of report dissemination activities to be used by all partners to monitor the local event or the presentation of sample project in other important specific meetings and conferences.

Our main tools have been products like the brochure, the website, journal articles, reports, papers and activities like project meeting, participation to conferences, academic courses.

A) Brochure
Print and diffusion of brochures in English and Italian.

B) Website
Website management and updating. The website has been implemented during the life of the project with the contribution of some partners. The language is English. It has an intranet section and an external area. We used intranet section for communication between partners, file sharing, internal agenda, etc. Each partner had the possibility to edit news and event, to put on line documents, report, all materials product in the inside meeting. In the external area, we published all the documents about the project, the presentation of SAMPLE partners, all the deliverables, news, the agenda of SAMPLE meetings, the power point presentations, etc. We inserted also many European useful links and many useful links from partners' countries: some partners sent us a list of contacts (reviews, website list) both at institutional and not institutional level.

C) Project meetings and participation to conferences
In the first semester of the project, we predisposed an agenda of internal calendar of project meeting and a calendar of external meeting in which partner will participate and in which spread out the objective and first result of project.

We did five project meeting:
1. kick-off meeting (15-16 May 2008);
2. Brussels meeting (17-18 February 2009);
3. Elche (2-3 July 2009);
4. Varsaw (24 March 2010);
5. Siena SAMPLE meeting (4-6 October 2010);
6. Brussels (24 February 2011).

In some cases, project meetings were inserted or linked to important international meetings:
- 'New techniques and technologies for statistics' (NTTS 2009), Brussels 2009;
- 'SAE2009 conference on small area estimation', Elche 2009;
- 'Forum of inequalities' (Siena);
- 'New techniques and technologies for statistics (NTTS 2009), Brussels 2011.

D) Press conferences
We did two press conferences: the first was in Italy, at the beginning of the project (kick-of meeting), the second was in Elche (July 2009). This was the occasion to define some tools of dissemination, like the drafting of article to press and to web site, the drafting of schema for the organisational activities and for the contents to produce, to follow for the next activities in calendar.

E) Comminication in specialised press
Publication of an articles for the SCOOP project.

F) The SSH policy brief
We published a SSH policy brief. As the term policy brief implies, this form of publication describes project results (especially political results) referring to an audience of non experts. The five parts of our policy brief were:
1) introduction: a presentation of the main aims of the project;
2) scientific evidence and analysis: this section included the most important policy-relevant information our project has produced: empirical data and cogent analysis. In other words: new knowledge;
3) policy implications and recommendations: we underlined here the policy relevance of our findings and we articulated recommendations for policymakers based on our findings;
4) research rarameters: we described here scientific and methodological approached with a simple language;
5) project identity: information about SAMPLE partners.
All the partners contributed to the redaction of the policy brief and realised a big effort to individuate and highlight the policy implications of SAMPLE project. The policy brief has been distributed at NTTS conference in Brussels and also by mail.

Prof. Monica Pratesi, Università di Pisa - Dipartimento di Statistica e Matematica Applicata all'Economia (UNIPI-DSMAE)

For more information:

Consortium details and contacts:
- Università di Siena - Centro Interdipartimentale di Ricerca sulla Distribuzione del Reddito (CRIDIRE), Italy
Prof. Achille Lemmi, via mail
- Cathie Marsh, Centre for Census and Survey Research, University of Manchester (CCSR), UK / School of Social Sciences University of Southampton (SOTON), UK
Dr Nikos Tzavidis, via mail
- Departamento de Estadística, Universidad Carlos III de Madrid (UCM3), Spain
Dr Isabel Molina, via mail
- Centro de Investigación Operativa, Universidad Miguel Hernandez de Elche (UMH), Spain
Prof. Domingo Morales, via mail
- Warsaw School of Economics (WSE), Poland
Prof. Tomasz Panek, via mail
- Provincia di Pisa - U.O. Studi e Ricerche - Osservatorio per le Politiche Sociali - Ufficio Politiche Comunitarie (PP), Italy
Dr Paolo Prosperini, via mail
Dr Claudio Rognini, via mail
- Simurg Ricerche (SR), Italy
Dr Moreno Toigo, via mail
- Central Statistical Office (GUS, Gówny Urzd Statystyczny), Poland
Dr Mariusz Kraj, via mail