CORDIS - Forschungsergebnisse der EU
CORDIS

Building a Tool to Evaluate and Improve Health Investments in Screening and Diagnosis of disease

Final Report Summary - HISCREENDIAG (Building a tool to evaluate and improve health investments in screening and diagnosis of disease)

Executive summary:

In the context of the development, implementation and diffusion of Health Technology Assessment (HTA) in the European Union, innovative genetic diagnostic and screening technologies have received little attention so far.

As a contribution to fill this gap, the present study reviews the state of the art of the HTA literature and discusses the actual decision-making processes in the field of genetic technologies. The project then set up a common set of procedures and criteria for the evaluation of health investments related to screening and diagnosis of disease across Europe, with a special focus on gene technologies, and the specific HTA issues related to this type of technologies.

A toolkit for health technology assessment of genetic diagnostics and screening tests was developed. The toolkit is built around three main elements:

a) Ten criteria for decision-making
b) Quality assessment
c) The process of HTA

Project Context and Objectives:

Technological innovation has yielded truly remarkable advances in healthcare during the last decades and, as a result, it has contributed to improve healthcare delivery and patient outcomes. All this has meant that health systems have developed at different speeds, and with differing degrees of complexity throughout the twentieth century, reflecting the diverse political and social conditions in each country. Notwithstanding their diversity, all systems, however, share a common reason for their existence, namely the improvement of health for their entire populations. To attain this objective a health system undertakes a series of functions, most notably, the financing and delivering of health services (Velasco-Garrido and Busse, 2005).

HTA uses evidence-based medicine techniques and is based on sets of guidelines that describe which methods should be used (see http://www.ispor.org online). In some areas, however, there is still the opportunity for further development of HTA methods, criteria and decision making processes. In the case of genetic screening and diagnostic technologies, and at the start of this project, there was a complete absence of guidance on appropriate HTA methods, criteria or process. Some suggestions, in the form of a handful of pilot studies, have been made but practical and real world applications had not been developed. This assessment gap appeared in one of the technological areas which will require substantial health investments over the next decade. This potential for advancement in technologies and increased expenditure suggest there is a strong need to review health investments for genetic diagnostics and screening technologies to assess their effectiveness and impact and develop a set of procedures and criteria. Such processes and criteria should aim to evaluate how well such investments meet the needs of the population by maximising health gain and using resources effectively.

There is a paucity of information on the quality, effectiveness, efficiency and accessibility of healthcare services among Member States. There is also a substantial lack of good quality data about the impact of health investments and healthcare performance in Europe. Moreover, it is not clear to what extent recent decisions on health investments have been optimal, in terms of maximising health gain for the resources spent. Public funders sometimes evaluate the impact of their health investments, but anecdotal evidence suggests there is a lack of standardised procedures and criteria, particularly in the case of genetic diagnostics and screening technologies (JRC-IPTS, 2006; Annemans et al, 2007).

This tool:

- allows for a better alignment of procedures and criteria between all (old and new) member states;
- serves as a guide for sharing processes and methods, identifying areas for synergism and coordination to avoid duplication of efforts in the EU; and
- improves future investments in terms of increasing evidence on the effectiveness and cost-effectiveness in the screening and diagnosis of diseases.

As explained, this project specifically focuses on health investment decisions for genetic tests for diagnosis and screening. It will be shown that several aspects are valid for HTA in general, but that several specific aspects need to be addressed in the case of genetic tests.
For uniformity reasons, it is useful to identify a common set of procedures and criteria to develop a tool for decision making procedures on health investments for genetic tests across European member states. It is also acknowledged that health care decisions on genetic tests can be expected on several decision levels (e.g. country, regional, hospital). For the purpose of this report, genetic test technologies were classified following using these four categories:

1. Genetic tests in population-based screening programs, which is defined as a test aimed at testing individuals (children or adults) who belong to a population defined (sub)group (e.g. age, race/ethnicity, gender) without clinical signs of disease, in particular, prenatal screening for Downs syndrome can serve as an example here;
2. Genetic testing for diagnostic/predictive testing at an individual basis, which is defined as diagnostic testing (for confirmation of clinical symptoms) or predictive testing (pre-symptomatic but existing risk of developing a condition because of family history) for predisposition to either a common disease or a rare diseases (rare disease being defined as occurring less than 1/2,500 population); screening for hereditary breast cancer may serve as the example here;
3. Genetic testing for carrier testing, which is defined as testing for the 'carier' of an inherited (recessive or X-linked) disorder, which does not affect the person but could eventually affect his/her relatives; an example may refer to testing for cystic fibrosis; and
4. Genetic testing to assess the response to specific therapies (pharmacogenetics), which is defined as the study of inter-individual variations in DNA sequence related to drug response (drug response can be in terms of efficacy/effectiveness and/or safety (e.g adverse drug reactions); the variation in drug response could be due to inherited mutation in drug absorption, metabolizing, distribution or excretion enzyme; a distinction between common and rare disease can be made again; examples may be testing for potential response on Herceptin in breast cancer and for adverse effects in azathioprin therapy.

Project Results:

2.1 A General Perspective on Health Technology Assessment for Genetic Screening and Diagnostics in the European Union: A Comprehensive Review of the Literature (WP1)

European countries have been spending an increasingly large percentage of their Gross Domestic Products on healthcare. Much of this spending can be attributed to the ever-increasing development, use and pricing of new healthcare technologies, estimated to account for almost half of the costs. However, this increased expenditure has not always translated into simultaneous improved population health outcomes. As the rate of healthcare spending is estimated to continue to rise, and this in combination with a growing variation in medical practice patterns, and poor quality outcomes, there is an increasing demand for more robust and timely information to improve healthcare decision-making. The increase in healthcare expenditures with sometimes no or limited gain in health outcomes indicates a need for evidence-based information on the benefits, costs, and risks associated with different health technologies. This is particularly relevant in the field of (genetic) screening, and diagnosis, where health technology assessment (HTA) is still underused. HTA comprises a tool that aims to provide all stakeholders involved in healthcare organization with evidence to make informed and scientific decisions. If used effectively, the use of HTA can improve the quality of care. The central aim of using HTA is to get the greatest value (i.e. health outcome) for the invested healthcare spending.

Genetic information has substantial clinical and economic implications and such information might be useful to a wide range of stakeholders such as healthcare decision-makers, clinicians, patients and their families. Genetic testing, and associated genetics services, are a complex intervention that can be used to predict the risk of developing a certain condition, facilitate more rapid and accurate diagnosis of genetic conditions, and by leading to earlier or more precise interventions, potentially prevent disease, prolong life, and promote health (Grosse et al, 2008; Payne et al, 2008). Genetic-based technologies offer the prospect of information to guide clinical decision-making but will also have impacts on the use of healthcare resources. Therefore, assessment of the clinical utility of genetic testing requires a process to value and weight different types of outcomes (Grosse and Khoury, 2006; Burke et al, 2002).

2.2 Health Investments in the European Union (WP2)

The next phase of this project aimed to obtain an overview of how Health Investments (HI) in the field of screening and diagnosis are currently being conducted across Europe (WP2) using several approaches:

List of health investments (HI) in the field of screening and diagnosis

A list of genetic tests useful for screening and diagnosis purposes used in the EU27 was completed. However, this project was unable to explicitly obtain an actual and global map of the utilisation of genetic tests in each country, because of limitations in the available data sources, but rather an aggregated list. Information was gathered from the following genetics websites: http://www.orpha.net http://www.ncbi.nlm.nih gov/sites/GeneTest and http://www.eddnal.com and from the web sites of different laboratories in the EU. Furthermore, published articles were collected together with opinions from experts. More specifically, based on Goddard et al (2003), 8 inherited diseases were selected: cystic fibrosis, congenital hypothyroidism, congenital adrenal hyperplasia, Duchenne muscular dystrophy, PKU, hypercholesterolemia, hemochromatosis and fragile X chromosome. For these eight inherited diseases there are screening programmes (mainly newborn screening) in place and also the type of genetic test, its application and the countries (laboratories) in which it is performed, can be described. The information was obtained from the database from http://www.genetest.org.

Creation of a database of HI responsible bodies

To fulfil this second objective we first performed a literature review and a search to identify the responsible health bodies in each EU27 country, their addresses and some responsible persons we could contact. The database was created using as a baseline data the existing information from the European Commission (Eurostat) and updated with information obtained from the Ministries of Health at MS level.

We then developed a questionnaire to: (1) identify the health bodies responsible for decisions on the adoption of health investment technologies related to genetic testing across Europe and (2) identify the methods applied to inform decision making.

We identified 120 health bodies, which were sent questionnaires by e-mail in Autumn 2009. Two reminders were also sent to potential responders. Pilot questionnaires were sent out to guarantee that the design of the questionnaire was sufficient and that it was feasible to answer the questions. T response rate was disappointingly low (about 10%). The results are included in Annex A. In summary, we have observed that there is a high heterogeneity with regard to the way decisions are made in the respondent countries. There are several potential levels of decision making and also there are decisions taken without almost any specific approval and control (i.e. just a laboratory in a hospital decides to implement a technique whose results are used by a given specialist within that centre, and that is it). Depending on the structure of the health system, there are special agencies dealing with regulatory issues for genetic testing approval and reimbursement. Another finding is that economic assessments are rarely used. However, there was some evidence that economic assessments were used in population screening.

2.3 Existing Procedures and Criteria of Health Investment Decision-making (WP3)
To ensure that the final recommendations of the project match with the context of current decision-making about investments in health, a detailed overview of procedures about novel diagnostic and screening technologies and payers in the EU member states was produced (WP3). A suitable research method needed to be defined to apply the framework proposed by Rogowski and colleagues (Rogowski, Hartz et al. 2008). The selected method needed to account for the complexity and variability of reimbursement decision-making, different degrees of transparency and potentially biased answers from respondents. As not all existing decision-making processes in all member states could be assessed, suitable case studies for investigation had to be identified in a context where documentation on coverage of genetic tests usually is not available. The main objective of the WP3 was to achieve an overview on existing procedures and criteria for examined case studies and countries.

Activities

Three major steps were conducted for achievement of the WP objectives:

1. Exploration

Data collection: To define a suitable method for analysis and data extraction, semi-structured interviews with third-party payers and related experts on past decision processes in the area of cancer prevention in three countries (Austria, Sweden, Lithuania) were conducted to develop a structured scheme for analysis of the steps of decision-making (Fischer, Leidl et al. 2011). Collecting data in this exploratory survey was very time consuming. Therefore, we developed a web-based structured questionnaire to generate more responses. The survey made use of the developed scheme and was implemented in the software EFS Survey 6.0 (Unipark GmbH, Cologne, Germany). The surveying instrument and questionnaire are described in detail in Annex 3.1;

Selection of respondents: To be able to cover at least 50% of the whole EU population, among these at least 50% of the population in the new member states, health care payers and respective respondents to the email survey were identified by means of an exploratory survey of the literature and the MISSOC information system of the EU member states.

Selection of case studies: As the focus of the whole project was put on genetic testing, we aimed at selecting at least two case studies according to the definition of genetic screening and testing services used in this project. For this purpose, exploratory analyses to identify past decisions in a selection of countries were conducted to ensure a number of decision processes were included in the survey. These results showed that decisions on the level of health care systems could be identified for only very few genetic technologies. The case of newborn screening technologies was selected as focus of the analysis because only here, explicit decisions have been made recently in a number of countries for expansion of programs (Bodamer, Hoffmann et al. 2007; Loeber 2007). A second, complementary survey was conducted to include other areas of genetic testing into the analysis.

2. Email survey

The initial contact with experts involved in decision-making on newborn screening (NBS) was made using the International Society for Neonatal Screening (ISNS). Further contacts were identified via the respondent for each examined country or, the list of participants of the ISNS Regional meeting in Prague in April 2009. The survey was conducted between August and December 2009. If they did not opt out, respondents were accompanied by phone when answering the questionnaire.

The second survey, aimed to cover the breadth of the different types of genetic tests by contacting all third-party payers in major EU countries via email. However, this survey did not succeed to provide a detailed overview of decision processes as contacted institutions were reluctant to supply the information requested.

3. Validation

To improve validity and reliability of data, a Delphi procedure was used for those decisions on newborn screening where two respondents per country provided information. Therefore, no further academic subcontractors were needed. To develop key characteristics of procedures and criteria, findings from both surveys were discussed among members of the project consortium.

Steps of coverage decision-making

For the steps that have been defined in the framework from Rogowski and colleagues, the following characteristics have been identified:

Health care payer: Decision processes can only be identified clearly if the health care payer and the respective decision-making committee are known. Genetic testing services were predominantly funded by taxes or statutory health insurance. In about one third of the cases, the decision was made by a separate committee.

Trigger of the process: Decision processes depend on the pathway of clinical care a genetic test is embedded in. For many decisions on predictive and diagnostic genetic testing, no formal decision process appeared to be triggered so that decisions are made on the level of single physicians. These local decisions are difficult to observe and potentially unstructured. For pharmacogenetic tests, decisions are likely not to be specific for the tests but rather are made as a complement to the decision about the use of the drug. Population-based genetic screening was often decided on a national level. Technologies for genetic testing were often selected by explicit criteria.

Methods of assessment: A minority of decisions made use of a formal HTA, predominantly for deciding on NBS technologies. The use of a systematic review was often indicated but could not be validated. The assessment of costs was predominantly based on a cost estimate rather than a formal cost-effectiveness analysis.

Criteria for appraisal: No clear pattern of appraisal aspects relevant for the decision outcome was identified. The effectiveness in terms of the health gain from testing, the severity of the disease and the availability of a treatment for the disease were reported to be most relevant for appraisal. Aspects related to costs were less relevant and lobbying activities were reported to have minor to no relevance.

Reimbursement: Far most of the coverage decisions were positive. It is therefore likely that a relevant part of decision making is informal and takes part before a formal and observable decision process has started. Different types of reimbursement were reported where reimbursement per test, per insured or by an annual budget were most frequent. The amount of reimbursement per single decision option was difficult or impossible to identify.

Rules for service provision: In about one fifth of decisions no further reporting to the payer was required after the decision was made. However, most payers required specific information about screening.

Formal and informal participation: Diverse participation of stakeholders was reported by the respondents in both surveys. Compared with pharmaceuticals, the industry seemed to have been less engaged. Instead, service providers (laboratories) had the biggest influence. Overall, stakeholders often participated by voting on the final outcome of the decision.

Publication of decision and supplementary information: Overall, the transparency of decision processes was limited. Most importantly, the decision outcome was not reported in all decisions (22%). Stakeholder comments, an HTA report and the rationale for the assessment question from scoping were reported in less than 20% of decisions. No decision was identified where all documents were provided according to the specified scheme. Also, information on decisions was difficult to validate via web- or document-searches.

2.4 Methodology (toolkit) to assess the impact of HIs (WP4)

A full toolkit for health technology assessment (HTA) of genetic screening and diagnostics tests were developed (WP4).

HTA uses evidence-based medicine techniques and is based on sets of guidelines for the appropriate methods that should be used (see http://www.ispor.org online) and is often used in the framework of reimbursement decisions and funding recommendations. In theory, application of the toolkit should lead to evidence-based reimbursement decisions (Netherlands, Belgium, Sweden, etcetera) or guidance on use (NICE and SMC in the UK). Within the toolkit, the process provides procedures and a set of requirements to provide suppliers of technologies with well defined pathways to follow for market access and reimbursement. In many western economies the pathways for manufacturers to get new drugs on the market is well defined and includes clear steps and decision criteria. In addition, particularly with centralised decision making, timelines generally exist in the technology assessment process that means the evaluation should be finished within an adequate amount of time for both market access and reimbursement decisions. In many western economies such procedures exist for outpatient and inpatient drugs with processes and criteria and with specific requirements on how to report evidence with regard to some of the decision criteria (for example guidelines for pharmacoeconomic analysis providing methodological standards in that area). Notably however, there is a lack of procedures and criteria identified for genetic screening and diagnostics.

Key questions are:

1. Is there direct evidence that the test reduces morbidity, mortality, and/or QOL?
2. What is the prevalence of disease in the target group? Can a high-risk group be reliably identified?
3. Can the test accurately detect the target condition? What are the sensitivity and specificity of the test? Is there significant variation between examiners in how the test is performed? In actual testing programs, how much earlier are patients identified and treated?
4. Does treatment reduce the incidence of the intermediate outcome? Does treatment work under ideal, clinical trial conditions? How do the efficacy and effectiveness of treatments compare in community settings?
5. Is the intermediate outcome reliably associated with reduced morbidity and/or mortality?
6. Does treatment improve health outcomes for people diagnosed clinically? How similar are people diagnosed clinically to those diagnosed by screening? Are there reasons to expect people diagnosed by screening to have even better health outcomes than those diagnosed clinically?
7. Does testing result in adverse effects? Is the test acceptable to patients? What are the potential harms, and how often do they occur?
8. Does treatment result in adverse effects?

The research questions are directly related to the choice of outcomes to be measured. With regard to clinical consequences, different levels of outcomes can be considered.

We propose an HTA toolkit built around 3 main elements:
a) Ten criteria for decision-making
b) Quality assessment
c) Process of HTA

a) Criteria for decision making

In our current approach, we define criteria as referring to the various items to be considered within the whole toolkit to assess a new diagnostic/screening test. As said above, in addition, the toolkit comprises a set of processes, which define the steps to follow in an HTA and state who performs these (national, local or meso level decision-makers/advisors) specific procedures. Furthermore, there should also be consideration of methodological standards and, ideally, even guidelines for conducting the component parts within the toolkit.

Previously, the EUnetHTA Core Model for HTA was developed within an EU-project (see http://www.eunethta.eu online). The model employs 10 domains of criteria within a procedure that can be applied for assessing a new screening test or diagnostic (which may be a genetic test, for example) as well. In particular these could be summarized as:

1. Current use of the technology (dissemination so far)
2. Epidemiology of relevant disease(s)
3. The exact technology and its characteristics
4. Safety/toxicity
5. Accuracy
6. Effectiveness/efficacy
7. Costs and economic evaluation
8. Ethical aspects
9. Organisational aspects
10. Psychosocial

1. Current use

It is obviously important to assess the current use of the new technology and the environment in which it is/might be adopted. The environment relates to alternatives such as other tests, for example older or less advanced technologies, or general drug treatments irrespective of screening/test outcomes. Current use may be limited if reimbursement is not (yet) adequately arranged or even market access is still in the process of being acquired.

2. Epidemiology and \management of the health condition

Secondly, the epidemiology and burden of disease for which the technology is intended should be analyzed. Aspects to be considered involve prevalence of disease, symptoms, natural progression, influence of risk factors, mortality and life-years lost, expected potential impact of screening or diagnosis, current treatment practice, clinical guidelines in the disease area, alternative treatment(s) (comparator(s)), positioning and decisions on market access and reimbursement in other countries and competitor(s) drugs. Two key data are required in the field of genetic testing: the number of people to be tested, and the expected number among them that has the characteristic of interest. The uncertainty about the first one hampers budget impact analyses and the uncertainty about the latter does not only affect budget impact but also the false negative and positive rates of the test.

3. The exact technology and its characteristics

In this stage various types of questions have to be asked and answered. These questions relate to the technology itself, but also to material and immaterial requirements for its use. For example, are investments in specific equipments needed, does it require highly-skilled staff to operate the technology?

4. Safety

Safety may refer to 1°. the test itself, 2°. the information resulting from the test, and 3°. the (wrongful) treatment based on the test. For most diagnostics and screening tests the second aspect might be of most importance, for example, considering genetic testing technologies. In such cases wrong or misleading diagnosis might lead to non-optimal or even mistreatment. Rightful diagnosis will enable optimal treatment, however genetic knowledge may impact on areas of the individual's and his family's life. For the third aspect, sensitivity, specificity, positive and negative predictive values are crucial (see below). This makes the safety issue again more complex as with traditional technologies and adds another extra source of uncertainty.

5. Accuracy

The basic question regarding diagnostic accuracy relates to whether the diagnostic or screening technology correctly distinguishes diseased/at risk populations from non-diseased/low risk populations. Often, for this purpose sensitivity and specificity is used or alternatively predictive values can be estimated and reported.

Following definitions may be used:

- Accuracy = proportions of subjects that the diagnostic/test correctly identifies as positive/negative;
- Sensitivity = probability of a positive diagnostic/test result in persons with disease/risk;
- Specificity = probability of a negative diagnostic/test result in persons without disease/risk;
- False positivity = probability of a positive diagnostic/test result in persons without disease/risk;
- False negative = probability of a negative diagnostic/test result in persons with disease/risk;
- Predictive value = proportion of persons with risk present/absent in those with a positive/negative result.

6. Clinical effectiveness

Rather than evidence on accuracy only, one might argue evidence on actually changing medical practice and improving patient outcomes is required in diagnose-and-treat or screen-and-treat settings, potentially showing statistically significant improvements in serious morbidity and mortality, or at least surrogate markers. Such trials combining the diagnostic/test technology with subsequent health-care interventions to demonstrate clinical utility are however yet scarce.

7. Costs and Economic evaluation

Economic evaluation has become paramount in the last decades to help priority setting in health care; i.e. to spend health-care budgets optimally to ensure the highest health gains for limited resources. It is undertaken to inform health-care decision makers with the explicit goal to enhance rational decision making. Guidelines are used as explicit tools for designing, executing and judging economic evaluations. Health-economic/pharmacoeconomic guidelines exist for various countries all over the world (see http://www.ispor.org online), and also several checklists are available for assessing the quality of such studies. Setting the methodological standards, most guidelines require the transparent presentation of cost-effectiveness planes, cost-effectiveness acceptability curves, value-of-information analysis, uncertainty analysis using both sensitivity and scenario analysis, and explicit probabilistic sensitivity analysis reporting averages and credibility intervals surrounding the cost-effectiveness estimates (Barton et al). These guidelines and quality requirements will be addressed in detail in the next section of this report.

8. Ethical aspects

Ethical analysis should be considered a separate domain in HTA and a structured analysis of both situations with and without the new diagnostic/testing technology should be embarked upon. It is obvious that in particular genetic testing is prone for ethical issues and it is generally considered a major issue here. For example, knowledge on genetical information may on the one hand enhance optimal treatment possibilities, however on the other hand it may provide future prospects one may not necessarily want to know already and it may impact beyond health care (for example, life insurances).

9. Organisational aspects

Organisational aspects are often neglected in HTA, with the focus on the technology only rather than considering the environment for implementation as well. Notably, neglect of organisational issues may seriously endanger the achievement of optimal health gains while controlling costs. Issues relating to this aspect may be whether we want to implement a screening in an opportunistic or systematic approach, whether we think a diagnostic is most suitable to be applied in an outpatient or inpatient setting etcetera. Essentially, two questions are eminent here: who do we target and how we target them? Considering a screening program, we could think of a specific subgroup registered with a GP and targeting them either opportunistically when they come for whatever reason or systematically by sending all within a specific age band an invitational letter for testing. Obviously, both approaches will also have differing economic impacts (Welte et al).

Whereas drugs are generally developed, registered and used in clear-cut patient populations and environments for application are often inherent to the technology considered (intravenous chemotherapy in hospital, general analgesics in the outpatient setting), notably in diagnosis/screening this may not be the case. Populations considered may even comprise (major parts of) the general population at large or groups with elevated risk a priori. So, the question 'who do we target' is often less straightforward to be answered, let alone how to target them.

10. Psychosocial and Legal aspects

This part of the HTA covers the broader life areas of those involved in the application of the new technology. These may be patients, but also family, friends, employers and employees play a role here. Also, impacts on leisure time should be considered here if not taken into account explicitly in the economic evaluation, which is certainly not done if the third-payer party perspective was chosen for the economic analysis (sic. NICE in the UK). Notably, in health economics one issue relates to the QALY-impacts of technologies beyond the index case only. Such QALY-impacts could be on partners of those diseased, parents of sick children and sexual contacts in the area of STIs.

Broader areas to be considered comprise again ethical issues, distrust, dillemas, stigamatization/even humiliation, tabooism and legal aspects. The latter obviously pose the general constraints for use of the technology. Legal issues should be sorted out a priori before recommendations on use and organisation of it can be considered. Legislation both at the national and supra-national (for example, EU) plays a role here. Important in this respect may be the EU-website on legislation on medical devices (see http://ec.europa.eu/enterprise/medical_devices online).


b) Quality assessment

Next, we will consider in more detail which tools, methodological standards and guidelines exist for assessing the above sketched recommended criteria and the quality of the available data. In this respect, our focus will be on health and economics; i.e. in particular specific aspects listed above.

Notably, this will, for example, concern CoI and assessment of adequate comparator (criterion 1; current use), budget impact analysis (criterion 2; epidemiology), cost consequences of the technology under consideration (criterion 3; technology and characteristics), cost and QALY impacts of toxicity/safety and efficacy/effectiveness characteristics of the technology (criteria 4 and 6; safety/toxicity and efficacy/effectiveness), health-economic consequences of (in)accuracy of the technology (criterion 5; accuracy) and actual economic evaluation (criterion 7). Again, in specifying criteria into quality assessment, technologies regarding screening tests and diagnostics involve various specificities that were analyzed before (Vegter et al 2008) and are re-visited below. This will clarify in which aspects HTAs of screening tests and diagnostics might differ from HTAs in other areas.

Assessment Tools

Regarding tools to judge and structure health-economic parts of HTAs, guidelines for health-economic or pharmacoeconomic evaluations can be extremely helpful. As an illustration, we list the Dutch guidelines for economic evaluation as presenting such a set of tools. This specific set of guidelines reflects one set of many possible sets of guidelines that are now available all over the world and that can be considered and compared at http://www.ispor.org. Generally, these sets of guidelines are quite comparable between countries and institutions (for example, the Dutch guidelines resemble the Belgium ones and those of the National Institute of Clinical Excellence [NICE] resemble those of the Scottish Medicines Consortium). Yet, notable differences exist, for example, where the 1st Dutch guideline prescribes the societal perspective, NICE generally recommends the third-party payer (NHS) perspective to be adopted. In particular, the Dutch guidelines consist of a set of 11 guidelines with several subheadings being specified (Hoomans et al 2010).

Specificities for quality requirements when assessing Screening Tests and Diagnostics

We developed a set of guidelines that focus on the particular aspects and characteristics associated with genetic screening and diagnostics. This was inspired by the papers by An overview of guidelines that should always be carefully assessed when economically evaluating pharmacogenetic and –genomic screening tests.

1. Disease under study

The epidemiological aspects of the disease and complications should be given much more attention than is done on average in a pharmacoeconomic assessment of a new drug. This is because there is double epidemiology: the prevalence of the condition to be tested, and the prevalence and characteristics of those to be tested. If both are not clearly described, and if the known information on Number Needed to Test to identify one person (NNT) is not provided, several flaws may occur in the analysis.

Points of importance Comments

Disease under study Determine prevalence of the underlying condition and of those to be tested.

Comparators Describe all types of adverse events associated with the testing strategy; consider more than two comparators in total. Association genotype-phenotype The authors should focus on providing new insights and conflicting results, which occur more in this field Predictive Values Explain how the link between analytical validity and clinical utility was established, based on which sources, and justify the choices made.

2. Description of the comparators.

A test strategy is to be compared to a no test strategy. It should be acknowledged that in a no test strategy, 2 options often occur: treat all and treat no one (best supportive care). This means that the comparison is not head to head but one strategy as compared to a mixture of others. A second point of attention here is side-effects. These are often lacking in recent economic evaluations of drugs, but are crucial here, because of the three types of adverse events mentioned earlier in this report. It should also be acknowledged that the technology used to test the presence of a biomarker is not the same as the biomarker itself. Both need to be explained in detail.

3. Association genotype-phenotype

We created a separate guideline for this issue. Indeed, for the association between genotype and phenotype conflicting results were identified in the literature. In general, for screening tests and diagnostics information may not be optimal and RCTs and meta analyses may yet be lacking. It is therefore eminent that all evidence is gathered, inclusive conflicting results, case-control studies and large observational cohort studies. Efforts must be made to perform meta-analysis or at least evidence synthesis. Special care should be taken with respect to allele frequencies. Percentages may differ between different (ethnic) populations. In the area of screening tests and diagnostics, inferences from one set of populations to another may be much less straightforward than for drug treatments. I.e. a careful evidence synthesis is required, without deleting any (potentially conflicting) information.

4. Predictive values

The major challenge in the area of screening tests and diagnostics – that differs from the area of drug treatments – is how to integrate sensitivity, specificity and predictive values adequately in the health-economic model. For specific studies measuring the above characteristics various caveats regarding its interpretation for use in daily clinical practice exist. This again relates to the difference between efficacy and effectiveness or analytical validity versus clinical validity and clinical utility in this case. Consistently, as mentioned in Chapter I, high analytical validity (ability to detect a clinical marker) reflects good performance in laboratory/research circumstances with high ability to detect, whereas clinical validity (ability to identify a disease/condition/response) reflects high performance in daily use and would correspond with high clinical utility (improvement in clinical management and outcomes). For example, one should consider whether those persons receiving the diagnostic/test are indeed similar to those studied in the clinical studies; i.e. is the clinical trial group representative for populations of relevance, what is the prevalence of the condition in the target population (see 1.)? Is the reference test used to assess the performance of the index technology indeed gold standard and highly likely to correctly identify disease/risk? In case there is no trial based evidence on the clinical utility, how was it then derived?

5. Decision model

Another difference for tests and diagnostics compared to drug treatments involves the requirement of an explicit analysis of all the advantages of knowing the test/diagnostic outcomes. This analysis should ideally lead to a valid design of a decision-analytic model involving the options 'screen/test' versus 'no screening/no testing'. It is recommended to model the tree as such that the true prevalence of the condition searched for is modelled first, so that the test results can be easily reported in terms of sensitivity and specificity.

6. Type of analysis

The type of economic analysis should be chosen based on the aim of the study and available data. A CUA is preferred, but sometimes not practicable. Formal CEAs present a good second option, especially when using LYGs as the outcome measurement. Still, also if using QALYs, the analysts should always report LYGs (in absolute terms and weighted for utilities) separately in the analysis, in case the concerned condition is life threatening. In addition, a full cost benefit analysis, whereby also the value of information/knowing is included, should be acceptable. In such a case, the QALY is directly translated in monetary terms (e.g. 1 QALY = 40,000 EUROS) and on top of this the value of information is added.

7. Study perspective

A third-party payer perspective is often acceptable, as adopting a societal perspective may be impossible or impractical in the field of screening and diagnosis. Indeed, given the multiple 'bridges' to be made (between analytical and clinical validity, between the latter and clinical utility, between the latter and QALYs,...) and given the absence of trial data regarding the full picture, additional assumptions on productivity would yet add another element of uncertainty. However, if possible, a societal perspective should also be adopted. Yet, in practice, we see much more hospital perspectives with the analysis of (genetic) tests. This is not surprising given what we stated before regarding local decision making. It should be stressed here that these analyses from a hospital perspective should follow the same guidelines and scrutiny as those from a societal perspective.

8. Analysis of costs.

Costs of screening tests and diagnostics, medications and of all relevant events, such as all adverse reactions on tests, diagnostics and related drug treatments should be carefully assessed. In contrast to the cost of a drug, the cost of a test seems to be much more variable between settings and countries. This is explained by the multiple components explaining the final cost. Special focus on transparency with this regard is required (see also further). Different assumptions regarding a new test-treatment combination could be made: what if the test is reimbursed but the treatment not, what if vice versa, etc…. This provides the payers with more insight in the economic impact of their coverage decisions.

9. Time horizon

The time horizon of an analysis should be sufficient to capture all the differential costs and effects between the testing and non-testing strategies, inclusive adequate discounting using country-specific numbers. Moreover, The time horizon should be flexible so that the reader can better understand both the short and long term impact of knowing the genetic profile.

10. Sensitivity analysis (SA)

In SA, notably some extra parameters enter with inherent uncertainties to be investigated. In particular, uncertainty around test performances should be included in the SA, as well as the epidemiology of the underlying characteristics tested in the population. Also special emphasis should be placed on the uncertainties in bridging to clinical utility. In general, one could state that a robust and systematic approach to quantifying the uncertainties in the model is required that takes account of not only parameter uncertainty but also considers the potential impact of structural and methodological uncertainties. This means that a single probabilistic sensitivity analysis (PSA) is unlikely to represent the true uncertainty in a model based evaluation of a genetic test and it will be necessary to support PSA with scenario and structural sensitivity analyses.

c) Process

Procedures taken to arrive at a decision on health care investments comprise of a sequence of steps. This general approach involves steps that apply to all types of genetic tests. The procedure can be uninitiated by the manufacturer (e.g. for commercial tests) or clinicians (e.g. labtests for inhospital use and coverage from hospital budget) or by HTA bodies. During the process, initiators (e.g. manufacturers) should be involved in all stages (hearings, presentations etc.) but the HTA should primarily be conducted by one single party whether this is a manufacturer independent body (sic. NICE) or health care payer.

If all these requirements for the genetic test under consideration are found to be positive, the next step is to evaluate the clinical utility of the test. Relevant questions here are:

1. Does the genetic test make a difference in clinical management and outcomes?
2. Does the genetic test add to optimal health care (e.g. dependent on epidemiology of the relevant disease and standard care)?

If both these questions are answered with 'YES', it is important to identify the exact need for the test before starting to go through the proposed algorithm. Here, the following options 'reflecting' the need of having access to the specific test are possible:

1. High need; extremely high risk of high disease burden or death.
2. In between low/high need; lower risk of death, risk of major permanent dysfunction or major symptomatic condition.
3. Low need; minor symptomatic condition and factors indicating risk of future problems.

Clarification of purpose: Genetic testing services have many potential purposes, target populations, and roles in the clinical context. One test may do many things and may do them exceptionally well or poorly, depending on who exactly is served and what exactly is done before, during, and consequent to the test. However, if the definition of the testing service and its purpose cannot be described clearly, evaluation cannot proceed and coverage cannot be justified.

Research protocols: Research evidence is central to evaluation and coverage decision-making, but assessment information may be uncertain or incomplete. It has been argued that anyone receiving an unproven health service is de facto a subject in an experiment and deserves the protection of research protocols requiring ethical treatment of human subjects (Spodick, 1984). Similar options might be considered for genetic tests of compelling worth and promising, but yet unproved effectiveness, or uncertain effects.

Periodic re-evaluation: New and evolving technologies are moving targets for assessment. Evidence on effectiveness, cost-effectiveness, and other criteria obsolesces quickly with changes in technology, target populations, disease interventions, and practice styles. Because TA takes time, evaluations may be outdated as they become available. For these reasons, genetic testing services may require periodic re-evaluation, and evaluations should model the effects of potential developments (for instance, changes in effectiveness, cost, target population, clinical management) subsequent to the evaluation and coverage decision.

Interventions into personal and family impacts: Most genetic testing services include auxiliary interventions to control undesirable side effects of testing on individuals and their families. Evaluators should identify such side effects, and policy makers could condition coverage on servicing to ameliorate negative effects outside the scope of the testing service's main purpose. The majority of health systems now strive to integrate multidisciplinary health services around patient needs, and genetic testing should follow this trend.

Interventions into societal impacts: Genetic tests have potential social effects beyond impacts on the tested individual and relatives. Social effects should be studied and understood as diligently as individual and family impacts. Incremental coverage decisions contribute to the overall social impact of genetic testing, and evaluators should keep abreast of the evolving scholarship in the legal and social sciences. Innovative policy making models (for instance, a national and widely representative commission devoted to the assessment, control, and provision of genetic technology), may help orient 'one-off' decisions toward a collective vision (Hoedemaekers, 2000; Government of Ontario, 2002).

Clinical practice protocols: Clinical practice protocols include published practice guidelines as well as providers' unpublished, institutional protocols. Clinical guidelines may be particularly appropriate where genetic testing practices vary substantially across providers, and these variations alter the service's value according to any of the six evaluative criteria described in domain one (Giacomini et al, 2000).

Ethics protocols: Ethics protocols intervene into some of genetic testing services' potentially negative additional effects. Conventional clinical ethics apply to genetic testing, but in many cases need interpretation and adaptation. Therefore, advisory bodies have highlighted a rigorous consent process as a criterion for genetic test coverage. As such, decision-makers are challenged to develop new ethical codes addressing consent and privacy related to genetic testing services (Goel and Group, 2001).

Regulation: Regulation serves two objectives. Firstly, it can reinforce selected policy tools (clinical guidelines, adjunct interventions, mandatory research protocols, and protections) by legally requiring them. Secondly, regulations could ban genetic testing services whose impacts are harmful or unacceptable (Caulfield et al, 2001; Government of Ontario, 2002).

Priority setting: Priority setting involves weighing

Coverage decisions about genetic tests frequently appear to be made outside of the scope of national decision making bodies, presumably on a local decision making level. Characteristics of the decision processes differed strongly by the type of care genetic testing is embedded in (eg, public health programs, drug treatment, care by genetics specialists) and no 'genetics specific' coverage decision process could be identified. Whether and where this indicates the need for a more 'genetics-specific' decision support tool or, whether and where, this indicates that the existing tools for coverage decision making are applicable also in the field of genetics needed to be addressed by the other work packages.

The toolkit was presented to a group of European HTA experts in order to try to reach a consensus.

There was complete agreement that the availability of evidence should be considered before embarking on a HTA of a genetic test. Some participants suggested that a minimum requirement for the level of evidence available should be used. Two participants clarified that the level of evidence available may not preclude HTA if the assessment includes evaluation of the expected value of perfect information with a focus on specific parameters that need further evidence. In this situation it may be feasible to use a HTA to inform a ‘coverage with evidence development' decision.

The participants agreed that a set of explicit criteria were needed to prioritise the HTA of genetic tests. A number of criteria were suggested to be useful when deciding how to prioritise genetic tests for HTA but some participants made the observation that it may not be feasible to select which criteria should be used and further work may be needed as the HTA of genetic testing evolves over time and experience. The list of prioritisation criteria included:

- potential for improved patient benefits, including health gains but also, importantly for some genetic tests, non health gains such as the providing information for reproductive decision making.
- cost and potential budget impact of the technology;
- extent of evidence available for making a decision, or, making recommendations for future research required;
- level of uncertainty in the data;
- risk of harm from not conducting a HTA;
- potential for the HTA to make an impact, such as meeting 'unmet need' and adding value to the healthcare system and patients;
- availability of other similar tests;
- and specifically for pharmacogenetics, the cost of the drug to be prescribed or the severity of the potential side effects being avoided by targeting prescription.

2.5 Towards consensus on the HTA of genetic tests for screening and diagnosis of disease (WP5)

After developing the toolkit, the following step was used to explore the extent of consensus amongst expert health economists working in European Member States about its theoretical and practical relevance (WP5). The primary objective of work package 5 (WP5) was to explore the extent of consensus amongst expert health economists working in European Member States about the theoretical and practical relevance of the health investment tool developed in WP4. The health investment tool comprises three key components:

(a) The Criteria. The criteria to use in a health technology assessment (HTA) of a genetic screening or diagnostic test to inform whether it is appropriate to invest in the technology and fund the technology using part of the budget allocated to the healthcare system. Ten criteria were selected using the findings from the EUnetHTA Core Model for HTA, which was an EU-funded project.

(b) Quality assessment. Each of the ten criteria in a HTA should ideally be quality assessed using appropriate methodological standards to critically appraise the quality and reporting of the evidence. The focus to date has been on developing methodological standards for conducting and reporting economic evaluations. However, it is potentially feasible to generate methodological standards for each aspect of a HTA.

(c) The process of health technology assessment. Health technology assessment involves a series of steps, and decisions to be made, that together form the process of HTA. Examples of the process of HTA include: timing of the assessment (pre or post marketing); topic selection; defining the research questions; evidence collection and synthesis; preparation of the report; dissemination of the report.

Two activities were performed in order to fulfil the objective for WP5:

i) Exploring views on the health technology assessment of genetic tests for screening and diagnosis of disease. An e-Delphi survey
WP4 identified ten criteria that have been previously used in HTA of technologies. There is no evidence to support whether these ten criteria are also directly relevant to conducting HTA of genetic technologies to inform reimbursement decisions. This study aimed to understand health economists' views about the criteria and quality assessment guidelines we should use to conduct (national) HTA of genetic tests. A two-round Delphi process used four classifications of genetic test as examples to understand the relevance of the health investment tool to different types of genetic tests.

A total of 90 respondents logged onto round 1 of the survey; while a further 67 respondents continued to complete the survey, of this number 26 respondents agreed to participate in a second survey (39%). In round 2, all of the 26 respondents who agreed to participate in the second survey logged onto the survey, of these 25 respondents completed the survey in full. The majority of respondents were identified in the 23 to 34 and 35 to 44 year age ranges and stated academia research and teaching as their primary role for both rounds of the Delphi survey. The vast majority of respondents reported the UK as their primary country of employment, representing the high number of UK health economists identified in the database of health economists.

ii) The Health Technology Assessment of genetic tests and other technologies: identifying similarities and differences. A Workshop

A workshop was arranged to formulate a consensus view on whether the HTA of genetic testing and screening should be any different to the HTA of other health technologies. The workshop invited expert participants to generate a consensus view on the HTA of genetic testing. The key question the workshop participants were asked to focus on was: Is the HTA of genetic tests any different to the HTA of other health technologies? The workshop was structured around four exercises:

Exercise 1: Criteria and quality assessment guidelines
Exercise 2: The Relevance of Criteria for Different Levels of Decision Making
Exercise 3: The process of HTA for genetic tests
Exercise 4: towards a consensus view
The twelve participants were divided into five groups each with two or three members. KP facilitated the exercises with IJ. The workshop was audio recorded, with the permission of the participants, and a note-taker was also used to record the key issues and questions raised throughout the dat. The participants completed each of the four exercises in turn, spending about 45 minutes on each exercise. The workshop used five examples of genetic tests as case studies and each of the five groups were allocated one type of genetic test to focus on for each of the first three exercises. The participants were asked to consider all five examples of genetic tests for the fourth exercise.

Before starting exercise 1 the participants were asked to consider the question:
Is the HTA of genetic tests any different to the HTA of other health technologies? This question generated an interesting discussion about the nature of the 'other technologies'. Before the workshop commenced, there one or two of the participants did indicate that the HTA of genetic testing and screening is different to the HTA for medicines but may have some close analogies, and shared challenges, with the HTA of diagnostic interventions. In addition, participants pointed out that the concept of 'how different' was important. There are likely to be subtle differences between any HTA of any technology but it was the degree of the difference which has an impact of the process, design and conduct of the HTA which need to be clarified.

2.6 Assessment of proposal in the EU context (WP6)

Building up on the achievements and findings of previous WPs the WP6 considers the policy implications of assessing health innovations in the field of genetic screening and diagnosis from an EU perspective.

This section has been organised in two parts.

1. Review of the policy debate on genetic diagnostics in Europe.

Using a translational research framework, we discuss the state-of-the art and the barriers that have limited so far the adoption and integration of genetic diagnostics in Europe.

The exploration of the policy framework includes the identification and analysis of the barriers, problems or issues raised in literature and by experts that deal with the introduction of genomics-based products in clinical practice.

In this part of the WP6 study we also described the European institutional positions, with a specific focus on the debate on the Directive 98/79/EC on in vitro diagnostic devices which establishes the framework for diagnostics regulation in the EU. Collection of data of analytical and clinical validity appears to be negatively affected by the lack of systematic pre-market review of genetic diagnostic products. The IVD Directive does not require most of these tests to undergo review, because their great majority is classified as low risk. The exception is a small number of blood screening tests considered as high risk and other tests considered as moderate risk.

2. Assessment of the proposed toolkit

Based on the information provided by previous WPs, and in the light of the policy debate highlighted as part of our WP research, we evaluated the methodology developed in the project.

Three main areas have been considered:

- Improvement of current decision-making processes
- Benefits for patients and population
- Implementation at national and international level

On the first point, several improvements have been identified: harmonization and standardization of assessment methodologies (to the extent possible), transparency in the evaluation, and, more generally, the possibility to implement a regulation in a market which is basically made up by home-brew un-regulated test and technologies (especially in EU).

The adoption of a standardized and comprehensive assessment methodology would enable incentives to private investments in genetic diagnostics and this would, in turn, boost the emergence of new technologies. The inclusion of several criteria (clinical utility, accuracy, effectiveness) is particularly important, as it captures the specificities of the diagnostics technologies and it is essential to account for the quality of the test, one of the most serious issues in the sector.

The assessment would open the possibility of including genetic diagnostics into the public health system, with important benefits in terms of equity in the access to these technologies for patient / population.

Potential Impact:
European countries have been spending an increasing amount of the percentage of their Gross Domestic Products on healthcare. Much of this spending can be attributed to the ever-increasing development, use and pricing of new healthcare technologies, estimated to account for almost half of the costs; however, it has not always translated into simultaneous improved population health outcomes. Next, as the rate of healthcare spending is estimated to continue to rise, and this in combination with a growing variation in medical practice patterns, and poor quality outcomes, there is an increasing demand for better information to improve healthcare decision-making. The latter evident increase of healthcare expenditures with sometimes no or limited gain in health outcomes indicates a need for evidence-based information on the benefits, costs, and risks associated with different health technologies, and in particular in the field of (genetic) screening, and diagnosis, where this kind of assessments are still underused. In this regard, HTA comprises a tool that may provide all stakeholders involved in healthcare organization with information to make informed and scientific evidence-based decisions. If used effectively, these efforts can improve the quality of care and contain healthcare costs. The central aim of these efforts is to get the greatest value (i.e. health outcome) for the invested healthcare spending.

Without high-quality evidence, the uptake and diffusion of technologies is likely to be influenced by a range of social, financial and institutional factors. This may not produce optimum health outcomes or efficient use of limited resources. HTA is a significant aid to evidence-based decision-making, but it must address the challenges of delivering timely and relevant information that reflects adequately the dynamics of technology and the healthcare system in order to provide the information needed for effective decision-making and priority-setting (Sorenson et al, 2008).

List of Websites:
n/a