Revision as of 17:15, 18 January 2015

The purpose of this page is to create a reference to describe our methodology for assigning levels of evidence to regimens.

Important note: Our intent is not to provide clinical decision support. Rather, our goal is to faithfully reproduce findings of clinical trials. Efficacy and toxicity information, in particular, is sometimes presented by authors in a confusing or ambivalent manner. As such, we try to illustrate ambiguities when they happen, and take no responsibility for your decision to choose a particular treatment regimen. Please read our disclaimer for further information.

A bit of background: We have taken a simplified approach to information visualization, based on a three-color "traffic signal" metric. In order to account for color-blindness, text is also included within each colored box. The colors we use are:

Green box text Yellow box text Red box text

See the sections below for a discussion of the various metrics we use.

Evidence

Generally, a regimen should be evaluated in a randomized fashion with an adequate patient sample to be considered a "green" regimen. We have defined adequate as 20 or more patients per arm. Non-randomized studies and randomized studies with fewer than 20 patients per arm are considered to be "yellow" regimens. Finally, case reports, retrospective series, and non-randomized studies with fewer than 20 patients enrolled are considered to be "red" regimens. Of course, there are finer gradations of the quality of evidence so this simplified scheme should be taken with a grain of salt.

Evidence is thus reported using one of the three labels:

Strong evidence

Moderate evidence

Weak evidence

Frequently asked questions

Q: If a randomized trial has more than two arms, will they all be labeled the same?
A: No, it depends on how many patients are in each arm of the trial. For arms that have more than 20 patients, the label is green. For arms with fewer than 20, the label is yellow.

Efficacy

Defined generally, efficacy is the presence or absence of a positive effect on the study population. Efficacy can be reported ranging from a weak surrogate measure (e.g. response rate) to direct measure of overall survival. Currently, we are not labeling efficacy by the quality of the outcome measure, but rather by comparative efficacy. As such, we only currently report efficacy for randomized trials; many non-randomized trials report efficacy compared to historical controls. However, in the rapidly developing field of oncology, this approach is rife with bias and as such we do not report on comparison to historical controls, at this time.

Efficacy is thus reported using one of the three labels:

Increased efficacy

Equivalent efficacy

Decreased efficacy

Frequently asked questions

Q: How do we choose to label efficacy when multiple outcomes are reported?
A: Often, a trial will report on multiple outcomes, such as overall response rate, progression-free survival, and overall survival. In this case, we generally look to the PRIMARY endpoint, as defined in the published methods. However, if a secondary endpoint shows differential efficacy and is less "surrogate" than the primary endpoint, we will label by that endpoint.

Q: How do you distinguish between a failed trial and a formal non-equivalence study?
A: We do not currently distinguish these; both of them would be labeled yellow. We are looking into ways to convey this information appropriately.

Q: Do you have a hierarchy of surrogacy?
A: Yes, this is the hierarchy that we use to determine the strength of an outcome measure:

Strong outcomes

Overall survival

Intermediate outcomes

Disease-free interval (DFI)
Disease-free survival (DFS)
Event-free survival (EFS)
Progression-free survival (PFS)
Time to next treatment (TTNT)

Weak outcomes

Overall response rate (ORR)
Response rate (RR)

Q: Do you consider quality of life (QoL) measures in efficacy?
A: Very few RCTs report on QoL measures, and as such we do not currently include them in the consideration. This may change in the future.

Toxicity

Defined generally, toxicity is the presence or absence of a negative effect (harm) on the study population. This is often also referred to as safety. As with efficacy, we only report comparative toxicity.

Toxicity is thus reported using one of the three labels:

Decreased toxicity

Equivalent toxicity

Increased toxicity

Frequently asked questions

Q: Are you basing the label on the reported CTCAE measures?
A: CTCAE measures are extremely valuable in that they are structured and thus reproducible. However, it is often hard to compare them directly. For example, if one regimen has grade 4 lab-based toxicity and the other has grade 2 gastrointestinal toxicity, which is the more toxic? In general, we plan to use the authors' interpretation of overall toxicity and tolerability when labeling.

Q: Do you plan to incorporate patient-reported outcomes?
A: As shown in numerous publications, patient reports of toxicity are more accurate than clinician assessments. However, they have not until recently been standardized. Now that the PRO-CTCAE is available, we expect to see more of these in the future and will incorporate them into the toxicity assessment.

Example code (for contributors)

Case report Red label code: Case report

Phase II Yellow label code: Phase II

Phase III Green label code: Phase III

@@ Line 1: / Line 1: @@
 The purpose of this page is to create a reference to describe our methodology for assigning levels of evidence to regimens.
-'''A bit of background:''' We have taken a simplified approach to information visualization, based on a three-color "traffic signal" metric. Generally, a regimen should be evaluated in a randomized fashion with an adequate patient sample to be considered a "green" regimen. We have defined adequate as 20 or more patients per arm. Non-randomized studies and randomized studies with fewer than 20 patients per arm are considered to be "yellow" regimens. Finally, case reports or non-randomized studies with fewer than 20 patients enrolled are considered to be "red" regimens. Of course, there are finer gradations of the quality of evidence so this simplified scheme should be taken with a grain of salt.
+'''Important note:''' Our intent is not to provide clinical decision support. Rather, our goal is to faithfully reproduce findings of clinical trials. Efficacy and toxicity information, in particular, is sometimes presented by authors in a confusing or ambivalent manner. As such, we try to illustrate ambiguities when they happen, and take no responsibility for your decision to choose a particular treatment regimen. Please read our [[HemOnc.org_-_A_Hematology_Oncology_Wiki:General_disclaimer|disclaimer]] for further information.
-We are currently collaborating on this list, which is a work in progress.  Criteria that may be considered:
+'''A bit of background:''' We have taken a simplified approach to information visualization, based on a three-color "traffic signal" metric. In order to account for color-blindness, text is also included within each colored box. The colors we use are:
+<span title="tooltip"
+style="background:#00CD00;
+padding:3px 6px 3px 6px;
+border-color:black;
+border-width:2px;
+border-style:solid;">Green box text</span>
+<span title="tooltip"
+style="background:#EEEE00;
+padding:3px 6px 3px 6px;
+border-color:black;
+border-width:2px;
+border-style:solid;">Yellow box text</span>
+<span title="tooltip"
+style="background:#ff0000;
+padding:3px 6px 3px 6px;
+border-color:black;
+border-width:2px;
+border-style:solid;">Red box text</span>
+See the sections below for a discussion of the various metrics we use.
+{{TOC limit|limit=2}}
+=Evidence=
+Generally, a regimen should be evaluated in a randomized fashion with an adequate patient sample to be considered a "green" regimen. We have defined adequate as 20 or more patients per arm. Non-randomized studies and randomized studies with fewer than 20 patients per arm are considered to be "yellow" regimens. Finally, case reports, retrospective series, and non-randomized studies with fewer than 20 patients enrolled are considered to be "red" regimens. Of course, there are finer gradations of the quality of evidence so this simplified scheme should be taken with a grain of salt.
+Evidence is thus reported using one of the three labels:
+<span title="tooltip"
+style="background:#00CD00;
+padding:3px 6px 3px 6px;
+border-color:black;
+border-width:2px;
+border-style:solid;">Strong evidence</span>
+<br><br>
+<span title="tooltip"
+style="background:#EEEE00;
+padding:3px 6px 3px 6px;
+border-color:black;
+border-width:2px;
+border-style:solid;">Moderate evidence</span>
+<br><br>
+<span title="tooltip"
+style="background:#ff0000;
+padding:3px 6px 3px 6px;
+border-color:black;
+border-width:2px;
+border-style:solid;">Weak evidence</span>
+==Frequently asked questions==
+'''Q:''' If a randomized trial has more than two arms, will they all be labeled the same?
+<br>
+'''A:''' No, it depends on how many patients are in each arm of the trial. For arms that have more than 20 patients, the label is green. For arms with fewer than 20, the label is yellow.
+<!-- We are currently collaborating on this list, which is a work in progress.  Criteria that may be considered:
 *Phase: I, II, III, IV
@@ Line 15: / Line 72: @@
 *Compared against: placebo, standard of care
 *Superiority, non-inferiority, equivalence
-*Outcomes:
+-->
-**Overall survival
-**Progression-free survival (PFS), disease-free survival (DFS)
+=Efficacy=
-**Quality of life/toxicity
+Defined generally, efficacy is the presence or absence of a positive effect on the study population. Efficacy can be reported ranging from a weak surrogate measure (e.g. response rate) to direct measure of overall survival. Currently, we are not labeling efficacy by the quality of the outcome measure, but rather by comparative efficacy. As such, we only currently report efficacy for randomized trials; many non-randomized trials report efficacy compared to historical controls. However, in the rapidly developing field of oncology, this approach is rife with bias and as such we do not report on comparison to historical controls, at this time.
-**RECIST: Complete response, partial response, stable disease
-**Complete/major molecular response (CMR, MMR), complete/major cytogenetic response (CCR/CCyR, MCR/MCyR), complete/partial hematologic response (CHR, PHR)
+Efficacy is thus reported using one of the three labels:
-*Number of sites (single, regional, national, international)
-*Number of experimental arms
+<span title="tooltip"
+style="background:#00CD00;
+padding:3px 6px 3px 6px;
+border-color:black;
+border-width:2px;
+border-style:solid;">Increased efficacy</span>
+<br><br>
+<span title="tooltip"
+style="background:#EEEE00;
+padding:3px 6px 3px 6px;
+border-color:black;
+border-width:2px;
+border-style:solid;">Equivalent efficacy</span>
+<br><br>
+<span title="tooltip"
+style="background:#ff0000;
+padding:3px 6px 3px 6px;
+border-color:black;
+border-width:2px;
+border-style:solid;">Decreased efficacy</span>
+==Frequently asked questions==
+'''Q:''' How do we choose to label efficacy when multiple outcomes are reported?
+<br>
+'''A:''' Often, a trial will report on multiple outcomes, such as overall response rate, progression-free survival, and overall survival. In this case, we generally look to the PRIMARY endpoint, as defined in the published methods. However, if a secondary endpoint shows differential efficacy and is less "surrogate" than the primary endpoint, we will label by that endpoint.
+'''Q:''' How do you distinguish between a failed trial and a formal non-equivalence study?
+<br>
+'''A:''' We do not currently distinguish these; both of them would be labeled yellow. We are looking into ways to convey this information appropriately.
+'''Q:''' Do you have a hierarchy of surrogacy?
+<br>
+'''A:''' Yes, this is the hierarchy that we use to determine the strength of an outcome measure:
+===Strong outcomes===
+*Overall survival
+===Intermediate outcomes===
+*Disease-free interval (DFI)
+*Disease-free survival (DFS)
+*Event-free survival (EFS)
+*Progression-free survival (PFS)
+*Time to next treatment (TTNT)
+===Weak outcomes===
+*Overall response rate (ORR)
+*Response rate (RR)
+'''Q:''' Do you consider quality of life (QoL) measures in efficacy?
+<br>
+'''A:''' Very few RCTs report on QoL measures, and as such we do not currently include them in the consideration. This may change in the future.
+=Toxicity=
+Defined generally, toxicity is the presence or absence of a negative effect (harm) on the study population. This is often also referred to as safety. As with efficacy, we only report comparative toxicity.
+Toxicity is thus reported using one of the three labels:
-Colors to be used:
 <span title="tooltip"
 style="background:#00CD00;
@@ Line 30: / Line 142: @@
 border-color:black;
 border-width:2px;
-border-style:solid;">Green box text</span>
+border-style:solid;">Decreased toxicity</span>
+<br><br>
 <span title="tooltip"
 style="background:#EEEE00;
@@ Line 36: / Line 149: @@
 border-color:black;
 border-width:2px;
-border-style:solid;">Yellow box text</span>
+border-style:solid;">Equivalent toxicity</span>
+<br><br>
 <span title="tooltip"
 style="background:#ff0000;
@@ Line 42: / Line 156: @@
 border-color:black;
 border-width:2px;
-border-style:solid;">Red box text</span>
+border-style:solid;">Increased toxicity</span>
+==Frequently asked questions==
+'''Q:''' Are you basing the label on the reported CTCAE measures?
+<br>
+'''A:''' CTCAE measures are extremely valuable in that they are structured and thus reproducible. However, it is often hard to compare them directly. For example, if one regimen has grade 4 lab-based toxicity and the other has grade 2 gastrointestinal toxicity, which is the more toxic? In general, we plan to use the authors' interpretation of overall toxicity and tolerability when labeling.
+'''Q:''' Do you plan to incorporate patient-reported outcomes?
+<br>
+'''A:''' As shown in numerous publications, patient reports of toxicity are more accurate than clinician assessments. However, they have not until recently been standardized. Now that the [https://wiki.nci.nih.gov/pages/viewpage.action;jsessionid=0CD2068195B764B2954DF251DE62A812?pageId=10857328 PRO-CTCAE] is available, we expect to see more of these in the future and will incorporate them into the toxicity assessment.
-==Examples==
+=Example code (for contributors)=
 <span
 style="background:#ff0000;
@@ Line 84: / Line 208: @@
 border-width:2px;
 border-style:solid;">Phase III</span></nowiki>
+<!--
+**RECIST: Complete response, partial response, stable disease
+**Complete/major molecular response (CMR, MMR), complete/major cytogenetic response (CCR/CCyR, MCR/MCyR), complete/partial hematologic response (CHR, PHR)
+*Number of sites (single, regional, national, international)
+*Number of experimental arms
+-->

Difference between revisions of "Levels of Evidence"

Revision as of 17:15, 18 January 2015

Contents

Evidence

Frequently asked questions

Efficacy

Frequently asked questions

Strong outcomes

Intermediate outcomes

Weak outcomes

Toxicity

Frequently asked questions

Example code (for contributors)

Navigation menu

Search