EURURO Vol. 71 No. 5

of the evidence, and the authors concluded that there was

an urgent need to conduct a large, robust, multicenter RCT

to address these shortcomings. Pickard et al

[8]

published

the results of such an RCT in 1167 patients and found no

evidence that either tamsulosin or nifedipine increased the

rate of spontaneous stone passage compared to placebo. The

results were consistent across subgroup and sensitivity

analyses.

We compare the RCT by Pickard et al

[8]

to the MA with

the most studies, by Seitz et al

[36]

, to explore and discuss

discordant findings. Most RCTs included in the Seitz MA

were small and recruited from a single center; only six of 35

(17%) recruited more than 100 patients. The majority had

low internal validity and only one RCT reported allocation

concealment. As small RCTs may report larger effect sizes

compared to larger RCTs, an MA of small RCTs can lead to

biased estimates of treatment effects

[39]

. Seitz et al also

found evidence of publication bias, which can lead to

overestimation of treatment effects and compromise the

validity of the MA findings

[40]

Seitz et al

[36]

found evidence of clinical heterogeneity

concerning the patient inclusion criteria, stone character-

istics, intervention, treatment in the control group, and

outcome measurement. In the MA, the primary outcome of

being stone-free was inconsistently defined, assessed using

different imaging modalities, and measured at a variety of

time points. In the RCT of Pickard et al

[8] ,

the primary

outcome was any need for further intervention within 4 wk

of randomization, which is compared here to being stone-

free. In the control group of the Pickard RCT, 80% of patients

were stone-free, whereas in the Seitz review, stone-free

rates ranged from 4% to 78%, which highlights the potential

impact of heterogeneity in the studies included.

With contrasting primary outcomes and different base-

line event rates in the control groups, it is not surprising that

the RCT and MA reported discordant findings. The choice of

primary outcome is clearly of paramount importance in any

trial. Heterogeneity in the conduct, design, and reporting of

trials in this MA makes pooled treatment effects difficult, if

not impossible, to interpret.

5.2.

Partial versus radical nephrectomy

In a European Organization for Research and Treatment of

Cancer (EORTC) RCT involving 541 patients with a solitary

T1–2N0M0 renal tumor of 5 cm, 21 patients progressed,

nine after radical nephrectomy (RN) and 12 after partial

nephrectomy (PN). An intent-to-treat analysis found an

overall survival (OS) advantage in favor of RN (hazard ratio

[HR] 1.5,

= 0.03); however, only 12 of the 117 deaths were

due to kidney cancer, four after RN and eight after PN

[10] .

Subsequently, Kim et al

[9]

published an SR and MA

including some 41 000 patients and found statistically

significant improvements in both OS (HR 0.81,

0.001)

and disease-specific survival (HR 0.71,

0.001), but this

time in favor of PN

[9] .

How can this discordance be

explained?

The Kim MA has a number of limitations. First, the

38 trials included were mostly retrospective, single-center

studies. The only RCT was the EORTC study. No information

was provided about the distribution of follow-up or patient

characteristics by treatment group (T category when

T1,

tumor size, tumor grade, cell type, or renal function).

Consequently, the differences in survival observed may not

be directly due to differences in treatment efficacy. In

addition, it is not clear to which patients the results can be

generalized. Lastly, there was significant heterogeneity in

the size of the treatment effect across the studies, so the

overall estimate of the HR is not meaningful. Nevertheless,

the EORTC RCT also had limitations and should be

interpreted with caution: 55 patients crossed over to the

other randomized treatment, 140 patients were clinically or

pathologically ineligible, and there were few cancer-related

events.

The MA found that PNwas associated with a lower risk of

severe chronic kidney disease (CKD); however, the EORTC

study only found lower incidence of at least moderate renal

dysfunction, not of advanced kidney disease or renal failure,

and this was not associated with a corresponding difference

in survival

[41]

. The studies in theMA did not always specify

the status of the contralateral kidney, whereas in the EORTC

study the contralateral kidney had to be normal.

Critical information regarding the biases of the studies

included in the SR were not made explicit, since a Grading of

Recommendations, Assessment, Development, and Evalua-

tion (GRADE) approach

[42]

was not used to assess the

quality of the evidence. The quality of the studies in the SR

and the heterogeneity of the results call into question the

validity of the conclusions of the MA, which should thus be

viewed with skepticism. In the same year, another SR

suggested that localized renal cell carcinomas are best

managed by PN where technically feasible. However, the

evidence base had significant limitations owing to studies of

low methodological quality and high RoB

[43]

Further nonrandomized studies have found improved

survival after PN

[44,45]

and a reduction in the risk of

cardiovascular events

[46]

relative to RN; however, patients

chosen for PN had a higher baseline likelihood of long-term

survival

[47,48]

. In another study, only patients with stage II

CKD had a lower risk of developing significant renal

impairment after PN

[49]

. More recently, an SR and MA

of 21 nonrandomized comparative studies in patients with

clinical T1b and T2 renal tumors found better tumor control

and survival for PN compared to RN

[50]

, but the SR is

subject to the same biases as the Kim MA.

Taking into account all the available efficacy data and a

perceived advantage in renal function, the 2016 EAU

Guidelines recommend, with several exceptions, that

localized renal cancers are better managed by PN than RN.

Discussion

It is generally accepted that a high-quality SR of RCTs and an

associated MA can provide a higher LE than a single RCT

addressing the same question

[2]

. This can be problematic,

however, when the results of the MA are in direct conflict

with the RCT, making it difficult for guideline organizations

to interpret the evidence and issue recommendations.

E U R O P E A N U R O L O G Y 7 1 ( 2 0 1 7 ) 8 1 1 – 8 1 9

816