Traditionally, phase II cancer trials test a binary endpoint formed from a dichotomisation of the continuous change in tumour size. Directly testing the continuous endpoint provides considerable gains in power, although also results in several statistical issues. One such issue is when complete responses, i.e. complete tumour removal, are observed in multiple patients; this is a problem when normality is assumed. Using simulated data and a recently published phase II trial, we investigate how the choice of test affects the operating characteristics of the trial. We propose using parametric tests based on the censored normal distribution, comparing them to the t-test and Wilcoxon non-parametric test. The censored normal distribution fits the real dataset well, but simulations indicate its type-I error rate is inflated, and its power is only slightly higher than the t-test. The Wilcoxon test has deflated type I error. For two-arm designs, the differences are much smaller. We conclude that the t-test is suitable for use when complete responses are present, although positively skewed data can result in the non-parametric test having higher power.