Presenting, analysing & discussing the results

Presenting, analysing & discussing the results

CLINICAL TRIALS PRESENTING, ANALYSING DISCUSSING THE MICHAEI- The purpose of clinical evaluation is to ident@ V & RESULTS KIRK-SMITH chang...

2MB Sizes 0 Downloads 26 Views

CLINICAL

TRIALS

PRESENTING,

ANALYSING

DISCUSSING

THE

MICHAEI-

The purpose of clinical evaluation is to ident@

V

&

RESULTS

KIRK-SMITH

changes

measures

are flawed, then no amount

due to a treatment. The Results section is where these

sophisticated

statistical

of

analysis can sort out these

changes are presented as data summarised in tables,

problems.

grafihs and plots, so that any changes and their extent

measures

can be easily seen. Statistical tests are used to conjirm

analysis of results is likely to be straightforward.

that these changes are due to treatment and not due to

data have been collected

chance. The choice of test depends on factors such as the

what to do with them,

measurement scales and data distribution and these are

being planned and designed after it has been done - not a good idea. To avoid this, it is

outlined. Finally, the meaning of the results and their ramifications are given in th,eDiscussion section.

strongly

On the other and design

recommended

hand,

of a clinical

evaluation

then the If

and one does not know

then the study is effectively

that the results section,

with mock graphs and tables, the planning stage.

The main purpose

if the Aims,

are well planned,

etc., be drawn up at

is to

find out whether a therapy or treatment causes a change.in the condition of the patients when compared other

to no treatment

treatments. The design

at all or compared

of the study ensures

can be sure that any change treatment Results

alone

section,

(or lack of change) Being

The

has actually happened

not is the key issue to be addressed There are two aspects first, visual inspection statistical

evidence

of the visual inspection.

or

in the Results

to identifying of the results

secondly,

no interpretation

factors.

is where any change

and its #extent is reported.

sure that a change

section. change;

that one

was due to the

and not to other in contrast;

to

analysis to confirm There

and

the should

be

of the data in the Results

section; it should just contain results, nothing more. What is interesting in the results or what

The first step in identifying

the results might mean has no place here. go into the Discussion section.

been a change due to the treatment is to summarise and display the results in a form that

It is important

These

to note that if the design

or

whether

will allow an easy visual inspection

there

has

of any changes.

AS well as using tables, are by using a graph (Fig.2). scores,

The vertical see later)

good ways of doing

(Fig.1)

or a aero-bar

lines represent

this plot

the spread

in which the effects

of

of the

If an obvious visible difference or change is seen between the control and treatment groups, or the baseline

and intervention

plotted,

then there is likely to be a real difference

periods when the data are

change.

However, the change may not always be

or

obvious, or people might disagree if there is a real change. For example, treatment

in a multi-subject

design, a

group of 24 patients might have a average

improvement

of 42 (say, on a pain scale), with upper

and lower scores in the group of 38 and 46. The control

group of another

improvement Although

24 patients might have an

of 38 with scores between 34 and 42.

the average scores differ by four points, it

might be argued that another control group, being different people, might just as well have had a scores of 42 like the treatment

group. So maybe the

difference between the treatment and control groups is just due to random variations in the particular people selected for each group. Similarly, in a single case example, a patient’s average baseline temperature might be 94”C, varying between 92 and 98°C. Then during treatment treatment measures.

are contrasted Spreadsheets,

statistical

packages

professional These

can be used to produce

quality tables,

should

descriptive “Table

graphs

be accompanied

texts drawing

and

plots.

mood scores

purpose

may be to see the

The question

of a clinical

influence

of the

age and weight),

of other

factors

These range

factors

treatment range,

on

often summarised

e.g. how

if it is thought

(like

then the treatment

and down together,

effects

of the

control

due

to

and age go up

or are said to be “correlated”.

in a table or presented plot (see Fig.2).

(i.e., probability)

by

of whether

between the treatment

and

groups is more likely to be due to the

treatment might

If the influencing factors can be classified then the average scores for patients in each group can be displayed as numbers graphically in a aero-bar

the chance

the average difference

the treatment for each patient’s age would be plotted. If a line can be drawn through the (Fig.S),

Statistical tests help sort out this question calculating

or

by a continuous

e.g. patients’ ages, then the change

as the “standard

might the difference between the two groups also be due to this same natural variation?

classes e.g.

that the effect

and calculated

if the spread of scores within each group is wide,

the first step is to display the results.

might be affected

of patients or

deviation” and given as the vertical bars in Fig.2. So,

of numbers

or might be discrete

the

time. This spread of scores about each average is

may be

male or female, smoking and non-smoking, with and without expectation. For”example,

in both these cases is whether

average score during treatment is really different from the average score with no treatment, because the spread of scores making up both averages overlap

scores

evaluation

in the conditions,

was given.

as a continuous

Again,

e.g.,

e.g. the age, weight or sex of

or a variation

measured

anyway, since the temperature

is varying over a wide range.

and are taken over a limited number

(average = 74) )). A subsidiary

the treatment

sessions the average

to 93, and varying between 93 and 97°C.

One could argue that the small average change

with short

with Group A having the highest

the treatment,

decreases

might have happened

out the main points,

1 shows the average

groups,

patient

with the non-treatment graphics programmes and

or due to the random

differences

one

expect between any two samples of patients.

For example, a statistical test carried out on the difference between the 42 and 38 averages in pain scores might show that with this spread of scores within each group there is less than a 1 in 20 chance (usually written as p< 0.05) of there being this difference in averages if two randomly selected control groups were compared. It is conventionally accepted that a difference with a 1 in 20 chance

(called a “significant

difference”)

is probably due to the treatment

and not

“parametric”

or “non-parametric”

due to chance differences

between the two groups.

used. Parametric

Similarly, if the difference

between the treatment

require

ant

statistical test is

tests (tests in Section

that data give a bell-shaped

‘normal distribution’)

C, Table 1)

curve (or a

when they are displayed. Also,

control group averages was 1 to 100, unlikely to be due to chance, then this is called “highly significant”,

the steps or intervals on the measurement

and one can be even more certain that the difference

must be equally apart. These are called interval and

is due to the treatment.

ratio scales and cover measures such as blood

However, it is important

note that even if a result is significantly different

it may not be clinically useful, since a very

reliable but small improvement treatment

to

(or reliably)

due to a complicated

be more trouble than it is worth.

pressure,

rash area and temperature,

measures which can be counted, cigarettes

scales

as well as

e.g., the number

of

smoked by each person on a treatment

programme. If the data are not “bell-shaped” when displayed and small samples are involved, a common

situation

in small scale clinical trials, then a non-parametric test (tests in Sections A and B, Table 1) might be considered.

Non-parametric

tests are also used when

interval of ratio scales are not being used, i.e. for nominal

or ordinal measurements.

Nominal measures are those that are counted or classified into different groups or categories, e.g., the numbers

of patients which are ‘yes’ or ‘no’, or

‘red’, ‘blue’ or ‘black”. Ordinal measures cover data that can be ordered magnitude, Statistical tests are used in a similar way to assess whether two factors are actually correlated going up and down together) correlation association

(i.e.,

or whether the

is due to chance. The degree of (how close the points are to the line in

Figure 3) is calculated and the probability alone is determined

as a ‘Correlation

of getting this value by chance by the number

pairs) used in the calculation,

of points (or

so that ‘significant’

and ‘highly significant’ correlations as in differences

Coefficient’;

can be confirmed

between groups.

or ranked in order of

e.g. “High”, “Middle” and “Low”, and

where the steps between scale points are not equal. Psychological

measures are often ordinal, e.g. scales

such as ‘strongly agree - agree - not strongly disagree’. Psychological might be best regarded

sure - disagree -

scales with numbers

as ordinal as well since the

distances between points on a scale may not be equal e.g., on a l-10 “relaxation” scale we may not be sure that the distance between 2 and 3 is the same in psychological

terms as between 9 and 10.

‘Related’ refers to whether used are matched

the samples of data

in some way, i.e., by using a patient

as their own control

(e.g., before

and after

treatment) or by pairing two subjects who are as alike as possible and then allocating them to different treatment The best way to select an appropriate to get advice from a statistician presenting

statistical test is

or research

them with the aims, measures,

advisor by

groups. ‘Unrelated’ refers to when patients

are allocated

randomly

to treatment

and control

groups.

design and

results sections already drawn up before any data are collected

(e.g. in the format described

in the first

article of this series). This will allow them to assess quickly which statistics are most appropriate. General

Statistical tests are usually calculated

issues to do with choice of tests will be considered

although

here. Standard

textbooks

on statistical analysis should

be consulted for detailed explanations about individual tests and how they should be used. Table 1 gives examples of some common tests used to determine changes and differences in data. The headings on this table will now be explained. The ‘shape’ of the data and the type of measurement

scale used determines

whether a

on computer,

they can also be done by hand or

calculator. A spreadsheet

can be used to calculate

statistics (Soper and Lee 1990),

and most PC’s have

spreadsheet software, but calculations need to be double checked for accuracy (e.g. in case of the spreadsheet

rounding

wrong formulae).

up figures

or having the

The simplest approach

is to use

statistical packages. Most educational establishments now have large and easy to use statistical packages,

e.g.

This concludes the series of articles on planning clinical

Minitab, SPSS (e.g. Bryman and Kramer 1990))

Systat, Unistat and StatXact. parametric facilities tests).

and parametric

(StatXact There

evaluations. The purpose of the series is to give an insight

These all have non-

into good practice in conducting research and into the

tests and good ‘help’

concentrates

issues underlying it rather than giving comprehensive

on non-parametric

are also smaller, more limited

guidance. It is hoped that the articles will encourage

(but

readers to consider evaluating their treatments and to

cheaper or free) statistical programmes, many of which are available as ‘public domain’ programmes.

explore research methods and issues more deeply. NOTE: Dr KiykSmith will be pleasedto collaboratewith readerswho are interestedin eoaluating theirtreatmentsand also to discussany aspectsof this serieswith readers.He can be

The Results section is followed by the Discussion

contactedat the UniversiQof UlsteratJordanstownon the Tel &

section

enaailnumbersprinted on the titlepage.

(which can also include “Conclusions”

and

‘Recommendations’). This gives your interpretation of the results and their ramifications. Detailed results, e.g., numbers,

should not be repeated,

but should be

mentioned in summary form, e.g., “The unexpectedly high mood ratings in Group A might be due to....“.

Typical points

in a Discussion are: s* ??Interpretation work mentioned

Kratochwill, TR. (1992) Single case researchdesign and of the results in terms of past

in the Introduction,

e.g. where

agrees or differs from previous findings and ideas and why this should be. ??Drawing attention to interesting

or surprising

aspects of the results, and their implications. ?? ??

Bryman, A. and Kramer, D. (1990) Quantitativedata analysisfi social sciences.Routledge, London.

that might be covered

Stating the limitations of the study. Giving clinical recommendations that arise

from the results. ?? Suggesting future research that should be done.

analysis. Lawrence Erlbaum, Hillside, NJ. it

Pilcher, D.M. (1990) Data analysti,fmthe helping Sage Pubs., London. Siegel, S. and Castellan, NJ. (1988) Nonparametric

professions.

statisticsfm the behaviouralsciences.McGraw-Hill International Editions, New York. Soper, J.B. and Lee, M.P. (1990) Statisticswith Lotwl 123 (2nd.Ed). Chartwell-Bratt Ltd., Sweden. Yamold, P.R. (1992) Statistical analysis for single case designs. In Bryant, F. et al. (Eds.) Methodologicalissues in applied social psychology. Plenum Press, N.Y