Dot plots

Contributed by: Mat Soukup (email: Mat.Soukup@fda.hhs.gov)

Type of Data: continuous

Type of Analysis: univariate

Description and Purpose:

The dotplot is a useful graphical approach for plotting a quantitative variable with labels representing the quantitative measurement (e.g., when looking at adverse event rates, a label for the adverse event accompanies the rate). The dotplot is an alternative to plotting data using such displays as bar charts and pie charts, though it has advantages over such approaches in that it allows one the ability to plot data in which the baseline of the data is not zero.

Comparison of Multiple Graphing Techniques

To illustrate the use of multiple graphical displays suppose we want to illustrate the disposition of subjects at the end of trial. Table 1 depicts the disposition along with the percentage of subjects in each category.

Table 1: Disposition

Disposition	Percentage
Completed Study	75.0%
Adverse Event	7.0%
Lack of Efficacy	8.0%
Withdrew Consent	6.0%
Other	4.0%

Figure 1 depicts the data from Table 1 using three approaches: dotplot, pie chart, and barplot. In this example, we can see the limitations of the pie chart as it is difficult to interpret the numerical differences for subjects that did not complete the study. Both the dotplot and the barchart show the magnitude of the differences. However, the dotplot has an advantage over the barchart in that the labels are written horizontally for the disposition as well as the ink used in the plotting region only corresponds to the point of interest, in this case the percentage.

Figure 1: Disposition Graphical Displays

Figure 1

Application of Dotplots to Adverse Event Data

As an illustrative example, we can plot the percent of subjects that experience an adverse event (using MedDRA preferred terms) in a clinical trial (Figure 2). In this example, we see that the preferred terms are listed in alphabetical order.

Figure 2: Percentage of subjects reporting an adverse event (MedDRA preferred term)

Figure 2

To aid in the visualization, the ordering of the preferred terms can be done based on the corresponding percentage (Figure 3). This plot allows one to easily see that PRURITUS and APPLICATION SITE PRURITUS are reported in the most frequently which requires more inspection if using Figure 2.

Figure 3: Percentage of subjects reporting an adverse event (MedDRA preferred term) sorted by Frequency

Figure 3

In addition, to using a basic dotplot as shown in Figures 2 and 3, additional information can be incorporated into the dotplot. For example, we could sort, the preferred according to the system organ classification (Figure 4). In this figure, color is used to denote the different the location of the SOC and terms are plotted by SOC descending from most frequent PT. Note that in this case, the color could be suppressed and rather the graphic could be printed in black and white since the SOC is provided on left side of the graphic.

Figure 4: Percentage of subjects reporting an adverse event (MedDRA preferred term) sorted by Frequency and System Organ Classification

Figure 4

The dotplot also allows one to use different plotting characters to correspond to a grouping (i.e. a superposition of plotting symbols). In the clinical trial realm, this would often times correspond to the treatment assignment. Figure 5 is similar in appearance to Figure 3 but adding in information for alternate treatment arms.

Figure 5: Percentage of subjects reporting an adverse event (MedDRA preferred term) sorted by frequency and grouped by treatment assignment

Figure 5

To incorporate additional information into the dotplot, one can also panel the display on additional categorical variable. Figure 6 contains such a plot in which the preferred term is paneled on the investigator’s determination of whether the event was related to treatment or not.

Figure 6: Percentage of subjects reporting an adverse event (MedDRA preferred term) sorted by frequency and grouped by treatment assignment with panels for treatment relationship

Figure 6

Reference: Cleveland’s “The Elements of Graphing Data” contains an extensive discussion on the use and application of dotplots. This was particularly helpful in devising an approach for presenting the information.

Code (R 10.1):

%CODE{lang="java"}%

Coding Information

All figures were created using R version 10.1.

Required Libraries

library(lattice)

library(SASxport)

library(vcd) # for color scheme

library(plotrix) # for pie chart

Data Preparation Code for ADAE CDISC Pilot Data set

adae <- read.xport("C:/Research/CDISC-ADaMPilot/900171/m5/datasets/CDISCPILOT01/analysis/ADAE.xPT",

names.tolower=TRUE)

adsl <- read.xport("C:/Research/CDISC-ADaMPilot/900171/m5/datasets/CDISCPILOT01/analysis/ADSL.xPT",

names.tolower=TRUE)

dat <- data.frame(ID=adae$usubjid,PT=adae$aedecod,TRT=adae$trtp)

# For each subject get only the unique PT

uid <- unique(adae$usubjid)

pts <- NULL

trts <- NULL

for(i in 1:length(uid)){

sdat <- subset(dat, ID%in%uid[i])

upt <- unique(sdat$PT)

npt <- length(upt)

rows <- NULL

for(k in 1:npt){

ww <- which(sdat$PT%in%upt[k])

rows[k] <- ww[1]

}

pts[i] <- list(sdat$PT[rows])

trts[i] <- list(sdat$TRT[rows])

}

dat1 <- data.frame(PT=unlist(pts), TRT=unlist(trts))

ss <- with(dat1, table(PT, TRT))

sums <- apply(ss, 1, sum)

ww <- which(sums>0)

xPT <- ss[ww,]

# Get percents

pPT <- NULL

for(j in 1:dim(xPT)[1]){

pPT[j] <- list(100*xPT[j,]/c(84,84,84)) #N/group come from ADSL

}

matP <- round(do.call('rbind', pPT),1)

# remove PT which occur in less than p subjects

sump <- apply(matP, 1, sum)

wsum <- which(sump > 5.0)

matXp <- xPT[wsum,]

matPp <- matP[wsum,]

rownames(matPp) <- rownames(matXp)

colnames(matPp) <- colnames(matXp)

plotdat <- data.frame(PT = rep(rownames(matPp), 3),

TRT = rep(colnames(matPp), each=dim(matPp)[1]),

PCT = c(matPp[,1], matPp[,2], matPp[,3]))

Setting up Graphical Parameters

hclcolors <- rainbow_hcl(4, l=50, start=50, c=75)[c(3,2,4,1)]

colorpalette <- c(hclcolors[1:3], 'grey45')

new.back <- trellis.par.get("background")

new.back$col <- "white"

newcol <- trellis.par.get("superpose.symbol")

newcol$col <- colorpalette

new.pan <- trellis.par.get("strip.background")

new.pan$col <- c('gray90','white')

trellis.par.set("background", new.back)

trellis.par.set("superpose.symbol", newcol)

trellis.par.set("strip.background",new.pan)

Figure 1 Code

advdat <- data.frame(REAS=c('Completed Study','Adverse Event',

'Lack of Efficacy', 'Withdrew Consent', 'Other'),

PCT=c(75, 7, 8, 6, 4))

rr <- rank(advdat$PCT)

mylayout <- layout(matrix(c(1,2,3,4), byrow=TRUE, ncol=4),

widths=c(1/9,2/9,1/3, 1/3),

heights=c(3/4,3/4,3/4,3/4))

par(mar=c(5,1,7,0))

plot(x=c(0,1), y=c(1,dim(advdat)[1]), axes=FALSE, ann=FALSE, type='n', xlab='',

ylab='')

text(x=1, y=1:dim(advdat)[1], advdat$REAS[rr], adj=c(1,NA), cex=.9,

col='black')

par(mar=c(5,0,7,1))

plot(x=advdat$PCT[rr], y=1:dim(advdat)[1], type='n', pch=16, axes=FALSE,

xlab='Percent')

abline(h=1:dim(advdat)[1], col='grey80')

box()

points(x=advdat$PCT[rr], y=1:dim(advdat)[1], pch=16, col='black')

axis(1)

mtext('Dotplot', 3, line=1, font=2)

#dotplot(reorder(REAS,PCT) ~ PCT, data=advdat)

plot(1:5,type="n",axes=FALSE)

box()

floating.pie(3,3,advdat$PCT, col=c('red','blue','green4','orange','violet'))

mtext('Pie Chart', 3, line=1, font=2)

legend(x=1, y=5, legend=advdat$REAS, fill=c('red','blue','green4','orange','violet'),

text.col=c('red','blue','green4','orange','violet'))

par(mar=c(5,2,7,1))

barplot(advdat$PCT[rr], names.arg=advdat$REAS[rr], horiz=TRUE, xlab='Percent')

mtext('Barplot', 3, line=1, font=2)

Figure 2 Code

dathigh <- subset(plotdat, TRT%in%'Xanomeline High Dose')

dotplot(PT ~ PCT, groups=TRT, data=dathigh,

xlab="Percent Reporting Event",

pch=16, col='black')

Figure 3 Code

dotplot(reorder(PT,PCT) ~ PCT, groups=TRT, data=dathigh,

xlab="Percent Reporting Event",

pch=16, col='black')

Figure 4 Code

dathigh$SOC <- c('GASTROINTESTINAL DISORDERS', 'PSYCHIATRIC DISORDERS',

'GENERAL DISORDERS AND ADMINISTRATION SITE CONDITIONS',

'CARDIAC DISORDERS','CARDIAC DISORDERS',

'MUSCULOSKELETAL AND CONNECTIVE TISSUE DISORDERS',

'SKIN AND SUBCUTANEOUS TISSUE DISORDERS',

'PSYCHIATRIC DISORDERS',

'RESPIRATORY, THORACIC AND MEDIASTINAL DISORDERS',

'GASTROINTESTINAL DISORDERS', 'NERVOUS SYSTEM DISORDERS',

'INVESTIGATIONS', 'SKIN AND SUBCUTANEOUS TISSUE DISORDERS',

'GENERAL DISORDERS AND ADMINISTRATION SITE CONDITIONS',

'NERVOUS SYSTEM DISORDERS', 'SKIN AND SUBCUTANEOUS TISSUE DISORDERS',

'CARDIAC DISORDERS', 'RESPIRATORY, THORACIC AND MEDIASTINAL DISORDERS',

'INFECTIONS AND INFESTATIONS', 'GASTROINTESTINAL DISORDERS',

'GENERAL DISORDERS AND ADMINISTRATION SITE CONDITIONS',

'SKIN AND SUBCUTANEOUS TISSUE DISORDERS',

'SKIN AND SUBCUTANEOUS TISSUE DISORDERS','CARDIAC DISORDERS',

'SKIN AND SUBCUTANEOUS TISSUE DISORDERS','NERVOUS SYSTEM DISORDERS',

'NERVOUS SYSTEM DISORDERS',

'INFECTIONS AND INFESTATIONS','GASTROINTESTINAL DISORDERS')

sortmat <- function (Mat, Sort) {

m <- do.call("order", as.data.frame(Mat[, Sort]))

Mat[m, ]

} # Used to sort a matrix

dat2 <- sortmat(dathigh, c(4,3))

tt <- table(dat2$SOC)

plotcol <- rainbow(length(tt))

dat2$color <- rep(plotcol, tt)

mylayout <- layout(matrix(c(1,2,3), byrow=TRUE, ncol=3),

widths=c(2/7,3/7, 2/7))

par(mar=c(5,1,2,0))

plot(x=c(0,1), y=c(1,dim(dat2)[1]), axes=FALSE, ann=FALSE, type='n', xlab='', ylab='')

text(x=1, y=1:dim(dat2)[1], dat2$PT, adj=c(1,NA), cex=.9,

col=dat2$color)

par(mar=c(5,0,2,2))

plot(x=dat2$PCT, y=1:dim(dat2)[1], type='n', pch=16, axes=FALSE, xlab='Percent Reporting Event')

abline(h=1:dim(dat2)[1], col='grey80')

box()

points(x=dat2$PCT, y=1:dim(dat2)[1], pch=16, col=dat2$color)

axis(1)

par(mar=c(5,0,2,1))

plot(x=c(0,1), y=c(1,dim(dat2)[1]), axes=FALSE, ann=FALSE, type='n', xlab='', ylab='')

text(x=0, y=1:dim(dat2)[1], dat2$SOC, adj=c(0,NA), cex=.85,

col=dat2$color)

Figure 5 Code

dotplot(reorder(PT,PCT) ~ PCT, groups=TRT, data=plotdat,

xlab="Percent Reporting Event",

pch=1:3,

key=list(

points=list(

col=trellis.par.get("superpose.symbol")$col[1:3],

pch=1:3),

text=list(

lab=levels(plotdat$TRT),

col=trellis.par.get('superpose.symbol')$col[1:3]),

columns=3, title='Treatment'))

Figure 6 Code

# Simulate Data

set.seed(133)

notrelated <- NULL

for(i in 1:dim(plotdat)[1]){

notrelated[i] <- plotdat$PCT[i] - runif(1, 0, plotdat$PCT[i])

}

related <- plotdat$PCT - notrelated

plotdat2 <- data.frame(PT=rep(plotdat$PT, 2),

TRT=rep(plotdat$TRT, 2),

REL=factor(rep(c('Related','Not Related'), each=dim(plotdat)[1]),

levels=c('Related','Not Related')),

PCT=c(related, notrelated))

dotplot(reorder(PT,PCT) ~ PCT|REL, groups=TRT, data=plotdat2,

xlab="Percent Reporting Event",

pch=1:3,

key=list(

points=list(

col=trellis.par.get("superpose.symbol")$col[1:3],

pch=1:3),

text=list(

lab=levels(plotdat$TRT),

col=trellis.par.get('superpose.symbol')$col[1:3]),

columns=3, title='Treatment'))

%ENDCODE%