# Empirical Cumulative Distribution Plots in MATLAB®

How to make Empirical Cumulative Distribution Plots in MATLAB® with Plotly.

## Compute Empirical Cumulative Distribution Function

Compute the Kaplan-Meier estimate of the cumulative distribution function (cdf) for simulated survival data.

Generate survival data from a Weibull distribution with parameters 3 and 1.

rng('default')  % for reproducibility
failuretime = random('wbl',3,1,15,1);


Compute the Kaplan-Meier estimate of the cdf for survival data.

rng('default')  % for reproducibility
failuretime = random('wbl',3,1,15,1);

[f,x] = ecdf(failuretime);
[f,x]

ans =

0    0.0895
0.0667    0.0895
0.1333    0.1072
0.2000    0.1303
0.2667    0.1313
0.3333    0.2718
0.4000    0.2968
0.4667    0.6147
0.5333    0.6684
0.6000    1.3749
0.6667    1.8106
0.7333    2.1685
0.8000    3.8350
0.8667    5.5428
0.9333    6.1910
1.0000    6.9825


Plot the estimated cdf.

rng('default')  % for reproducibility
failuretime = random('wbl',3,1,15,1);

[f,x] = ecdf(failuretime);

ecdf(failuretime)

fig2plotly(gcf);


## Empirical Hazard Function of Right-Censored Data

Compute and plot the hazard function of simulated right-censored survival data.

Generate failure times from a Birnbaum-Saunders distribution.

rng('default')  % For reproducibility
failuretime = random('birnbaumsaunders',0.3,1,100,1);


Assuming that the end of the study is at time 0.9, generate a logical array that indicates simulated failure times that are larger than 0.9 as censored data, and store this information in a vector.

rng('default')  % For reproducibility
failuretime = random('birnbaumsaunders',0.3,1,100,1);

T = 0.9;
cens = (failuretime>T);


Plot the empirical hazard function for the data.

rng('default')  % For reproducibility
failuretime = random('birnbaumsaunders',0.3,1,100,1);

T = 0.9;
cens = (failuretime>T);

ecdf(failuretime,'Function','cumulative hazard', ...
'Censoring',cens,'Bounds','on');

fig2plotly(gcf);


## Compare Empirical Cumulative Distribution Function (CDF) with Known CDF

Generate right-censored survival data and compare the empirical cumulative distribution function (cdf) with the known cdf.

Generate failure times from an exponential distribution with mean failure time of 15.

rng('default')  % For reproducibility
y = exprnd(15,75,1);


Generate drop-out times from an exponential distribution with mean failure time of 30.

rng('default')  % For reproducibility
y = exprnd(15,75,1);

d = exprnd(30,75,1);


Generate the observed failure times. They are the minimum of the generated failure times and the drop-out times.

rng('default')  % For reproducibility
y = exprnd(15,75,1);

d = exprnd(30,75,1);

t = min(y,d);


Create a logical array that indicates generated failure times that are larger than the drop-out times. The data for which this is true are censored.

rng('default')  % For reproducibility
y = exprnd(15,75,1);

d = exprnd(30,75,1);

t = min(y,d);

censored = (y>d);


Compute the empirical cdf and confidence bounds.

rng('default')  % For reproducibility
y = exprnd(15,75,1);

d = exprnd(30,75,1);

t = min(y,d);

censored = (y>d);

[f,x,flo,fup] = ecdf(t,'Censoring',censored);


Plot the cdf and confidence bounds.

rng('default')  % For reproducibility
y = exprnd(15,75,1);

d = exprnd(30,75,1);

t = min(y,d);

censored = (y>d);

[f,x,flo,fup] = ecdf(t,'Censoring',censored);

figure()
ecdf(t,'Censoring',censored,'Bounds','on');
hold on

fig2plotly(gcf);


Superimpose a plot of the known population cdf.

rng('default')  % For reproducibility
y = exprnd(15,75,1);

d = exprnd(30,75,1);

t = min(y,d);

censored = (y>d);

[f,x,flo,fup] = ecdf(t,'Censoring',censored);

figure()
ecdf(t,'Censoring',censored,'Bounds','on');
hold on

xx = 0:.1:max(t);
yy = 1-exp(-xx/15);
plot(xx,yy,'g-','LineWidth',2)
axis([0 50 0 1])
legend('Empirical','LCB','UCB','Population', ...
'Location','southeast')
hold off

fig2plotly(gcf);


## Empirical Survivor Function with Confidence Bounds

Generate survival data and plot the empirical survivor function with 99% confidence bounds.

Generate lifetime data from a Weibull distribution with parameters 100 and 2.

rng('default')  % For reproducibility
R = wblrnd(100,2,100,1);


Plot the survivor function for the data with 99% confidence bounds.

rng('default')  % For reproducibility
R = wblrnd(100,2,100,1);

ecdf(R,'Function','survivor','Alpha',0.01,'Bounds','on')
hold on

fig2plotly(gcf);


Fit the Weibull survivor function.

rng('default')  % For reproducibility
R = wblrnd(100,2,100,1);

ecdf(R,'Function','survivor','Alpha',0.01,'Bounds','on')
hold on

x = 1:1:250;
wblsurv = 1-cdf('weibull',x,100,2);
plot(x,wblsurv,'g-','LineWidth',2)
legend('Empirical','LCB','UCB','Population', ...
'Location','northeast')

fig2plotly(gcf);


The survivor function based on the actual distribution is within the confidence bounds.