Simple Linear Regression
This example shows how to perform simple linear regression using the accidents dataset. The example also shows you how to calculate the coefficient of determination R2 to evaluate the regressions. The accidents dataset contains data for fatal traffic accidents in U.S. states.
Linear regression models the relation between a dependent, or response, variable y and one or more independent, or predictor, variables x1,...,xn. Simple linear regression considers only one independent variable using the relation
where β0 is the y-intercept, β1 is the slope (or regression coefficient), and ϵ is the error term. This can be simplified to
From the dataset accidents, load accident data in y and state population data in x. Find the linear regression relation y=β1x between the accidents in a state and the population of a state using the \ operator. The \ operator performs a least-squares regression.
load accidents x = hwydata(:,14); %Population of states y = hwydata(:,4); %Accidents per state format long b1 = x\y
b1 = 1.372716735564871e-04
b1 is the slope or regression coefficient. The linear relation is y=β1x=0.0001372x.
Calculate the accidents per state yCalc from x using the relation. Visualize the regression by plotting the actual values y and the calculated values yCalc.
yCalc1 = b1*x; scatter(x,y) hold on plot(x,yCalc1) xlabel('Population of state') ylabel('Fatal traffic accidents per state') title('Linear Regression Relation Between Accidents & Population') grid on fig2plotly()
Improve the fit by including a y-intercept β0 in your model as y=β0+β1x. Calculate β0 by padding x with a column of ones and using the \ operator.
X = [ones(length(x),1) x]; b = X\y
b = 2×1 102 × 1.427120171726538 0.000001256394274
This result represents the relation y=β0+β1x=142.7120+0.0001256x.
Visualize the relation by plotting it on the same figure.
yCalc2 = X*b; plot(x,yCalc2,'--') legend('Data','Slope','Slope & Intercept','Location','best'); fig2plotly()