Microsoft Office Tutorials and References

In Depth Information

**Chapter 12: Statistical Regression**

CHAPTER
12

Statistical Regression

Statistical regression is the relation between selected values of x and observed

as that is beyond

the scope of this topic. My purpose is to explain how to use Excel to calculate the

parameters of a regression line. In the Excel file Chapter 12 under the sheet named

Regression, you will find a set of data that will be used to demonstrate this feature in

Excel. The data (sales over time) is going to create an almost perfectly straight line

when you draw it on XY coordinates.

values of y. I am not going to explain or define

“

regression

”

Month

3

4

5

6

7

8

9

10

11

12

Sales

9

15

21

24

32

35

39

45

48

54

USING THE SCATTER CHART IN EXCEL

You can use the chart option of Excel to reveal or demonstrate the relationship

between the two data sets. We assume, in this example, that the Sales is our Y var-

iable and that the Month is our X variable. To create a chart, select the Month and

Sales data B1:C11 (including the header) and activate the Chart Wizard. When you

select, under the Insert ribbon, the very first scatter chart as shown in Figure 12.1, it

will result in the XY scatter chart.

To better visualize, you can add a regression line to the chart. Right-click on any

of the points in the chart and select Add Trend line in the small drop-down menu.

Check the last two boxes on the Format Trend line menu to display the trend line

s

equation and R
2
(R-squared) value. See Figure 12.2 for the menu and the results of

programming the line

’

s equation and R
2

’

(R-squared) value. The equation of the

regression line is reversed.

The regression line is y

4.8727x.

The slope is 4.8727 and the intercept is

¼

4.3455

þ

4.3455. R
2
is 0.9945.

This Add Trend line feature of Excel allows you to find linear and other rela-

tionships between data sets without using other calculations and functions. Once you

find a relationship you can, at a single glance, analyze whether the data and the

relationship between the variables is meaningful or not. I will further demonstrate in

other examples how understanding these relationships can be an advantage.

What do the results mean? The intercept is

4.35. Theoretically, when x was 0

(at time zero) sales were

4.35. It does not make sense; however, it does work for the

equation of the range of x between 3 and 12.