You are on page 1of 14

A Forecast of Non-Farm Employment in the United States

4th Quarter 2013 3rd Quarter 2015

By
Marius Mihai & Chaoyun Liu
University of New Orleans

December 2013

Acknowledgments
This report was prepared by Marius M. Mihai and Chaoyun Liu, Graduate Research Assistants at the
University of New Orleans (UNO), as a final project for the Time Series Analysis class. Thanks and recognition is
given to Dr. Tumulesh Solanky, for his valuable lessons which allowed us to complete this analysis.

Introduction
Total non-farm payroll employment in the United States (US) is one of the most important leading economic
indicators reported by the Bureau of Labor Statistics (BLS) on a monthly basis. Monitoring this time series is
imperative when assessing the strength of the national economy. Forecasting such a series is also crucial,
because it provides a good overview of how the job market will perform in future time periods. This
information is very important not only to the general public, but also to all decision makers in a business
environment.
For simplicity, throughout the report we will use US non-farm payroll employment and US employment
interchangeably. The data selected was seasonally adjusted and it goes back to 1980. The analysis and forecast
will be done on a quarterly basis. A total of 135 data points were collected.

A. Preliminary Analysis
1. Time Series Plot
140000

130000

120000

Number (In Millions)

Total non-farm employment in the United


States had a positive growth from 1980 to
2013. The time series plot is presented on
the right and it shows a clear positive
trend with four significant downturns
which represent major recessions in the
United States. The first major drop in
economic activity visible in our plot was
the early 1980s recession that stemmed
from the 1979 energy crisis. This crisis
originated from the Iranian Revolution,
and that led to sharp increases in oil prices
around the world. Another drop in the
number of US jobs occurred in the early
1990s which also coincided with a period
of economic turmoil nationwide. The third
and fourth declines in the number of jobs
at a national level were also the result of
the recession in the early 2000s and the
Great Recession of 2009.

Quarterly US NonFarm Employment 1980-2013

110000

100000

90000

80000
01JAN1980

01JAN1985

01JAN1990

01JAN1995

01JAN2000

01JAN2005

01JAN2010

Date

2. Autocorrelation Plot and


Autocorrelation Check for White
Noise
In order to be able to model the data,
various assumptions have to be verified to
see if our figures indeed follow a time series
pattern. The autocorrelation (ACF) plot (on
the right) is a decaying function, but it does
not decrease fast enough in order to make
the series stationary. Therefore, the data in
its original form is non-stationary. The
results of the ACF check for white noise are
presented in the table right below. The pvalues are small, thus there are non-zero
autocorrelations present in the data. The
time series structure assumption is verified,
so the US non-farm employment can be
modeled as a time series.

Autocorrelation Check for White Noise


To Lag
Chi-Square
DF
Pr>Chi-Square
6
739.63
6
<0.0001
12
1,323.63
12
<0.0001
18
1,754.74
18
<0.0001

01JAN2015

3. Stationary and First-Difference

The ACF plot also confirms the stationary


pattern. Again, the sudden declines in the
number of US non-farm jobs can be
attributed to the US recessions discussed
above. The ACF check for white noise had
small p-values thus the differenced data
can be modeled as a time series process.

2000

1000

US nonfarm emploment (In Millions)

The first-difference plot on the right shows


the transformed data. After taking the first
difference, it appears that the data became
mean-stationary. The variance-stationary
condition is also met except for the troughs
in early 1980s, 1990s, 2000 and 2009. All of
these low points represented the recessions
discussed at the beginning of this section.

First-Difference US NonFarm Employment 1980-2013

-1000

-2000

-3000
01JAN1980

01JAN1985

01JAN1990

01JAN1995

01JAN2000

Date

Autocorrelation Check for White Noise


To Lag
Chi-Square
DF
Pr>Chi-Square
6
242.33
6
<0.0001
12
259.02
12
<0.0001
18
290.40
18
<0.0001

01JAN2005

01JAN2010

01JAN2015

B. Model Identification

From there, we plotted the time series,


and then its first difference. The data looks
stationary, so no other differencing or
transformation was applied. Below the
time series plot are the ACF and PACF
(partial autocorrelation) plots. While the
ACF is a declining function, the PACF is
most likely significant at lag 1, 2, 3, 4, 5, 7,
11, 20. However, in the PACF, the first four
lags seem to be the most noticeable.
After identifying the important spikes in
the PACF plot, various models were plotted
in order to identify the underlying process
applicable to our data. For each attempted
model, a graph of the forecasted values
over actual values is plotted. In addition,
the AICs, SBCs and standard deviation of
the errors will be compared in order to
identify the best model. Last but not least,
the autocorrelation check for residuals is
presented in order to confirm the validity of
our models.
Next, we identified six time series models
that represented a good fit for our data. At
the end of the section, these will be
compared and one will be selected as the
underlying process for the US non-farm
employment data.

140000

130000

120000

Number (In Millions)

The original data contains 135 data points,


ending in the third quarter of 2013. For the
purpose of our analysis we only selected
121 of the original data points, and we
used it as a basis for model identification.
Thus, the selected cutoff point was data
until the first quarter of 2010.

Quarterly US NonFarm Employment 1980-2010

110000

100000

90000

80000
01JAN1980

01JAN1985

01JAN1990

01JAN1995

01JAN2000

Date

01JAN2005

01JAN2010

01JAN2015

1. AR - non-zero parameters at lag 1, 2, 11

Parameter
MU
AR1,1
AR1,2
AR1,3

Estimate
216.85
1.12847
-0.26920
-0.05109

Conditional Least Square Estimation


Std. Error
t Value
132.03
1.64
0.08994
12.55
0.09000
-2.99
0.05580
-0.92

Pr > |t|
0.1032
<0.0001
0.0034
0.3617

Lag
0
1
2
11

AIC=1,691.964
SBC=1,703.014
Std. Deviation (Error)=332.32
The first model identified was an autoregressive
model that has the present value based on its
previous value, previous second value, and
previous eleventh value. The time series plot at the
bottom shows a good fit of the forecasted values.
Although the predicted values are a little lower
when compared to the actuals, this model
nonetheless is a strong candidate in our selection
process.
Other statistics are also confirming the validity of
this model. Out of all parameters, two are very
significant- the one at lag 1 and the one at lag 2.
The residuals are not showing any time series
pattern.
To Lag
6
12
18

Autocorrelation Check Of Residuals


Chi-Square
DF
Pr>Chi-Square
5.53
3
0.1367
6.98
9
0.6391
9.74
15
0.8358

Quarterly US NonFarm Employment 1980-2013


140000

130000

Number (In Mil ions)

120000

110000

100000

90000

80000
01JAN1980

01JAN1985

01JAN1990

01JAN1995

01JAN2000

01JAN2005

01JAN2010

Date
PLOT

US_employment

Forecast for US_employment2

01JAN2015

2. AR non-zero parameters at lag 1 and 4

Parameter
MU
AR1,1
AR1,2

Estimate
165.299
0.989
-0.16310

Conditional Least Square Estimation


Std. Error
t Value
134.994
1.22
0.05055
19.56
0.05496
-2.97

Pr > |t|
0.2232
<0.0001
0.0036

Lag
0
1
4

AIC=1,690.613
SBC=1,698.975
Std. Deviation (Error)=665.271
The second model identified was an autoregressive
model that has the current value based on its
previous one, and previous fourth value. Again, the
time series plot at the bottom shows a relatively
good fit of the forecasted values. However, towards
the end of the forecast, the predicted values are
leveling off while the actual values continue on an
upward trend. Compared to the previous model, this
one has a slightly lower AIC and SBC, and both of the
autoregressive terms are significant. However, the
standard deviation of the errors is a lot higher when
compared to the previous model. Thus, the
predicted values for this second time series structure
are not as accurate. Same as above, the residuals are
not showing any time series pattern.
To Lag
6
12
18

Autocorrelation Check of Residuals


Chi-Square
DF
Pr>Chi-Square
5.66
4
0.2263
7.5
10
0.6776
10.3
16
0.8508
Quarterly US NonFarm Employment 1980-2013
140000

130000

Number (In Millions)

120000

110000

100000

90000

80000
01JAN1980

01JAN1985

01JAN1990

01JAN1995

01JAN2000

01JAN2005

01JAN2010

Date
PLOT

US_em ploym ent

Forecas t for US_em ploym ent2

01JAN2015

3. AR non-zero parameters at lag 1 and 7

Parameter
MU
AR1,1
AR1,2

Estimate
218.67
0.90776
-0.09385

Conditional Least Square Estimation


Std. Error
t Value
134.896
1.62
0.04293
21.15
0.05672
-1.65

Pr > |t|
0.1077
<0.0001
0.1007

Lag
0
1
7

AIC=1,697.128
SBC=1,705.49
Std. Deviation (Error)=181.33
The third model attempted was an autoregressive
model that has the current value based on its
previous one, and previous seventh value. Looking
at the time series plot on the bottom of the page,
the forecasted vales are following the actual
numbers very accurately. As a result, the standard
deviation of the errors is significantly lower than in
the previous models. Although AIC and SBC are
slightly higher, this AR model represents the best
fit so far. Both terms can be considered significant
at 90% confidence (the AR term of lag 7 is
marginally significant). The autocorrelation check
for residuals indicates that there might be some
significant correlations up to lag 6, however after
that the residuals are not showing any time series
pattern.
To Lag
6
12
18

Autocorrelation Check of Residuals


Chi-Square
DF
Pr>Chi-Square
13.63
6
0.0341
15.41
12
0.2195
17.46
18
0.4919
Quarterly US NonFarm Employment 1980-2013
140000

130000

Number (In Millions)

120000

110000

100000

90000

80000
01JAN1980

01JAN1985

01JAN1990

01JAN1995

01JAN2000

01JAN2005

01JAN2010

Date
PLOT

US_employment

Forecast for US_employment2

01JAN2015

4. AR non-zero parameters at lag 1, 2 and 7


Parameter
MU
AR1,1
AR1,2
AR1,3

Estimate
188.78246
1.12788
-0.25276
-0.04967

Conditional Least Square Estimation


Std. Error
t Value
140.16943
1.35
0.09041
12.48
0.09294
-2.72
0.05665
-0.88

Pr > |t|
0.1807
<.0001
0.0075
0.3824

Lag
0
1
2
7

AIC=1,691.721
SBC=1,702.871
Std. Deviation (Error)=274.107
This model is an autoregressive model with the
present value based on the previous value, the
previous second value, and previous seventh value.
The forecast made by this model fits the data very
good at first, but then the forecast line goes below
the actual plot as is shown in the graph - the model
predicts a slower increase in employment than it
actually is. The conditional least squares estimation
shows that the p-values of the first and second
parameter are significant. The AIC and SBC of this
model are 1,692 and 1,703, and the standard
deviation of the error is 274.

To Lag
6
12
18

Autocorrelation Check of Residuals


Chi-Square
DF
Pr>Chi-Square
5.17
3
0.1594
6.91
9
0.6465
9.54
15
0.8476
Quarterly US NonFarm Employment 1980-2013

140000

130000

Number (In Millions)

120000

110000

100000

90000

80000
01JAN1980

01JAN1985

01JAN1990

01JAN1995

01JAN2000

01JAN2005

01JAN2010

Date
PLOT

US_employment

Forecast for US_employment2

01JAN2015

5. AR non-zero parameters at lag 1, 5 and 7

Parameter
MU
AR1,1
AR1,2
AR1,3

Estimate
205.89901
0.93087
-0.08619
-0.03007

Conditional Least Square Estimation


Std. Error
t Value
136.14991
1.51
0.04723
19.71
0.07806
-1.10
0.07863
-0.38

Pr > |t|
0.1332
<.0001
0.2718
0.7029

Lag
0
1
5
7

AIC=1,697.878
SBC=1709.028
Std. Deviation (Error)=281.2292
This model is an autoregressive model with the
present value based on the previous value, the
previous fifth value, and previous seventh value.
The forecast made by this model provides a good
fit, same as in previous models. However, the
forecast line goes slightly higher over the actuals,
and then it moves back down. The conditional least
squares estimation shows that the p-value of the
first parameter is significant. The AIC and SBC of
this model are 1,698 and 1,709, and the standard
deviation of the error is 281.

To Lag
6
12
18

Autocorrelation Check of Residuals


Chi-Square
DF
Pr>Chi-Square
10.22
3
0.0168
11.60
9
0.2367
14.17
15
0.5127
Quarterly US NonFarm Employment 1980-2013

140000

130000

Number (In Mil ions)

120000

110000

100000

90000

80000
01JAN1980

01JAN1985

01JAN1990

01JAN1995

01JAN2000

01JAN2005

01JAN2010

Date
PLOT

US_employment

Forecast for US_employment2

01JAN2015

6. AR non-zero parameters at 1, 3, 7, 11
Parameter
MU
AR1,1
AR1,2
AR1,3
AR1,4

Estimate
220.25644
1.01007
-0.16063
-0.02278
-0.03926

Conditional Least Square Estimation


Std. Error
t Value
124.89119
1.76
0.06207
16.27
0.06788
-2.37
0.06589
-0.35
0.06165
-0.64

Pr > |t|
0.0805
<.0001
0.0196
0.7302
0.5255

Lag
0
1
3
7
11

AIC=1,695.23
SBC=1709.168
Std. Deviation (Error)=277.0325
This model is an autoregressive model with the
present value based on the previous value, the
previous third value, the previous seventh value,
and the previous eleventh value. The forecast
made by this model fits the data very well. The
conditional least squares estimation shows that the
p-values of the first and second parameter are
significant. The AIC and SBC of this model are 1,695
and 1,709, and the standard deviation of the error
is 277.

To Lag
6
12
18
24

Autocorrelation Check for Residuals


Chi-Square
DF
Pr>Chi-Square
6.60
2
0.0369
8.60
8
0.3773
11.24
14
0.6669
14.90
20
0.7821
Quarterly US NonFarm Employment 1980-2013
140000

130000

Number (In Mil ions)

120000

110000

100000

90000

80000
01JAN1980

01JAN1985

01JAN1990

01JAN1995

01JAN2000

01JAN2005

01JAN2010

Date
PLOT

US_employment

Forecast for US_employment2

01JAN2015

C. Model Selection
AIC, SBC and St. Deviation were plotted
and compared in order to make the best
model selection.
As it can be seen, AICs and SBCs were
not very different - that made it hard to
make a decision based solely on those
two criteria. Thus, more weight was
given to the standard deviation value.
AR 1, 7 was selected over the other
models because it had the minimum
deviation from the actual values.
AR 1, 7 was applied to all data points
and the following results were obtained:

Parameter
MU
AR1,1
AR1,2

Estimate
213.22
0.89
-0.07

AIC, SBC, St. Deviation Comparison


1,800
1,600
1,400
1,200
1,000

AIC

800

St. Dev.

600
400

332

274

281

277

AR 1, 2, 7

AR 1, 5, 7

AR 1, 3, 7,
11

181

200
0
AR 1,2,11

AR 1,4

AR 1,7

Conditional Least Square Estimation


Std. Error
t Value
130.59
1.63
0.04
20.87
0.04
-1.56

From the parameter selection, it


appears that only the lag 1 parameter
was significant. The parameter at lag
7, although it was not significant, its
p-value was not extraordinarily high therefore its rejection was not a
strong one. The autocorrelation check
for residuals was also good. P-values
were high and they did not indicate
any time series pattern in the
residuals.

To Lag
6
12
18

SBC

665

Autocorrelation Check for Residuals


Chi-Square
DF
Pr>Chi-Square
11.79
6
0.0669
15.25
12
0.2280
17.30
18
0.5025

Pr > |t|
0.1049
<.0001
0.1211

Lag
0
1
7

Quarterly Forecast US NonFarm Employment 2013 Q4-2015 Q3


150000

140000

Number (In Millions)

130000

120000

110000

100000

90000

80000
01JAN1980 01JAN1985 01JAN1990 01JAN1995 01JAN2000 01JAN2005 01JAN2010 01JAN2015 01JAN2020

Date
PLOT

US_employment
Upper 95% Confidence Limit

Forecast for US_employment


Lower 95% Confidence Limit

Above, there is the forecast plot for the next 8 quarters along with its 95% confidence limits. According to our
model, US non-farm employment will continue to increase over the next two years. In addition, the 95%
confidence limits were assumed to represent various scenarios in the US economy. Thus, the upper 95% limit
represents the number of non-farm jobs amid a potential boom in the national economy, while the lower 95%
limit shows the number of jobs in case another recession happens. Otherwise, with everything else assumed
to be equal, in the next two years US non-farm employment should follow the optimal forecast represented
by the red line in our plot above. The table below presents the actual forecast, upper 95, and lower 95 values
generated by our model.
Date
2013 Q4
2014 Q1
2014 Q2
2014 Q3
2014 Q4
2015 Q1
2015 Q2
2015 Q3

Forecast(million)
Optimal conditions
136.6
136.9
137.3
137.6
137.9
138.1
138.3
138.5

Lower 95(million)
Economic Recession
136.0
135.7
135.4
134.9
134.3
133.8
133.1
132.5

Upper 95(million)
Economic Boom
137.1
138.1
139.2
140.3
141.4
142.4
143.5
144.5

D. Conclusions and Limitations


Most recent data indicates that as of the third quarter of 2013, the total US non-farm employment totaled
about 136.2 million jobs. Based on the results of our model, the US economy is projected to reach about
138.5 million non-farm jobs by the end of the third quarter 2015. That is an increase of about 2.3 million jobs,
or 1.7% under optimal growth conditions, and assuming no other extraordinary events will distort the national
economy.
Limitation: The BLS (Bureau of Labor Statistics) revises this data relatively often. Our forecasted values would
change should the BLS update the US non-farm employment figures. Thus, our analysis is solely based on the
figures released for the month of November 2013, and should be revised if the BLS provides updates to our
current numbers.

E. Data Sources
Our primary data source was the Bureau of Labor Statistics. Below, we provided the link and the Series Id
where this data can be downloaded from.
http://data.bls.gov/cgi-bin/srgate
Series ID: CES0000000001

You might also like