Professional Documents
Culture Documents
Get started
Organizing do-files
Combining data
Label variables
Import data
GET STARTED
Also, use the Internet for help! You can search for codes
written by someone else. For example, Stata does not
have an inbuilt command to calculate Gini index, type
. net search gini
ORGANIZING DO-FILES
For reproducibility of your results!
Idea: original data is safe and you can always get back to
the raw data if something goes wrong. The new data and
new variables are created in do-files. Working directly in
Stata is useful to explore the data but as soon as it produces
something important you should write it in a do-file.
Always create logs in the do-file. Each do-file should have a log file
where all results are saved in text.
Separate do-files that create data from do-files that analyze data.
crdata1.do
crdata2.do
andata1.do
andata2.do
etc.
cd Z:/pathname/Econometrics 1
. use mydata2
COMBINING DATA
Number of obs.
Number of variables
j variable (3 values)
4 ->
12
7 ->
-> year
xij variables:
wage2005 wage2006 wage2007
-> wage
94
2. 1 2006
96
3. 1 2007
98
4. 2 2005
75
5. 2 2006
79
--------------------------------------------6. 2 2007
77
7. 3 2005
70
8. 3 2006
69
9. 3 2007
70
10
10. 4 2005
--------------------------------------------11. 4 2006
10
12. 4 2007
10
+---------------------------------------------+
LABEL VARIABLES
IMPORT DATA
http://wps.pearsoned.co.uk/ema_ge_stock_ie_3/193/4
9605/12699039.cw/index.html
IMPORT DATA
Work from the do-file. Write clear to get rid of any old
datasets before we load any new data.
insheet using [filename], option to import a .csv file. If
you have a semi-colon as delimiter use delimit(;) as option
Save it as a Stata file (.dta)
capture log close
log using mycps.log, replace text
set more of
clear
insheet using cps92_08.csv, delimit(;)
li if _n<50
destring ahe, dpcomma replace
save mycps.dta
log close
ABBREVIATIONS IN STATA
EXAMINE DATASET
EXAMINE DATASET
You may also want to use the data editor in the stata menu
to browse through your data in a spreadsheet.
Data data editor, or
. br
Tips! Use help in Stata to find out all the options for each
command. For example, type help list
. tab bachelor
frequency table
crosstabulation
frequency table
CORRELATION
REGRESSION ANALYSIS
What does this result tell us about the effect of age on hourly wage
rate?
the
T-value
statistically significant?
P-value
significance level?
REGRESSION ANALYSIS
REGRESSION ANALYSIS
. est table
. est table, b se t stats(N, r2, F)
If you want to see more result statistics, just add the desired
statistics after table,
PREDICTIONS
Calculates predicted
timlon from our
store it as yhat
Calculates predicted
residuals, and store
. ttest ahe==15
Test if the mean of a specified variable (aheis equal to a
certain
hypothesized value (15)
. ttest ahe==15, level(99)
The confidence interval is 95% by default, this can be changed by
setting it to 99%
GRAPHING DATA
GRAPHING DATA
There are many options for graphing. Type help twoway and
find out. For example,
. twoway scatter ahe age, ms(o) mc(red)
changes the marker symbol to o and the marker color to red
. twoway lfit ahe age
fits a linear line onto our scatter plot to see any relationship more clearly
. twoway lfit ahe age, lc(blue)
changes the line color of the fitted line to blue
. twoway (lfit ahe age) (scatter ahe age), ti("Hourly wage vs Age")
fitted line and scatter plot in the same graph
. twoway (lfit ahe age, lw(0.5) lc(blue)) (scatter ahe age, ms(o) mc(red)), ti("Hourly wage vs
Age") xline(30, lw(1) lc(black))
adds a x-line at age 30 with line width 1 and line color black
GRAPHING DATA