Professional Documents
Culture Documents
Matthew Beauregard
March 16, 2005
1 Important note
SAS is a very large, very complex piece of software. Enterprise Miner, while making up only a small part
of the SAS system, is itself very large and complex. Until you understand what you are doing, follow these
instructions carefully and in order. It is very possible to make a mistake from which you cannot recover,
except by building your system all over again.
Note to FIT Linux lab users: apart from not following the instructions, the two most common reasons
for strange errors in SAS are that you are out of free space on the network storage, or that you cancelled
the VMware login box after starting Windows. To check your free space, ssh to charlie and type quota
-v. Also, a helpdesk technician from Technical Services alleges that the Desktop is temporary local storage,
not network storage, so you might try using that as working space.
2 Startup
1. Download the datafiles from http://www-staff.it.uts.edu.au/~mbeaureg/topics/prediction_
in_sas/data and extract the contents.
2. Run The SAS System and choose Solutions → Analysis → Enterprise Miner.
3. Choose File → New → Project, name it NN and click Create.
4. Right-click the empty pane on the right, choose Add Node and add an Input Data Source, a Data
Partition, a Replacement and a Neural Network. Arrange these left to right.
5. Hover your mouse cursor over the right edge of the Input Data Source until it becomes crosshairs.
Drag a connecting arrow to the left edge of the Data Partition. Repeat to connect the other nodes in
a line.
6. In the Explorer window (left) double click Libraries then right-click an empty area. Choose New.
7. Enter TUTORIAL as the name and click Browse. Navigate into the folder you expanded from the zipfile
and click OK, then OK again.
8. Double-click Input Data Source then click Select. Choose the TUTORIAL library and ORGANICS data.
Click OK.
1. If the Input Data Source window is not open, double click that node.
1
2. Choose the Variables tab and right click input beside ORGYN. Choose Set Model Role → target.
Close the window. Save changes.
3. Double-click the Data Partition node. Set the percentages to 60% train, 20% validation and 20% test.
Close the window. Save changes.
4. Double-click the Neural Network node. Choose the Basic tab. Set the Runtime limit to 10 minutes.
5. Click the triangle besides Multilayer Perceptron. Choose the hidden neurons preset for Moderate noise
data. Choose OK, close the window, save changes. Call your model NN.
6. Ensure that the Neural Network node is selected (dotted outline) and choose Actions → Run.
Enterprise Miner will traverse all the preprocessing nodes before displaying a training/validation per-
formance graph. Training will cease after about 15 iterations because the model becomes perfect. Once
calculation is complete, view the results.
3.1 Questions
1. From the Tables tab, what is the misclassification rate on validation data?
2. Examine the training graph on the Plot tab. Is there any difference between performance on the
training and validation data sets?
3. What features in the data might lead to this perfect performance?
4.1 Questions
1. Describe the training performance plot. What do you think would have happened if we allowed training
to continue?
5. Click on the tree’s label in the diagram and rename it to 4way Tree.
2
6. Run your network from the Assessment node. Examine the results.
7. Highlight all the models and choose Tools →Lift Chart.
2. Locate your table using the Explorer pane on the left, right-click it and choose Export. Follow the
wizard.