Lab Assignment 2: some additional comments

posted Aug 31, 2012, 3:01 PM by Unknown user   [ updated Aug 31, 2012, 5:38 PM ]
Question 1:
 
There is an error in Example 6.3 of the text book.  I have uploaded an updated version of the assignment. Please see that for further details.
 
Question 3: I found that the best approach was not to use rattle first - load
the csv file in the R console and just apply cut and ordered to the
interest rate variable and ensure that there are the only minimal
number of input variables and the set was filtered on 36 month loans.
#e.g.
>loan<-read.csv("loan_file.csv")
#Then create a new list
>loanSS36 <- list()
# now just add the attributes and target variable to the new list rather than
using loan and setting null to all the many other input variables that you are not using
>loanSS36$"interest.rate" <- loan$"interest.rate"

I saved this in a csv file using
 
>write.csv (loanSS36, "loanSS36.csv").
then I loaded rattle with this csv file.
 
If rpart is still stalling, then you might want to edit the csv file (in your favorite editor) and create a sample dataset using the first 1000 rows. Now work with the smaller dataset until you have everything running correctly in R.
 
Remember that there are 2^v-2 subsets of possible split points. Credit.grade has 35 possible values (v=35)
That's 34359738366 different combinations!
 
 
Comments