I am taking the Machine Learning course on Coursera being taught by Andrew Ng. It is turning out to be useful so far, and he has presented the material clearly. Itâ€™s a nice introduction to the Machine Learning/Computer Science language, since I come from a statistics background.

I learned about gradient descent today for simple linear regression. The following is my code in R and I compare it to the lm function in base R .
I am using the Prostate dataset from the lasso2 package. The model I am fitting is:

#prostate cancer data set library ( lasso2 )

`## R Package to solve regression problems while imposing` ## an L1 constraint on the parameters. Based on S-plus Release 2.1 ## Copyright (C) 1998, 1999 ## Justin Lokhorst <jlokhors@stats.adelaide.edu.au> ## Berwin A. Turlach <bturlach@stats.adelaide.edu.au> ## Bill Venables <wvenable@stats.adelaide.edu.au> ## ## Copyright (C) 2002 ## Martin Maechler <maechler@stat.math.ethz.ch>

data ( Prostate ) # hypothesis hypothesis <- function ( x , theta0 , theta1 ){ h <- theta0 + theta1 * x return ( h ) } # Jacobian deriv <- function ( x , y , theta0 , theta1 ){ dt0 <- ( length ( x )) ^ ( -1 ) * sum (( hypothesis ( x , theta0 , theta1 ) - y )) dt1 <- ( length ( x )) ^ ( -1 ) * t ( x ) %*% ( hypothesis ( x , theta0 , theta1 ) - y ) return ( c ( dt0 , dt1 )) } theta <- c ( 0 , 0 ) alpha <- 0.5 X <- Prostate $ lcavol Y <- Prostate $ lpsa i = 1 # theta.star <- deriv ( Prostate $ lcavol , Prostate $ lpsa , theta [ 1 ], theta [ 2 ]) # set convergence threshold threshold <- 1e-7 # logical to check if threshold has been achieved continue = TRUE while ( continue ){ theta [ 1 ] <- theta.star [ 1 ] - alpha * deriv ( x = X , y = Y , theta.star [ 1 ], theta.star [ 2 ])[ 1 ] theta [ 2 ] <- theta.star [ 2 ] - alpha * deriv ( x = X , y = Y , theta.star [ 1 ], theta.star [ 2 ])[ 2 ] continue <- ( abs (( theta.star - theta )[ 1 ]) > threshold & abs (( theta.star - theta )[ 2 ]) > threshold ) theta.star [ 1 ] <- theta [ 1 ] theta.star [ 2 ] <- theta [ 2 ] i = i +1 } # number of iterations i

`## [1] 214`

# beta0 and beta1 theta.star

`## [1] 1.5072975 0.7193205`

# compare to lm fit <- lm ( lpsa ~ lcavol , data = Prostate ) summary ( fit )

`## ` ## Call: ## lm(formula = lpsa ~ lcavol, data = Prostate) ## ## Residuals: ## Min 1Q Median 3Q Max ## -1.67624 -0.41648 0.09859 0.50709 1.89672 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 1.50730 0.12194 12.36 <2e-16 *** ## lcavol 0.71932 0.06819 10.55 <2e-16 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 0.7875 on 95 degrees of freedom ## Multiple R-squared: 0.5394, Adjusted R-squared: 0.5346 ## F-statistic: 111.3 on 1 and 95 DF, p-value: < 2.2e-16