Maximum Likelihood

This section is based on the R programming wikibook

Introduction

Maximum likelihood estimation is just an optimization problem. You have to write down your log likelihood function and use some optimization technique. Sometimes you also need to write your score (the first derivative of the log likelihood) and or the hessian (the second derivative of the log likelihood).

One dimension

If there is only one parameter, then the log likelihood can be optimised using the optimize function.

Example 1 - Type 1 Pareto distribution

Note that in this example the minimum value is treated as known and is not estimated. Therefore this is a one-dimensional problem.

The rpareto1 function from the actuar package is used to generate a random vector from a type 1 Pareto distribution with shape equal to 1 and minimum value equal to 500. The dpareto1 function, also from the actuar package , is used with option log = TRUE to write the log likelihood. Finally, optimize is used with maximum=TRUE and a minimum and maximum value for the parameter is provided using the the interval option.

First, install the actuar package.

install.packages("actuar")
Installing package into ‘/home/grosedj/R-packages’
(as ‘lib’ is unspecified)
Warning message in install.packages("actuar"):
“installation of package ‘actuar’ had non-zero exit status”
library(actuar)
y <- rpareto1(1000, shape = 1, min = 500)
ll <- function(mu, x) 
{
    sum(dpareto1(x,mu[1],min = min(x),log = TRUE)) 
} 
optimize(f = ll, x = y, interval = c(0,10), maximum = TRUE)
Attaching package: ‘actuar’
The following object is masked from ‘package:grDevices’:

    cm
$maximum
1.03889736541982
$objective
-8139.16372823865

Exercise 1

Find out more about the optimize function.

help(optimize)

Exercise 2

How could you use the Curry function from the functional package to help organise your functions and parameters for use with optimize ?

Exercise 3

Demonstrate your solution to Exercise 2

Multiple dimensions

For optimising more than one parameter, use the optim function.

Example 2 - Beta distribution

y <- rbeta(1000,2,2)
loglik <- function(mu, x) 
{ 
    sum(-dbeta(x,mu[1],mu[2],log = TRUE)) 
}  
out <- optim(par = c(1,1), fn=loglik,x=y,method = "L-BFGS-B",lower=c(0,0))
print(out)
$par
[1] 2.019864 1.976904

$value
[1] -124.9543

$counts
function gradient 
       9        9 

$convergence
[1] 0

$message
[1] "CONVERGENCE: REL_REDUCTION_OF_F <= FACTR*EPSMCH"

Note that the runtime of the optimiser can grow with dimension \(d\) of the problem (i.e. the number of parameters), and, in general, has complexity \(\mathcal{O}(2^{d})\). Having a function which calculates the jacobian of the objective function can greatly reduce this growth in runtime. To be able to exploit the availability of a function that efficiently computes the jacobian the optimx package can be used.

Exercise 4

Find details of the optimx package and install it.