Creates an sgld (stochastic gradient Langevin dynamics) object which can be passed to
sgmcmcStep
to simulate from 1 step of SGLD for the posterior defined by logLik
and logPrior. This allows the user to code the loop themselves, as in many standard
TensorFlow procedures (such as optimization). Which means they do not need to store
the chain at each iteration. This is useful when the full chain needs a lot of memory.
sgldSetup(logLik, dataset, params, stepsize, logPrior = NULL, minibatchSize = 0.01, seed = NULL)
logLik | function which takes parameters and dataset (list of TensorFlow variables and placeholders respectively) as input. It should return a TensorFlow expression which defines the log likelihood of the model. |
---|---|
dataset | list of numeric R arrays which defines the datasets for the problem. The names in the list should correspond to those referred to in the logLik and logPrior functions |
params | list of numeric R arrays which define the starting point of each parameter. The names in the list should correspond to those referred to in the logLik and logPrior functions |
stepsize | list of numeric values corresponding to the SGLD stepsizes for each parameter The names in the list should correspond to those in params. Alternatively specify a single numeric value to use that stepsize for all parameters. |
logPrior | optional. Default uninformative improper prior. Function which takes parameters (list of TensorFlow variables) as input. The function should return a TensorFlow tensor which defines the log prior of the model. |
minibatchSize | optional. Default 0.01. Numeric or integer value that specifies amount of dataset to use at each iteration either as proportion of dataset size (if between 0 and 1) or actual magnitude (if an integer). |
seed | optional. Default NULL. Numeric seed for random number generation. The default does not declare a seed for the TensorFlow session. |
The function returns an 'sgld' object, which is used to pass the required information
about the current model to the sgmcmcStep
function. The function
sgmcmcStep
runs one step of sgld. The sgld object has the following attributes:
list of tf$Variables with the same names as the params list passed to
sgldSetup
. This is the object passed to the logLik and logPrior functions you
declared to calculate the log posterior gradient estimate.
a tensor that estimates the log posterior given the current placeholders and params (the placeholders holds the minibatches of data).
dataset size.
dataset as passed to sgldSetup
.
minibatchSize as passed to sgldSetup
.
list of tf$placeholder objects with the same names as dataset
used to feed minibatches of data to sgmcmcStep
. These are the objects
that get fed to the dataset argument of the logLik and logPrior functions you declared.
list of stepsizes as passed to sgldSetup
.
a list of TensorFlow steps that are evaluated by sgmcmcStep
.
# NOT RUN { # Simulate from a Normal Distribution, unknown location and known scale with uninformative prior # Run sgmcmc step by step and calculate estimate of location on the fly to reduce storage dataset = list("x" = rnorm(1000)) params = list("theta" = 0) logLik = function(params, dataset) { distn = tf$distributions$Normal(params$theta, 1) return(tf$reduce_sum(distn$log_prob(dataset$x))) } stepsize = list("theta" = 1e-4) sgld = sgldSetup(logLik, dataset, params, stepsize) nIters = 10^4L # Initialize location estimate locEstimate = 0 # Initialise TensorFlow session sess = initSess(sgld) for ( i in 1:nIters ) { sgmcmcStep(sgld, sess) locEstimate = locEstimate + 1 / nIters * getParams(sgld, sess)$theta } # For more examples see vignettes # }