Selecting a Backend and Datatype

Before building models with ProbFlow, you’ll want to decide which backend to use, and what default datatype to use.

Setting the Backend

What I mean by “backend” is the system which performs the automatic differentiation required to fit models with stochastic variational inference. ProbFlow currently supports two backends: TensorFlow and PyTorch. TensorFlow is the default backend, but you can set which backend to use:

import probflow as pf

pf.set_backend('pytorch') #or 'tensorflow'

You can see which backend is currently being used by:

pf.get_backend()

ProbFlow will only use operations specific to the backend you’ve chosen, and you can only use operations from your chosen backend when specifying your models via ProbFlow.

Setting the Datatype

You can also set the default datatype ProbFlow uses for creating the variable tensors. This datatype much match the datatype of the data you’re fitting. The default datatype is tf.dtypes.float32 when TensorFlow is the backend, and torch.float32 when PyTorch is the backend.

You can see which is the current default datatype with:

pf.get_datatype()

And you can set the default datatype with pf.set_datatype. For example, to instead use double precision with the TensorFlow backend:

pf.set_datatype(tf.dtypes.float64)

Personal opinion warning!

I’d gently recommend sticking to the default float32 datatype. Variational inference is super noisy as is, so do we really need all that extra precision? Single precision is also a lot faster on most GPUs. If your data is of a different type, just cast it with (for numpy arrays and pandas DataFrames) .astype('float32').