Logistic Regression Colab Badge

TLDR

class LogisticRegression(pf.CategoricalModel):

    def __init__(self, d):
        self.w = pf.Parameter([d, 1]) #weights
        self.b = pf.Parameter([1, 1]) #bias

    def __call__(self, x):
        return pf.Bernoulli(x @ self.w() + self.b())

model = LogisticRegression(x.shape[1])

or simply

model = pf.LogisticRegression(x.shape[1])

and then

model.fit(x, y)

In the last example, both \(x\) and \(y\) were continuous variables (their values ranged from \(-\infty\) to \(\infty\)). What if our output variable is binary? That is, suppose the output variable can take only one of two values, 0 or 1, and so we need a classification model.

Let’s create a dataset which has 3 continuous features, and a target variable with 2 classes:

# Imports
import probflow as pf
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
randn = lambda *x: np.random.randn(*x).astype('float32')

# Settings
N = 1000 #number of datapoints
D = 3    #number of features

# Generate data
x = randn(N, D)
w = np.array([[2.], [0.1], [-2.]]).astype('float32')
noise = randn(N, 1)
y = np.round(1./(1.+np.exp(-(x@w+noise))))

# Plot it
for i in range(D):
    plt.subplot(1, D, i+1)
    sns.violinplot(x=y[:, 0], y=x[:, i])
../_images/output_4_02.svg

Building a Logistic Regression Manually

A logistic regression is a model where our output variable is categorical. It’s basically the same thing as a linear regression, except we pipe the linearly predicted value through a nonlinear function to get the probability of the output class:

\[p(y=1) = f( \mathbf{x}^\top \mathbf{w} + b )\]

where \(f\) is usually the logistic function. Or, with \(>2\) classes, a softmax.

If our target variable has only 2 possible classes, we can model this using a Bernoulli distribution:

\[y \sim \text{Bernoulli}( f( \mathbf{x}^\top \mathbf{w} + b ) )\]

To create this model in ProbFlow, we’ll create a class which inherits CategoricalModel, because the target variable is categorical (either 0 or 1). Again, in the __init__ method we define the parameters of the model, and in the __call__ method we compute probabilistic predictions given the parameters and the input data:

class LogisticRegression(pf.CategoricalModel):

    def __init__(self, dims):
        self.w = pf.Parameter([dims, 1], name='Weights')
        self.b = pf.Parameter([1, 1], name='Bias')

    def __call__(self, x):
        return pf.Bernoulli(x @ self.w() + self.b())

Note that by default, the Bernoulli distribution treats its inputs as logits (that is, it passes the inputs through a sigmoid function to get the output class probabilities). To force it to treat the inputs as raw probability values, use the probs keyword argument to the Bernoulli constructor.

Then we can instantiate our model class,

model = LogisticRegression(D)

And fit it to the data!

model.fit(x, y, lr=0.01)

Now we can plot the posterior distributions for the weights and the bias, and can see that the model recovered the values we used to generate the data:

model.posterior_plot(ci=0.9)
../_images/output_14_1.svg

Using the LogisticRegression module

An even easier way to do a logistic regression with ProbFlow is to use the pre-built LogisticRegression model:

model = pf.LogisticRegression(D)
model.fit(x, y, lr=0.01)
model.posterior_plot(ci=0.9)
../_images/output_14_1.svg

Multinomial Logistic Regression

The LogisticRegression model even handles when \(y\) has multiple classes (that is, a Multinomial logistic regression). Let’s generate some data with 4 features, where the target has 3 possible classes:

# Settings
N = 1000 #number of datapoints
D = 4    #number of features
K = 3    #number of target classes

# Generate data
x = randn(N, D)
w = randn(D, K)
noise = randn(N, 1)
y = np.argmax(x@w+noise, axis=1).astype('float32')

# Plot it
for i in range(D):
    plt.subplot(2, 2, i+1)
    sns.violinplot(x=y, y=x[:, i])
../_images/output_18_0.svg

The k keyword argument to LogisticRegression sets the number of classes of the dependent variable.

model = pf.LogisticRegression(D, k=K)
model.fit(x, y, lr=0.01, epochs=200)
model.posterior_plot()
../_images/output_20_1.svg

And we can predict the target class given the features:

>>> model.predict(x[:5, :])
array([0, 0, 1, 2, 1], dtype=int32)

Or even compute the posterior predictive probability of the target class for test datapoints:

x_test = randn(1, D)
model.pred_dist_plot(x_test)
../_images/output_24_2.svg