[1]:
import os
os.environ['CUDA_VISIBLE_DEVICES'] = '0'

Tutorial A2: Predictions

When one has a machine learning model, a natural next step is to make predictions with that model. Accordingly, tangermeme implements general-purpose functions for making predictions from PyTorch models in a memory-efficient manner, regardless of the number of inputs or outputs from the model. These functions can be used by themselves but their primary purpose is as building blocks for more complicated analysis functions.

Although making predictions using a model is conceptually simple to understand, there are several technical issues with doing efficiently so in practice. First, when your data is too big to fit in GPU memory you cannot simply move it all over at once and so you must make predictions in a batched manner. Here, a small number of examples are moved from the CPU to the GPU, predictions are made, and the results are moved back to the CPU. The batch size can be tuned to the largest number of examples that fit in GPU memory. Second, when a model has multiple outputs, making predictions in a batched manner yields a set of tensors for each batch. These tensors must be correctly concatenated across batches to make sure that the final output from the function matches the shape of the data as if all examples were run through the model at the same time. Third, some models have multiple inputs and so this function must be able to handle an optional set of additional arguments.

Predict

The simplest function that implements these ideas is predict, which takes in a model, data, and optional additional arguments, and makes batched predictions on the data given the model. The function can be run on GPUs, CPUs, or any other devices that work with PyTorch.

To demonstate it, let’s use a model that takes its inputs and flattens them before feeding them into a dense layer to make three predictions per example. The forward function takes in two optional arguments: alpha, which gets added to the predictions, and beta, which multiplies the predictions (but not alpha). By default, these are set such that the predictions are returned without modification.

[2]:
import torch

class FlattenDense(torch.nn.Module):
    def __init__(self, length=10):
            super(FlattenDense, self).__init__()
            self.dense = torch.nn.Linear(length*4, 3)

    def forward(self, X, alpha=0, beta=1):
            X = X.reshape(X.shape[0], -1)
            return self.dense(X) * beta + alpha

Now let’s generate some random sequence and see what the model output looks like for it. We need to use torch.manual_seed because of the random initializations used in the torch.nn.Linear layer. As a side note, even though we are not training the model here, the usage doesn’t change based on whether the model is randomly initialized or trained.

[3]:
from tangermeme.utils import random_one_hot
torch.manual_seed(0)

X = random_one_hot((5, 4, 10), random_state=0).float()
model = FlattenDense()

y = model(X)
y
[3]:
tensor([[-0.3154, -0.1625, -0.3183],
        [-0.0866,  0.5461, -0.0244],
        [ 0.3089, -0.2828, -0.1485],
        [ 0.1671, -0.1341, -0.3094],
        [-0.0627,  0.0088,  0.3471]], grad_fn=<AddBackward0>)

This is simple enough to do for a simple model on a small amount of data. Let’s try using the built-in predict function with different batch sizes, to demonstrate how one would do batched predictions.

[4]:
from tangermeme.predict import predict

y0 = predict(model, X, batch_size=2)
y0
[4]:
tensor([[-0.3154, -0.1625, -0.3183],
        [-0.0866,  0.5461, -0.0244],
        [ 0.3089, -0.2828, -0.1485],
        [ 0.1671, -0.1341, -0.3094],
        [-0.0627,  0.0088,  0.3471]])
[5]:
y0 = predict(model, X, batch_size=100)
y0
[5]:
tensor([[-0.3154, -0.1625, -0.3183],
        [-0.0866,  0.5461, -0.0244],
        [ 0.3089, -0.2828, -0.1485],
        [ 0.1671, -0.1341, -0.3094],
        [-0.0627,  0.0088,  0.3471]])

Note that the tensor no longer has grad_fn=<AddBackward0> meaning that gradients were not calculated or stored. Specifically, the prediction loop is wrapped in torch.no_grad. By default, this function will move each batch to the GPU. However, it doesn’t have to. You can pass device='cpu' to have the predictions be made on the CPU.

[6]:
y0 = predict(model, X, device='cpu')
y0
[6]:
tensor([[-0.3154, -0.1625, -0.3183],
        [-0.0866,  0.5461, -0.0244],
        [ 0.3089, -0.2828, -0.1485],
        [ 0.1671, -0.1341, -0.3094],
        [-0.0627,  0.0088,  0.3471]])

Next, let’s consider the setting where you want to pass additional arguments into the forward function because the model is multi-input. Remember that our model can optionally take in alpha and beta parameters. All we have to do is pass in a tuple of args to the predict function where each element in args is a tensor containing values for one of the inputs to the model.

Let’s start off by looking at just passing in alpha to the model.

[7]:
torch.manual_seed(0)
alpha = torch.randn(5, 1)

y + alpha
[7]:
tensor([[ 1.2256,  1.3785,  1.2227],
        [-0.3800,  0.2527, -0.3178],
        [-1.8699, -2.4616, -2.3273],
        [ 0.7355,  0.4344,  0.2591],
        [-1.1472, -1.0757, -0.7374]], grad_fn=<AddBackward0>)
[8]:
predict(model, X, args=(alpha,))
[8]:
tensor([[ 1.2256,  1.3785,  1.2227],
        [-0.3800,  0.2527, -0.3178],
        [-1.8699, -2.4616, -2.3273],
        [ 0.7355,  0.4344,  0.2591],
        [-1.1472, -1.0757, -0.7374]])

Now, let’s try passing in both alpha and beta.

[9]:
torch.manual_seed(1)
beta = torch.randn(5, 1)

y * beta + alpha
[9]:
tensor([[ 1.3324,  1.4336,  1.3305],
        [-0.3165, -0.1477, -0.2999],
        [-2.1597, -2.1962, -2.1879],
        [ 0.6723,  0.4851,  0.3762],
        [-1.0562, -1.0885, -1.2414]], grad_fn=<AddBackward0>)
[10]:
predict(model, X, args=(alpha, beta))
[10]:
tensor([[ 1.3324,  1.4336,  1.3305],
        [-0.3165, -0.1477, -0.2999],
        [-2.1597, -2.1962, -2.1879],
        [ 0.6723,  0.4851,  0.3762],
        [-1.0562, -1.0885, -1.2414]])

This implementation is extremely flexible. It makes no assumptions on the shape of the underlying data (except that the batch_size dimension is the same), and so we could pass in bigger tensors if we wanted to without having to modify the code.

[11]:
torch.manual_seed(0)
alpha = torch.randn(5, 3)

y + alpha
[11]:
tensor([[ 1.2256, -0.4559, -2.4970],
        [ 0.4818, -0.5384, -1.4230],
        [ 0.7123,  0.5552, -0.8678],
        [-0.2362, -0.7307, -0.1273],
        [-0.9193,  1.1094, -0.7241]], grad_fn=<AddBackward0>)
[12]:
predict(model, X, args=(alpha,))
[12]:
tensor([[ 1.2256, -0.4559, -2.4970],
        [ 0.4818, -0.5384, -1.4230],
        [ 0.7123,  0.5552, -0.8678],
        [-0.2362, -0.7307, -0.1273],
        [-0.9193,  1.1094, -0.7241]])

This means that if you have a model with one input that is biological sequence and another input that is something more complicated – like an image, for instance – you can easily pass both into the model. They just need to be passed in in the same order as defined by the forward function.