Skip to content

Heads

Configuration Classes

pytorch_tabular.models.common.heads.LinearHeadConfig dataclass

A model class for Linear Head configuration; serves as a template and documentation. The models take a dictionary as input, but if there are keys which are not present in this model class, it'll throw an exception.

PARAMETER DESCRIPTION
layers

Hyphen-separated number of layers and units in the classification/regression head. eg. 32-64-32. Default is just a mapping from intput dimension to output dimension

TYPE: str DEFAULT: field(default='', metadata={'help': 'Hyphen-separated number of layers and units in the classification/regression head. eg. 32-64-32. Default is just a mapping from intput dimension to output dimension'})

activation

The activation type in the classification head. The default activaion in PyTorch like ReLU, TanH, LeakyReLU, etc. https://pytorch.org/docs/stable/nn.html#non-linear-activations- weighted-sum-nonlinearity

TYPE: str DEFAULT: field(default='ReLU', metadata={'help': 'The activation type in the classification head. The default activaion in PyTorch like ReLU, TanH, LeakyReLU, etc. https://pytorch.org/docs/stable/nn.html#non-linear-activations-weighted-sum-nonlinearity'})

dropout

probability of an classification element to be zeroed.

TYPE: float DEFAULT: field(default=0.0, metadata={'help': 'probability of an classification element to be zeroed.'})

use_batch_norm

Flag to include a BatchNorm layer after each Linear Layer+DropOut

TYPE: bool DEFAULT: field(default=False, metadata={'help': 'Flag to include a BatchNorm layer after each Linear Layer+DropOut'})

initialization

Initialization scheme for the linear layers. Defaults to kaiming. Choices are: [kaiming,xavier,random].

TYPE: str DEFAULT: field(default='kaiming', metadata={'help': 'Initialization scheme for the linear layers. Defaults to `kaiming`', 'choices': ['kaiming', 'xavier', 'random']})

pytorch_tabular.models.common.heads.MixtureDensityHeadConfig dataclass

MixtureDensityHead configuration

PARAMETER DESCRIPTION
num_gaussian

Number of Gaussian Distributions in the mixture model. Defaults to 1

TYPE: int DEFAULT: field(default=1, metadata={'help': 'Number of Gaussian Distributions in the mixture model. Defaults to 1'})

sigma_bias_flag

Whether to have a bias term in the sigma layer. Defaults to False

TYPE: bool DEFAULT: field(default=False, metadata={'help': 'Whether to have a bias term in the sigma layer. Defaults to False'})

mu_bias_init

To initialize the bias parameter of the mu layer to predefined cluster centers. Should be a list with the same length as number of gaussians in the mixture model. It is highly recommended to set the parameter to combat mode collapse. Defaults to None

TYPE: Optional[List] DEFAULT: field(default=None, metadata={'help': 'To initialize the bias parameter of the mu layer to predefined cluster centers. Should be a list with the same length as number of gaussians in the mixture model. It is highly recommended to set the parameter to combat mode collapse. Defaults to None'})

weight_regularization

Whether to apply L1 or L2 Norm to the MDN layers. Defaults to L2. Choices are: [1,2].

TYPE: Optional[int] DEFAULT: field(default=2, metadata={'help': 'Whether to apply L1 or L2 Norm to the MDN layers. Defaults to L2', 'choices': [1, 2]})

lambda_sigma

The regularization constant for weight regularization of sigma layer. Defaults to 0.1

TYPE: Optional[float] DEFAULT: field(default=0.1, metadata={'help': 'The regularization constant for weight regularization of sigma layer. Defaults to 0.1'})

lambda_pi

The regularization constant for weight regularization of pi layer. Defaults to 0.1

TYPE: Optional[float] DEFAULT: field(default=0.1, metadata={'help': 'The regularization constant for weight regularization of pi layer. Defaults to 0.1'})

lambda_mu

The regularization constant for weight regularization of mu layer. Defaults to 0

TYPE: Optional[float] DEFAULT: field(default=0, metadata={'help': 'The regularization constant for weight regularization of mu layer. Defaults to 0'})

softmax_temperature

The temperature to be used in the gumbel softmax of the mixing coefficients. Values less than one leads to sharper transition between the multiple components. Defaults to 1

TYPE: Optional[float] DEFAULT: field(default=1, metadata={'help': 'The temperature to be used in the gumbel softmax of the mixing coefficients. Values less than one leads to sharper transition between the multiple components. Defaults to 1'})

n_samples

Number of samples to draw from the posterior to get prediction. Defaults to 100

TYPE: int DEFAULT: field(default=100, metadata={'help': 'Number of samples to draw from the posterior to get prediction. Defaults to 100'})

central_tendency

Which measure to use to get the point prediction. Defaults to mean. Choices are: [mean,median].

TYPE: str DEFAULT: field(default='mean', metadata={'help': 'Which measure to use to get the point prediction. Defaults to mean', 'choices': ['mean', 'median']})

speedup_training

Turning on this parameter does away with sampling during training which speeds up training, but also doesn't give you visibility on train metrics. Defaults to False

TYPE: bool DEFAULT: field(default=False, metadata={'help': "Turning on this parameter does away with sampling during training which speeds up training, but also doesn't give you visibility on train metrics. Defaults to False"})

log_debug_plot

Turning on this parameter plots histograms of the mu, sigma, and pi layers in addition to the logits(if log_logits is turned on in experment config). Defaults to False

TYPE: bool DEFAULT: field(default=False, metadata={'help': 'Turning on this parameter plots histograms of the mu, sigma, and pi layers in addition to the logits(if log_logits is turned on in experment config). Defaults to False'})

input_dim

The input dimensions to the head. This will be automatically filled in while initializing from the backbone.output_dim

TYPE: int DEFAULT: field(default=None, metadata={'help': 'The input dimensions to the head. This will be automatically filled in while initializing from the `backbone.output_dim`'})

Head Classes

pytorch_tabular.models.common.heads.LinearHead(in_units, output_dim, config, **kwargs)

Bases: Head

Source code in src/pytorch_tabular/models/common/heads/blocks.py
def __init__(self, in_units: int, output_dim: int, config, **kwargs):
    # Linear Layers
    _layers = []
    _curr_units = in_units
    for units in config.layers.split("-"):
        try:
            int(units)
        except ValueError:
            if units == "":
                continue
            else:
                raise ValueError(f"Invalid units {units} in layers {config.layers}")
        _layers.extend(
            _linear_dropout_bn(
                config.activation,
                config.initialization,
                config.use_batch_norm,
                _curr_units,
                int(units),
                config.dropout,
            )
        )
        _curr_units = int(units)
    # Appending Final Output
    _layers.append(nn.Linear(_curr_units, output_dim))
    linear_layers = nn.Sequential(*_layers)
    _initialize_layers(config.activation, config.initialization, linear_layers)
    super().__init__(
        layers=linear_layers,
        config_template=head_config.LinearHeadConfig,
    )

pytorch_tabular.models.common.heads.MixtureDensityHead(config, **kwargs)

Bases: nn.Module

Source code in src/pytorch_tabular/models/common/heads/blocks.py
def __init__(self, config: DictConfig, **kwargs):
    self.hparams = config
    super().__init__()
    self._build_network()

gaussian_probability(sigma, mu, target, log=False)

Returns the probability of target given MoG parameters sigma and mu.

PARAMETER DESCRIPTION
sigma

The standard deviation of the Gaussians. B is the batch size, G is the number of Gaussians, and O is the number of dimensions per Gaussian.

TYPE: BxGxO

mu

The means of the Gaussians. B is the batch size, G is the number of Gaussians, and O is the number of dimensions per Gaussian.

TYPE: BxGxO

target

A batch of target. B is the batch size and I is the number of input dimensions.

TYPE: BxI

RETURNS DESCRIPTION
probabilities

The probability of each point in the probability of the distribution in the corresponding sigma/mu index.

TYPE: BxG

Source code in src/pytorch_tabular/models/common/heads/blocks.py
def gaussian_probability(self, sigma, mu, target, log=False):
    """Returns the probability of `target` given MoG parameters `sigma` and `mu`.

    Arguments:
        sigma (BxGxO): The standard deviation of the Gaussians. B is the batch
            size, G is the number of Gaussians, and O is the number of
            dimensions per Gaussian.
        mu (BxGxO): The means of the Gaussians. B is the batch size, G is the
            number of Gaussians, and O is the number of dimensions per Gaussian.
        target (BxI): A batch of target. B is the batch size and I is the number of
            input dimensions.
    Returns:
        probabilities (BxG): The probability of each point in the probability
            of the distribution in the corresponding sigma/mu index.
    """
    target = target.expand_as(sigma)
    if log:
        ret = -torch.log(sigma) - 0.5 * LOG2PI - 0.5 * torch.pow((target - mu) / sigma, 2)
    else:
        ret = (ONEOVERSQRT2PI / sigma) * torch.exp(-0.5 * ((target - mu) / sigma) ** 2)
    return ret  # torch.prod(ret, 2)

sample(pi, sigma, mu)

Draw samples from a MoG.

Source code in src/pytorch_tabular/models/common/heads/blocks.py
def sample(self, pi, sigma, mu):
    """Draw samples from a MoG."""
    categorical = Categorical(pi)
    pis = categorical.sample().unsqueeze(1)
    sample = Variable(sigma.data.new(sigma.size(0), 1).normal_())
    # Gathering from the n Gaussian Distribution based on sampled indices
    sample = sample * sigma.gather(1, pis) + mu.gather(1, pis)
    return sample