Self-Supervised Models
Configuration Classes
pytorch_tabular.ssl_models.DenoisingAutoEncoderConfig
dataclass
Bases: SSLModelConfig
DeNoising AutoEncoder configuration.
PARAMETER | DESCRIPTION |
---|---|
noise_strategy |
Defines what kind of noise we are introducing to samples.
TYPE:
|
noise_probabilities |
Dict of individual probabilities to corrupt the input features with swap/zero noise. Key should be the feature name and if any feature is missing, the default_noise_probability is used. Default is an empty dict()
TYPE:
|
default_noise_probability |
Default probability to corrupt the input features with swap/zero noise. For features for which noise_probabilities does not define a probability. Default is 0.8
TYPE:
|
loss_type_weights |
Weights to be used for the loss function in the order [binary, categorical, numerical]. If None, will use the default weights using a formula. eg. for binary, default weight will be n_binary/n_features. Defaults to None
TYPE:
|
mask_loss_weight |
Weight to be used for the loss function for the masked features. Defaults to 1.0
TYPE:
|
max_onehot_cardinality |
Maximum cardinality of one-hot encoded categorical features. Any categorical feature with cardinality>max_onehot_cardinality will be embedded in a learned embedding space and others will be converted to a one hot representation. If set to 0, will use the embedding strategy for all categorical feature. Default is 4
TYPE:
|
encoder_config |
The config of the encoder to be used for the model. Should be one of the model configs defined in PyTorch Tabular
TYPE:
|
decoder_config |
The config of decoder to be used for the model. Should be one of the model configs defined in PyTorch Tabular. Defaults to nn.Identity
TYPE:
|
embedding_dims |
The dimensions of the embedding for each categorical column as a list of tuples (cardinality, embedding_dim). If left empty, will infer using the cardinality of the categorical column using the rule min(50, (x + 1) // 2)
TYPE:
|
embedding_dropout |
Dropout to be applied to the Categorical Embedding. Defaults to 0.1
TYPE:
|
batch_norm_continuous_input |
If True, we will normalize the continuous layer by passing it through a BatchNorm layer. DEPRECATED - Use head and head_config instead
TYPE:
|
learning_rate |
The learning rate of the model. Defaults to 1e-3
TYPE:
|
seed |
The seed for reproducibility. Defaults to 42
TYPE:
|
Model Classes
pytorch_tabular.ssl_models.DenoisingAutoEncoderModel(config, **kwargs)
Bases: SSLBaseModel
Source code in src/pytorch_tabular/ssl_models/dae/dae.py
Base Model Class
pytorch_tabular.ssl_models.SSLBaseModel(config, mode='pretrain', encoder=None, decoder=None, custom_optimizer=None, custom_optimizer_params={}, **kwargs)
Bases: pl.LightningModule