Skip to content

Supervised Models

Configuration Classes

pytorch_tabular.models.AutoIntConfig dataclass

Bases: ModelConfig

AutomaticFeatureInteraction configuration

PARAMETER DESCRIPTION
attn_embed_dim

The number of hidden units in the Multi-Headed Attention layers. Defaults to 32

TYPE: int DEFAULT: field(default=32, metadata={'help': 'The number of hidden units in the Multi-Headed Attention layers. Defaults to 32'})

num_heads

The number of heads in the Multi-Headed Attention layer. Defaults to 2

TYPE: int DEFAULT: field(default=2, metadata={'help': 'The number of heads in the Multi-Headed Attention layer. Defaults to 2'})

num_attn_blocks

The number of layers of stacked Multi-Headed Attention layers. Defaults to 2

TYPE: int DEFAULT: field(default=3, metadata={'help': 'The number of layers of stacked Multi-Headed Attention layers. Defaults to 2'})

attn_dropouts

Dropout between layers of Multi-Headed Attention Layers. Defaults to 0.0

TYPE: float DEFAULT: field(default=0.0, metadata={'help': 'Dropout between layers of Multi-Headed Attention Layers. Defaults to 0.0'})

has_residuals

Flag to have a residual connect from enbedded output to attention layer output. Defaults to True

TYPE: bool DEFAULT: field(default=True, metadata={'help': 'Flag to have a residual connect from enbedded output to attention layer output. Defaults to True'})

embedding_dim

The dimensions of the embedding for continuous and categorical columns. Defaults to 16

TYPE: int DEFAULT: field(default=16, metadata={'help': 'The dimensions of the embedding for continuous and categorical columns. Defaults to 16'})

embedding_initialization

Initialization scheme for the embedding layers. Defaults to kaiming. Choices are: [kaiming_uniform,kaiming_normal].

TYPE: Optional[str] DEFAULT: field(default='kaiming_uniform', metadata={'help': 'Initialization scheme for the embedding layers. Defaults to `kaiming`', 'choices': ['kaiming_uniform', 'kaiming_normal']})

embedding_bias

Flag to turn on Embedding Bias. Defaults to True

TYPE: bool DEFAULT: field(default=True, metadata={'help': 'Flag to turn on Embedding Bias. Defaults to True'})

share_embedding

The flag turns on shared embeddings in the input embedding process. The key idea here is to have an embedding for the feature as a whole along with embeddings of each unique values of that column. For more details refer to Appendix A of the TabTransformer paper. Defaults to False

TYPE: bool DEFAULT: field(default=False, metadata={'help': 'The flag turns on shared embeddings in the input embedding process. The key idea here is to have an embedding for the feature as a whole along with embeddings of each unique values of that column. For more details refer to Appendix A of the TabTransformer paper. Defaults to False'})

share_embedding_strategy

There are two strategies in adding shared embeddings. 1. add - A separate embedding for the feature is added to the embedding of the unique values of the feature. 2. fraction - A fraction of the input embedding is reserved for the shared embedding of the feature. Defaults to fraction.. Choices are: [add,fraction].

TYPE: Optional[str] DEFAULT: field(default='fraction', metadata={'help': 'There are two strategies in adding shared embeddings. 1. `add` - A separate embedding for the feature is added to the embedding of the unique values of the feature. 2. `fraction` - A fraction of the input embedding is reserved for the shared embedding of the feature. Defaults to fraction.', 'choices': ['add', 'fraction']})

shared_embedding_fraction

Fraction of the input_embed_dim to be reserved by the shared embedding. Should be less than one. Defaults to 0.25

TYPE: float DEFAULT: field(default=0.25, metadata={'help': 'Fraction of the input_embed_dim to be reserved by the shared embedding. Should be less than one. Defaults to 0.25'})

deep_layers

Flag to enable a deep MLP layer before the Multi-Headed Attention layer. Defaults to False

TYPE: bool DEFAULT: field(default=False, metadata={'help': 'Flag to enable a deep MLP layer before the Multi-Headed Attention layer. Defaults to False'})

layers

Hyphen-separated number of layers and units in the deep MLP. Defaults to 128-64-32

TYPE: str DEFAULT: field(default='128-64-32', metadata={'help': 'Hyphen-separated number of layers and units in the deep MLP. Defaults to 128-64-32'})

activation

The activation type in the deep MLP. The default activaion in PyTorch like ReLU, TanH, LeakyReLU, etc. https://pytorch.org/docs/stable/nn.html#non-linear-activations-weighted-sum- nonlinearity. Defaults to ReLU

TYPE: str DEFAULT: field(default='ReLU', metadata={'help': 'The activation type in the deep MLP. The default activaion in PyTorch like ReLU, TanH, LeakyReLU, etc. https://pytorch.org/docs/stable/nn.html#non-linear-activations-weighted-sum-nonlinearity. Defaults to ReLU'})

use_batch_norm

Flag to include a BatchNorm layer after each Linear Layer+DropOut in the deep MLP. Defaults to False

TYPE: bool DEFAULT: field(default=False, metadata={'help': 'Flag to include a BatchNorm layer after each Linear Layer+DropOut in the deep MLP. Defaults to False'})

initialization

Initialization scheme for the linear layers in the deep MLP. Defaults to kaiming. Choices are: [kaiming,xavier,random].

TYPE: str DEFAULT: field(default='kaiming', metadata={'help': 'Initialization scheme for the linear layers in the deep MLP. Defaults to `kaiming`', 'choices': ['kaiming', 'xavier', 'random']})

dropout

probability of an element to be zeroed in the deep MLP. Defaults to 0.0

TYPE: float DEFAULT: field(default=0.0, metadata={'help': 'probability of an element to be zeroed in the deep MLP. Defaults to 0.0'})

attention_pooling

If True, will combine the attention outputs of each block for final prediction. Defaults to False

TYPE: bool DEFAULT: field(default=False, metadata={'help': 'If True, will combine the attention outputs of each block for final prediction. Defaults to False'})

task

Specify whether the problem is regression or classification. backbone is a task which considers the model as a backbone to generate features. Mostly used internally for SSL and related tasks.. Choices are: [regression,classification,backbone].

TYPE: str DEFAULT: field(metadata={'help': 'Specify whether the problem is regression or classification. `backbone` is a task which considers the model as a backbone to generate features. Mostly used internally for SSL and related tasks.', 'choices': ['regression', 'classification', 'backbone']})

head

The head to be used for the model. Should be one of the heads defined in pytorch_tabular.models.common.heads. Defaults to LinearHead. Choices are: [None,LinearHead,MixtureDensityHead].

TYPE: Optional[str] DEFAULT: field(default='LinearHead', metadata={'help': 'The head to be used for the model. Should be one of the heads defined in `pytorch_tabular.models.common.heads`. Defaults to LinearHead', 'choices': [None, 'LinearHead', 'MixtureDensityHead']})

head_config

The config as a dict which defines the head. If left empty, will be initialized as default linear head.

TYPE: Optional[Dict] DEFAULT: field(default_factory=lambda : {'layers': ''}, metadata={'help': 'The config as a dict which defines the head. If left empty, will be initialized as default linear head.'})

embedding_dims

The dimensions of the embedding for each categorical column as a list of tuples (cardinality, embedding_dim). If left empty, will infer using the cardinality of the categorical column using the rule min(50, (x + 1) // 2)

TYPE: Optional[List] DEFAULT: field(default=None, metadata={'help': 'The dimensions of the embedding for each categorical column as a list of tuples (cardinality, embedding_dim). If left empty, will infer using the cardinality of the categorical column using the rule min(50, (x + 1) // 2)'})

embedding_dropout

Dropout to be applied to the Categorical Embedding. Defaults to 0.1

TYPE: float DEFAULT: field(default=0.0, metadata={'help': 'Dropout to be applied to the Categorical Embedding. Defaults to 0.0'})

batch_norm_continuous_input

If True, we will normalize the continuous layer by passing it through a BatchNorm layer.

TYPE: bool DEFAULT: field(default=True, metadata={'help': 'If True, we will normalize the continuous layer by passing it through a BatchNorm layer.'})

learning_rate

The learning rate of the model. Defaults to 1e-3.

TYPE: float DEFAULT: field(default=0.001, metadata={'help': 'The learning rate of the model. Defaults to 1e-3.'})

loss

The loss function to be applied. By Default it is MSELoss for regression and CrossEntropyLoss for classification. Unless you are sure what you are doing, leave it at MSELoss or L1Loss for regression and CrossEntropyLoss for classification

TYPE: Optional[str] DEFAULT: field(default=None, metadata={'help': 'The loss function to be applied. By Default it is MSELoss for regression and CrossEntropyLoss for classification. Unless you are sure what you are doing, leave it at MSELoss or L1Loss for regression and CrossEntropyLoss for classification'})

metrics

the list of metrics you need to track during training. The metrics should be one of the functional metrics implemented in torchmetrics. By default, it is accuracy if classification and mean_squared_error for regression

TYPE: Optional[List[str]] DEFAULT: field(default=None, metadata={'help': 'the list of metrics you need to track during training. The metrics should be one of the functional metrics implemented in ``torchmetrics``. To use your own metric, please use the `metric` param in the `fit` method By default, it is accuracy if classification and mean_squared_error for regression'})

metrics_params

The parameters to be passed to the metrics function

TYPE: Optional[List] DEFAULT: field(default=None, metadata={'help': 'The parameters to be passed to the metrics function. `task` is forced to be `multiclass`` because the multiclass version can handle binary as well and for simplicity we are only using `multiclass`.'})

metrics_prob_input

Is a mandatory parameter for classification metrics defined in the config. This defines whether the input to the metric function is the probability or the class. Length should be same as the number of metrics. Defaults to None.

TYPE: Optional[List] DEFAULT: field(default=None, metadata={'help': 'Is a mandatory parameter for classification metrics defined in the config. This defines whether the input to the metric function is the probability or the class. Length should be same as the number of metrics. Defaults to None.'})

target_range

The range in which we should limit the output variable. Currently ignored for multi-target regression. Typically used for Regression problems. If left empty, will not apply any restrictions

TYPE: Optional[List] DEFAULT: field(default=None, metadata={'help': 'The range in which we should limit the output variable. Currently ignored for multi-target regression. Typically used for Regression problems. If left empty, will not apply any restrictions'})

seed

The seed for reproducibility. Defaults to 42

TYPE: int DEFAULT: field(default=42, metadata={'help': 'The seed for reproducibility. Defaults to 42'})

pytorch_tabular.models.CategoryEmbeddingModelConfig dataclass

Bases: ModelConfig

CategoryEmbeddingModel configuration

PARAMETER DESCRIPTION
layers

Hyphen-separated number of layers and units in the classification head. eg. 32-64-32. Defaults to 128-64-32

TYPE: str DEFAULT: field(default='128-64-32', metadata={'help': 'Hyphen-separated number of layers and units in the classification head. eg. 32-64-32. Defaults to 128-64-32'})

activation

The activation type in the classification head. The default activaion in PyTorch like ReLU, TanH, LeakyReLU, etc. https://pytorch.org/docs/stable/nn.html#non-linear-activations-weighted-sum-nonlinearity. Defaults to ReLU

TYPE: str DEFAULT: field(default='ReLU', metadata={'help': 'The activation type in the classification head. The default activaion in PyTorch like ReLU, TanH, LeakyReLU, etc. https://pytorch.org/docs/stable/nn.html#non-linear-activations-weighted-sum-nonlinearity. Defaults to ReLU'})

use_batch_norm

Flag to include a BatchNorm layer after each Linear Layer+DropOut. Defaults to False

TYPE: bool DEFAULT: field(default=False, metadata={'help': 'Flag to include a BatchNorm layer after each Linear Layer+DropOut. Defaults to False'})

initialization

Initialization scheme for the linear layers. Defaults to kaiming. Choices are: [kaiming,xavier,random].

TYPE: str DEFAULT: field(default='kaiming', metadata={'help': 'Initialization scheme for the linear layers. Defaults to `kaiming`', 'choices': ['kaiming', 'xavier', 'random']})

dropout

probability of an classification element to be zeroed. This is added to each linear layer. Defaults to 0.0

TYPE: float DEFAULT: field(default=0.0, metadata={'help': 'probability of an classification element to be zeroed. This is added to each linear layer. Defaults to 0.0'})

task

Specify whether the problem is regression or classification. backbone is a task which considers the model as a backbone to generate features. Mostly used internally for SSL and related tasks.. Choices are: [regression,classification,backbone].

TYPE: str DEFAULT: field(metadata={'help': 'Specify whether the problem is regression or classification. `backbone` is a task which considers the model as a backbone to generate features. Mostly used internally for SSL and related tasks.', 'choices': ['regression', 'classification', 'backbone']})

head

The head to be used for the model. Should be one of the heads defined in pytorch_tabular.models.common.heads. Defaults to LinearHead. Choices are: [None,LinearHead,MixtureDensityHead].

TYPE: Optional[str] DEFAULT: field(default='LinearHead', metadata={'help': 'The head to be used for the model. Should be one of the heads defined in `pytorch_tabular.models.common.heads`. Defaults to LinearHead', 'choices': [None, 'LinearHead', 'MixtureDensityHead']})

head_config

The config as a dict which defines the head. If left empty, will be initialized as default linear head.

TYPE: Optional[Dict] DEFAULT: field(default_factory=lambda : {'layers': ''}, metadata={'help': 'The config as a dict which defines the head. If left empty, will be initialized as default linear head.'})

embedding_dims

The dimensions of the embedding for each categorical column as a list of tuples (cardinality, embedding_dim). If left empty, will infer using the cardinality of the categorical column using the rule min(50, (x + 1) // 2)

TYPE: Optional[List] DEFAULT: field(default=None, metadata={'help': 'The dimensions of the embedding for each categorical column as a list of tuples (cardinality, embedding_dim). If left empty, will infer using the cardinality of the categorical column using the rule min(50, (x + 1) // 2)'})

embedding_dropout

Dropout to be applied to the Categorical Embedding. Defaults to 0.1

TYPE: float DEFAULT: field(default=0.0, metadata={'help': 'Dropout to be applied to the Categorical Embedding. Defaults to 0.0'})

batch_norm_continuous_input

If True, we will normalize the continuous layer by passing it through a BatchNorm layer.

TYPE: bool DEFAULT: field(default=True, metadata={'help': 'If True, we will normalize the continuous layer by passing it through a BatchNorm layer.'})

learning_rate

The learning rate of the model. Defaults to 1e-3.

TYPE: float DEFAULT: field(default=0.001, metadata={'help': 'The learning rate of the model. Defaults to 1e-3.'})

loss

The loss function to be applied. By Default it is MSELoss for regression and CrossEntropyLoss for classification. Unless you are sure what you are doing, leave it at MSELoss or L1Loss for regression and CrossEntropyLoss for classification

TYPE: Optional[str] DEFAULT: field(default=None, metadata={'help': 'The loss function to be applied. By Default it is MSELoss for regression and CrossEntropyLoss for classification. Unless you are sure what you are doing, leave it at MSELoss or L1Loss for regression and CrossEntropyLoss for classification'})

metrics

the list of metrics you need to track during training. The metrics should be one of the functional metrics implemented in torchmetrics. By default, it is accuracy if classification and mean_squared_error for regression

TYPE: Optional[List[str]] DEFAULT: field(default=None, metadata={'help': 'the list of metrics you need to track during training. The metrics should be one of the functional metrics implemented in ``torchmetrics``. To use your own metric, please use the `metric` param in the `fit` method By default, it is accuracy if classification and mean_squared_error for regression'})

metrics_params

The parameters to be passed to the metrics function. task is forced to be multiclass because the multiclass version can handle binary as well and for simplicity we are only using multiclass.

TYPE: Optional[List] DEFAULT: field(default=None, metadata={'help': 'The parameters to be passed to the metrics function. `task` is forced to be `multiclass`` because the multiclass version can handle binary as well and for simplicity we are only using `multiclass`.'})

metrics_prob_input

Is a mandatory parameter for classification metrics defined in the config. This defines whether the input to the metric function is the probability or the class. Length should be same as the number of metrics. Defaults to None.

TYPE: Optional[List] DEFAULT: field(default=None, metadata={'help': 'Is a mandatory parameter for classification metrics defined in the config. This defines whether the input to the metric function is the probability or the class. Length should be same as the number of metrics. Defaults to None.'})

target_range

The range in which we should limit the output variable. Currently ignored for multi-target regression. Typically used for Regression problems. If left empty, will not apply any restrictions

TYPE: Optional[List] DEFAULT: field(default=None, metadata={'help': 'The range in which we should limit the output variable. Currently ignored for multi-target regression. Typically used for Regression problems. If left empty, will not apply any restrictions'})

seed

The seed for reproducibility. Defaults to 42

TYPE: int DEFAULT: field(default=42, metadata={'help': 'The seed for reproducibility. Defaults to 42'})

pytorch_tabular.models.FTTransformerConfig dataclass

Bases: ModelConfig

Tab Transformer configuration

PARAMETER DESCRIPTION
input_embed_dim

The embedding dimension for the input categorical features. Defaults to 32

TYPE: int DEFAULT: field(default=32, metadata={'help': 'The embedding dimension for the input categorical features. Defaults to 32'})

embedding_initialization

Initialization scheme for the embedding layers. Defaults to kaiming. Choices are: [kaiming_uniform,kaiming_normal].

TYPE: Optional[str] DEFAULT: field(default='kaiming_uniform', metadata={'help': 'Initialization scheme for the embedding layers. Defaults to `kaiming`', 'choices': ['kaiming_uniform', 'kaiming_normal']})

embedding_bias

Flag to turn on Embedding Bias. Defaults to True

TYPE: bool DEFAULT: field(default=True, metadata={'help': 'Flag to turn on Embedding Bias. Defaults to True'})

share_embedding

The flag turns on shared embeddings in the input embedding process. The key idea here is to have an embedding for the feature as a whole along with embeddings of each unique values of that column. For more details refer to Appendix A of the TabTransformer paper. Defaults to False

TYPE: bool DEFAULT: field(default=False, metadata={'help': 'The flag turns on shared embeddings in the input embedding process. The key idea here is to have an embedding for the feature as a whole along with embeddings of each unique values of that column. For more details refer to Appendix A of the TabTransformer paper. Defaults to False'})

share_embedding_strategy

There are two strategies in adding shared embeddings. 1. add - A separate embedding for the feature is added to the embedding of the unique values of the feature. 2. fraction - A fraction of the input embedding is reserved for the shared embedding of the feature. Defaults to fraction.. Choices are: [add,fraction].

TYPE: Optional[str] DEFAULT: field(default='fraction', metadata={'help': 'There are two strategies in adding shared embeddings. 1. `add` - A separate embedding for the feature is added to the embedding of the unique values of the feature. 2. `fraction` - A fraction of the input embedding is reserved for the shared embedding of the feature. Defaults to fraction.', 'choices': ['add', 'fraction']})

shared_embedding_fraction

Fraction of the input_embed_dim to be reserved by the shared embedding. Should be less than one. Defaults to 0.25

TYPE: float DEFAULT: field(default=0.25, metadata={'help': 'Fraction of the input_embed_dim to be reserved by the shared embedding. Should be less than one. Defaults to 0.25'})

attn_feature_importance

If you are facing memory issues, you can turn off feature importance which will not save the attention weights. Defaults to True

TYPE: bool DEFAULT: field(default=True, metadata={'help': 'If you are facing memory issues, you can turn off feature importance which will not save the attention weights. Defaults to True'})

num_heads

The number of heads in the Multi-Headed Attention layer. Defaults to 8

TYPE: int DEFAULT: field(default=8, metadata={'help': 'The number of heads in the Multi-Headed Attention layer. Defaults to 8'})

num_attn_blocks

The number of layers of stacked Multi-Headed Attention layers. Defaults to 6

TYPE: int DEFAULT: field(default=6, metadata={'help': 'The number of layers of stacked Multi-Headed Attention layers. Defaults to 6'})

transformer_head_dim

The number of hidden units in the Multi-Headed Attention layers. Defaults to None and will be same as input_dim.

TYPE: Optional[int] DEFAULT: field(default=None, metadata={'help': 'The number of hidden units in the Multi-Headed Attention layers. Defaults to None and will be same as input_dim.'})

attn_dropout

Dropout to be applied after Multi headed Attention. Defaults to 0.1

TYPE: float DEFAULT: field(default=0.1, metadata={'help': 'Dropout to be applied after Multi headed Attention. Defaults to 0.1'})

add_norm_dropout

Dropout to be applied in the AddNorm Layer. Defaults to 0.1

TYPE: float DEFAULT: field(default=0.1, metadata={'help': 'Dropout to be applied in the AddNorm Layer. Defaults to 0.1'})

ff_dropout

Dropout to be applied in the Positionwise FeedForward Network. Defaults to 0.1

TYPE: float DEFAULT: field(default=0.1, metadata={'help': 'Dropout to be applied in the Positionwise FeedForward Network. Defaults to 0.1'})

ff_hidden_multiplier

Multiple by which the Positionwise FF layer scales the input. Defaults to 4

TYPE: int DEFAULT: field(default=4, metadata={'help': 'Multiple by which the Positionwise FF layer scales the input. Defaults to 4'})

transformer_activation

The activation type in the transformer feed forward layers. In addition to the default activation in PyTorch like ReLU, TanH, LeakyReLU, etc. https://pytorch.org/docs/stable/nn.html#non-linear-activations-weighted-sum-nonlinearity, GEGLU, ReGLU and SwiGLU are also implemented(https://arxiv.org/pdf/2002.05202.pdf). Defaults to GEGLU

TYPE: str DEFAULT: field(default='GEGLU', metadata={'help': 'The activation type in the transformer feed forward layers. In addition to the default activation in PyTorch like ReLU, TanH, LeakyReLU, etc. https://pytorch.org/docs/stable/nn.html#non-linear-activations-weighted-sum-nonlinearity, GEGLU, ReGLU and SwiGLU are also implemented (https://arxiv.org/pdf/2002.05202.pdf). Defaults to GEGLU'})

out_ff_layers

DEPRECATED: Hyphen-separated number of layers and units in the deep MLP. Defaults to 128-64-32

TYPE: Optional[str] DEFAULT: field(default=None, metadata={'help': 'DEPRECATED: Hyphen-separated number of layers and units in the deep MLP. Defaults to 128-64-32'})

out_ff_activation

DEPRECATED: The activation type in the deep MLP. The default activaion in PyTorch like ReLU, TanH, LeakyReLU, etc. https://pytorch.org/docs/stable/nn.html#non-linear-activations-weighted-sum-nonlinearity. Defaults to ReLU

TYPE: Optional[str] DEFAULT: field(default=None, metadata={'help': 'DEPRECATED: The activation type in the deep MLP. The default activaion in PyTorch like ReLU, TanH, LeakyReLU, etc. https://pytorch.org/docs/stable/nn.html#non-linear-activations-weighted-sum-nonlinearity. Defaults to ReLU'})

out_ff_dropout

DEPRECATED: probability of an classification element to be zeroed in the deep MLP. Defaults to 0.0

TYPE: Optional[float] DEFAULT: field(default=None, metadata={'help': 'DEPRECATED: probability of an classification element to be zeroed in the deep MLP. Defaults to 0.0'})

out_ff_initialization

DEPRECATED: Initialization scheme for the linear layers. Defaults to kaiming. Choices are: [None,kaiming,xavier,random].

TYPE: Optional[str] DEFAULT: field(default=None, metadata={'help': 'DEPRECATED: Initialization scheme for the linear layers. Defaults to `kaiming`', 'choices': [None, 'kaiming', 'xavier', 'random']})

task

Specify whether the problem is regression or classification. backbone is a task which considers the model as a backbone to generate features. Mostly used internally for SSL and related tasks.. Choices are: [regression,classification,backbone].

TYPE: str DEFAULT: field(metadata={'help': 'Specify whether the problem is regression or classification. `backbone` is a task which considers the model as a backbone to generate features. Mostly used internally for SSL and related tasks.', 'choices': ['regression', 'classification', 'backbone']})

head

The head to be used for the model. Should be one of the heads defined in pytorch_tabular.models.common.heads. Defaults to LinearHead. Choices are: [None,LinearHead,MixtureDensityHead].

TYPE: Optional[str] DEFAULT: field(default='LinearHead', metadata={'help': 'The head to be used for the model. Should be one of the heads defined in `pytorch_tabular.models.common.heads`. Defaults to LinearHead', 'choices': [None, 'LinearHead', 'MixtureDensityHead']})

head_config

The config as a dict which defines the head. If left empty, will be initialized as default linear head.

TYPE: Optional[Dict] DEFAULT: field(default_factory=lambda : {'layers': ''}, metadata={'help': 'The config as a dict which defines the head. If left empty, will be initialized as default linear head.'})

embedding_dims

The dimensions of the embedding for each categorical column as a list of tuples (cardinality, embedding_dim). If left empty, will infer using the cardinality of the categorical column using the rule min(50, (x + 1) // 2)

TYPE: Optional[List] DEFAULT: field(default=None, metadata={'help': 'The dimensions of the embedding for each categorical column as a list of tuples (cardinality, embedding_dim). If left empty, will infer using the cardinality of the categorical column using the rule min(50, (x + 1) // 2)'})

embedding_dropout

Dropout to be applied to the Categorical Embedding. Defaults to 0.1

TYPE: float DEFAULT: field(default=0.0, metadata={'help': 'Dropout to be applied to the Categorical Embedding. Defaults to 0.0'})

batch_norm_continuous_input

If True, we will normalize the continuous layer by passing it through a BatchNorm layer.

TYPE: bool DEFAULT: field(default=True, metadata={'help': 'If True, we will normalize the continuous layer by passing it through a BatchNorm layer.'})

learning_rate

The learning rate of the model. Defaults to 1e-3.

TYPE: float DEFAULT: field(default=0.001, metadata={'help': 'The learning rate of the model. Defaults to 1e-3.'})

loss

The loss function to be applied. By Default it is MSELoss for regression and CrossEntropyLoss for classification. Unless you are sure what you are doing, leave it at MSELoss or L1Loss for regression and CrossEntropyLoss for classification

TYPE: Optional[str] DEFAULT: field(default=None, metadata={'help': 'The loss function to be applied. By Default it is MSELoss for regression and CrossEntropyLoss for classification. Unless you are sure what you are doing, leave it at MSELoss or L1Loss for regression and CrossEntropyLoss for classification'})

metrics

the list of metrics you need to track during training. The metrics should be one of the functional metrics implemented in torchmetrics. By default, it is accuracy if classification and mean_squared_error for regression

TYPE: Optional[List[str]] DEFAULT: field(default=None, metadata={'help': 'the list of metrics you need to track during training. The metrics should be one of the functional metrics implemented in ``torchmetrics``. To use your own metric, please use the `metric` param in the `fit` method By default, it is accuracy if classification and mean_squared_error for regression'})

metrics_params

The parameters to be passed to the metrics function. task is forced to be multiclass because the multiclass version can handle binary as well and for simplicity we are only using multiclass.

TYPE: Optional[List] DEFAULT: field(default=None, metadata={'help': 'The parameters to be passed to the metrics function. `task` is forced to be `multiclass`` because the multiclass version can handle binary as well and for simplicity we are only using `multiclass`.'})

metrics_prob_input

Is a mandatory parameter for classification metrics defined in the config. This defines whether the input to the metric function is the probability or the class. Length should be same as the number of metrics. Defaults to None.

TYPE: Optional[List] DEFAULT: field(default=None, metadata={'help': 'Is a mandatory parameter for classification metrics defined in the config. This defines whether the input to the metric function is the probability or the class. Length should be same as the number of metrics. Defaults to None.'})

target_range

The range in which we should limit the output variable. Currently ignored for multi-target regression. Typically used for Regression problems. If left empty, will not apply any restrictions

TYPE: Optional[List] DEFAULT: field(default=None, metadata={'help': 'The range in which we should limit the output variable. Currently ignored for multi-target regression. Typically used for Regression problems. If left empty, will not apply any restrictions'})

seed

The seed for reproducibility. Defaults to 42

TYPE: int DEFAULT: field(default=42, metadata={'help': 'The seed for reproducibility. Defaults to 42'})

pytorch_tabular.models.GANDALFConfig dataclass

Bases: ModelConfig

Gated Adaptive Network for Deep Automated Learning of Features (GANDALF) Config.

PARAMETER DESCRIPTION
gflu_stages

Number of layers in the feature abstraction layer. Defaults to 6

TYPE: int DEFAULT: field(default=6, metadata={'help': 'Number of layers in the feature abstraction layer. Defaults to 6'})

gflu_dropout

Dropout rate for the feature abstraction layer. Defaults to 0.0

TYPE: float DEFAULT: field(default=0.0, metadata={'help': 'Dropout rate for the feature abstraction layer. Defaults to 0.0'})

gflu_feature_init_sparsity

Only valid for t-softmax. The perecentge of features to be selected in each GFLU stage. This is just initialized and during learning it may change. Defaults to 0.3

TYPE: float DEFAULT: field(default=0.3, metadata={'help': 'Only valid for t-softmax. The perecentge of features to be selected in each GFLU stage. This is just initialized and during learning it may change'})

learnable_sparsity

Only valid for t-softmax. If True, the sparsity parameters will be learned. If False, the sparsity parameters will be fixed to the initial values specified in gflu_feature_init_sparsity and tree_feature_init_sparsity. Defaults to True

TYPE: bool DEFAULT: field(default=True, metadata={'help': 'Only valid for t-softmax. If True, the sparsity parameters will be learned.If False, the sparsity parameters will be fixed to the initial values specified in `gflu_feature_init_sparsity` and `tree_feature_init_sparsity`'})

task

Specify whether the problem is regression or classification. backbone is a task which considers the model as a backbone to generate features. Mostly used internally for SSL and related tasks.. Choices are: [regression,classification,backbone].

TYPE: str DEFAULT: field(metadata={'help': 'Specify whether the problem is regression or classification. `backbone` is a task which considers the model as a backbone to generate features. Mostly used internally for SSL and related tasks.', 'choices': ['regression', 'classification', 'backbone']})

head

The head to be used for the model. Should be one of the heads defined in pytorch_tabular.models.common.heads. Defaults to LinearHead. Choices are: [None,LinearHead,MixtureDensityHead].

TYPE: Optional[str] DEFAULT: field(default='LinearHead', metadata={'help': 'The head to be used for the model. Should be one of the heads defined in `pytorch_tabular.models.common.heads`. Defaults to LinearHead', 'choices': [None, 'LinearHead', 'MixtureDensityHead']})

head_config

The config as a dict which defines the head. If left empty, will be initialized as default linear head.

TYPE: Optional[Dict] DEFAULT: field(default_factory=lambda : {'layers': ''}, metadata={'help': 'The config as a dict which defines the head. If left empty, will be initialized as default linear head.'})

embedding_dims

The dimensions of the embedding for each categorical column as a list of tuples (cardinality, embedding_dim). If left empty, will infer using the cardinality of the categorical column using the rule min(50, (x + 1) // 2)

TYPE: Optional[List] DEFAULT: field(default=None, metadata={'help': 'The dimensions of the embedding for each categorical column as a list of tuples (cardinality, embedding_dim). If left empty, will infer using the cardinality of the categorical column using the rule min(50, (x + 1) // 2)'})

embedding_dropout

Dropout to be applied to the Categorical Embedding. Defaults to 0.1

TYPE: float DEFAULT: field(default=0.0, metadata={'help': 'Dropout to be applied to the Categorical Embedding. Defaults to 0.0'})

batch_norm_continuous_input

If True, we will normalize the continuous layer by passing it through a BatchNorm layer.

TYPE: bool DEFAULT: field(default=True, metadata={'help': 'If True, we will normalize the continuous layer by passing it through a BatchNorm layer.'})

learning_rate

The learning rate of the model. Defaults to 1e-3.

TYPE: float DEFAULT: field(default=0.001, metadata={'help': 'The learning rate of the model. Defaults to 1e-3.'})

loss

The loss function to be applied. By Default it is MSELoss for regression and CrossEntropyLoss for classification. Unless you are sure what you are doing, leave it at MSELoss or L1Loss for regression and CrossEntropyLoss for classification

TYPE: Optional[str] DEFAULT: field(default=None, metadata={'help': 'The loss function to be applied. By Default it is MSELoss for regression and CrossEntropyLoss for classification. Unless you are sure what you are doing, leave it at MSELoss or L1Loss for regression and CrossEntropyLoss for classification'})

metrics

the list of metrics you need to track during training. The metrics should be one of the functional metrics implemented in torchmetrics. By default, it is accuracy if classification and mean_squared_error for regression

TYPE: Optional[List[str]] DEFAULT: field(default=None, metadata={'help': 'the list of metrics you need to track during training. The metrics should be one of the functional metrics implemented in ``torchmetrics``. To use your own metric, please use the `metric` param in the `fit` method By default, it is accuracy if classification and mean_squared_error for regression'})

metrics_params

The parameters to be passed to the metrics function. task is forced to be multiclass because the multiclass version can handle binary as well and for simplicity we are only using multiclass.

TYPE: Optional[List] DEFAULT: field(default=None, metadata={'help': 'The parameters to be passed to the metrics function. `task` is forced to be `multiclass`` because the multiclass version can handle binary as well and for simplicity we are only using `multiclass`.'})

metrics_prob_input

Is a mandatory parameter for classification metrics defined in the config. This defines whether the input to the metric function is the probability or the class. Length should be same as the number of metrics. Defaults to None.

TYPE: Optional[List] DEFAULT: field(default=None, metadata={'help': 'Is a mandatory parameter for classification metrics defined in the config. This defines whether the input to the metric function is the probability or the class. Length should be same as the number of metrics. Defaults to None.'})

target_range

The range in which we should limit the output variable. Currently ignored for multi-target regression. Typically used for Regression problems. If left empty, will not apply any restrictions

TYPE: Optional[List] DEFAULT: field(default=None, metadata={'help': 'The range in which we should limit the output variable. Currently ignored for multi-target regression. Typically used for Regression problems. If left empty, will not apply any restrictions'})

seed

The seed for reproducibility. Defaults to 42

TYPE: int DEFAULT: field(default=42, metadata={'help': 'The seed for reproducibility. Defaults to 42'})

pytorch_tabular.models.GatedAdditiveTreeEnsembleConfig dataclass

Bases: ModelConfig

Gated Additive Tree Ensemble Config.

PARAMETER DESCRIPTION
gflu_stages

Number of layers in the feature abstraction layer. Defaults to 6

TYPE: int DEFAULT: field(default=6, metadata={'help': 'Number of layers in the feature abstraction layer. Defaults to 6'})

gflu_dropout

Dropout rate for the feature abstraction layer. Defaults to 0.0

TYPE: float DEFAULT: field(default=0.0, metadata={'help': 'Dropout rate for the feature abstraction layer. Defaults to 0.0'})

tree_depth

Depth of the tree. Defaults to 5

TYPE: int DEFAULT: field(default=4, metadata={'help': 'Depth of the tree. Defaults to 5'})

num_trees

Number of trees to use in the ensemble. Defaults to 20

TYPE: int DEFAULT: field(default=10, metadata={'help': 'Number of trees to use in the ensemble. Defaults to 20'})

binning_activation

The binning function to use. Defaults to entmoid. Defaults to sparsemoid. Choices are: [entmoid,sparsemoid,sigmoid].

TYPE: str DEFAULT: field(default='sparsemoid', metadata={'help': 'The binning function to use. Defaults to entmoid. Defaults to entmoid', 'choices': ['entmoid', 'sparsemoid', 'sigmoid']})

feature_mask_function

The feature mask function to use. Defaults to sparsemax. Choices are: [entmax,sparsemax,softmax].

TYPE: str DEFAULT: field(default='t-softmax', metadata={'help': 'The feature mask function to use. Defaults to entmax', 'choices': ['entmax', 'sparsemax', 'softmax', 't-softmax']})

tree_dropout

probability of dropout in tree binning transformation. Defaults to 0.0

TYPE: float DEFAULT: field(default=0.0, metadata={'help': 'probability of dropout in tree binning transformation. Defaults to 0.0'})

chain_trees

If True, we will chain the trees together. Synonymous to boosting (chaining trees) or bagging (parallel trees). Defaults to True

TYPE: bool DEFAULT: field(default=True, metadata={'help': 'If True, we will chain the trees together. Synonymous to boosting (chaining trees) or bagging (parallel trees). Defaults to True'})

tree_wise_attention

If True, we will use tree wise attention to combine trees. Defaults to True

TYPE: bool DEFAULT: field(default=True, metadata={'help': 'If True, we will use tree wise attention to combine trees. Defaults to True'})

tree_wise_attention_dropout

probability of dropout in the tree wise attention layer. Defaults to 0.0

TYPE: float DEFAULT: field(default=0.0, metadata={'help': 'probability of dropout in the tree wise attention layer. Defaults to 0.0'})

share_head_weights

If True, we will share the weights between the heads. Defaults to True

TYPE: bool DEFAULT: field(default=True, metadata={'help': 'If True, we will share the weights between the heads. Defaults to True'})

task

Specify whether the problem is regression or classification. backbone is a task which considers the model as a backbone to generate features. Mostly used internally for SSL and related tasks.. Choices are: [regression,classification,backbone].

TYPE: str DEFAULT: field(metadata={'help': 'Specify whether the problem is regression or classification. `backbone` is a task which considers the model as a backbone to generate features. Mostly used internally for SSL and related tasks.', 'choices': ['regression', 'classification', 'backbone']})

head

The head to be used for the model. Should be one of the heads defined in pytorch_tabular.models.common.heads. Defaults to LinearHead. Choices are: [None,LinearHead,MixtureDensityHead].

TYPE: Optional[str] DEFAULT: field(default='LinearHead', metadata={'help': 'The head to be used for the model. Should be one of the heads defined in `pytorch_tabular.models.common.heads`. Defaults to LinearHead', 'choices': [None, 'LinearHead', 'MixtureDensityHead']})

head_config

The config as a dict which defines the head. If left empty, will be initialized as default linear head.

TYPE: Optional[Dict] DEFAULT: field(default_factory=lambda : {'layers': ''}, metadata={'help': 'The config as a dict which defines the head. If left empty, will be initialized as default linear head.'})

embedding_dims

The dimensions of the embedding for each categorical column as a list of tuples (cardinality, embedding_dim). If left empty, will infer using the cardinality of the categorical column using the rule min(50, (x + 1) // 2)

TYPE: Optional[List] DEFAULT: field(default=None, metadata={'help': 'The dimensions of the embedding for each categorical column as a list of tuples (cardinality, embedding_dim). If left empty, will infer using the cardinality of the categorical column using the rule min(50, (x + 1) // 2)'})

embedding_dropout

Dropout to be applied to the Categorical Embedding. Defaults to 0.1

TYPE: float DEFAULT: field(default=0.0, metadata={'help': 'Dropout to be applied to the Categorical Embedding. Defaults to 0.0'})

batch_norm_continuous_input

If True, we will normalize the continuous layer by passing it through a BatchNorm layer.

TYPE: bool DEFAULT: field(default=True, metadata={'help': 'If True, we will normalize the continuous layer by passing it through a BatchNorm layer.'})

learning_rate

The learning rate of the model. Defaults to 1e-3.

TYPE: float DEFAULT: field(default=0.001, metadata={'help': 'The learning rate of the model. Defaults to 1e-3.'})

loss

The loss function to be applied. By Default it is MSELoss for regression and CrossEntropyLoss for classification. Unless you are sure what you are doing, leave it at MSELoss or L1Loss for regression and CrossEntropyLoss for classification

TYPE: Optional[str] DEFAULT: field(default=None, metadata={'help': 'The loss function to be applied. By Default it is MSELoss for regression and CrossEntropyLoss for classification. Unless you are sure what you are doing, leave it at MSELoss or L1Loss for regression and CrossEntropyLoss for classification'})

metrics

the list of metrics you need to track during training. The metrics should be one of the functional metrics implemented in torchmetrics. By default, it is accuracy if classification and mean_squared_error for regression

TYPE: Optional[List[str]] DEFAULT: field(default=None, metadata={'help': 'the list of metrics you need to track during training. The metrics should be one of the functional metrics implemented in ``torchmetrics``. To use your own metric, please use the `metric` param in the `fit` method By default, it is accuracy if classification and mean_squared_error for regression'})

metrics_params

The parameters to be passed to the metrics function. task is forced to be multiclass because the multiclass version can handle binary as well and for simplicity we are only using multiclass.

TYPE: Optional[List] DEFAULT: field(default=None, metadata={'help': 'The parameters to be passed to the metrics function. `task` is forced to be `multiclass`` because the multiclass version can handle binary as well and for simplicity we are only using `multiclass`.'})

metrics_prob_input

Is a mandatory parameter for classification metrics defined in the config. This defines whether the input to the metric function is the probability or the class. Length should be same as the number of metrics. Defaults to None.

TYPE: Optional[List] DEFAULT: field(default=None, metadata={'help': 'Is a mandatory parameter for classification metrics defined in the config. This defines whether the input to the metric function is the probability or the class. Length should be same as the number of metrics. Defaults to None.'})

target_range

The range in which we should limit the output variable. Currently ignored for multi-target regression. Typically used for Regression problems. If left empty, will not apply any restrictions

TYPE: Optional[List] DEFAULT: field(default=None, metadata={'help': 'The range in which we should limit the output variable. Currently ignored for multi-target regression. Typically used for Regression problems. If left empty, will not apply any restrictions'})

seed

The seed for reproducibility. Defaults to 42

TYPE: int DEFAULT: field(default=42, metadata={'help': 'The seed for reproducibility. Defaults to 42'})

pytorch_tabular.models.MDNConfig dataclass

Bases: ModelConfig

MDN configuration

PARAMETER DESCRIPTION
backbone_config_class

The config class for defining the Backbone. The config class should be a valid module path from models. e.g. FTTransformerConfig

TYPE: str DEFAULT: field(default=None, metadata={'help': 'The config class for defining the Backbone. The config class should be a valid module path from `models`. e.g. `FTTransformerConfig`'})

backbone_config_params

The dict of config parameters for defining the Backbone.

TYPE: Dict DEFAULT: field(default=None, metadata={'help': 'The dict of config parameters for defining the Backbone.'})

task

Specify whether the problem is regression or classification. backbone is a task which considers the model as a backbone to generate features. Mostly used internally for SSL and related tasks.. Choices are: [regression,classification,backbone].

TYPE: str DEFAULT: field(metadata={'help': 'Specify whether the problem is regression or classification. `backbone` is a task which considers the model as a backbone to generate features. Mostly used internally for SSL and related tasks.', 'choices': ['regression', 'classification', 'backbone']})

head

TYPE: str DEFAULT: field(init=False, default='MixtureDensityHead')

head_config

The config for defining the Mixed Density Network Head

TYPE: Dict DEFAULT: field(default=None, metadata={'help': 'The config for defining the Mixed Density Network Head'})

embedding_dims

The dimensions of the embedding for each categorical column as a list of tuples (cardinality, embedding_dim). If left empty, will infer using the cardinality of the categorical column using the rule min(50, (x + 1) // 2)

TYPE: Optional[List] DEFAULT: field(default=None, metadata={'help': 'The dimensions of the embedding for each categorical column as a list of tuples (cardinality, embedding_dim). If left empty, will infer using the cardinality of the categorical column using the rule min(50, (x + 1) // 2)'})

embedding_dropout

Dropout to be applied to the Categorical Embedding. Defaults to 0.1

TYPE: float DEFAULT: field(default=0.0, metadata={'help': 'Dropout to be applied to the Categorical Embedding. Defaults to 0.0'})

batch_norm_continuous_input

If True, we will normalize the continuous layer by passing it through a BatchNorm layer.

TYPE: bool DEFAULT: field(default=True, metadata={'help': 'If True, we will normalize the continuous layer by passing it through a BatchNorm layer.'})

learning_rate

The learning rate of the model. Defaults to 1e-3.

TYPE: float DEFAULT: field(default=0.001, metadata={'help': 'The learning rate of the model. Defaults to 1e-3.'})

loss

The loss function to be applied. By Default it is MSELoss for regression and CrossEntropyLoss for classification. Unless you are sure what you are doing, leave it at MSELoss or L1Loss for regression and CrossEntropyLoss for classification

TYPE: Optional[str] DEFAULT: field(default=None, metadata={'help': 'The loss function to be applied. By Default it is MSELoss for regression and CrossEntropyLoss for classification. Unless you are sure what you are doing, leave it at MSELoss or L1Loss for regression and CrossEntropyLoss for classification'})

metrics

the list of metrics you need to track during training. The metrics should be one of the functional metrics implemented in torchmetrics. By default, it is accuracy if classification and mean_squared_error for regression

TYPE: Optional[List[str]] DEFAULT: field(default=None, metadata={'help': 'the list of metrics you need to track during training. The metrics should be one of the functional metrics implemented in ``torchmetrics``. To use your own metric, please use the `metric` param in the `fit` method By default, it is accuracy if classification and mean_squared_error for regression'})

metrics_params

The parameters to be passed to the metrics function. task is forced to be multiclass because the multiclass version can handle binary as well and for simplicity we are only using multiclass.

TYPE: Optional[List] DEFAULT: field(default=None, metadata={'help': 'The parameters to be passed to the metrics function. `task` is forced to be `multiclass`` because the multiclass version can handle binary as well and for simplicity we are only using `multiclass`.'})

metrics_prob_input

Is a mandatory parameter for classification metrics defined in the config. This defines whether the input to the metric function is the probability or the class. Length should be same as the number of metrics. Defaults to None.

TYPE: Optional[List] DEFAULT: field(default=None, metadata={'help': 'Is a mandatory parameter for classification metrics defined in the config. This defines whether the input to the metric function is the probability or the class. Length should be same as the number of metrics. Defaults to None.'})

target_range

The range in which we should limit the output variable. Currently ignored for multi-target regression. Typically used for Regression problems. If left empty, will not apply any restrictions

TYPE: Optional[List] DEFAULT: field(default=None, metadata={'help': 'The range in which we should limit the output variable. Currently ignored for multi-target regression. Typically used for Regression problems. If left empty, will not apply any restrictions'})

seed

The seed for reproducibility. Defaults to 42

TYPE: int DEFAULT: field(default=42, metadata={'help': 'The seed for reproducibility. Defaults to 42'})

pytorch_tabular.models.NodeConfig dataclass

Bases: ModelConfig

Model configuration

PARAMETER DESCRIPTION
num_layers

Number of Oblivious Decision Tree Layers in the Dense Architecture

TYPE: int DEFAULT: field(default=1, metadata={'help': 'Number of Oblivious Decision Tree Layers in the Dense Architecture'})

num_trees

Number of Oblivious Decision Trees in each layer

TYPE: int DEFAULT: field(default=2048, metadata={'help': 'Number of Oblivious Decision Trees in each layer'})

additional_tree_output_dim

The additional output dimensions which is only used to pass through different layers of the architectures. Only the first output_dim outputs will be used for prediction

TYPE: int DEFAULT: field(default=3, metadata={'help': 'The additional output dimensions which is only used to pass through different layers of the architectures. Only the first output_dim outputs will be used for prediction'})

depth

The depth of the individual Oblivious Decision Trees

TYPE: int DEFAULT: field(default=6, metadata={'help': 'The depth of the individual Oblivious Decision Trees'})

choice_function

Generates a sparse probability distribution to be used as feature weights(aka, soft feature selection). Choices are: [entmax15,sparsemax].

TYPE: str DEFAULT: field(default='entmax15', metadata={'help': 'Generates a sparse probability distribution to be used as feature weights(aka, soft feature selection)', 'choices': ['entmax15', 'sparsemax']})

bin_function

Generates a sparse probability distribution to be used as tree leaf weights. Choices are: [entmoid15,sparsemoid].

TYPE: str DEFAULT: field(default='entmoid15', metadata={'help': 'Generates a sparse probability distribution to be used as tree leaf weights', 'choices': ['entmoid15', 'sparsemoid']})

max_features

If not None, sets a max limit on the number of features to be carried forward from layer to layer in the Dense Architecture

TYPE: Optional[int] DEFAULT: field(default=None, metadata={'help': 'If not None, sets a max limit on the number of features to be carried forward from layer to layer in the Dense Architecture'})

input_dropout

Dropout to be applied to the inputs between layers of the Dense Architecture

TYPE: float DEFAULT: field(default=0.0, metadata={'help': 'Dropout to be applied to the inputs between layers of the Dense Architecture'})

initialize_response

Initializing the response variable in the Oblivious Decision Trees. By default, it is a standard normal distribution. Choices are: [normal,uniform].

TYPE: str DEFAULT: field(default='normal', metadata={'help': 'Initializing the response variable in the Oblivious Decision Trees. By default, it is a standard normal distribution', 'choices': ['normal', 'uniform']})

initialize_selection_logits

Initializing the feature selector. By default is a uniform distribution across the features. Choices are: [uniform,normal].

TYPE: str DEFAULT: field(default='uniform', metadata={'help': 'Initializing the feature selector. By default is a uniform distribution across the features', 'choices': ['uniform', 'normal']})

threshold_init_beta

Used in the Data-aware initialization of thresholds where the threshold is initialized randomly (with a beta distribution) to feature values in the first batch. It initializes threshold to a q-th quantile of data points. where q ~ Beta(:threshold_init_beta:, :threshold_init_beta:) If this param is set to 1, initial thresholds will have the same distribution as data points If greater than 1 (e.g. 10), thresholds will be closer to median data value If less than 1 (e.g. 0.1), thresholds will approach min/max data values.

TYPE: float DEFAULT: field(default=1.0, metadata={'help': '\n Used in the Data-aware initialization of thresholds where the threshold is initialized randomly\n (with a beta distribution) to feature values in the first batch.\n It initializes threshold to a q-th quantile of data points.\n where q ~ Beta(:threshold_init_beta:, :threshold_init_beta:)\n If this param is set to 1, initial thresholds will have the same distribution as data points\n If greater than 1 (e.g. 10), thresholds will be closer to median data value\n If less than 1 (e.g. 0.1), thresholds will approach min/max data values.\n '})

threshold_init_cutoff

Used in the Data-aware initialization of scales(used in the scaling ODTs). It is initialized in such a way that all the samples in the first batch belong to the linear region of the entmoid/sparsemoid(bin-selectors) and thereby have non-zero gradients Threshold log-temperatures initializer, in (0, inf) By default(1.0), log-temperatures are initialized in such a way that all bin selectors end up in the linear region of sparse-sigmoid. The temperatures are then scaled by this parameter. Setting this value > 1.0 will result in some margin between data points and sparse-sigmoid cutoff value Setting this value < 1.0 will cause (1 - value) part of data points to end up in flat sparse- sigmoid region For instance, threshold_init_cutoff = 0.9 will set 10% points equal to 0.0 or 1.0 Setting this value > 1.0 will result in a margin between data points and sparse-sigmoid cutoff value All points will be between (0.5 - 0.5 / threshold_init_cutoff) and (0.5 + 0.5 / threshold_init_cutoff)

TYPE: float DEFAULT: field(default=1.0, metadata={'help': '\n Used in the Data-aware initialization of scales(used in the scaling ODTs).\n It is initialized in such a way that all the samples in the first batch belong to the linear\n region of the entmoid/sparsemoid(bin-selectors) and thereby have non-zero gradients\n Threshold log-temperatures initializer, in (0, inf)\n By default(1.0), log-temperatures are initialized in such a way that all bin selectors\n end up in the linear region of sparse-sigmoid. The temperatures are then scaled by this parameter.\n Setting this value > 1.0 will result in some margin between data points and sparse-sigmoid cutoff value\n Setting this value < 1.0 will cause (1 - value) part of data points to end up in flat sparse-sigmoid\n region. For instance, threshold_init_cutoff = 0.9 will set 10% points equal to 0.0 or 1.0\n Setting this value > 1.0 will result in a margin between data points and sparse-sigmoid cutoff value\n All points will be between (0.5 - 0.5 / threshold_init_cutoff) and (0.5 + 0.5 / threshold_init_cutoff)\n '})

cat_embedding_dropout

DEPRECATED: Please use embedding_dropout instead. probability of an embedding element to be zeroed.

TYPE: float DEFAULT: field(default=0.0, metadata={'help': 'DEPRECATED: Please use `embedding_dropout` instead. probability of an embedding element to be zeroed.'})

embed_categorical

Flag to embed categorical columns using an Embedding Layer. If turned off, the categorical columns are encoded using LeaveOneOutEncoder. This is DEPRECATED and will always be True from next release.

TYPE: bool DEFAULT: field(default=False, metadata={'help': 'Flag to embed categorical columns using an Embedding Layer. If turned off, the categorical columns are encoded using LeaveOneOutEncoder. This is DEPRECATED and will always be `True` from next release.'})

task

Specify whether the problem is regression or classification. backbone is a task which considers the model as a backbone to generate features. Mostly used internally for SSL and related tasks.. Choices are: [regression,classification,backbone].

TYPE: str DEFAULT: field(metadata={'help': 'Specify whether the problem is regression or classification. `backbone` is a task which considers the model as a backbone to generate features. Mostly used internally for SSL and related tasks.', 'choices': ['regression', 'classification', 'backbone']})

head

The head to be used for the model. Should be one of the heads defined in pytorch_tabular.models.common.heads. Defaults to LinearHead. Choices are: [None,LinearHead,MixtureDensityHead].

TYPE: Optional[str] DEFAULT: field(default='LinearHead', metadata={'help': 'The head to be used for the model. Should be one of the heads defined in `pytorch_tabular.models.common.heads`. Defaults to LinearHead', 'choices': [None, 'LinearHead', 'MixtureDensityHead']})

head_config

The config as a dict which defines the head. If left empty, will be initialized as default linear head.

TYPE: Optional[Dict] DEFAULT: field(default_factory=lambda : {'layers': ''}, metadata={'help': 'The config as a dict which defines the head. If left empty, will be initialized as default linear head.'})

embedding_dims

The dimensions of the embedding for each categorical column as a list of tuples (cardinality, embedding_dim). If left empty, will infer using the cardinality of the categorical column using the rule min(50, (x + 1) // 2)

TYPE: Optional[List] DEFAULT: field(default=None, metadata={'help': 'The dimensions of the embedding for each categorical column as a list of tuples (cardinality, embedding_dim). If left empty, will infer using the cardinality of the categorical column using the rule min(50, (x + 1) // 2)'})

embedding_dropout

Dropout to be applied to the Categorical Embedding. Defaults to 0.1

TYPE: float DEFAULT: field(default=0.0, metadata={'help': 'Dropout to be applied to the Categorical Embedding. Defaults to 0.0'})

batch_norm_continuous_input

If True, we will normalize the continuous layer by passing it through a BatchNorm layer.

TYPE: bool DEFAULT: field(default=True, metadata={'help': 'If True, we will normalize the continuous layer by passing it through a BatchNorm layer.'})

learning_rate

The learning rate of the model. Defaults to 1e-3.

TYPE: float DEFAULT: field(default=0.001, metadata={'help': 'The learning rate of the model. Defaults to 1e-3.'})

loss

The loss function to be applied. By Default it is MSELoss for regression and CrossEntropyLoss for classification. Unless you are sure what you are doing, leave it at MSELoss or L1Loss for regression and CrossEntropyLoss for classification

TYPE: Optional[str] DEFAULT: field(default=None, metadata={'help': 'The loss function to be applied. By Default it is MSELoss for regression and CrossEntropyLoss for classification. Unless you are sure what you are doing, leave it at MSELoss or L1Loss for regression and CrossEntropyLoss for classification'})

metrics

the list of metrics you need to track during training. The metrics should be one of the functional metrics implemented in torchmetrics. By default, it is accuracy if classification and mean_squared_error for regression

TYPE: Optional[List[str]] DEFAULT: field(default=None, metadata={'help': 'the list of metrics you need to track during training. The metrics should be one of the functional metrics implemented in ``torchmetrics``. To use your own metric, please use the `metric` param in the `fit` method By default, it is accuracy if classification and mean_squared_error for regression'})

metrics_params

The parameters to be passed to the metrics function. task is forced to be multiclass because the multiclass version can handle binary as well and for simplicity we are only using multiclass.

TYPE: Optional[List] DEFAULT: field(default=None, metadata={'help': 'The parameters to be passed to the metrics function. `task` is forced to be `multiclass`` because the multiclass version can handle binary as well and for simplicity we are only using `multiclass`.'})

metrics_prob_input

Is a mandatory parameter for classification metrics defined in the config. This defines whether the input to the metric function is the probability or the class. Length should be same as the number of metrics. Defaults to None.

TYPE: Optional[List] DEFAULT: field(default=None, metadata={'help': 'Is a mandatory parameter for classification metrics defined in the config. This defines whether the input to the metric function is the probability or the class. Length should be same as the number of metrics. Defaults to None.'})

target_range

The range in which we should limit the output variable. Currently ignored for multi-target regression. Typically used for Regression problems. If left empty, will not apply any restrictions

TYPE: Optional[List] DEFAULT: field(default=None, metadata={'help': 'The range in which we should limit the output variable. Currently ignored for multi-target regression. Typically used for Regression problems. If left empty, will not apply any restrictions'})

seed

The seed for reproducibility. Defaults to 42

TYPE: int DEFAULT: field(default=42, metadata={'help': 'The seed for reproducibility. Defaults to 42'})

pytorch_tabular.models.TabNetModelConfig dataclass

Bases: ModelConfig

Model configuration

PARAMETER DESCRIPTION
n_d

Dimension of the prediction layer (usually between 4 and 64)

TYPE: int DEFAULT: field(default=8, metadata={'help': 'Dimension of the prediction layer (usually between 4 and 64)'})

n_a

Dimension of the attention layer (usually between 4 and 64)

TYPE: int DEFAULT: field(default=8, metadata={'help': 'Dimension of the attention layer (usually between 4 and 64)'})

n_steps

Number of sucessive steps in the newtork (usually betwenn 3 and 10)

TYPE: int DEFAULT: field(default=3, metadata={'help': 'Number of sucessive steps in the newtork (usually betwenn 3 and 10)'})

gamma

Float above 1, scaling factor for attention updates (usually betwenn 1.0 to 2.0)

TYPE: float DEFAULT: field(default=1.3, metadata={'help': 'Float above 1, scaling factor for attention updates (usually betwenn 1.0 to 2.0)'})

n_independent

Number of independent GLU layer in each GLU block (default 2)

TYPE: int DEFAULT: field(default=2, metadata={'help': 'Number of independent GLU layer in each GLU block (default 2)'})

n_shared

Number of independent GLU layer in each GLU block (default 2)

TYPE: int DEFAULT: field(default=2, metadata={'help': 'Number of independent GLU layer in each GLU block (default 2)'})

virtual_batch_size

Batch size for Ghost Batch Normalization

TYPE: int DEFAULT: field(default=128, metadata={'help': 'Batch size for Ghost Batch Normalization'})

mask_type

Either 'sparsemax' or 'entmax' : this is the masking function to use. Choices are: [sparsemax,entmax].

TYPE: str DEFAULT: field(default='sparsemax', metadata={'help': "Either 'sparsemax' or 'entmax' : this is the masking function to use", 'choices': ['sparsemax', 'entmax']})

task

Specify whether the problem is regression or classification. backbone is a task which considers the model as a backbone to generate features. Mostly used internally for SSL and related tasks.. Choices are: [regression,classification,backbone].

TYPE: str DEFAULT: field(metadata={'help': 'Specify whether the problem is regression or classification. `backbone` is a task which considers the model as a backbone to generate features. Mostly used internally for SSL and related tasks.', 'choices': ['regression', 'classification', 'backbone']})

head

The head to be used for the model. Should be one of the heads defined in pytorch_tabular.models.common.heads. Defaults to LinearHead. Choices are: [None,LinearHead,MixtureDensityHead].

TYPE: Optional[str] DEFAULT: field(default='LinearHead', metadata={'help': 'The head to be used for the model. Should be one of the heads defined in `pytorch_tabular.models.common.heads`. Defaults to LinearHead', 'choices': [None, 'LinearHead', 'MixtureDensityHead']})

head_config

The config as a dict which defines the head. If left empty, will be initialized as default linear head.

TYPE: Optional[Dict] DEFAULT: field(default_factory=lambda : {'layers': ''}, metadata={'help': 'The config as a dict which defines the head. If left empty, will be initialized as default linear head.'})

embedding_dims

The dimensions of the embedding for each categorical column as a list of tuples (cardinality, embedding_dim). If left empty, will infer using the cardinality of the categorical column using the rule min(50, (x + 1) // 2)

TYPE: Optional[List] DEFAULT: field(default=None, metadata={'help': 'The dimensions of the embedding for each categorical column as a list of tuples (cardinality, embedding_dim). If left empty, will infer using the cardinality of the categorical column using the rule min(50, (x + 1) // 2)'})

embedding_dropout

Dropout to be applied to the Categorical Embedding. Defaults to 0.1

TYPE: float DEFAULT: field(default=0.0, metadata={'help': 'Dropout to be applied to the Categorical Embedding. Defaults to 0.0'})

batch_norm_continuous_input

If True, we will normalize the continuous layer by passing it through a BatchNorm layer.

TYPE: bool DEFAULT: field(default=True, metadata={'help': 'If True, we will normalize the continuous layer by passing it through a BatchNorm layer.'})

learning_rate

The learning rate of the model. Defaults to 1e-3.

TYPE: float DEFAULT: field(default=0.001, metadata={'help': 'The learning rate of the model. Defaults to 1e-3.'})

loss

The loss function to be applied. By Default it is MSELoss for regression and CrossEntropyLoss for classification. Unless you are sure what you are doing, leave it at MSELoss or L1Loss for regression and CrossEntropyLoss for classification

TYPE: Optional[str] DEFAULT: field(default=None, metadata={'help': 'The loss function to be applied. By Default it is MSELoss for regression and CrossEntropyLoss for classification. Unless you are sure what you are doing, leave it at MSELoss or L1Loss for regression and CrossEntropyLoss for classification'})

metrics

the list of metrics you need to track during training. The metrics should be one of the functional metrics implemented in torchmetrics. By default, it is accuracy if classification and mean_squared_error for regression

TYPE: Optional[List[str]] DEFAULT: field(default=None, metadata={'help': 'the list of metrics you need to track during training. The metrics should be one of the functional metrics implemented in ``torchmetrics``. To use your own metric, please use the `metric` param in the `fit` method By default, it is accuracy if classification and mean_squared_error for regression'})

metrics_params

The parameters to be passed to the metrics function. task is forced to be multiclass because the multiclass version can handle binary as well and for simplicity we are only using multiclass.

TYPE: Optional[List] DEFAULT: field(default=None, metadata={'help': 'The parameters to be passed to the metrics function. `task` is forced to be `multiclass`` because the multiclass version can handle binary as well and for simplicity we are only using `multiclass`.'})

metrics_prob_input

Is a mandatory parameter for classification metrics defined in the config. This defines whether the input to the metric function is the probability or the class. Length should be same as the number of metrics. Defaults to None.

TYPE: Optional[List] DEFAULT: field(default=None, metadata={'help': 'Is a mandatory parameter for classification metrics defined in the config. This defines whether the input to the metric function is the probability or the class. Length should be same as the number of metrics. Defaults to None.'})

target_range

The range in which we should limit the output variable. Currently ignored for multi-target regression. Typically used for Regression problems. If left empty, will not apply any restrictions

TYPE: Optional[List] DEFAULT: field(default=None, metadata={'help': 'The range in which we should limit the output variable. Currently ignored for multi-target regression. Typically used for Regression problems. If left empty, will not apply any restrictions'})

seed

The seed for reproducibility. Defaults to 42

TYPE: int DEFAULT: field(default=42, metadata={'help': 'The seed for reproducibility. Defaults to 42'})

pytorch_tabular.models.TabTransformerConfig dataclass

Bases: ModelConfig

Tab Transformer configuration

PARAMETER DESCRIPTION
input_embed_dim

The embedding dimension for the input categorical features. Defaults to 32

TYPE: int DEFAULT: field(default=32, metadata={'help': 'The embedding dimension for the input categorical features. Defaults to 32'})

embedding_initialization

Initialization scheme for the embedding layers. Defaults to kaiming. Choices are: [kaiming_uniform,kaiming_normal].

TYPE: Optional[str] DEFAULT: field(default='kaiming_uniform', metadata={'help': 'Initialization scheme for the embedding layers. Defaults to `kaiming`', 'choices': ['kaiming_uniform', 'kaiming_normal']})

embedding_bias

Flag to turn on Embedding Bias. Defaults to False

TYPE: bool DEFAULT: field(default=False, metadata={'help': 'Flag to turn on Embedding Bias. Defaults to False'})

share_embedding

The flag turns on shared embeddings in the input embedding process. The key idea here is to have an embedding for the feature as a whole along with embeddings of each unique values of that column. For more details refer to Appendix A of the TabTransformer paper. Defaults to False

TYPE: bool DEFAULT: field(default=False, metadata={'help': 'The flag turns on shared embeddings in the input embedding process. The key idea here is to have an embedding for the feature as a whole along with embeddings of each unique values of that column. For more details refer to Appendix A of the TabTransformer paper. Defaults to False'})

share_embedding_strategy

There are two strategies in adding shared embeddings. 1. add - A separate embedding for the feature is added to the embedding of the unique values of the feature. 2. fraction - A fraction of the input embedding is reserved for the shared embedding of the feature. Defaults to fraction.. Choices are: [add,fraction].

TYPE: Optional[str] DEFAULT: field(default='fraction', metadata={'help': 'There are two strategies in adding shared embeddings. 1. `add` - A separate embedding for the feature is added to the embedding of the unique values of the feature. 2. `fraction` - A fraction of the input embedding is reserved for the shared embedding of the feature. Defaults to fraction.', 'choices': ['add', 'fraction']})

shared_embedding_fraction

Fraction of the input_embed_dim to be reserved by the shared embedding. Should be less than one. Defaults to 0.25

TYPE: float DEFAULT: field(default=0.25, metadata={'help': 'Fraction of the input_embed_dim to be reserved by the shared embedding. Should be less than one. Defaults to 0.25'})

num_heads

The number of heads in the Multi-Headed Attention layer. Defaults to 8

TYPE: int DEFAULT: field(default=8, metadata={'help': 'The number of heads in the Multi-Headed Attention layer. Defaults to 8'})

num_attn_blocks

The number of layers of stacked Multi-Headed Attention layers. Defaults to 6

TYPE: int DEFAULT: field(default=6, metadata={'help': 'The number of layers of stacked Multi-Headed Attention layers. Defaults to 6'})

transformer_head_dim

The number of hidden units in the Multi-Headed Attention layers. Defaults to None and will be same as input_dim.

TYPE: Optional[int] DEFAULT: field(default=None, metadata={'help': 'The number of hidden units in the Multi-Headed Attention layers. Defaults to None and will be same as input_dim.'})

attn_dropout

Dropout to be applied after Multi headed Attention. Defaults to 0.1

TYPE: float DEFAULT: field(default=0.1, metadata={'help': 'Dropout to be applied after Multi headed Attention. Defaults to 0.1'})

add_norm_dropout

Dropout to be applied in the AddNorm Layer. Defaults to 0.1

TYPE: float DEFAULT: field(default=0.1, metadata={'help': 'Dropout to be applied in the AddNorm Layer. Defaults to 0.1'})

ff_dropout

Dropout to be applied in the Positionwise FeedForward Network. Defaults to 0.1

TYPE: float DEFAULT: field(default=0.1, metadata={'help': 'Dropout to be applied in the Positionwise FeedForward Network. Defaults to 0.1'})

ff_hidden_multiplier

Multiple by which the Positionwise FF layer scales the input. Defaults to 4

TYPE: int DEFAULT: field(default=4, metadata={'help': 'Multiple by which the Positionwise FF layer scales the input. Defaults to 4'})

transformer_activation

The activation type in the transformer feed forward layers. In addition to the default activation in PyTorch like ReLU, TanH, LeakyReLU, etc. https://pytorch.org/docs/stable/nn.html#non-linear-activations-weighted-sum-nonlinearity, GEGLU, ReGLU and SwiGLU are also implemented(https://arxiv.org/pdf/2002.05202.pdf). Defaults to GEGLU

TYPE: str DEFAULT: field(default='GEGLU', metadata={'help': 'The activation type in the transformer feed forward layers. In addition to the default activation in PyTorch like ReLU, TanH, LeakyReLU, etc. https://pytorch.org/docs/stable/nn.html#non-linear-activations-weighted-sum-nonlinearity, GEGLU, ReGLU and SwiGLU are also implemented(https://arxiv.org/pdf/2002.05202.pdf). Defaults to GEGLU'})

out_ff_layers

DEPRECATED: Hyphen-separated number of layers and units in the deep MLP. Defaults to 128-64-32

TYPE: Optional[str] DEFAULT: field(default=None, metadata={'help': 'DEPRECATED: Hyphen-separated number of layers and units in the deep MLP. Defaults to 128-64-32'})

out_ff_activation

DEPRECATED: The activation type in the deep MLP. The default activaion in PyTorch like ReLU, TanH, LeakyReLU, etc. https://pytorch.org/docs/stable/nn.html#non- linear-activations-weighted-sum-nonlinearity. Defaults to ReLU

TYPE: Optional[str] DEFAULT: field(default=None, metadata={'help': 'DEPRECATED: The activation type in the deep MLP. The default activaion in PyTorch like ReLU, TanH, LeakyReLU, etc. https://pytorch.org/docs/stable/nn.html#non-linear-activations-weighted-sum-nonlinearity. Defaults to ReLU'})

out_ff_dropout

DEPRECATED: probability of an classification element to be zeroed in the deep MLP. Defaults to 0.0

TYPE: Optional[float] DEFAULT: field(default=None, metadata={'help': 'DEPRECATED: probability of an classification element to be zeroed in the deep MLP. Defaults to 0.0'})

out_ff_initialization

DEPRECATED: Initialization scheme for the linear layers. Defaults to kaiming. Choices are: [None,kaiming,xavier,random].

TYPE: Optional[str] DEFAULT: field(default=None, metadata={'help': 'DEPRECATED: Initialization scheme for the linear layers. Defaults to `kaiming`', 'choices': [None, 'kaiming', 'xavier', 'random']})

task

Specify whether the problem is regression or classification. backbone is a task which considers the model as a backbone to generate features. Mostly used internally for SSL and related tasks.. Choices are: [regression,classification,backbone].

TYPE: str DEFAULT: field(metadata={'help': 'Specify whether the problem is regression or classification. `backbone` is a task which considers the model as a backbone to generate features. Mostly used internally for SSL and related tasks.', 'choices': ['regression', 'classification', 'backbone']})

head

The head to be used for the model. Should be one of the heads defined in pytorch_tabular.models.common.heads. Defaults to LinearHead. Choices are: [None,LinearHead,MixtureDensityHead].

TYPE: Optional[str] DEFAULT: field(default='LinearHead', metadata={'help': 'The head to be used for the model. Should be one of the heads defined in `pytorch_tabular.models.common.heads`. Defaults to LinearHead', 'choices': [None, 'LinearHead', 'MixtureDensityHead']})

head_config

The config as a dict which defines the head. If left empty, will be initialized as default linear head.

TYPE: Optional[Dict] DEFAULT: field(default_factory=lambda : {'layers': ''}, metadata={'help': 'The config as a dict which defines the head. If left empty, will be initialized as default linear head.'})

embedding_dims

The dimensions of the embedding for each categorical column as a list of tuples (cardinality, embedding_dim). If left empty, will infer using the cardinality of the categorical column using the rule min(50, (x + 1) // 2)

TYPE: Optional[List] DEFAULT: field(default=None, metadata={'help': 'The dimensions of the embedding for each categorical column as a list of tuples (cardinality, embedding_dim). If left empty, will infer using the cardinality of the categorical column using the rule min(50, (x + 1) // 2)'})

embedding_dropout

Dropout to be applied to the Categorical Embedding. Defaults to 0.1

TYPE: float DEFAULT: field(default=0.0, metadata={'help': 'Dropout to be applied to the Categorical Embedding. Defaults to 0.0'})

batch_norm_continuous_input

If True, we will normalize the continuous layer by passing it through a BatchNorm layer.

TYPE: bool DEFAULT: field(default=True, metadata={'help': 'If True, we will normalize the continuous layer by passing it through a BatchNorm layer.'})

learning_rate

The learning rate of the model. Defaults to 1e-3.

TYPE: float DEFAULT: field(default=0.001, metadata={'help': 'The learning rate of the model. Defaults to 1e-3.'})

loss

The loss function to be applied. By Default it is MSELoss for regression and CrossEntropyLoss for classification. Unless you are sure what you are doing, leave it at MSELoss or L1Loss for regression and CrossEntropyLoss for classification

TYPE: Optional[str] DEFAULT: field(default=None, metadata={'help': 'The loss function to be applied. By Default it is MSELoss for regression and CrossEntropyLoss for classification. Unless you are sure what you are doing, leave it at MSELoss or L1Loss for regression and CrossEntropyLoss for classification'})

metrics

the list of metrics you need to track during training. The metrics should be one of the functional metrics implemented in torchmetrics. By default, it is accuracy if classification and mean_squared_error for regression

TYPE: Optional[List[str]] DEFAULT: field(default=None, metadata={'help': 'the list of metrics you need to track during training. The metrics should be one of the functional metrics implemented in ``torchmetrics``. To use your own metric, please use the `metric` param in the `fit` method By default, it is accuracy if classification and mean_squared_error for regression'})

metrics_params

The parameters to be passed to the metrics function. task is forced to be multiclass because the multiclass version can handle binary as well and for simplicity we are only using multiclass.

TYPE: Optional[List] DEFAULT: field(default=None, metadata={'help': 'The parameters to be passed to the metrics function. `task` is forced to be `multiclass`` because the multiclass version can handle binary as well and for simplicity we are only using `multiclass`.'})

metrics_prob_input

Is a mandatory parameter for classification metrics defined in the config. This defines whether the input to the metric function is the probability or the class. Length should be same as the number of metrics. Defaults to None.

TYPE: Optional[List] DEFAULT: field(default=None, metadata={'help': 'Is a mandatory parameter for classification metrics defined in the config. This defines whether the input to the metric function is the probability or the class. Length should be same as the number of metrics. Defaults to None.'})

target_range

The range in which we should limit the output variable. Currently ignored for multi-target regression. Typically used for Regression problems. If left empty, will not apply any restrictions

TYPE: Optional[List] DEFAULT: field(default=None, metadata={'help': 'The range in which we should limit the output variable. Currently ignored for multi-target regression. Typically used for Regression problems. If left empty, will not apply any restrictions'})

seed

The seed for reproducibility. Defaults to 42

TYPE: int DEFAULT: field(default=42, metadata={'help': 'The seed for reproducibility. Defaults to 42'})

pytorch_tabular.config.ModelConfig dataclass

Base Model configuration

PARAMETER DESCRIPTION
task

Specify whether the problem is regression or classification. backbone is a task which considers the model as a backbone to generate features. Mostly used internally for SSL and related tasks.. Choices are: [regression,classification,backbone].

TYPE: str DEFAULT: field(metadata={'help': 'Specify whether the problem is regression or classification. `backbone` is a task which considers the model as a backbone to generate features. Mostly used internally for SSL and related tasks.', 'choices': ['regression', 'classification', 'backbone']})

head

The head to be used for the model. Should be one of the heads defined in pytorch_tabular.models.common.heads. Defaults to LinearHead. Choices are: [None,LinearHead,MixtureDensityHead].

TYPE: Optional[str] DEFAULT: field(default='LinearHead', metadata={'help': 'The head to be used for the model. Should be one of the heads defined in `pytorch_tabular.models.common.heads`. Defaults to LinearHead', 'choices': [None, 'LinearHead', 'MixtureDensityHead']})

head_config

The config as a dict which defines the head. If left empty, will be initialized as default linear head.

TYPE: Optional[Dict] DEFAULT: field(default_factory=lambda : {'layers': ''}, metadata={'help': 'The config as a dict which defines the head. If left empty, will be initialized as default linear head.'})

embedding_dims

The dimensions of the embedding for each categorical column as a list of tuples (cardinality, embedding_dim). If left empty, will infer using the cardinality of the categorical column using the rule min(50, (x + 1) // 2)

TYPE: Optional[List] DEFAULT: field(default=None, metadata={'help': 'The dimensions of the embedding for each categorical column as a list of tuples (cardinality, embedding_dim). If left empty, will infer using the cardinality of the categorical column using the rule min(50, (x + 1) // 2)'})

embedding_dropout

Dropout to be applied to the Categorical Embedding. Defaults to 0.0

TYPE: float DEFAULT: field(default=0.0, metadata={'help': 'Dropout to be applied to the Categorical Embedding. Defaults to 0.0'})

batch_norm_continuous_input

If True, we will normalize the continuous layer by passing it through a BatchNorm layer.

TYPE: bool DEFAULT: field(default=True, metadata={'help': 'If True, we will normalize the continuous layer by passing it through a BatchNorm layer.'})

learning_rate

The learning rate of the model. Defaults to 1e-3.

TYPE: float DEFAULT: field(default=0.001, metadata={'help': 'The learning rate of the model. Defaults to 1e-3.'})

loss

The loss function to be applied. By Default it is MSELoss for regression and CrossEntropyLoss for classification. Unless you are sure what you are doing, leave it at MSELoss or L1Loss for regression and CrossEntropyLoss for classification

TYPE: Optional[str] DEFAULT: field(default=None, metadata={'help': 'The loss function to be applied. By Default it is MSELoss for regression and CrossEntropyLoss for classification. Unless you are sure what you are doing, leave it at MSELoss or L1Loss for regression and CrossEntropyLoss for classification'})

metrics

the list of metrics you need to track during training. The metrics should be one of the functional metrics implemented in torchmetrics. By default, it is accuracy if classification and mean_squared_error for regression

TYPE: Optional[List[str]] DEFAULT: field(default=None, metadata={'help': 'the list of metrics you need to track during training. The metrics should be one of the functional metrics implemented in ``torchmetrics``. To use your own metric, please use the `metric` param in the `fit` method By default, it is accuracy if classification and mean_squared_error for regression'})

metrics_prob_input

Is a mandatory parameter for classification metrics defined in the config. This defines whether the input to the metric function is the probability or the class. Length should be same as the number of metrics. Defaults to None.

TYPE: Optional[bool] DEFAULT: field(default=None, metadata={'help': 'Is a mandatory parameter for classification metrics defined in the config. This defines whether the input to the metric function is the probability or the class. Length should be same as the number of metrics. Defaults to None.'})

metrics_params

The parameters to be passed to the metrics function. task is forced to be multiclass because the multiclass version can handle binary as well and for simplicity we are only using multiclass.

TYPE: Optional[List] DEFAULT: field(default=None, metadata={'help': 'The parameters to be passed to the metrics function. `task` is forced to be `multiclass`` because the multiclass version can handle binary as well and for simplicity we are only using `multiclass`.'})

target_range

The range in which we should limit the output variable. Currently ignored for multi-target regression. Typically used for Regression problems. If left empty, will not apply any restrictions

TYPE: Optional[List] DEFAULT: field(default=None, metadata={'help': 'The range in which we should limit the output variable. Currently ignored for multi-target regression. Typically used for Regression problems. If left empty, will not apply any restrictions'})

seed

The seed for reproducibility. Defaults to 42

TYPE: int DEFAULT: field(default=42, metadata={'help': 'The seed for reproducibility. Defaults to 42'})

Model Classes

pytorch_tabular.models.AutoIntModel(config, **kwargs)

Bases: BaseModel

Source code in src/pytorch_tabular/models/autoint/autoint.py
def __init__(self, config: DictConfig, **kwargs):
    super().__init__(config, **kwargs)

pytorch_tabular.models.CategoryEmbeddingModel(config, **kwargs)

Bases: BaseModel

Source code in src/pytorch_tabular/models/category_embedding/category_embedding_model.py
def __init__(self, config: DictConfig, **kwargs):
    super().__init__(config, **kwargs)

pytorch_tabular.models.FTTransformerModel(config, **kwargs)

Bases: BaseModel

Source code in src/pytorch_tabular/models/ft_transformer/ft_transformer.py
def __init__(self, config: DictConfig, **kwargs):
    super().__init__(config, **kwargs)

pytorch_tabular.models.GANDALFModel(config, **kwargs)

Bases: BaseModel

Source code in src/pytorch_tabular/models/gandalf/gandalf.py
def __init__(self, config: DictConfig, **kwargs):
    super().__init__(config, **kwargs)

pytorch_tabular.models.GatedAdditiveTreeEnsembleModel(config, **kwargs)

Bases: BaseModel

Source code in src/pytorch_tabular/models/gate/gate_model.py
def __init__(self, config: DictConfig, **kwargs):
    super().__init__(config, **kwargs)

pytorch_tabular.models.MDNModel(config, **kwargs)

Bases: BaseModel

Source code in src/pytorch_tabular/models/mixture_density/mdn.py
def __init__(self, config: DictConfig, **kwargs):
    assert "inferred_config" in kwargs, "inferred_config not found in initialization arguments"
    self.inferred_config = kwargs["inferred_config"]
    assert config.task == "regression", "MDN is only implemented for Regression"
    super().__init__(config, **kwargs)
    assert self.hparams.output_dim == 1, "MDN is not implemented for multi-targets"
    if config.target_range is not None:
        logger.warning("MDN does not use target range. Ignoring it.")

pytorch_tabular.models.NODEModel(config, **kwargs)

Bases: BaseModel

Source code in src/pytorch_tabular/models/node/node_model.py
def __init__(self, config: DictConfig, **kwargs):
    super().__init__(config, **kwargs)

data_aware_initialization(datamodule)

Performs data-aware initialization for NODE.

Source code in src/pytorch_tabular/models/node/node_model.py
def data_aware_initialization(self, datamodule):
    """Performs data-aware initialization for NODE."""
    logger.info("Data Aware Initialization of NODE using a forward pass with 2000 batch size....")
    # Need a big batch to initialize properly
    alt_loader = datamodule.train_dataloader(batch_size=self.hparams.data_aware_init_batch_size)
    batch = next(iter(alt_loader))
    for k, v in batch.items():
        if isinstance(v, list) and (len(v) == 0):
            # Skipping empty list
            continue
        # batch[k] = v.to("cpu" if self.config.gpu == 0 else "cuda")
        batch[k] = v.to(self.device)

    # single forward pass to initialize the ODST
    with torch.no_grad():
        self(batch)

pytorch_tabular.models.TabNetModel(config, **kwargs)

Bases: BaseModel

Source code in src/pytorch_tabular/models/tabnet/tabnet_model.py
def __init__(self, config: DictConfig, **kwargs):
    assert config.task in [
        "regression",
        "classification",
    ], "TabNet is only implemented for Regression and Classification"
    super().__init__(config, **kwargs)

pytorch_tabular.models.TabTransformerModel(config, **kwargs)

Bases: BaseModel

Source code in src/pytorch_tabular/models/tab_transformer/tab_transformer.py
def __init__(self, config: DictConfig, **kwargs):
    super().__init__(config, **kwargs)

Base Model Class

pytorch_tabular.models.BaseModel(config, custom_loss=None, custom_metrics=None, custom_metrics_prob_inputs=None, custom_optimizer=None, custom_optimizer_params={}, **kwargs)

Bases: pl.LightningModule

Base Model for PyTorch Tabular.

PARAMETER DESCRIPTION
config

The configuration for the model.

TYPE: DictConfig

custom_loss

A custom loss function. Defaults to None.

TYPE: Optional[torch.nn.Module] DEFAULT: None

custom_metrics

A list of custom metrics. Defaults to None.

TYPE: Optional[List[Callable]] DEFAULT: None

custom_metrics_prob_inputs

A list of boolean values indicating whether the metric requires probability inputs. Defaults to None.

TYPE: Optional[List[bool]] DEFAULT: None

custom_optimizer

A custom optimizer. Defaults to None.

TYPE: Optional[torch.optim.Optimizer] DEFAULT: None

custom_optimizer_params

A dictionary of custom optimizer parameters. Defaults to {}.

TYPE: Dict DEFAULT: {}

kwargs

Additional keyword arguments.

TYPE: Dict DEFAULT: {}

Source code in src/pytorch_tabular/models/base_model.py
def __init__(
    self,
    config: DictConfig,
    custom_loss: Optional[torch.nn.Module] = None,
    custom_metrics: Optional[List[Callable]] = None,
    custom_metrics_prob_inputs: Optional[List[bool]] = None,
    custom_optimizer: Optional[torch.optim.Optimizer] = None,
    custom_optimizer_params: Dict = {},
    **kwargs,
):
    """Base Model for PyTorch Tabular.

    Args:
        config (DictConfig): The configuration for the model.
        custom_loss (Optional[torch.nn.Module], optional): A custom loss function. Defaults to None.
        custom_metrics (Optional[List[Callable]], optional): A list of custom metrics. Defaults to None.
        custom_metrics_prob_inputs (Optional[List[bool]], optional): A list of boolean values indicating whether the
            metric requires probability inputs. Defaults to None.
        custom_optimizer (Optional[torch.optim.Optimizer], optional): A custom optimizer. Defaults to None.
        custom_optimizer_params (Dict, optional): A dictionary of custom optimizer parameters. Defaults to {}.
        kwargs (Dict, optional): Additional keyword arguments.
    """
    super().__init__()
    assert "inferred_config" in kwargs, "inferred_config not found in initialization arguments"
    inferred_config = kwargs["inferred_config"]
    # Merging the config and inferred config
    config = safe_merge_config(config, inferred_config)
    self.custom_loss = custom_loss
    self.custom_metrics = custom_metrics
    self.custom_metrics_prob_inputs = custom_metrics_prob_inputs
    self.custom_optimizer = custom_optimizer
    self.custom_optimizer_params = custom_optimizer_params
    self.kwargs = kwargs
    # Updating config with custom parameters for experiment tracking
    if self.custom_loss is not None:
        config.loss = str(self.custom_loss)
    if self.custom_metrics is not None:
        # Adding metrics to config for hparams logging and tracking
        config.metrics = []
        config.metrics_params = []
        for metric in self.custom_metrics:
            if isinstance(metric, partial):
                # extracting func names from partial functions
                config.metrics.append(metric.func.__name__)
                config.metrics_params.append(metric.keywords)
            else:
                config.metrics.append(metric.__name__)
                config.metrics_params.append(vars(metric))
        if config.task == "classification":
            config.metrics_prob_input = self.custom_metrics_prob_inputs
    # Updating default metrics in config
    elif config.task == "classification":
        # Adding metric_params to config for classification task
        for i, mp in enumerate(config.metrics_params):
            # For classification task, output_dim == number of classses
            config.metrics_params[i]["task"] = mp.get("task", "multiclass")
            config.metrics_params[i]["num_classes"] = mp.get("num_classes", inferred_config.output_dim)
            if config.metrics[i] in (
                "accuracy",
                "precision",
                "recall",
                "precision_recall",
                "specificity",
                "f1_score",
                "fbeta_score",
            ):
                config.metrics_params[i]["top_k"] = mp.get("top_k", 1)

    if self.custom_optimizer is not None:
        config.optimizer = str(self.custom_optimizer.__class__.__name__)
    if len(self.custom_optimizer_params) > 0:
        config.optimizer_params = self.custom_optimizer_params
    self.save_hyperparameters(config)
    # The concatenated output dim of the embedding layer
    self._build_network()
    self._setup_loss()
    self._setup_metrics()
    self._check_and_verify()
    self.do_log_logits = (
        hasattr(self.hparams, "log_logits") and self.hparams.log_logits and self.hparams.log_target == "wandb"
    )
    if not WANDB_INSTALLED:
        self.do_log_logits = False
        warnings.warn(
            "Wandb is not installed. Please install wandb to log logits. "
            "You can install wandb using pip install wandb or install PyTorch Tabular"
            " using pip install pytorch-tabular[all]"
        )
    if not PLOTLY_INSTALLED:
        self.do_log_logits = False
        warnings.warn(
            "Plotly is not installed. Please install plotly to log logits. "
            "You can install plotly using pip install plotly or install PyTorch Tabular"
            " using pip install pytorch-tabular[all]"
        )

apply_output_sigmoid_scaling(y_hat)

Applies sigmoid scaling to the output of the model if the task is regression and the target range is defined.

PARAMETER DESCRIPTION
y_hat

The output of the model

TYPE: torch.Tensor

RETURNS DESCRIPTION
torch.Tensor

torch.Tensor: The output of the model with sigmoid scaling applied

Source code in src/pytorch_tabular/models/base_model.py
def apply_output_sigmoid_scaling(self, y_hat: torch.Tensor) -> torch.Tensor:
    """Applies sigmoid scaling to the output of the model if the task is regression and the target range is defined.

    Args:
        y_hat (torch.Tensor): The output of the model

    Returns:
        torch.Tensor: The output of the model with sigmoid scaling applied
    """
    if (self.hparams.task == "regression") and (self.hparams.target_range is not None):
        for i in range(self.hparams.output_dim):
            y_min, y_max = self.hparams.target_range[i]
            y_hat[:, i] = y_min + nn.Sigmoid()(y_hat[:, i]) * (y_max - y_min)
    return y_hat

calculate_loss(output, y, tag)

Calculates the loss for the model.

PARAMETER DESCRIPTION
output

The output dictionary from the model

TYPE: Dict

y

The target tensor

TYPE: torch.Tensor

tag

The tag to use for logging

TYPE: str

RETURNS DESCRIPTION
torch.Tensor

torch.Tensor: The loss value

Source code in src/pytorch_tabular/models/base_model.py
def calculate_loss(self, output: Dict, y: torch.Tensor, tag: str) -> torch.Tensor:
    """Calculates the loss for the model.

    Args:
        output (Dict): The output dictionary from the model
        y (torch.Tensor): The target tensor
        tag (str): The tag to use for logging

    Returns:
        torch.Tensor: The loss value
    """
    y_hat = output["logits"]
    reg_terms = [k for k, v in output.items() if "regularization" in k]
    reg_loss = 0
    for t in reg_terms:
        # Log only if non-zero
        if output[t] != 0:
            reg_loss += output[t]
            self.log(
                f"{tag}_{t}_loss",
                output[t],
                on_epoch=True,
                on_step=False,
                logger=True,
                prog_bar=False,
            )
    if self.hparams.task == "regression":
        computed_loss = reg_loss
        for i in range(self.hparams.output_dim):
            _loss = self.loss(y_hat[:, i], y[:, i])
            computed_loss += _loss
            if self.hparams.output_dim > 1:
                self.log(
                    f"{tag}_loss_{i}",
                    _loss,
                    on_epoch=True,
                    on_step=False,
                    logger=True,
                    prog_bar=False,
                )
    else:
        # TODO loss fails with batch size of 1?
        computed_loss = self.loss(y_hat.squeeze(), y.squeeze()) + reg_loss
    self.log(
        f"{tag}_loss",
        computed_loss,
        on_epoch=(tag in ["valid", "test"]),
        on_step=(tag == "train"),
        # on_step=False,
        logger=True,
        prog_bar=True,
    )
    return computed_loss

calculate_metrics(y, y_hat, tag)

Calculates the metrics for the model.

PARAMETER DESCRIPTION
y

The target tensor

TYPE: torch.Tensor

y_hat

The predicted tensor

TYPE: torch.Tensor

tag

The tag to use for logging

TYPE: str

RETURNS DESCRIPTION
List[torch.Tensor]

List[torch.Tensor]: The list of metric values

Source code in src/pytorch_tabular/models/base_model.py
def calculate_metrics(self, y: torch.Tensor, y_hat: torch.Tensor, tag: str) -> List[torch.Tensor]:
    """Calculates the metrics for the model.

    Args:
        y (torch.Tensor): The target tensor

        y_hat (torch.Tensor): The predicted tensor

        tag (str): The tag to use for logging

    Returns:
        List[torch.Tensor]: The list of metric values
    """
    metrics = []
    for metric, metric_str, prob_inp, metric_params in zip(
        self.metrics,
        self.hparams.metrics,
        self.hparams.metrics_prob_input,
        self.hparams.metrics_params,
    ):
        if self.hparams.task == "regression":
            _metrics = []
            for i in range(self.hparams.output_dim):
                name = metric.func.__name__ if isinstance(metric, partial) else metric.__name__
                if name == torchmetrics.functional.mean_squared_log_error.__name__:
                    # MSLE should only be used in strictly positive targets. It is undefined otherwise
                    _metric = metric(
                        torch.clamp(y_hat[:, i], min=0),
                        torch.clamp(y[:, i], min=0),
                        **metric_params,
                    )
                else:
                    _metric = metric(y_hat[:, i], y[:, i], **metric_params)
                if self.hparams.output_dim > 1:
                    self.log(
                        f"{tag}_{metric_str}_{i}",
                        _metric,
                        on_epoch=True,
                        on_step=False,
                        logger=True,
                        prog_bar=False,
                    )
                _metrics.append(_metric)
            avg_metric = torch.stack(_metrics, dim=0).sum()
        else:
            y_hat = nn.Softmax(dim=-1)(y_hat.squeeze())
            if prob_inp:
                avg_metric = metric(y_hat, y.squeeze(), **metric_params)
            else:
                avg_metric = metric(torch.argmax(y_hat, dim=-1), y.squeeze(), **metric_params)
        metrics.append(avg_metric)
        self.log(
            f"{tag}_{metric_str}",
            avg_metric,
            on_epoch=True,
            on_step=False,
            logger=True,
            prog_bar=True,
        )
    return metrics

compute_head(backbone_features)

Computes the head of the model.

PARAMETER DESCRIPTION
backbone_features

The backbone features

TYPE: Tensor

RETURNS DESCRIPTION
Dict[str, Any]

The output of the model

Source code in src/pytorch_tabular/models/base_model.py
def compute_head(self, backbone_features: Tensor) -> Dict[str, Any]:
    """Computes the head of the model.

    Args:
        backbone_features (Tensor): The backbone features

    Returns:
        The output of the model
    """
    y_hat = self.head(backbone_features)
    y_hat = self.apply_output_sigmoid_scaling(y_hat)
    return self.pack_output(y_hat, backbone_features)

data_aware_initialization(datamodule)

Performs data-aware initialization of the model when defined.

Source code in src/pytorch_tabular/models/base_model.py
def data_aware_initialization(self, datamodule):
    """Performs data-aware initialization of the model when defined."""
    pass

extract_embedding()

Extracts the embedding of the model.

This is used in CategoricalEmbeddingTransformer

Source code in src/pytorch_tabular/models/base_model.py
def extract_embedding(self):
    """Extracts the embedding of the model.

    This is used in `CategoricalEmbeddingTransformer`
    """
    if self.hparams.categorical_dim > 0:
        if not isinstance(self.embedding_layer, PreEncoded1dLayer):
            return self.embedding_layer.cat_embedding_layers
        else:
            raise ValueError(
                "Cannot extract embedding for PreEncoded1dLayer. Please use a different embedding layer."
            )
    else:
        raise ValueError(
            "Model has been trained with no categorical feature and therefore can't be used"
            " as a Categorical Encoder"
        )

feature_importance()

Returns a dataframe with feature importance for the model.

Source code in src/pytorch_tabular/models/base_model.py
def feature_importance(self) -> pd.DataFrame:
    """Returns a dataframe with feature importance for the model."""
    if hasattr(self.backbone, "feature_importance_"):
        imp = self.backbone.feature_importance_
        n_feat = len(self.hparams.categorical_cols + self.hparams.continuous_cols)
        if self.hparams.categorical_dim > 0:
            if imp.shape[0] != n_feat:
                # Combining Cat Embedded Dimensions to a single one by averaging
                wt = []
                norm = []
                ft_idx = 0
                for _, embd_dim in self.hparams.embedding_dims:
                    wt.extend([ft_idx] * embd_dim)
                    norm.append(embd_dim)
                    ft_idx += 1
                for _ in self.hparams.continuous_cols:
                    wt.extend([ft_idx])
                    norm.append(1)
                    ft_idx += 1
                imp = np.bincount(wt, weights=imp) / np.array(norm)
            else:
                # For models like FTTransformer, we dont need to do anything
                # It takes categorical and continuous as individual 2-D features
                pass
        importance_df = pd.DataFrame(
            {
                "Features": self.hparams.categorical_cols + self.hparams.continuous_cols,
                "importance": imp,
            }
        )
        return importance_df
    else:
        raise ValueError("Feature Importance unavailable for this model.")

forward(x)

The forward pass of the model.

PARAMETER DESCRIPTION
x

The input of the model with 'continuous' and 'categorical' keys

TYPE: Dict

Source code in src/pytorch_tabular/models/base_model.py
def forward(self, x: Dict) -> Dict[str, Any]:
    """The forward pass of the model.

    Args:
        x (Dict): The input of the model with 'continuous' and 'categorical' keys
    """
    x = self.embed_input(x)
    x = self.compute_backbone(x)
    return self.compute_head(x)

pack_output(y_hat, backbone_features)

Packs the output of the model.

PARAMETER DESCRIPTION
y_hat

The output of the model

TYPE: torch.Tensor

backbone_features

The backbone features

TYPE: torch.tensor

RETURNS DESCRIPTION
Dict[str, Any]

The packed output of the model

Source code in src/pytorch_tabular/models/base_model.py
def pack_output(self, y_hat: torch.Tensor, backbone_features: torch.tensor) -> Dict[str, Any]:
    """Packs the output of the model.

    Args:
        y_hat (torch.Tensor): The output of the model

        backbone_features (torch.tensor): The backbone features

    Returns:
        The packed output of the model
    """
    # if self.head is the Identity function it means that we cannot extract backbone features,
    # because the model cannot be divide in backbone and head (i.e. TabNet)
    if type(self.head) == nn.Identity:
        return {"logits": y_hat}
    return {"logits": y_hat, "backbone_features": backbone_features}

predict(x, ret_model_output=False)

Predicts the output of the model.

PARAMETER DESCRIPTION
x

The input of the model with 'continuous' and 'categorical' keys

TYPE: Dict

ret_model_output

If True, the method returns the output of the model

TYPE: bool DEFAULT: False

RETURNS DESCRIPTION
Union[torch.Tensor, Tuple[torch.Tensor, Dict]]

The output of the model

Source code in src/pytorch_tabular/models/base_model.py
def predict(self, x: Dict, ret_model_output: bool = False) -> Union[torch.Tensor, Tuple[torch.Tensor, Dict]]:
    """Predicts the output of the model.

    Args:
        x (Dict): The input of the model with 'continuous' and 'categorical' keys

        ret_model_output (bool): If True, the method returns the output of the model

    Returns:
        The output of the model
    """
    assert self.hparams.task != "ssl", "It's not allowed to use the method predict in case of ssl task"
    ret_value = self.forward(x)
    if ret_model_output:
        return ret_value.get("logits"), ret_value
    return ret_value.get("logits")