torchfry.networks

Module contents

Neural network implementations using the Fastfood and Random Kitchen Sink layers.

class torchfry.networks.LeNet(features=1024, projection_layer=FastfoodLayer, proj_args={})[source]

Bases: Module

LeNet-based model similar to Deep Fried Convnets. This model replaces the traditional FC layer with a random feature layer (e.g., FastfoodLayer, RKSLayer), followed by batch normalization and ReLU activation, ending with a FC linear layer for classification.

Network architecture is as follows:

Conv -> ReLU -> MaxPool -> Conv -> ReLU -> MaxPool -> Flatten ->

Fastfood/RKS -> BN -> ReLU -> FC Linear (Output)
Parameters:
  • features (int) – Number of features for the projection layer.

  • projection_layer (nn.Module) – The type of projection layer to use in hidden layers.

  • proj_args (dict) – Additional arguments to pass to the projection layer (e.g., scale, device, learnable flags, etc.)

References

Notes

This model is programmed to run on the CIFAR-10 dataset.

forward(x)[source]

Forward pass through the network.

Parameters:

x (torch.Tensor) – Input tensor of shape (batch_size, 3, 32, 32).

Returns:

Output logits tensor of shape (batch_size, classes), representing raw classification scores.

Return type:

torch.Tensor

Notes

Input tensor corresponds to the CIFAR-10 dataset, which has \(3\) color channels and \(32 \times 32\) size images.

class torchfry.networks.MLP(features, classes, widths, layer=FastfoodLayer, proj_args={})[source]

Bases: Module

Multi-Layer Perceptron-based model (MLP) that replaces the stacked FC layers with random feature layers (e.g., FastfoodLayer, RKSLayer), each followed by batch normalization and ReLU activation, ending with a FC linear layer for classification.

Network architecture is as follows:

(Fastfood/RKS -> BN -> ReLU) * n -> FC Linear (Output)

Where n is the desired number of stacked projection layers.

Parameters:
  • features (int) – Number of features for the projection layer.

  • classes (int) – Number of output classes for classification.

  • widths (list of int) – List containing the widths (number of neurons) for each hidden layer.

  • projection_layer (nn.Module) – The type of projection layer to use in hidden layers.

  • proj_args (dict) – Additional arguments to pass to the projection layer (e.g., scale, device, learnable flags, etc.).

Notes

This model is primarily run on the Fashion MNIST dataset, but supports CIFAR-10 as well.

forward(x)[source]

Forward pass through the network.

Parameters:

x (torch.Tensor) – Input tensor of shape (batch_size, features).

Returns:

Output logits tensor of shape (batch_size, classes), representing raw classification scores.

Return type:

torch.Tensor

class torchfry.networks.VGG(input_shape=(3, 32, 32), features=512, classes=10, projection_layer=FastfoodLayer, proj_args={})[source]

Bases: Module

VGG-based model that uses a pre-trained VGG-16BN model from the Visual Geometry Group. It consists of 13 convolutional layers, each followed by batch normalization and ReLU activation, some followed by a max-pooling layer for a total of 5, then 2 FC linear layers with ReLU and dropout after each, finally ending with a FC linear layer for classification.

Network architecture is as follows:

(Conv -> BN -> ReLU -> Conv -> BN -> ReLU -> MaxPool) * 2 ->

(Conv -> BN -> ReLU -> Conv -> BN -> ReLU -> Conv -> BN -> ReLU -> MaxPool) * 3 ->

Flatten -> (FC Linear -> ReLU -> Dropout) * 2 -> FC Linear (Output)

In this implementation, the first two FC linear layers are replaced with one of the random feature layers (Fastfood/RKS):

... -> (Fastfood/RKS -> ReLU -> Dropout) * 2 -> FC Linear (Output)
Parameters:
  • input_shape (tuple of int) – Shape of the input images in (channels, height, width) format.

  • features (int) – Number of features for the classifier layers.

  • classes (int) – Number of output classes for classification.

  • projection_layer (nn.Module) – The type of projection layer to use in hidden layers.

  • proj_args (dict) – Additional arguments to pass to the projection layer (e.g., scale, device, learnable flags, etc.).

Notes

This model is programmed to run on the CIFAR-10 dataset.

forward(x: torch.Tensor)[source]

Forward pass through the network.

Parameters:

x (torch.Tensor) – Input tensor of shape (batch_size, channels, height, width).

Returns:

Output logits tensor of shape (batch_size, classes), representing raw classification scores.

Return type:

torch.Tensor