Thinktechway

Convolutional Neural networks for signal processing from deep learning architecture to very deep learning architecture- AlexNet(Part 3 )

AlexNet

 
 

AlexNet is one of the very deep learning architectures that introduced several layers, such as the Relu activation function and Dropout, that have been used for the normalisation of the layers. AlexNet has also introduced the training for the first time using the GPU instead of the CPU. One of the conclusions of the papers is that the depth of the NNs can produce a lot of parameters that can be impossible to use on the CPU alone, and the training should be performed on the GPU. The other difference between the AlexNets and other very deep learning techniques is that it depends mostly on the complexity of the parameters and on the computational resources, as it requires a tremendous amount of data for preprocessing. Several papers report the higher performance of AlexNet when compared with other very deep learning architectures such as Yolo Net, VGnet and Densenet (Eldem et al., 2023; Lu et al., 2019; Mirchandani et al., 2023). AlexNet Applications are not limited to the images; AlexNet is also used for Epilepsy detection, which refers to the identification and classification of Seizure activities in the brain using electroencephalography. In summary, it measures electrical activity in the brain and is used to identify abnormal patterns by capturing time-frequency features. The original EEG was converted to a time-frequency image for identification by AlexNet. :

 

1. Dropout layer

 

Dropout layer is one of the regularization techniques to prevent over-fitting that could improve the performance by freezing some of the neurons

 

				
					from tensorflow.keras.layers import Dropout

model.add(Dropout(0.5))
#you can set the drop out rate here , in this example 50% of the neurons will be Dropout 
				
			
 

Dropout is most commonly used in the convolutional and fully connected layers; dropout rate is a hyperparameter that is needed to be fine-tuned, Automatic hyperparameters libraries are available for fine-tuning the parameters, such as Bayesian optimisation for hypertuning the parameters, such as Optuna for efficient hyperparameter tuning.

				
					# Define the objective function for Optuna
def hypertuning(trial):
    # Suggest hyperparameters
    learning_rate = trial.suggest_float("learning_rate", 0.0001, 0.1, log=True)
    dropout_rate = trial.suggest_float("dropout_rate", 0.2, 0.7)
    batch_size = trial.suggest_int("batch_size", 16, 128, step=16)
    optimizer_name = trial.suggest_categorical("optimizer", ["SGD", "Adam", "RMSprop"])
    
    # Load dataset
    transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.5,), (0.5,))])
    train_dataset = datasets.CIFAR10(root="./data", train=True, transform=transform, download=True)
    train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
    
    # Initialize the model
    model = AlexNet(num_classes=10, dropout_rate=dropout_rate)
    criterion = nn.CrossEntropyLoss()
    optimizer = getattr(optim, optimizer_name)(model.parameters(), lr=learning_rate)
    
    # Train the model (1 epoch for simplicity)
    model.train()
    for epoch in range(1):
        for images, labels in train_loader:
            optimizer.zero_grad()
            outputs = model(images)
            loss = criterion(outputs, labels)
            loss.backward()
            optimizer.step()
   

# Optimize hyperparameters using bayseyian algoritmic for hyperparameters tuning 
study = optuna.create_study(direction="maximize")
study.optimize(objective, n_trials=50)

# Print the best hyperparameters
print("Best hyperparameters:", study.best_params)
				
			

2. Another main methods introduced with with AlexNet is the dependencies on the GPU along with the CPU as the milions of parameters that available for hyper-tuning. The method briefly depends on splitting convolutional for hypertunning , the layer namely Split Convolutional Layers for Parallel Processing on different GPU.

Here is how to ask for GPU utilization in pytoroch for very deep learning architecture.

				
					device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model = model.to(device)
				
			

In tensorflow , the GPU utilization for training on deep learning architecture is introduced automatically within the frameworks , you can ensure utilization of deep learing architecture using the following code

				
					from tensorflow.python.client import device_lib
print(device_lib.list_local_devices())
				
			

3. Overlapping Max-Pooling Layer

overlapping  Max-Pooling layer means that the stride are smaller than the pooling windows size  ensuring feature preservation across the input feature maps.
Advantages

better feature selection
by ensuring the overlapping the spatial shift with the feature maps can be resolve with the overlapping ensuring better feature selection , however a slight computational cost can be introduced with the overlapping techniques.  

Implementation of AlexNet on signal processing

Spectrogram image of a signals using the fourier transform of the time series signals

AlexNet is efficient in image classification and could be implemented on signal, widening the application of the very deep learning architecture. Some applications include speech recognition, natural disaster monitoring, and brain activity monitoring using ECG and speech-emotion recognition (Lech et al., 2020). Signals, compared to image applications, could show a precise application, while both are different forms in the shape of 1D and 2D. Monitoring activity using signals or contact base sensors is a much more precise application. In order to implement Alex net on the signals form, the frequency components of the time-series data need to be derived using different methods such as wavelet analysis, Fourier Analysis, Short time Fourier Transform and Hilbert transform. The frequency components of the signal could enhance the feature extraction, resulting in better performance. A complete tutorial on the frequency analysis techniques of the signal processing series will be added separately. The frequency components are still in the numerical array forms, and a method to convert the frequency components into different RGB inside the images should be implemented to be used as inputs to the AlexNets. This image is called the spectrogram or the RGB representation of the frequency components inside the original time series data. 

 

Reference:

 

1.Eldem, H., Ülker, E., & Işıklı, O. Y. (2023). Alexnet architecture variations with transfer learning for classification of wound images. Engineering Science and Technology, an International Journal, 45, 101490. https://doi.org/10.1016/j.jestch.2023.101490

2. Mirchandani, R., Yoon, C., Prakash, S., Khaire, A., Naran, A., Nair, A., & Ganti, S. (2023). Comparing the architecture and performance of AlexNet, Faster R-CNN, and YOLOv4 in the multiclass classification of Alzheimer brain MRI scans. [Unpublished manuscript]. AI4ALL.

3.
Lu, S., Lu, Z., & Zhang, Y.-D. (2019). Pathological brain detection based on AlexNet and transfer learning. Journal of Computational Science, 30, 41–47. https://doi.org/10.1016/j.jocs.2018.11.004

4.Diep, Q. B., Phan, H. Y., & Truong, T.-C. (2024). Crossmixed convolutional neural network for digital speech recognition. PLOS ONE, 19(4), e0302394. https://doi.org/10.1371/journal.pone.0302394

5. Lu, S., Lu, Z., & Zhang, Y.-D. (2019). Pathological brain detection based on AlexNet and transfer learning. Journal of Computational Science, 30, 41–47. https://doi.org/10.1016/j.jocs.2018.11.004

 

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top