본문으로 바로가기


1) 개요

    - 60 million parameters and 650,000 neurons

    - Consists of 5 convolutional layers, some max-pooling layers, and 3 fully-connected layers.

    - Use softmax for the classification.

    - To make training faster, AlexNet used non-saturating neurons and a GPU.

    - Use dropout to reduce overfitting in the fully-connected layers.


2) ReLU activation function

    - LeNet-5 used Tanh function.

    - AlexNet used ReLU function to prevent from overfitting.


3) Dropout

    - AlexNet used dropout with probability 0.5.


4) Overlapping pooling

    - AlexNet used Max pooling.

    - By overlapping pooling, its output of different dimensions. More precise than non-overlapping pooling.


5) Local Response Normalization

    - This layer is useful when we are dealing with ReLU neurons.

    - The activated neuron do normalization to neighbour neurons. It will make stand out for the activated neruron. So, we can detect high frequency features with a large repsponse.

    - In contrast, if all neurons are large, then normalizing will diminish all of them.


6) Data augmentation

    - Artificially enlarge the dataset using label-preserving transformations.

    - For computing power, they transformed the images on the CPU while the GPU is training.

    (1) Crop original image by 224 x 224 size for train, horizontal reflection for test.

    (2) Altering the intensities of the RGB channels in training images.



ref : http://cs231n.stanford.edu/slides/2020/lecture_9.pdf


1) GPU

    - A single GPU which they used has only 3GB of memory. It limits the maximum size of the networks that can be trained.

    - They spread the net across two GPUs

    - the GPUs communicate only in certain layers (CONV3, FC6, FC7, FC8)


2) stride 4 for CONV1

    - They reduced computation at CONV1 by applying stride 4. Because the input size is very big and to reduce it.




