How does a FC layer work in a typical CNN
Clash Royale CLAN TAG#URR8PPP
up vote
3
down vote
favorite
I am new to CNNs and NNs. I am reading this blog: CNN and I am confused about this part: What confuses me is the operation that will be performed on an input vector/matrix. Will we be using a typical ANN equation: "O = W.T * input"?. And then a sigmoid on top of it?
neural-network
add a comment |Â
up vote
3
down vote
favorite
I am new to CNNs and NNs. I am reading this blog: CNN and I am confused about this part: What confuses me is the operation that will be performed on an input vector/matrix. Will we be using a typical ANN equation: "O = W.T * input"?. And then a sigmoid on top of it?
neural-network
add a comment |Â
up vote
3
down vote
favorite
up vote
3
down vote
favorite
I am new to CNNs and NNs. I am reading this blog: CNN and I am confused about this part: What confuses me is the operation that will be performed on an input vector/matrix. Will we be using a typical ANN equation: "O = W.T * input"?. And then a sigmoid on top of it?
neural-network
I am new to CNNs and NNs. I am reading this blog: CNN and I am confused about this part: What confuses me is the operation that will be performed on an input vector/matrix. Will we be using a typical ANN equation: "O = W.T * input"?. And then a sigmoid on top of it?
neural-network
edited Aug 11 at 12:09
Djib2011
1,334515
1,334515
asked Aug 11 at 11:05
user57521
161
161
add a comment |Â
add a comment |Â
2 Answers
2
active
oldest
votes
up vote
2
down vote
Yes, essentially a typical CNN consists of two parts:
The convolution and pooling layers, whose goals are to extract features from the images. These are the first layers in the network.
The final layer(s), which are usually Fully Connected NNs, whose goal is to classify those features.
The latter do have a typical equation (i.e. $f(W^T cdot X + b)$), where $f$ is an activation function. Usually in the context of CNNs, $f$ is a ReLU, except for the activation function of the final layer, which is selected according to the nature of the problem. The most common cases are:
- Sigmoid activation functions work for binary classification problems.
- Softmax activation functions work practically for both binary and multi-class classification problem.
- For regression problems, the final layer has no activation.
One final note I'd like to make is that before entering the first FC layer, the output of the previous layer is flattened. By this I mean that the (typically 3) dimensions of that tensor are layed out into one large dimension.
For example a tensor with a shape of $(5, 5, 32)$, when flattened would become $(5 cdot 5 cdot 32) = (800)$.
add a comment |Â
up vote
0
down vote
Basically, yes. But in order to pass input from a convolutional, or max pooling layer to a fully-connected one, you need to "flatten" the input tensor. That is, either to flatten the tensor/multi-dimensional array from the convolutional layer or to use something like Global Average Pooling that will reduce the tensor to a vector.
You can check code snippets in different frameworks, that will help you understand the process.
Also, should be noted that fully connected layers are used not only as the last layer that outputs class probabilities in a CNN, check for example VGG networks, they have 2-3 fully connected layers at the end.
And the last remark, to get class scores you usually (not always!) use Softmax, not simple sigmoid. Softmax ensures that the sum of the values in your output vector is equal to 1.
add a comment |Â
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
2
down vote
Yes, essentially a typical CNN consists of two parts:
The convolution and pooling layers, whose goals are to extract features from the images. These are the first layers in the network.
The final layer(s), which are usually Fully Connected NNs, whose goal is to classify those features.
The latter do have a typical equation (i.e. $f(W^T cdot X + b)$), where $f$ is an activation function. Usually in the context of CNNs, $f$ is a ReLU, except for the activation function of the final layer, which is selected according to the nature of the problem. The most common cases are:
- Sigmoid activation functions work for binary classification problems.
- Softmax activation functions work practically for both binary and multi-class classification problem.
- For regression problems, the final layer has no activation.
One final note I'd like to make is that before entering the first FC layer, the output of the previous layer is flattened. By this I mean that the (typically 3) dimensions of that tensor are layed out into one large dimension.
For example a tensor with a shape of $(5, 5, 32)$, when flattened would become $(5 cdot 5 cdot 32) = (800)$.
add a comment |Â
up vote
2
down vote
Yes, essentially a typical CNN consists of two parts:
The convolution and pooling layers, whose goals are to extract features from the images. These are the first layers in the network.
The final layer(s), which are usually Fully Connected NNs, whose goal is to classify those features.
The latter do have a typical equation (i.e. $f(W^T cdot X + b)$), where $f$ is an activation function. Usually in the context of CNNs, $f$ is a ReLU, except for the activation function of the final layer, which is selected according to the nature of the problem. The most common cases are:
- Sigmoid activation functions work for binary classification problems.
- Softmax activation functions work practically for both binary and multi-class classification problem.
- For regression problems, the final layer has no activation.
One final note I'd like to make is that before entering the first FC layer, the output of the previous layer is flattened. By this I mean that the (typically 3) dimensions of that tensor are layed out into one large dimension.
For example a tensor with a shape of $(5, 5, 32)$, when flattened would become $(5 cdot 5 cdot 32) = (800)$.
add a comment |Â
up vote
2
down vote
up vote
2
down vote
Yes, essentially a typical CNN consists of two parts:
The convolution and pooling layers, whose goals are to extract features from the images. These are the first layers in the network.
The final layer(s), which are usually Fully Connected NNs, whose goal is to classify those features.
The latter do have a typical equation (i.e. $f(W^T cdot X + b)$), where $f$ is an activation function. Usually in the context of CNNs, $f$ is a ReLU, except for the activation function of the final layer, which is selected according to the nature of the problem. The most common cases are:
- Sigmoid activation functions work for binary classification problems.
- Softmax activation functions work practically for both binary and multi-class classification problem.
- For regression problems, the final layer has no activation.
One final note I'd like to make is that before entering the first FC layer, the output of the previous layer is flattened. By this I mean that the (typically 3) dimensions of that tensor are layed out into one large dimension.
For example a tensor with a shape of $(5, 5, 32)$, when flattened would become $(5 cdot 5 cdot 32) = (800)$.
Yes, essentially a typical CNN consists of two parts:
The convolution and pooling layers, whose goals are to extract features from the images. These are the first layers in the network.
The final layer(s), which are usually Fully Connected NNs, whose goal is to classify those features.
The latter do have a typical equation (i.e. $f(W^T cdot X + b)$), where $f$ is an activation function. Usually in the context of CNNs, $f$ is a ReLU, except for the activation function of the final layer, which is selected according to the nature of the problem. The most common cases are:
- Sigmoid activation functions work for binary classification problems.
- Softmax activation functions work practically for both binary and multi-class classification problem.
- For regression problems, the final layer has no activation.
One final note I'd like to make is that before entering the first FC layer, the output of the previous layer is flattened. By this I mean that the (typically 3) dimensions of that tensor are layed out into one large dimension.
For example a tensor with a shape of $(5, 5, 32)$, when flattened would become $(5 cdot 5 cdot 32) = (800)$.
answered Aug 11 at 12:03
Djib2011
1,334515
1,334515
add a comment |Â
add a comment |Â
up vote
0
down vote
Basically, yes. But in order to pass input from a convolutional, or max pooling layer to a fully-connected one, you need to "flatten" the input tensor. That is, either to flatten the tensor/multi-dimensional array from the convolutional layer or to use something like Global Average Pooling that will reduce the tensor to a vector.
You can check code snippets in different frameworks, that will help you understand the process.
Also, should be noted that fully connected layers are used not only as the last layer that outputs class probabilities in a CNN, check for example VGG networks, they have 2-3 fully connected layers at the end.
And the last remark, to get class scores you usually (not always!) use Softmax, not simple sigmoid. Softmax ensures that the sum of the values in your output vector is equal to 1.
add a comment |Â
up vote
0
down vote
Basically, yes. But in order to pass input from a convolutional, or max pooling layer to a fully-connected one, you need to "flatten" the input tensor. That is, either to flatten the tensor/multi-dimensional array from the convolutional layer or to use something like Global Average Pooling that will reduce the tensor to a vector.
You can check code snippets in different frameworks, that will help you understand the process.
Also, should be noted that fully connected layers are used not only as the last layer that outputs class probabilities in a CNN, check for example VGG networks, they have 2-3 fully connected layers at the end.
And the last remark, to get class scores you usually (not always!) use Softmax, not simple sigmoid. Softmax ensures that the sum of the values in your output vector is equal to 1.
add a comment |Â
up vote
0
down vote
up vote
0
down vote
Basically, yes. But in order to pass input from a convolutional, or max pooling layer to a fully-connected one, you need to "flatten" the input tensor. That is, either to flatten the tensor/multi-dimensional array from the convolutional layer or to use something like Global Average Pooling that will reduce the tensor to a vector.
You can check code snippets in different frameworks, that will help you understand the process.
Also, should be noted that fully connected layers are used not only as the last layer that outputs class probabilities in a CNN, check for example VGG networks, they have 2-3 fully connected layers at the end.
And the last remark, to get class scores you usually (not always!) use Softmax, not simple sigmoid. Softmax ensures that the sum of the values in your output vector is equal to 1.
Basically, yes. But in order to pass input from a convolutional, or max pooling layer to a fully-connected one, you need to "flatten" the input tensor. That is, either to flatten the tensor/multi-dimensional array from the convolutional layer or to use something like Global Average Pooling that will reduce the tensor to a vector.
You can check code snippets in different frameworks, that will help you understand the process.
Also, should be noted that fully connected layers are used not only as the last layer that outputs class probabilities in a CNN, check for example VGG networks, they have 2-3 fully connected layers at the end.
And the last remark, to get class scores you usually (not always!) use Softmax, not simple sigmoid. Softmax ensures that the sum of the values in your output vector is equal to 1.
answered Aug 11 at 12:03
Alexandru Burlacu
11
11
add a comment |Â
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f36780%2fhow-does-a-fc-layer-work-in-a-typical-cnn%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password