1x1 Convolution. How does the math work?

up vote
2
down vote

favorite

So I stumbled upon Andrew Ng course on 1x1 convolutions.
There he explains that you can use 1x1x192 convolution to shrink it.

But when I do

input_ = torch.randn([28, 28, 192])
filter = torch.zeros([1, 1, 192])

out = torch.mul(input_,filter)

I obviously get 28x28x192 matrix. So how should I be able to shrink it?
Just add the result of every 1x1x192 * 1x1x192 kerner result? So I'd get 28x28x1 matrix?

asked 3 hours ago

Mihkel L.

1111

New contributor

add a commentÂ |Â

up vote
2
down vote

favorite

So I stumbled upon Andrew Ng course on 1x1 convolutions.
There he explains that you can use 1x1x192 convolution to shrink it.

But when I do

input_ = torch.randn([28, 28, 192])
filter = torch.zeros([1, 1, 192])

out = torch.mul(input_,filter)

I obviously get 28x28x192 matrix. So how should I be able to shrink it?
Just add the result of every 1x1x192 * 1x1x192 kerner result? So I'd get 28x28x1 matrix?

asked 3 hours ago

Mihkel L.

1111

New contributor

add a commentÂ |Â

up vote
2
down vote

favorite

So I stumbled upon Andrew Ng course on 1x1 convolutions.
There he explains that you can use 1x1x192 convolution to shrink it.

But when I do

input_ = torch.randn([28, 28, 192])
filter = torch.zeros([1, 1, 192])

out = torch.mul(input_,filter)

I obviously get 28x28x192 matrix. So how should I be able to shrink it?
Just add the result of every 1x1x192 * 1x1x192 kerner result? So I'd get 28x28x1 matrix?

asked 3 hours ago

Mihkel L.

1111

New contributor

So I stumbled upon Andrew Ng course on 1x1 convolutions.
There he explains that you can use 1x1x192 convolution to shrink it.

But when I do

input_ = torch.randn([28, 28, 192])
filter = torch.zeros([1, 1, 192])

out = torch.mul(input_,filter)

I obviously get 28x28x192 matrix. So how should I be able to shrink it?
Just add the result of every 1x1x192 * 1x1x192 kerner result? So I'd get 28x28x1 matrix?

convnet

asked 3 hours ago

Mihkel L.

1111

New contributor

asked 3 hours ago

Mihkel L.

1111

New contributor

asked 3 hours ago

Mihkel L.

1111

New contributor

asked 3 hours ago

Mihkel L.

1111

asked 3 hours ago

Mihkel L.

1111

New contributor

Mihkel L. is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

add a commentÂ |Â

2 Answers
2

active

oldest

votes

up vote
2
down vote

Let's go back at normal convolution: let's say you have a 28x28x3 image (3 = R,G,B).

I don't use torch, but keras, but the principle applies I think.

When you apply a 2D Convolution, passing the size of the filter, for example 3x3, the framework adapt your filter from 3x3 to 3x3x3! Where the last 3 it's due to the dept of the image.

The same happens when, after a first layer of convolution with 100 filters, you obtain an image of size 28x28x100, at the second convolution layer you decide only the first two dimension of the filter, let's say 4x4. The framework instead, applies a filter of dimension 4x4x100!

So, to reply at your question, if you apply 1x1 convolution to 28x28x100, passing number of filters of k. You obtain an activation map (result) of dimension 28x28xk.

And that's the shrink suggested by Ng.

Again to fully reply to your question, the mat is simple, just apply the theory of the convolution using 3D filters. Sum of multiplication of overlapping elements between filter and image.

answered 1 hour ago

Francesco Pegoraro

765

New contributor

add a commentÂ |Â

up vote
1
down vote

In your example you use one filter, the video suggests to instead use 32 filters.

Let's take a deeper look: Each 1x1 convolutional filter performs actions only locally, taking into account one pixel, with 192 channels. Its output will be one value for each pixel, thus effectively reducing the dimension to 28x28x1. Now you don't only want to have this one channel, but instead 32 - the solution to that problem is to simply take 32 filters, where each channel-dimension in the resulting 28x28x32 matrix corresponds to one 1x1 conv. filter.

Mathematically: Let $MinmathbbR^28times28times192$ be the pixel Matrix. Now we can define a 1x1 convolutional filter, let's call it $phi:mathbbR^28times28times192tomathbbR^28times28times1$, with $phi:x_i,jmapsto g(x_i,j)$ for a $g:mathbbR^192tomathbbR$. I do not think $g$ needs to be linear, but I guess it depends on how you want to see the convolution - I would not assume linearity. Consequently we can now define a $psi:mathbbR^28times28times192tomathbbR^28times28times32$ using $psi:xmapstoleft(phi_1(x), dots, phi_32(x)right)^T$. If we now apply $psi$ on $M$, we will achieve the desired dimensionality reduction to $28times28times32$.

edited 1 hour ago

answered 2 hours ago

AndrÃ©

4209

Can you show me the math on this. Because Obviously I don't understand the wording on this problem. =D
â€“Â Mihkel L.
2 hours ago

I edited my answer :)
â€“Â AndrÃ©
1 hour ago

Okey, I'll show you my thinking. When I multiply 1x1x192 with 1x1x192 I get 1x1x192. I'm not seeing how can I multiply 1x1x192 matrix with 1x1 and not get the third dimension to be 192. The desired math you showed is what I want. But I don't get what's in side of g.
â€“Â Mihkel L.
37 mins ago

g could be just adding the 192 results together or should there be more smarts behind it?
â€“Â Mihkel L.
30 mins ago

add a commentÂ |Â

Your Answer

StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
);
);
, "mathjax-editing");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "557"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

Mihkel L. is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f38643%2f1x1-convolution-how-does-the-math-work%23new-answer', 'question_page');

);

Post as a guest

Name

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

up vote
2
down vote

Let's go back at normal convolution: let's say you have a 28x28x3 image (3 = R,G,B).

I don't use torch, but keras, but the principle applies I think.

When you apply a 2D Convolution, passing the size of the filter, for example 3x3, the framework adapt your filter from 3x3 to 3x3x3! Where the last 3 it's due to the dept of the image.

So, to reply at your question, if you apply 1x1 convolution to 28x28x100, passing number of filters of k. You obtain an activation map (result) of dimension 28x28xk.

And that's the shrink suggested by Ng.

Again to fully reply to your question, the mat is simple, just apply the theory of the convolution using 3D filters. Sum of multiplication of overlapping elements between filter and image.

answered 1 hour ago

Francesco Pegoraro

765

New contributor

add a commentÂ |Â

up vote
2
down vote

Let's go back at normal convolution: let's say you have a 28x28x3 image (3 = R,G,B).

I don't use torch, but keras, but the principle applies I think.

When you apply a 2D Convolution, passing the size of the filter, for example 3x3, the framework adapt your filter from 3x3 to 3x3x3! Where the last 3 it's due to the dept of the image.

So, to reply at your question, if you apply 1x1 convolution to 28x28x100, passing number of filters of k. You obtain an activation map (result) of dimension 28x28xk.

And that's the shrink suggested by Ng.

Again to fully reply to your question, the mat is simple, just apply the theory of the convolution using 3D filters. Sum of multiplication of overlapping elements between filter and image.

answered 1 hour ago

Francesco Pegoraro

765

New contributor

add a commentÂ |Â

up vote
2
down vote

Let's go back at normal convolution: let's say you have a 28x28x3 image (3 = R,G,B).

I don't use torch, but keras, but the principle applies I think.

When you apply a 2D Convolution, passing the size of the filter, for example 3x3, the framework adapt your filter from 3x3 to 3x3x3! Where the last 3 it's due to the dept of the image.

So, to reply at your question, if you apply 1x1 convolution to 28x28x100, passing number of filters of k. You obtain an activation map (result) of dimension 28x28xk.

And that's the shrink suggested by Ng.

Again to fully reply to your question, the mat is simple, just apply the theory of the convolution using 3D filters. Sum of multiplication of overlapping elements between filter and image.

answered 1 hour ago

Francesco Pegoraro

765

New contributor

Let's go back at normal convolution: let's say you have a 28x28x3 image (3 = R,G,B).

I don't use torch, but keras, but the principle applies I think.

When you apply a 2D Convolution, passing the size of the filter, for example 3x3, the framework adapt your filter from 3x3 to 3x3x3! Where the last 3 it's due to the dept of the image.

So, to reply at your question, if you apply 1x1 convolution to 28x28x100, passing number of filters of k. You obtain an activation map (result) of dimension 28x28xk.

And that's the shrink suggested by Ng.

Again to fully reply to your question, the mat is simple, just apply the theory of the convolution using 3D filters. Sum of multiplication of overlapping elements between filter and image.

answered 1 hour ago

Francesco Pegoraro

765

New contributor

answered 1 hour ago

Francesco Pegoraro

765

New contributor

answered 1 hour ago

Francesco Pegoraro

765

answered 1 hour ago

Francesco Pegoraro

765

New contributor

Francesco Pegoraro is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

add a commentÂ |Â

up vote
1
down vote

In your example you use one filter, the video suggests to instead use 32 filters.

edited 1 hour ago

answered 2 hours ago

AndrÃ©

4209

Can you show me the math on this. Because Obviously I don't understand the wording on this problem. =D
â€“Â Mihkel L.
2 hours ago

Okey, I'll show you my thinking. When I multiply 1x1x192 with 1x1x192 I get 1x1x192. I'm not seeing how can I multiply 1x1x192 matrix with 1x1 and not get the third dimension to be 192. The desired math you showed is what I want. But I don't get what's in side of g.
â€“Â Mihkel L.
37 mins ago

g could be just adding the 192 results together or should there be more smarts behind it?
â€“Â Mihkel L.
30 mins ago

add a commentÂ |Â

up vote
1
down vote

In your example you use one filter, the video suggests to instead use 32 filters.

edited 1 hour ago

answered 2 hours ago

AndrÃ©

4209

Can you show me the math on this. Because Obviously I don't understand the wording on this problem. =D
â€“Â Mihkel L.
2 hours ago

Okey, I'll show you my thinking. When I multiply 1x1x192 with 1x1x192 I get 1x1x192. I'm not seeing how can I multiply 1x1x192 matrix with 1x1 and not get the third dimension to be 192. The desired math you showed is what I want. But I don't get what's in side of g.
â€“Â Mihkel L.
37 mins ago

g could be just adding the 192 results together or should there be more smarts behind it?
â€“Â Mihkel L.
30 mins ago

add a commentÂ |Â

up vote
1
down vote

In your example you use one filter, the video suggests to instead use 32 filters.

edited 1 hour ago

answered 2 hours ago

AndrÃ©

4209

In your example you use one filter, the video suggests to instead use 32 filters.

edited 1 hour ago

answered 2 hours ago

AndrÃ©

4209

edited 1 hour ago

answered 2 hours ago

AndrÃ©

4209

answered 2 hours ago

AndrÃ©

4209

answered 2 hours ago

AndrÃ©

4209

Can you show me the math on this. Because Obviously I don't understand the wording on this problem. =D
â€“Â Mihkel L.
2 hours ago

Okey, I'll show you my thinking. When I multiply 1x1x192 with 1x1x192 I get 1x1x192. I'm not seeing how can I multiply 1x1x192 matrix with 1x1 and not get the third dimension to be 192. The desired math you showed is what I want. But I don't get what's in side of g.
â€“Â Mihkel L.
37 mins ago

g could be just adding the 192 results together or should there be more smarts behind it?
â€“Â Mihkel L.
30 mins ago

add a commentÂ |Â

Can you show me the math on this. Because Obviously I don't understand the wording on this problem. =D
â€“Â Mihkel L.
2 hours ago

Okey, I'll show you my thinking. When I multiply 1x1x192 with 1x1x192 I get 1x1x192. I'm not seeing how can I multiply 1x1x192 matrix with 1x1 and not get the third dimension to be 192. The desired math you showed is what I want. But I don't get what's in side of g.
â€“Â Mihkel L.
37 mins ago

g could be just adding the 192 results together or should there be more smarts behind it?
â€“Â Mihkel L.
30 mins ago

Can you show me the math on this. Because Obviously I don't understand the wording on this problem. =D
â€“Â Mihkel L.
2 hours ago

Okey, I'll show you my thinking. When I multiply 1x1x192 with 1x1x192 I get 1x1x192. I'm not seeing how can I multiply 1x1x192 matrix with 1x1 and not get the third dimension to be 192. The desired math you showed is what I want. But I don't get what's in side of g.
â€“Â Mihkel L.
37 mins ago

g could be just adding the 192 results together or should there be more smarts behind it?
â€“Â Mihkel L.
30 mins ago

add a commentÂ |Â

Mihkel L. is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

Mihkel L. is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

Post as a guest

Name

Search This Blog

Iyfjky