Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
497 views
in Technique[技术] by (71.8m points)

python - How does adaptive pooling in pytorch work?

Adaptive pooling is a great function, but how does it work? It seems to be inserting pads or shrinking/expanding kernel sizes in what seems like a pattered but fairly arbitrary way. The pytorch documentation I can find is not more descriptive than "put desired output size here." Does anyone know how this works or can point to where it's explained?

Some test code on a 1x1x6 tensor, (1,2,3,4,5,6), with an adaptive output of size 8:

import torch
import torch.nn as nn

class TestNet(nn.Module):
    def __init__(self):
        super(TestNet, self).__init__()
        self.avgpool = nn.AdaptiveAvgPool1d(8)

    def forward(self,x):
        print(x)
        x = self.avgpool(x)
        print(x)
        return x

def test():
    x = torch.Tensor([[[1,2,3,4,5,6]]])
    net = TestNet()
    y = net(x)
    return y

test()

Output:

tensor([[[ 1.,  2.,  3.,  4.,  5.,  6.]]])
tensor([[[ 1.0000,  1.5000,  2.5000,  3.0000,  4.0000,  4.5000,  5.5000,
       6.0000]]])

If it mirror pads by on the left and right (operating on (1,1,2,3,4,5,6,6)), and has a kernel of 2, then the outputs for all positions except for 4 and 5 make sense, except of course the output isn't the right size. Is it also padding the 3 and 4 internally? If so, it's operating on (1,1,2,3,3,4,4,5,6,6), which, if using a size 2 kernel, produces the wrong output size and would also miss a 3.5 output. Is it changing the size of the kernel?

Am I missing something obvious about the way this works?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

In general, pooling reduces dimensions. If you want to increase dimensions, you might want to look at interpolation.

Anyway, let's talk about adaptive pooling in general. You can look at the source code here. Some claimed that adaptive pooling is the same as standard pooling with stride and kernel size calculated from input and output size. Specifically, the following parameters are used:

  1. Stride = (input_size//output_size)
  2. Kernel size = input_size - (output_size-1)*stride
  3. Padding = 0

These are inversely worked from the pooling formula. While they DO produce output of the desired size, its output is not necessarily the same as that of adaptive pooling. Here is a test snippet:

import torch
import torch.nn as nn

in_length = 5
out_length = 3

x = torch.arange(0, in_length).view(1, 1, -1).float()
print(x)

stride = (in_length//out_length)
avg_pool = nn.AvgPool1d(
        stride=stride,
        kernel_size=(in_length-(out_length-1)*stride),
        padding=0,
    )
adaptive_pool = nn.AdaptiveAvgPool1d(out_length)

print(avg_pool.stride, avg_pool.kernel_size)

y_avg = avg_pool(x)
y_ada = adaptive_pool(x)

print(y_avg)
print(y_ada)

Output:

tensor([[[0., 1., 2., 3., 4.]]])
(1,) (3,)
tensor([[[1., 2., 3.]]])
tensor([[[0.5000, 2.0000, 3.5000]]])
Error:  1.0

Average pooling pools from elements (0, 1, 2), (1, 2, 3) and (2, 3, 4).

Adaptive pooling pools from elements (0, 1), (1, 2, 3) and (3, 4). (Change the code a bit to see that it is not pooling from (2) only)

  • You can tell adaptive pooling tries to reduce overlapping in pooling.
  • The difference can be mitigated using padding with count_include_pad=True, but in general I don't think they can be exactly the same for 2D or higher for all input/output sizes. I would imagine using different paddings for left/right. This is not supported in pooling layers for the moment.
  • From a practical perspective it should not matter much.
  • Check the code for actual implementation.

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...