Capsule-Network-Tutorial
This is easy-to-follow Capsule Network tutorial with clean readable code: Capsule Network.ipynb
Dynamic Routing Between Capsules
Understanding Hinton’s Capsule Networks blog posts:
Category: Python / Deep Learning |
Watchers: 11 |
Star: 748 |
Fork: 135 |
Last update: Aug 14, 2022 |
This is easy-to-follow Capsule Network tutorial with clean readable code: Capsule Network.ipynb
Dynamic Routing Between Capsules
Understanding Hinton’s Capsule Networks blog posts:
class PrimaryCaps(nn.Module): def __init__(self, num_capsules=8, in_channels=256, out_channels=32, kernel_size=9):
Per paper: "The second layer (PrimaryCapsules) is a convolutional capsule layer with 32 channels of convolutional 8D capsules (i.e. each primary capsule contains 8 convolutional units with a 9 × 9 kernel and a stride of 2)"
Which indicate num_capsules should be 32 and out_channels should be 8.
Awesome repo. I've been trying to implement an MLP based caps net, where the primary caps analyzes subsets of features in groups, feeding them into MLPs per capsule then pushed to downstream capsules.
For some reason, I am getting terrible results. Do you know what I might be doing wrong?
What are areas that make MLP capsules different from convolutional primary capsules, what should I be aware of? How do I optimize to get good results. I think dynamic routing is essential for establishment of hierarchy, but in the non-image context, I'm unsure of how the affine transformation is helpful.
Can you post code on how one might do this using MLP instead of convolutional layers?
Fix issue 13 https://github.com/higgsfield/Capsule-Network-Tutorial/issues/13
I wanted to try working on CIFAR10, I modified the channel number =3 in convLayer e the kernel dim = 24 in convLayer. But it is not working in the primaryCaps u = u.view(x.size(0), 32 * 6 * 6, -1) that is giving to me errot :
ipython-input-4-7b4e2b87bd5c> in forward(self, x) 14 print( "PrimaryCaps {}".format(x.size())) 15 #u = u.view(x.size(0), 32 * 6 * 6, -1) ---> 16 u = u.view(x.size(0), 32 * 6 * 6, -1) 17 return self.squash(u) 18 RuntimeError: invalid argument 2: size '[100 x 1152 x -1]' is invalid for input with 204800 elements at /pytorch/aten/src/TH/THStorage.c:37
Do you have any advise for me?
hello,I copy the code in the pycharm,but I found that the weight are 'nan',why?
Similar to others, I found many issues with this implementation (a lot of mistakes!!!). So I decided to create my own one. It is bug-free and works very good. You can find it here:
https://github.com/hula-ai/capsule_network_dynamic_routing
To handle
IndexError:_ invalid index of a 0-dim tensor. Use tensor.item() to convert a 0-dim tensor to a Python number
I don't know why the train accuracy is becoming lower and lower with the epoch decreasing. Besides, the margin loss and reconstruction loss can't be caculated.
In the latest version of Pytorch loss.data returns the value, loss.data[0] throws an error.
Hi, I think the softmax in the routing algorithm is being calculated over the wrong dimension.
Currently the code has:
b_ij = Variable(torch.zeros(1, self.num_routes, self.num_capsules, 1))
...
for iteration in range(num_iterations):
c_ij = F.softmax(b_ij)
and since the dim parameter is not passed to the F.softmax
call it will choose dim=1 and compute the softmax over the self.num_routes
dimension (input caps or 1152 here) whereas the softmax should be computed so that the c_ij between each input capsule and all the capsules in the next layer should sum to 1.
Thus the correct call should be:
c_ij = F.softmax(b_ij, dim=2)