Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Description of architecture #6

Open
wbrandenburger opened this issue Jun 25, 2019 · 2 comments
Open

Description of architecture #6

wbrandenburger opened this issue Jun 25, 2019 · 2 comments
Labels
question Further information is requested

Comments

@wbrandenburger
Copy link
Collaborator

wbrandenburger commented Jun 25, 2019

I need much more information about the architecture. I think it can be meaningful, f we have always a short table which contains the successively applied layer (convolutional (and strides), (un)pooling layers, ReLU functions, skip connections as well as input size of images and resulting feature maps). Can you create a document for the latest architecture, which describe this sequence. With a current description we may discuss with matthias a little bit better.

@wbrandenburger wbrandenburger added discussion Issues to discuss question Further information is requested and removed discussion Issues to discuss labels Jun 25, 2019
@HannesStark
Copy link
Owner

ENCODER

Input: 256x256x3

Layer 1: Convolution with 32 filters, kernelsize 3, strides of 2, activation fuction Relu
Result: 128x128x32

Layer 2: Convolution with 64 filters, kernelsize 3, strides of 2, activation fuction Relu
Result: 64x64x64

Layer 3: Convolution with 64 filters, kernelsize 3, strides of 2, activation fuction Relu
Result: 32x32x64

Layer 4: Flatten
Result: 65 536

Layer 5: Dense to 256 with no activation function
Result 256

Die Hälfte (128) von Layer5 wird als Mittelwert und die andere Hälfte als Standardabweichung genommen und damit aus der Normalverteilung 128 Werte gesampled:
Result 128

DECODER

Input: 128

Layer1: Dense to 32^3 with activation function Relu
Result: 32^3 = 32.768

Layer2: Reshape to 32x32x32
Result: 32x32x32

Layer3: Convolution Transposed with 32 filters, kernelsize 3, strides of 2, activation function Relu
Result: 64x64x32

Layer4: Convolution Transposed with 32 filters, kernelsize 3, strides of 2, activation function Relu
Result: 128x128x32

Layer5: Convolution Transposed with 32 filters, kernelsize 3, strides of 2, activation function Relu
Result: 256x256x32

Layer6: Convolution Transposed with 3 filters, kernelsize 3, strides of 1, no activation function
Result: 256x256x3

@HannesStark
Copy link
Owner

Zum vergleich ist hier die Architektur von dem CVAE für MNIST der gut nachstellbar funktioniert

ENCODER

Input: 28x28x1

Layer 1: Convolution with 32 filters, kernelsize 3, strides of 2, activation fuction Relu
Result: 14x14x32

Layer 2: Convolution with 64 filters, kernelsize 3, strides of 2, activation fuction Relu
Result: 7x7x64

Layer 3: Flatten
Result: 3.136

Layer 4: Dense to 50 with no activation function
Result 50

Die Hälfte (25) von Layer4 wird als Mittelwert und die andere Hälfte als Standardabweichung genommen und damit aus der Normalverteilung 128 Werte gesampled:
Result 25

DECODER

Input: 25

Layer1: Dense to 7732 with activation function Relu
Result: 7732 = 1 568

Layer2: Reshape to 7x7x32
Result: 7x7x32

Layer3: Convolution Transposed with 64 filters, kernelsize 3, strides of 2, activation function Relu
Result: 14x14x64

Layer4: Convolution Transposed with 32 filters, kernelsize 3, strides of 2, activation function Relu
Result: 28x28x32

Layer5: Convolution Transposed with 1 filter, kernelsize 3, strides of 1, no activation function
Result: 28x28x1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants