Binary-CrossEntropy - funciona em Keras mas não em lasanha?

votos
0

Im usando a mesma estrutura de rede neural convolutional em Keras e lasanha. Agora, eu só mudou para uma rede simples para ver se ele mudou nada, mas não deixaram.

Em Keras ele funciona bem, ele produz valores entre 0 e 1, com uma boa precisão. Em lasanha, os valores não vêm principalmente errado. Parece que a saída é a mesma que a de entrada.

Basicamente: Ele produz e treinar bem em keras. Mas não na minha versão lasanha

Estrutura Em lasanha:

def structure(w=5, h=5):
    try:

        input_var = T.tensor4('inputs')
        target_var = T.bmatrix('targets')

        network = lasagne.layers.InputLayer(shape=(None, 1, h, w), input_var=input_var)

        network = lasagne.layers.Conv2DLayer(
            network, num_filters=64, filter_size=(3, 3), stride=1, pad=0,
            nonlinearity=lasagne.nonlinearities.rectify,
            W=lasagne.init.GlorotUniform())

        network = lasagne.layers.Conv2DLayer(
            network, num_filters=64, filter_size=(3, 3), stride=1, pad=0,
            nonlinearity=lasagne.nonlinearities.rectify,
            W=lasagne.init.GlorotUniform())

        network = lasagne.layers.MaxPool2DLayer(network, pool_size=(2, 2), stride=None, pad=(0, 0), ignore_border=True)

        network = lasagne.layers.DenseLayer(
            lasagne.layers.dropout(network, p=0.5),
            num_units=256,
            nonlinearity=lasagne.nonlinearities.rectify, W=lasagne.init.GlorotUniform())

        network = lasagne.layers.DenseLayer(
            lasagne.layers.dropout(network, p=0.5),
            num_units=1,
            nonlinearity=lasagne.nonlinearities.sigmoid)

        print  ...Output, lasagne.layers.get_output_shape(network)

        return network, input_var, target_var

    except Exception as inst:
        print (Failure to Build NN !, inst.message, (type(inst)), (inst.args), (inst))

    return None

Em Keras:

def getModel(w,h):
    from keras.models import Sequential
    from keras.layers import Dense, Dropout, Activation, Flatten
    from keras.layers import Convolution2D, MaxPooling2D
    from keras.optimizers import SGD

    model = Sequential()

    model.add(Convolution2D(64, 3, 3, border_mode='valid', input_shape=(1, h, w)))
    model.add(Activation('relu'))
    model.add(Convolution2D(64, 3, 3))
    model.add(Activation('relu'))
    model.add(MaxPooling2D(pool_size=(2, 2)))
    model.add(Dropout(0.25))

    model.add(Convolution2D(128, 3, 3, border_mode='valid'))
    model.add(Activation('relu'))
    model.add(Convolution2D(128, 3, 3))
    model.add(Activation('relu'))
    model.add(MaxPooling2D(pool_size=(2, 2)))
    model.add(Dropout(0.25))
    #
    model.add(Flatten())
    #
    model.add(Dense(256))
    model.add(Activation('relu'))
    model.add(Dropout(0.25))

    model.add(Dense(128))
    model.add(Activation('relu'))
    model.add(Dropout(0.25))

    #
    model.add(Dense(1))
    model.add(Activation('sigmoid'))

    sgd = SGD(lr=0.001, decay=1e-6, momentum=0.9, nesterov=True)
    model.compile(loss='binary_crossentropy', optimizer='sgd')

    return model

E para treinar na Keras ..

model.fit(x, y, batch_size=512, nb_epoch=500, verbose=2, validation_split=0.2, shuffle=True, show_accuracy=True)

E para treinar e prever a lasanha:

Treinar :

prediction = lasagne.layers.get_output(network)

loss = lasagne.objectives.binary_crossentropy(prediction, target_var)
loss = loss.mean()

params = lasagne.layers.get_all_params(network, trainable=True)

# updates = lasagne.updates.sgd(loss, params, learning_rate=learning_rate)
updates = lasagne.updates.nesterov_momentum(loss_or_grads=loss, params=params, learning_rate=learning_rate, momentum=momentum_rho)

#
test_prediction = lasagne.layers.get_output(network, deterministic=True)

test_prediction = lasagne.layers.get_output(network, deterministic=True)
test_loss = lasagne.objectives.binary_crossentropy(test_prediction, target_var)
test_loss = test_loss.mean()

# Accuracy
test_acc = lasagne.objectives.binary_accuracy(test_prediction, target_var)
test_acc = test_acc.mean()

train_fn = theano.function([input_var, target_var], loss, updates=updates)
val_fn = theano.function([input_var, target_var], [test_loss, test_acc])

E im usando esses iteradores que espero is not a causa disso .. Talvez seja?

def iterate_minibatches_getOutput(self, inputs, batchsize):
    for start_idx in range(0, len(inputs) - batchsize + 1, batchsize):
        excerpt = slice(start_idx, start_idx + batchsize)
        yield inputs[excerpt]

def iterate_minibatches(self, inputs, targets, batchsize, shuffle=False):
    assert len(inputs) == len(targets)
    if shuffle:
        indices = np.arange(len(inputs))
        np.random.shuffle(indices)
    for start_idx in range(0, len(inputs) - batchsize + 1, batchsize):
        if shuffle:
            excerpt = indices[start_idx:start_idx + batchsize]
        else:
            excerpt = slice(start_idx, start_idx + batchsize)
        yield inputs[excerpt], targets[excerpt]

Prever :

test_prediction = lasagne.layers.get_output(self.network, deterministic=True)
predict_fn = theano.function([self.input_var], test_prediction)


index = 0
for batch in self.iterate_minibatches_getOutput(inputs=submission_feature_x, batchsize=self.batch_size):
    inputs = batch
    y = predict_fn(inputs)
    start = index * self.batch_size
    end = (index + 1) * self.batch_size
    predictions[index * self.batch_size:self.batch_size * (index + 1)] = y
    index += 1

print debug -->, predictions[0:10]
print debug max ---->, np.max(predictions)
print debug min ----->, np.min(predictions)

Isso imprime:

debug --> [[ 0.3252553 ]
 [ 0.3252553 ]
 [ 0.3252553 ]
 [ 0.3252553 ]
 [ 0.3252553 ]
 [ 0.3252553 ]
 [ 0.3252553 ]
 [ 0.3252553 ]
 [ 0.3252553 ]
 [ 0.32534513]]
debug max ----> 1.0
debug min -----> 0.0

Os resultados são totalmente errado. No entanto, o que me confunde, é que ele produz bem em keras.

Além disso, o acc validação nunca muda:

Epoch 2 of 30 took 9.5846s
  Training loss:                0.22714619
  Validation loss:              0.17278196
  Validation accuracy:          95.85454545 %
Epoch 3 of 30 took 9.6437s
  Training loss:                0.22646923
  Validation loss:              0.17249792
  Validation accuracy:          95.85454545 %
Epoch 4 of 30 took 9.6464s
  Training loss:                0.22563262
  Validation loss:              0.17235395
  Validation accuracy:          95.85454545 %
Epoch 5 of 30 took 10.5069s
  Training loss:                0.22464556
  Validation loss:              0.17226825
  Validation accuracy:          95.85454545 %
...

Por favor ajude! O que estou fazendo de errado?


Estas são as formas sendo usados:

x_train.shape  (102746, 1, 17, 17)
y_train.shape  (102746, 1)
x_val.shape  (11416, 1, 17, 17)
y_val.shape  (11416, 1)
Publicado 15/04/2016 em 14:32
fonte usuário
Em outras línguas...                            


1 respostas

votos
3

O problema era:

target_var = T.bmatrix('targets')

Deveria estar :

target_var = T.fmatrix('targets')

Além disso, a taxa de aprendizagem foi muito baixa.

E no Script Keras, houve um outro erro:

sgd = SGD(lr=0.001, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='binary_crossentropy', optimizer='sgd')

Deveria estar :

sgd = SGD(lr=0.001, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='binary_crossentropy', optimizer=sgd)
Respondeu 17/04/2016 em 01:01
fonte usuário

Cookies help us deliver our services. By using our services, you agree to our use of cookies. Learn more