Coder Social home page Coder Social logo

Comments (8)

bfelbo avatar bfelbo commented on June 1, 2024

The probability output you listed don't make sense. Probabilities can never be negative. It'd look into that first before digging into anything else :)

from deepmoji.

bfelbo avatar bfelbo commented on June 1, 2024

I'm assuming you're supplying our own PRETRAINED_PATH to deepmoji_transfer() and not the default DeepMoji one?

from deepmoji.

bfelbo avatar bfelbo commented on June 1, 2024

This seems to be more of a usage question than a problem with the package so I'm closing it here. However, I'm happy to help you brainstorm what the issue could be if you provide more info.

from deepmoji.

tiimoS avatar tiimoS commented on June 1, 2024

@bfelbo thanks for the feedback!
I have setup everything from scratch again and now the probabilities are all positive :)

I'm assuming you're supplying our own PRETRAINED_PATH to deepmoji_transfer() and not the default DeepMoji one?
Yes, I have used finetune_dataset.py to train the model over a conversation dataset with 6 emotion labels. The finetuning worked alright and I got reported a test accuracy of about 70% at the end of the fine-tuning, which would be okay!

However, I observe strange behaviour when running the predictions:

  1. When I run my predictions on the same test that was used during fine-tuning, I only get a fraction of the accuracy promoted after fine-tuning (only around 5-20%). I use nearly the same code as in score_emojis.py, the only difference being the usage of deepmoji_transfer instead of deepmoji_emojis and the PRETRAINED_PATH points to my fine-tuned .hdf5 weights file. I cannot explain myself the difference in the accuracy and performance of the model. Do you have any idea?

One initial error was that I was using a different max_len in the finetuning process and in the prediction. Now, I am using the same max_len which improved the results a bit.

class Predictor:
    vocabulary = set_vocabulary()
    max_len=20


    def __init__(self):
        self.model = deepmoji_transfer(nb_classes=6, maxlen=Predictor.max_len, weight_path=FINETUNED_MODEL_PATH, extend_embedding=1058)
        self.tokenizer = SentenceTokenizer(Predictor.vocabulary, Predictor.max_len)
        self.emotions_dict = {0: 'angry', 1: 'disgusted', 2: 'afraid', 3: 'joyful', 4: 'sad', 5: 'surprised'}

    def predict_emotions(self, message):
        message = unicode(message)
        tokenized, _, _ = self.tokenizer.tokenize_sentences([message])
        prob = self.model.predict(tokenized)[0]
        top_prediction_idx_sorted = top_elements(prob, 6)
        prediction_labels_sorted = [self.emotions_dict[idx] for idx in top_prediction_idx_sorted]
        probabilities_sorted = [prob[idx] for idx in top_prediction_idx_sorted]

        return prediction_labels_sorted, probabilities_sorted


def top_elements(array, k):
    ind = np.argpartition(array, -k)[-k:]
    return ind[np.argsort(array[ind])][::-1]

All probabilities predicted remain below 20%, which is pretty bad :S

  1. The predictions produced by prob = self.model.predict(tokenized)[0] are not deterministic. When I initialize the model multiple times, I get different predictions on the same sentence with each model. I don't think this behaviour is expected?

Any support is highly appreciated !

from deepmoji.

bfelbo avatar bfelbo commented on June 1, 2024
The predictions produced by prob = self.model.predict(tokenized)[0] are not deterministic. When I initialize the model multiple times, I get different predictions on the same sentence with each model. I don't think this behaviour is expected?

This sounds strange and definitely not intended. My guess would be that the model is not loading all your weights correctly and some weights in the model are getting randomly initialized, leading to this non-deterministic behavior. Make sure you check that the weights for your final softmax layer are being loaded correctly :)

from deepmoji.

tiimoS avatar tiimoS commented on June 1, 2024

Make sure you check that the weights for your final softmax layer are being loaded correctly :)

I assume this would probably be the code snippet below from the model_def.py file. I see that the softmax layer is being excluded when you pass your own weights. I have removed this and now the model is deterministic! Thanks a lot for the help, highly appreciated!
Any reason why this layer is being excluded in the first place?

def deepmoji_transfer(nb_classes, maxlen, weight_path=None, extend_embedding=0,
                      embed_dropout_rate=0.25, final_dropout_rate=0.5,
                      embed_l2=1E-6):
    model = deepmoji_architecture(nb_classes=nb_classes,
                                  nb_tokens=NB_TOKENS + extend_embedding,
                                  maxlen=maxlen, embed_dropout_rate=embed_dropout_rate,
                                  final_dropout_rate=final_dropout_rate, embed_l2=embed_l2)

    if weight_path is not None:
        load_specific_weights(model, weight_path,
                              exclude_names=['softmax'],
                              extend_embedding=extend_embedding)
    return model

from deepmoji.

bfelbo avatar bfelbo commented on June 1, 2024

It's assumed that the deepmoji_transfer method is called before doing the transfer learning (i.e. training a new softmax layer). See this description at the top of the method:

Loads the pretrained DeepMoji model for finetuning/transfer learning. Does not load weights for the softmax layer.

If you're loading all the weights, you could use deepmoji_architecture() directly to define the architecture. You can then call load_specific_weights() with exclude_names=[] or use Keras' model.load_weights() to load all weights.

from deepmoji.

tiimoS avatar tiimoS commented on June 1, 2024

Thanks a lot for the support! I am now using deepmoji_architecture() and load_specific_weights() as you proposed. Now everything works smoothly.

from deepmoji.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.