Hi, I have been trying to use DeepMoji to predict emotion labels for a given text.

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

<div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clip

Fine-tuning to predict emotion labels about deepmoji HOT 8 CLOSED

bfelbo commented on June 1, 2024

Fine-tuning to predict emotion labels

from deepmoji.

Comments (8)

bfelbo commented on June 1, 2024

The probability output you listed don't make sense. Probabilities can never be negative. It'd look into that first before digging into anything else :)

from deepmoji.

bfelbo commented on June 1, 2024

I'm assuming you're supplying our own PRETRAINED_PATH to deepmoji_transfer() and not the default DeepMoji one?

from deepmoji.

bfelbo commented on June 1, 2024

This seems to be more of a usage question than a problem with the package so I'm closing it here. However, I'm happy to help you brainstorm what the issue could be if you provide more info.

from deepmoji.

tiimoS commented on June 1, 2024

@bfelbo thanks for the feedback!
I have setup everything from scratch again and now the probabilities are all positive :)

I'm assuming you're supplying our own PRETRAINED_PATH to deepmoji_transfer() and not the default DeepMoji one?
Yes, I have used finetune_dataset.py to train the model over a conversation dataset with 6 emotion labels. The finetuning worked alright and I got reported a test accuracy of about 70% at the end of the fine-tuning, which would be okay!

However, I observe strange behaviour when running the predictions:

When I run my predictions on the same test that was used during fine-tuning, I only get a fraction of the accuracy promoted after fine-tuning (only around 5-20%). I use nearly the same code as in score_emojis.py, the only difference being the usage of deepmoji_transfer instead of deepmoji_emojis and the PRETRAINED_PATH points to my fine-tuned .hdf5 weights file. I cannot explain myself the difference in the accuracy and performance of the model. Do you have any idea?

One initial error was that I was using a different max_len in the finetuning process and in the prediction. Now, I am using the same max_len which improved the results a bit.

class Predictor:
    vocabulary = set_vocabulary()
    max_len=20


    def __init__(self):
        self.model = deepmoji_transfer(nb_classes=6, maxlen=Predictor.max_len, weight_path=FINETUNED_MODEL_PATH, extend_embedding=1058)
        self.tokenizer = SentenceTokenizer(Predictor.vocabulary, Predictor.max_len)
        self.emotions_dict = {0: 'angry', 1: 'disgusted', 2: 'afraid', 3: 'joyful', 4: 'sad', 5: 'surprised'}

    def predict_emotions(self, message):
        message = unicode(message)
        tokenized, _, _ = self.tokenizer.tokenize_sentences([message])
        prob = self.model.predict(tokenized)[0]
        top_prediction_idx_sorted = top_elements(prob, 6)
        prediction_labels_sorted = [self.emotions_dict[idx] for idx in top_prediction_idx_sorted]
        probabilities_sorted = [prob[idx] for idx in top_prediction_idx_sorted]

        return prediction_labels_sorted, probabilities_sorted


def top_elements(array, k):
    ind = np.argpartition(array, -k)[-k:]
    return ind[np.argsort(array[ind])][::-1]

All probabilities predicted remain below 20%, which is pretty bad :S

The predictions produced by prob = self.model.predict(tokenized)[0] are not deterministic. When I initialize the model multiple times, I get different predictions on the same sentence with each model. I don't think this behaviour is expected?

Any support is highly appreciated !

from deepmoji.

bfelbo commented on June 1, 2024

The predictions produced by prob = self.model.predict(tokenized)[0] are not deterministic. When I initialize the model multiple times, I get different predictions on the same sentence with each model. I don't think this behaviour is expected?

This sounds strange and definitely not intended. My guess would be that the model is not loading all your weights correctly and some weights in the model are getting randomly initialized, leading to this non-deterministic behavior. Make sure you check that the weights for your final softmax layer are being loaded correctly :)

from deepmoji.

tiimoS commented on June 1, 2024

Make sure you check that the weights for your final softmax layer are being loaded correctly :)

I assume this would probably be the code snippet below from the model_def.py file. I see that the softmax layer is being excluded when you pass your own weights. I have removed this and now the model is deterministic! Thanks a lot for the help, highly appreciated!
Any reason why this layer is being excluded in the first place?

def deepmoji_transfer(nb_classes, maxlen, weight_path=None, extend_embedding=0,
                      embed_dropout_rate=0.25, final_dropout_rate=0.5,
                      embed_l2=1E-6):
    model = deepmoji_architecture(nb_classes=nb_classes,
                                  nb_tokens=NB_TOKENS + extend_embedding,
                                  maxlen=maxlen, embed_dropout_rate=embed_dropout_rate,
                                  final_dropout_rate=final_dropout_rate, embed_l2=embed_l2)

    if weight_path is not None:
        load_specific_weights(model, weight_path,
                              exclude_names=['softmax'],
                              extend_embedding=extend_embedding)
    return model

from deepmoji.

bfelbo commented on June 1, 2024

It's assumed that the deepmoji_transfer method is called before doing the transfer learning (i.e. training a new softmax layer). See this description at the top of the method:

Loads the pretrained DeepMoji model for finetuning/transfer learning. Does not load weights for the softmax layer.

If you're loading all the weights, you could use deepmoji_architecture() directly to define the architecture. You can then call load_specific_weights() with exclude_names=[] or use Keras' model.load_weights() to load all weights.

from deepmoji.

tiimoS commented on June 1, 2024

Thanks a lot for the support! I am now using deepmoji_architecture() and load_specific_weights() as you proposed. Now everything works smoothly.

from deepmoji.

Fine-tuning to predict emotion labels about deepmoji HOT 8 CLOSED

Comments (8)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent