Comments (8)
The probability output you listed don't make sense. Probabilities can never be negative. It'd look into that first before digging into anything else :)
from deepmoji.
I'm assuming you're supplying our own PRETRAINED_PATH
to deepmoji_transfer()
and not the default DeepMoji one?
from deepmoji.
This seems to be more of a usage question than a problem with the package so I'm closing it here. However, I'm happy to help you brainstorm what the issue could be if you provide more info.
from deepmoji.
@bfelbo thanks for the feedback!
I have setup everything from scratch again and now the probabilities are all positive :)
I'm assuming you're supplying our own
PRETRAINED_PATH
todeepmoji_transfer()
and not the default DeepMoji one?
Yes, I have used finetune_dataset.py to train the model over a conversation dataset with 6 emotion labels. The finetuning worked alright and I got reported a test accuracy of about 70% at the end of the fine-tuning, which would be okay!
However, I observe strange behaviour when running the predictions:
- When I run my predictions on the same test that was used during fine-tuning, I only get a fraction of the accuracy promoted after fine-tuning (only around 5-20%). I use nearly the same code as in score_emojis.py, the only difference being the usage of deepmoji_transfer instead of deepmoji_emojis and the PRETRAINED_PATH points to my fine-tuned .hdf5 weights file. I cannot explain myself the difference in the accuracy and performance of the model. Do you have any idea?
One initial error was that I was using a different max_len in the finetuning process and in the prediction. Now, I am using the same max_len which improved the results a bit.
class Predictor:
vocabulary = set_vocabulary()
max_len=20
def __init__(self):
self.model = deepmoji_transfer(nb_classes=6, maxlen=Predictor.max_len, weight_path=FINETUNED_MODEL_PATH, extend_embedding=1058)
self.tokenizer = SentenceTokenizer(Predictor.vocabulary, Predictor.max_len)
self.emotions_dict = {0: 'angry', 1: 'disgusted', 2: 'afraid', 3: 'joyful', 4: 'sad', 5: 'surprised'}
def predict_emotions(self, message):
message = unicode(message)
tokenized, _, _ = self.tokenizer.tokenize_sentences([message])
prob = self.model.predict(tokenized)[0]
top_prediction_idx_sorted = top_elements(prob, 6)
prediction_labels_sorted = [self.emotions_dict[idx] for idx in top_prediction_idx_sorted]
probabilities_sorted = [prob[idx] for idx in top_prediction_idx_sorted]
return prediction_labels_sorted, probabilities_sorted
def top_elements(array, k):
ind = np.argpartition(array, -k)[-k:]
return ind[np.argsort(array[ind])][::-1]
All probabilities predicted remain below 20%, which is pretty bad :S
- The predictions produced by prob = self.model.predict(tokenized)[0] are not deterministic. When I initialize the model multiple times, I get different predictions on the same sentence with each model. I don't think this behaviour is expected?
Any support is highly appreciated !
from deepmoji.
The predictions produced by prob = self.model.predict(tokenized)[0] are not deterministic. When I initialize the model multiple times, I get different predictions on the same sentence with each model. I don't think this behaviour is expected?
This sounds strange and definitely not intended. My guess would be that the model is not loading all your weights correctly and some weights in the model are getting randomly initialized, leading to this non-deterministic behavior. Make sure you check that the weights for your final softmax layer are being loaded correctly :)
from deepmoji.
Make sure you check that the weights for your final softmax layer are being loaded correctly :)
I assume this would probably be the code snippet below from the model_def.py file. I see that the softmax layer is being excluded when you pass your own weights. I have removed this and now the model is deterministic! Thanks a lot for the help, highly appreciated!
Any reason why this layer is being excluded in the first place?
def deepmoji_transfer(nb_classes, maxlen, weight_path=None, extend_embedding=0,
embed_dropout_rate=0.25, final_dropout_rate=0.5,
embed_l2=1E-6):
model = deepmoji_architecture(nb_classes=nb_classes,
nb_tokens=NB_TOKENS + extend_embedding,
maxlen=maxlen, embed_dropout_rate=embed_dropout_rate,
final_dropout_rate=final_dropout_rate, embed_l2=embed_l2)
if weight_path is not None:
load_specific_weights(model, weight_path,
exclude_names=['softmax'],
extend_embedding=extend_embedding)
return model
from deepmoji.
It's assumed that the deepmoji_transfer
method is called before doing the transfer learning (i.e. training a new softmax layer). See this description at the top of the method:
Loads the pretrained DeepMoji model for finetuning/transfer learning. Does not load weights for the softmax layer.
If you're loading all the weights, you could use deepmoji_architecture()
directly to define the architecture. You can then call load_specific_weights()
with exclude_names=[]
or use Keras' model.load_weights()
to load all weights.
from deepmoji.
Thanks a lot for the support! I am now using deepmoji_architecture()
and load_specific_weights()
as you proposed. Now everything works smoothly.
from deepmoji.
Related Issues (20)
- Live Demo is down HOT 3
- How did you relate data labels to emojis HOT 1
- Returning "confidence" for each predicted emoji?
- [Errno 2] No such file or directory: 'model/deepmoji_weights.hdf5' HOT 4
- How to start training deepmoji on a new language corpus? HOT 2
- Building similar model for Hindi(India) language. HOT 1
- Is training dataset available? HOT 1
- How are emojis handled? How are they encoded? HOT 1
- Deploy model in mobile. HOT 2
- How is emotional impact of words highlighted? HOT 5
- Is the training data private? HOT 1
- index error with SCv1 and SCv2-GEN
- Benchmark Dataset Splits
- Deepmoji Live demo is down HOT 2
- cannot connect to remote server HOT 1
- cannot connect to remote server HOT 1
- how to return impact associated to each word per sentence HOT 17
- confusion in finetuning script
- the link https://deepmoji.mit.edu/ is broken HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from deepmoji.