Coder Social home page Coder Social logo

redbud-tree-depression's People

Contributors

talhanai avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

redbud-tree-depression's Issues

Question about X_{train, dev} shape and content

Hi @talhanai,

The input tenor to the LSTM model is of shape [Nexamples, Ntimesteps, Nfeatures]. Nfeatures is the feature dimension (audio=279/text=100), but how do I make sense of Nexamples and Ntimesteps? I am guessing that they have to do with the parameters timestep and stride mentioned in the paper (Audio was 30 timesteps, with stride 1. Text was 7 timesteps, and stride 3). I would appreciate it if you could elaborate on how you used (timestep, stride) parameters to reshape your feature set to the LSTM input tensor.

Initially, I thought Nexamples referred to the number of responses, e.g. 8050 (from Section 4.1.2, 4.1.3), and thus, the number of responses for audio and the number of responses for text should be the same. But then this part in Section 4.3.2 confused me,
The audio and text inputs for each LSTM branch had different strides and timesteps yielding a different number of training (and development) examples, therefore we needed to equalize the number of examples (Audio was 30 timesteps, with stride 1. Text was 7 timesteps, and stride 3). This step was performed by padding the number of training examples in the smaller set (text) to match that larger set (audio) by mapping examples together that appeared in the same window of the interview..

Thanks in advance!

data/audio/x_train.npy

Hello, if there is any code that can generate training data, or if you want to write your own code based on your own articles

How to exclude some features ?

"From the initial set of 553 features, we excluded all features without a statistically significant univariate correlation with outcomes on the training set (|ρ| < 1e-01, p > 1e-02) nor a significant L1 regularized logistic regression model coefficient (|β| < 1e-04), thus resulting in a subset of 279 features and 8,050 examples (responses)"

How to exclude some features to get a subset of 279 features ?

Feature processing

How do you deal with the problem of the audio files containing the interviewer voice?How to get rid of the interviewer's voice ?how to extract the higher-order statistics features of 79 convarep features?

Question about validation and test data

Hi @talhanai,

I hope you can help me out with a question about your trainLSTM.py code.

# train model
model.fit(X_train, Y_train,
			batch_size=batch_size,
			epochs=epochs,
			validation_data=(X_dev, Y_dev),
			class_weight=cweight,
			callbacks=callbacks_list)

# load best model and evaluate
model.load_weights(filepath=filepath_best)

# gotta compile it
model.compile(loss=loss,
			optimizer=sgd,
			metrics=['accuracy'])

# return predictions of best model
pred        = model.predict(X_dev,   batch_size=None, verbose=0, steps=None)
pred_train  = model.predict(X_train, batch_size=None, verbose=0, steps=None)

return pred, pred_train
# 5. evaluate performance
f1 = metrics.f1_score(Y_dev, np.round(pred), pos_label=1)

Particularly, I am having trouble understanding why you are using X_dev and Y_dev as both validation data and test data. By using them for both validating and testing would result in data leakage.

From reading your paper, I understand that you were only working with the training set and development set of the DAIC dataset. So here, I am assuming that X_train, Y_train are from the training set and X_dev and Y_dev are from the development set.

Any insights would be very much appreciated!

Some problem for accuracy

Hi, sorry for bothering you. I try to reproduce the experimental results about your work, but I came across a problem. Without any change for your code, the predictions for train and validation are all negative samples:

  • dev_pred:[[0.30978903],........,[0.30978903]]
  • pred_train_adudio:[[0.30978903],........,[0.30978903]].

In particular, as for the training process, 1) the loss is optimized from 0.6555 to 0.5990; 2) the accuracy is fixed to 0.7143 the rate between negative samples and all samples.

Can you help me to solve this problem? Thank you so much~~^_^

How to calculate the mae, rmse metric?

Hi, i have read your paper. I am doubt that why the metrics (mae,rmse) value in your experiment >1? I am using the mae metric in Keras to train my model, but its value is between 0 and 1? Could you tell me how do you calculate the metrics in your paper?

How to get 8050 training examples (subject responses)?

I really appreciate that you published your code!

I am currently trying to replicate your feature generation process. Could you please elaborate on your process of narrowing down to the 8,050 examples from the training set? My understanding is that they are only the subject's responses to Ellie's queries. I am having difficulty in getting the exact number of training examples that you have.

Thanks in advance!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.