Coder Social home page Coder Social logo

florentf9 / deeptemporalclustering Goto Github PK

View Code? Open in Web Editor NEW
219.0 219.0 58.0 189 KB

:chart_with_upwards_trend: Keras implementation of the Deep Temporal Clustering (DTC) model

License: MIT License

Python 100.00%
clustering deep-learning dtc time-series

deeptemporalclustering's Introduction

Florent's GitHub stats Florent's top languages

deeptemporalclustering's People

Contributors

arthurgsf avatar florentf9 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

deeptemporalclustering's Issues

how to load model.h5

Hi Florent, I've got some doc vectors,and tried to fit the model.
Now I got some trained models( named like 'model_40.h5'). I want to see the details of clustering or predict new doc vectors, how could I load these DTC models?
Many thanks.

Practical Use

While the theory is interesting and has some application, using what you've made is difficult. I would like to see an example

Loss interpretation

Hi,
Isnt the order of losses like loss[0] is reconstruction loss, loss[1] clustering loss and loss[2] heatmap loss looking at the compile function for loss?

The below order in the code looks incorrect. Please confirm
logdict['L'] = loss[0]
logdict['Lr'] = loss[1]
logdict['Lc'] = loss[2]

Heatmap issue

I am getting below heatmap for the above data window where there is no event but the heatmap value is close to 15.

image image

When there is an event in the data window the heatmap threshold is 15 which is same as the above threshold when there is no event in the data window

image image

Any suggestions in generating the heatmap graph?

CuDNNLSTM not found

Hey,

actually I'm using the newest Version of Keras (2.4.0) and Tensorflow (2.4), in this case I can’t load CuDNNLSTM (TAE.py).

I also tried to install an older version of Keras (2.3.0) and Tensorflow which includes CuDNNLSTM, but in this case there are new errors as well.

Can you create a new Requirements.txt which include all libraries with their specific version you use.

I hope you can help me

Agglomerative Clustering without n_clusters

I am testing this out for a music-similarity dataset, which does not have a defined number of clusters. Would your DTC library work the same for use with Agglomerative Clustering where {n_clusters=None, distance_threshold=d, compute_full_tree=True}?

It would seem that TSClusteringLayer and heatmap generation require n_clusters.

input shape

hello ,
I am trying to replicate this DTC with my data off 25000 time steps of single series and 17 features, but when I pass it to the encoder the time steps are reducing but not the features, i tried to transpose the input data but I get dimension error.

can anyone guide me with what is the correct input dimension for the encoder.

data shape = (25000,17),
reshaped = (25000,1 17)??

Heatmap use

Could you please explain how to visualize the heatmap weights on a time series ?

About the loss value

When I usepython DeepTemporalClustering.py --heatmap False --dist_metric cid --dataset CBF --pool_size 8to train the DTC, no matter what dataset used, loss value always suddenly increases on the 8th epoch, looks like weight of Lr and Lc also change with epoch, which really confuses me
746f18f1e6f49fd74b2c6c4de1dfcaf
7676fc6860a4d01398f565615df05a7
Looking forward to your reply,thank you!

Dimension Reduction

Hi, Thank you very much for your implementation. While looking into the autoencoder architecture I didn't understand how this is doing dimension reduction. As the encoder is passing the same vector because the return sequence value is True. I am sorry If I didn't fully understand your implementation
As I am working on an encoder to encode the time series data my data is somehow looking like 200 numbers of samples and each sample has 1500 points (acceleration signal from measurement ). (200,1500) I want to encode these 200 responses to latent space let say 2 or 4 dim. The output of the latent variable should look like this (2,1500).
Can you help me out here on how to use this architecture?
Zohaib

ValueError: Input 0 is incompatible with layer AE: expected shape=(None, 5210, 6), found shape=(None, 6)

Hello,
Thank you very much for the code.
I am having some issues using your classes with my own program (also when running the main code in DeepTemporalClustering.py).
My code so far:

dataSource = web.DataReader('S68.SI', 'yahoo', start=start_date, end=end_date)
X_train = dataSource.to_numpy()
# Some constant values
n_clusters = 2
pretrain_optimizer = 'adam'
optimizer = 'adam'
batch_size = 64
# Initialize model
dtc.initialize()
dtc.model.summary()
dtc.compile(gamma=1.0, optimizer=optimizer, initial_heatmap_loss_weight=0.1, final_heatmap_loss_weight=0.9)
# Pre train
dtc.pretrain(X=X_train, optimizer=pretrain_optimizer,
                     epochs=10, batch_size=batch_size,
                     save_dir='results/tmp')

At this point I'm getting the above mentioned error.

The X_train shape is 5210 by 6. i.e. 5210 timesteps and 6 features.

Upon investigation it seems that this line of code in TAE.py is causing the problem:
x = Input(shape=(timesteps, input_dim), name='input_seq')
Is it necessary for the Input shape to include the timesteps?
I checked online and it seems that only the features should be part of the input shape. Is this correct?

Thank you and regards.

Nan: Predicted Value

Hi,
I got all nan predicted values when running your code. In DTC.fit, everything goes well when calculating p and q for the first time just after "init_cluster_weights". But when it reached "model.fit(X_train, [X_train, p]" for the first time and then got predicted value in the next iteration, all predicted values turned to be nan.

I have tried several ways to modify it, including:
enlarge batchsize
reduce learning rate of Adam
grad clip
redefine the KL loss function to avoid log(0)

I am sure there is no nan or inf in the input data. Could you please help me to solve the problem?

Looking forward to your reply and HAPPY CHRISTMAS HOLIDAY!

Dependency Problems with cudnn and Tensorflow

Hi, I have been trying to use this implementation to cluster some time series data in education domain. Keras in my machine is using tensorflow backend (1.13.1) instead of Theano.My Cudnn version was 7.6 and cuda toolkit was 10.0.
Running the code with such configuration in an Anaconda environment is giving me the following error (while it prints epoch 1/10 on the terminal):
"Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR"
Can you please tell me whether the implementation has any version dependencies of tensorflow and cuda toolkit. It tried to downgrade tensorflow to 1.12.0. But then I had to downgrade cudda toolkit as well and it still did not work.

variable time step

Hi in the documentation for the DTC object, found in DeepTemporalClustering.py, it is indicated that the timesteps param can be variable. However when I instantiate as follows:

dtc = DTC(n_clusters=3,
input_dim=X_train.shape[-1],
timesteps=None,
n_filters=50,
kernel_size=10,
strides=1,
pool_size=None,
n_units=[50, 1],
alpha=1,
dist_metric='eucl',
cluster_init='kmeans',
heatmap=False)

I get an error. There is an assert which brings up a typeError.

TypeError: unsupported operand type(s) for %: 'NoneType' and 'int'

Should I be using 0 instead of None?

Heatmap

Hello, how can I see the generated heatmap? I set --heapmap to true, but I don't see the generated heatmap.

Assertion error

Hello, I have an assertion error when I run your code. I don't know how to modify it, and I don't know the function of this assertion. Can you explain it?

Problem with Autoencoder Dimensions

Hello, I'm trying to replicate your examples but keep getting this error on the output dimensions of the autoencoder.

Pretraining...
Traceback (most recent call last):
  File "DeepTemporalClustering.py", line 535, in <module>
    save_dir=args.save_dir)
  File "DeepTemporalClustering.py", line 313, in pretrain
    self.autoencoder.fit(X, X, batch_size=batch_size, epochs=epochs, verbose=verbose)
  File "C:\Users\Computer\Anaconda3\lib\site-packages\keras\engine\training.py", line 1154, in fit
    batch_size=batch_size)
  File "C:\Users\Computer\Anaconda3\lib\site-packages\keras\engine\training.py", line 621, in _standardize_user_data
    exception_prefix='target')
  File "C:\Users\Computer\Anaconda3\lib\site-packages\keras\engine\training_utils.py", line 145, in standardize_input_data
    str(data_shape))
ValueError: Error when checking target: expected output_seq to have shape (6400, 1) but got array with shape (128, 1)

The autoencoder output is expecting 6400 = 128 (timesteps) x 50 (n_filter). I know its in the autoencoder because I checked the output dimensions of encoder, decoder and autoencoder:

image

I tried replacing it with the

output = Conv1D(1, kernel_size, strides=strides, padding='same', activation='linear', name='output_seq')(decoded)

line that was commented out in TAE.py but that just returned another error:

ValueError: Input 0 is incompatible with layer output_seq: expected ndim=3, found ndim=4

I also tried using temporal_autoencoder_v2 in TAE.py but that just returned another shape error:

ValueError: Input 0 is incompatible with layer dense: expected shape=(None, 16, 100), found shape=(None, 16, 2)

I am very cautious of playing with the architecture too much as I want to be able to replicate the results. Any suggestions on what to try?

ValueError: The name "reshape" is used 2 times in the model. All layer names should be unique.

Hi,

Running the code in terminal resulted in the following error. Do you happen to know what the problem is? Thanks.

(DeepTemporalClustering) e:\DeepTemporalClustering>python DeepTemporalClustering.py --heatmap true --n_clusters 2 --pool_size 8
Namespace(ae_weights=None, alpha=1.0, batch_size=64, cluster_init='kmeans', dataset='CBF', dist_metric='eucl', epochs=100, eval_epochs=1, final_heatmap_loss_weight=0.9, finetune_heatmap_at_epoch=8, gamma=1.0, heatmap=True, initial_heatmap_loss_weight=0.1, kernel_size=10, n_clusters=2, n_filters=50, n_units=[50, 1], patience=5, pool_size=8, pretrain_epochs=10, save_dir='results/tmp', save_epochs=10, strides=1, tol=0.001)
128
0
2021-01-31 18:22:27.823495: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations: AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
WARNING:tensorflow:AutoGraph could not transform <bound method TSClusteringLayer.call of <TSClusteringLayer.TSClusteringLayer object at 0x000001EAF288B9D0>> and will run it as-is.
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, export AUTOGRAPH_VERBOSITY=10) and attach the full output.
Cause: module 'gast' has no attribute 'Index'
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
Traceback (most recent call last):
File "DeepTemporalClustering.py", line 516, in
dtc.initialize()
File "DeepTemporalClustering.py", line 113, in initialize
self.model = Model(inputs=self.autoencoder.input,
File "E:\miniconda3_64\envs\DeepTemporalClustering\lib\site-packages\tensorflow\python\keras\engine\training.py", line 242, in new
return functional.Functional(*args, **kwargs)
File "E:\miniconda3_64\envs\DeepTemporalClustering\lib\site-packages\tensorflow\python\training\tracking\base.py", line 457, in _method_wrapper
result = method(self, *args, **kwargs)
File "E:\miniconda3_64\envs\DeepTemporalClustering\lib\site-packages\tensorflow\python\keras\engine\functional.py", line 115, in init
self._init_graph_network(inputs, outputs)
File "E:\miniconda3_64\envs\DeepTemporalClustering\lib\site-packages\tensorflow\python\training\tracking\base.py", line 457, in _method_wrapper
result = method(self, *args, **kwargs)
File "E:\miniconda3_64\envs\DeepTemporalClustering\lib\site-packages\tensorflow\python\keras\engine\functional.py", line 190, in _init_graph_network
nodes, nodes_by_depth, layers, _ = _map_graph_network(
File "E:\miniconda3_64\envs\DeepTemporalClustering\lib\site-packages\tensorflow\python\keras\engine\functional.py", line 941, in _map_graph_network
raise ValueError('The name "' + name + '" is used ' +
ValueError: The name "reshape" is used 2 times in the model. All layer names should be unique.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.