Coder Social home page Coder Social logo

seeg's Issues

The results do not match the paper.

Good job! However, when I tried to reproduce the results using the given settings, I found that the FGD model obtained did not match the data in the paper. Here is the training log. Can you identify what went wrong?

2023-04-03 19:20:41,227: PyTorch version: 1.4.0
2023-04-03 19:20:41,227: CUDA version: 10.1
2023-04-03 19:20:41,227: 2 GPUs, default cuda:0
2023-04-03 19:20:41,228: {'GAN_noise_size': 0,
'batch_size': 256,
'config': 'config/multimodal_context_beat_ep200.yml',
'discriminator_lr_weight': 0.2,
'dropout_prob': 0.3,
'epochs': 200,
'eval_net_path': 'output/train_h36m_gesture_autoencoder/gesture_autoencoder_checkpoint_best.bin',
'freeze_wordembed': False,
'hidden_size': 300,
'input_context': 'audio',
'learning_rate': 0.0005,
'loader_workers': 16,
'loss_gan_weight': 5.0,
'loss_kld_weight': 0.1,
'loss_reg_weight': 0.05,
'loss_regression_weight': 500.0,
'loss_warmup': 10,
'mean_dir_vec': [[0.0154009],
[-0.9690125],
[-0.0884354],
[-0.0022264],
[-0.8655276],
[0.4342174],
[-0.0035145],
[-0.8755367],
[-0.4121039],
[-0.9236511],
[0.3061306],
[-0.0012415],
[-0.5155854],
[0.8129665],
[0.0871897],
[0.2348464],
[0.1846561],
[0.8091402],
[0.9271948],
[0.2960011],
[-0.013189],
[0.5233978],
[0.8092403],
[0.0725451],
[-0.2037076],
[0.1924306],
[0.8196916]],
'mean_pose': [[3.06e-05],
[0.0004946],
[0.0008437],
[0.0033759],
[-0.2051629],
[-0.0143453],
[0.0031566],
[-0.3054764],
[0.0411491],
[0.0029072],
[-0.4254303],
[-0.001311],
[-0.1458413],
[-0.1505532],
[-0.0138192],
[-0.2835603],
[0.0670333],
[0.0107002],
[-0.2280813],
[0.112117],
[0.2087789],
[0.1523502],
[-0.1521499],
[-0.0161503],
[0.291909],
[0.0644232],
[0.0040145],
[0.2452035],
[0.1115339],
[0.2051307]],
'model': 'multimodal_context',
'model_save_path': 'output/train_beat_with-cat_emb_with-lrelu_200ep_dis-add-layer_2023-04-03-19_20_41',
'motion_resampling_framerate': 15,
'n_layers': 4,
'n_poses': 34,
'n_pre_poses': 4,
'name': 'multimodal_context',
'pose_representation': '3d_vec',
'random_seed': -1,
'save_result_video': True,
'subdivision_stride': 10,
'test_data_path': ['data/ted_dataset/lmdb_test'],
'train_data_path': ['data/ted_dataset/lmdb_train'],
'val_data_path': ['data/ted_dataset/lmdb_val'],
'wordembed_dim': 300,
'wordembed_path': 'data/fasttext/crawl-300d-2M-subword.bin',
'z_type': 'speaker'}
2023-04-03 19:20:41,228: Reading data 'data/ted_dataset/lmdb_train'...
2023-04-03 19:20:41,228: Found the cache data/ted_dataset/lmdb_train_cache
2023-04-03 19:20:41,229: Reading data 'data/ted_dataset/lmdb_val'...
2023-04-03 19:20:41,230: Found the cache data/ted_dataset/lmdb_val_cache
2023-04-03 19:20:41,230: Reading data 'data/ted_dataset/lmdb_test'...
2023-04-03 19:20:41,230: Found the cache data/ted_dataset/lmdb_test_cache
2023-04-03 19:20:41,230: building a language model...
2023-04-03 19:20:41,230: loaded from data/ted_dataset/vocab_cache.pkl
2023-04-03 19:22:04,617: [VAL] loss: 0.163, joint mae: 0.06099, accel diff: 0.00236, FGD: 231.373, feat_D: 90.965 / 77.7s
2023-04-03 19:22:04,618: *** BEST VALIDATION LOSS: 231.373
2023-04-03 19:22:05,151: Saved the checkpoint
2023-04-03 19:29:38,933: EP 0 (156) | 8m 57s, 541 samples/s | loss: 44.316, KLD: 0.011, DIV_REG: -0.373,
2023-04-03 19:33:40,215: EP 0 (312) | 12m 58s, 480 samples/s | loss: 41.546, KLD: 0.011, DIV_REG: -0.327,
2023-04-03 19:36:49,128: EP 0 (468) | 16m 7s, 534 samples/s | loss: 39.885, KLD: 0.013, DIV_REG: -0.426,
2023-04-03 19:39:19,590: EP 0 (624) | 18m 38s, 500 samples/s | loss: 39.260, KLD: 0.018, DIV_REG: -0.483,
2023-04-03 19:41:30,009: EP 0 (780) | 20m 48s, 510 samples/s | loss: 38.680, KLD: 0.026, DIV_REG: -0.549,
2023-04-03 19:42:51,142: [VAL] loss: 0.114, joint mae: 0.03950, accel diff: 0.00255, FGD: 213.396, feat_D: 73.584 / 79.7s
2023-04-03 19:42:51,143: *** BEST VALIDATION LOSS: 213.396
2023-04-03 19:42:51,711: Saved the checkpoint
2023-04-03 19:46:17,849: EP 1 (156) | 25m 36s, 513 samples/s | loss: 38.114, KLD: 0.034, DIV_REG: -0.621,
2023-04-03 19:49:09,744: EP 1 (312) | 28m 28s, 520 samples/s | loss: 37.637, KLD: 0.041, DIV_REG: -0.662,
2023-04-03 19:51:53,730: EP 1 (468) | 31m 12s, 552 samples/s | loss: 37.183, KLD: 0.048, DIV_REG: -0.724,
2023-04-03 19:54:25,152: EP 1 (624) | 33m 43s, 483 samples/s | loss: 36.862, KLD: 0.054, DIV_REG: -0.770,
2023-04-03 19:56:47,302: EP 1 (780) | 36m 5s, 572 samples/s | loss: 36.474, KLD: 0.060, DIV_REG: -0.836,
2023-04-03 19:58:08,175: [VAL] loss: 0.113, joint mae: 0.03928, accel diff: 0.00258, FGD: 192.299, feat_D: 73.296 / 79.5s
2023-04-03 19:58:08,176: *** BEST VALIDATION LOSS: 192.299
2023-04-03 19:58:08,747: Saved the checkpoint
2023-04-03 20:01:33,645: EP 2 (156) | 40m 52s, 515 samples/s | loss: 36.272, KLD: 0.065, DIV_REG: -0.891,
2023-04-03 20:04:22,613: EP 2 (312) | 43m 41s, 543 samples/s | loss: 36.034, KLD: 0.069, DIV_REG: -0.953,
2023-04-03 20:07:04,479: EP 2 (468) | 46m 23s, 506 samples/s | loss: 35.748, KLD: 0.074, DIV_REG: -1.005,
2023-04-03 20:09:35,175: EP 2 (624) | 48m 53s, 136 samples/s | loss: 35.513, KLD: 0.078, DIV_REG: -1.076,
2023-04-03 20:11:59,147: EP 2 (780) | 51m 17s, 529 samples/s | loss: 35.277, KLD: 0.081, DIV_REG: -1.151,
2023-04-03 20:13:20,638: [VAL] loss: 0.113, joint mae: 0.03925, accel diff: 0.00258, FGD: 179.544, feat_D: 73.309 / 80.0s
2023-04-03 20:13:20,639: *** BEST VALIDATION LOSS: 179.544
2023-04-03 20:13:21,233: Saved the checkpoint
2023-04-03 20:16:55,393: EP 3 (156) | 56m 14s, 584 samples/s | loss: 34.959, KLD: 0.085, DIV_REG: -1.212,
2023-04-03 20:19:47,767: EP 3 (312) | 59m 6s, 515 samples/s | loss: 34.806, KLD: 0.088, DIV_REG: -1.283,
2023-04-03 20:22:30,763: EP 3 (468) | 61m 49s, 471 samples/s | loss: 34.728, KLD: 0.090, DIV_REG: -1.335,
2023-04-03 20:25:01,649: EP 3 (624) | 64m 20s, 500 samples/s | loss: 34.382, KLD: 0.092, DIV_REG: -1.402,
2023-04-03 20:27:27,613: EP 3 (780) | 66m 46s, 477 samples/s | loss: 34.216, KLD: 0.094, DIV_REG: -1.453,
2023-04-03 20:28:48,844: [VAL] loss: 0.114, joint mae: 0.03963, accel diff: 0.00260, FGD: 170.022, feat_D: 73.421 / 79.7s
2023-04-03 20:28:48,845: *** BEST VALIDATION LOSS: 170.022
2023-04-03 20:28:49,463: Saved the checkpoint
2023-04-03 20:32:23,060: EP 4 (156) | 71m 41s, 539 samples/s | loss: 34.081, KLD: 0.095, DIV_REG: -1.511,
2023-04-03 20:35:22,058: EP 4 (312) | 74m 40s, 522 samples/s | loss: 34.009, KLD: 0.096, DIV_REG: -1.582,
2023-04-03 20:38:10,138: EP 4 (468) | 77m 28s, 491 samples/s | loss: 33.686, KLD: 0.097, DIV_REG: -1.633,
2023-04-03 20:40:50,572: EP 4 (624) | 80m 9s, 504 samples/s | loss: 33.670, KLD: 0.098, DIV_REG: -1.697,
2023-04-03 20:43:21,220: EP 4 (780) | 82m 39s, 510 samples/s | loss: 33.488, KLD: 0.098, DIV_REG: -1.722,
2023-04-03 20:44:42,185: [VAL] loss: 0.114, joint mae: 0.03949, accel diff: 0.00259, FGD: 157.507, feat_D: 73.292 / 79.5s
2023-04-03 20:44:42,186: *** BEST VALIDATION LOSS: 157.507
2023-04-03 20:44:42,787: Saved the checkpoint
2023-04-03 20:48:20,144: EP 5 (156) | 87m 38s, 543 samples/s | loss: 33.453, KLD: 0.098, DIV_REG: -1.794,
2023-04-03 20:51:19,736: EP 5 (312) | 90m 38s, 463 samples/s | loss: 33.235, KLD: 0.098, DIV_REG: -1.835,
2023-04-03 20:54:12,778: EP 5 (468) | 93m 31s, 452 samples/s | loss: 33.175, KLD: 0.098, DIV_REG: -1.861,
2023-04-03 20:56:53,481: EP 5 (624) | 96m 12s, 424 samples/s | loss: 33.122, KLD: 0.098, DIV_REG: -1.900,
2023-04-03 20:59:48,576: EP 5 (780) | 99m 7s, 508 samples/s | loss: 33.074, KLD: 0.098, DIV_REG: -1.927,
2023-04-03 21:01:09,537: [VAL] loss: 0.115, joint mae: 0.03969, accel diff: 0.00259, FGD: 151.947, feat_D: 73.636 / 79.7s
2023-04-03 21:01:09,538: *** BEST VALIDATION LOSS: 151.947
2023-04-03 21:01:10,133: Saved the checkpoint
2023-04-03 21:05:09,722: EP 6 (156) | 104m 28s, 560 samples/s | loss: 32.867, KLD: 0.097, DIV_REG: -1.973,
2023-04-03 21:08:15,264: EP 6 (312) | 107m 33s, 491 samples/s | loss: 32.860, KLD: 0.097, DIV_REG: -1.995,
2023-04-03 21:11:07,136: EP 6 (468) | 110m 25s, 537 samples/s | loss: 32.834, KLD: 0.097, DIV_REG: -2.021,
2023-04-03 21:13:40,498: EP 6 (624) | 112m 59s, 511 samples/s | loss: 32.830, KLD: 0.097, DIV_REG: -2.052,
2023-04-03 21:16:07,461: EP 6 (780) | 115m 26s, 529 samples/s | loss: 32.728, KLD: 0.098, DIV_REG: -2.084,
2023-04-03 21:17:28,712: [VAL] loss: 0.116, joint mae: 0.04008, accel diff: 0.00258, FGD: 142.532, feat_D: 74.084 / 79.7s
2023-04-03 21:17:28,713: *** BEST VALIDATION LOSS: 142.532
2023-04-03 21:17:29,269: Saved the checkpoint
2023-04-03 21:20:23,333: EP 7 (156) | 119m 42s, 533 samples/s | loss: 32.533, KLD: 0.098, DIV_REG: -2.093,
2023-04-03 21:22:31,282: EP 7 (312) | 121m 49s, 536 samples/s | loss: 32.640, KLD: 0.098, DIV_REG: -2.125,
2023-04-03 21:24:03,300: EP 7 (468) | 123m 21s, 326 samples/s | loss: 32.452, KLD: 0.097, DIV_REG: -2.141,
2023-04-03 21:25:25,536: EP 7 (624) | 124m 44s, 533 samples/s | loss: 32.523, KLD: 0.097, DIV_REG: -2.166,
2023-04-03 21:26:44,911: EP 7 (780) | 126m 3s, 612 samples/s | loss: 32.543, KLD: 0.097, DIV_REG: -2.177,
2023-04-03 21:28:05,493: [VAL] loss: 0.115, joint mae: 0.04011, accel diff: 0.00258, FGD: 141.492, feat_D: 74.010 / 79.0s
2023-04-03 21:28:05,494: *** BEST VALIDATION LOSS: 141.492
2023-04-03 21:28:06,058: Saved the checkpoint
2023-04-03 21:30:04,757: EP 8 (156) | 129m 23s, 356 samples/s | loss: 32.433, KLD: 0.096, DIV_REG: -2.200,
2023-04-03 21:31:49,412: EP 8 (312) | 131m 8s, 553 samples/s | loss: 32.293, KLD: 0.096, DIV_REG: -2.179,
2023-04-03 21:33:15,824: EP 8 (468) | 132m 34s, 565 samples/s | loss: 32.235, KLD: 0.097, DIV_REG: -2.221,
2023-04-03 21:34:34,820: EP 8 (624) | 133m 53s, 587 samples/s | loss: 32.261, KLD: 0.097, DIV_REG: -2.225,
2023-04-03 21:35:53,075: EP 8 (780) | 135m 11s, 581 samples/s | loss: 32.294, KLD: 0.096, DIV_REG: -2.247,
2023-04-03 21:37:13,929: [VAL] loss: 0.117, joint mae: 0.04056, accel diff: 0.00258, FGD: 137.467, feat_D: 74.355 / 79.2s
2023-04-03 21:37:13,930: *** BEST VALIDATION LOSS: 137.467
2023-04-03 21:37:14,500: Saved the checkpoint
2023-04-03 21:39:07,209: EP 9 (156) | 138m 25s, 554 samples/s | loss: 32.193, KLD: 0.096, DIV_REG: -2.265,
2023-04-03 21:40:48,792: EP 9 (312) | 140m 7s, 543 samples/s | loss: 32.190, KLD: 0.095, DIV_REG: -2.268,
2023-04-03 21:42:18,787: EP 9 (468) | 141m 37s, 563 samples/s | loss: 32.051, KLD: 0.095, DIV_REG: -2.276,
2023-04-03 21:43:34,018: EP 9 (624) | 142m 52s, 537 samples/s | loss: 32.141, KLD: 0.095, DIV_REG: -2.291,
2023-04-03 21:44:53,100: EP 9 (780) | 144m 11s, 574 samples/s | loss: 31.999, KLD: 0.095, DIV_REG: -2.306,
2023-04-03 21:46:14,119: [VAL] loss: 0.117, joint mae: 0.04083, accel diff: 0.00257, FGD: 137.812, feat_D: 74.562 / 79.4s
2023-04-03 21:46:14,120: best validation loss so far: 137.467 at EPOCH 9
2023-04-03 21:46:14,647: Saved the checkpoint
2023-04-03 21:48:00,942: EP 10 (156) | 147m 19s, 550 samples/s | loss: 32.110, KLD: 0.095, DIV_REG: -2.316,
2023-04-03 21:49:37,853: EP 10 (312) | 148m 56s, 436 samples/s | loss: 32.034, KLD: 0.095, DIV_REG: -2.309,
2023-04-03 21:51:09,313: EP 10 (468) | 150m 27s, 376 samples/s | loss: 31.805, KLD: 0.095, DIV_REG: -2.322,
2023-04-03 21:52:42,833: EP 10 (624) | 152m 1s, 281 samples/s | loss: 31.974, KLD: 0.095, DIV_REG: -2.337,
2023-04-03 21:54:04,110: EP 10 (780) | 153m 22s, 536 samples/s | loss: 31.751, KLD: 0.095, DIV_REG: -2.349,
2023-04-03 21:55:18,336: [VAL] loss: 0.117, joint mae: 0.04091, accel diff: 0.00258, FGD: 134.009, feat_D: 75.092 / 72.6s
2023-04-03 21:55:18,337: *** BEST VALIDATION LOSS: 134.009
2023-04-03 21:55:18,902: Saved the checkpoint
2023-04-03 21:58:24,244: EP 11 (156) | 157m 42s, 233 samples/s | loss: 31.920, gen: 3.477, dis: 0.000, KLD: 0.094, DIV_REG: -2.338,
2023-04-03 22:01:31,057: EP 11 (312) | 160m 49s, 249 samples/s | loss: 31.757, gen: 3.484, dis: 0.000, KLD: 0.094, DIV_REG: -2.377,
2023-04-03 22:04:40,562: EP 11 (468) | 163m 59s, 242 samples/s | loss: 31.769, gen: 3.493, dis: 0.000, KLD: 0.095, DIV_REG: -2.353,
2023-04-03 22:07:48,177: EP 11 (624) | 167m 6s, 227 samples/s | loss: 31.798, gen: 3.501, dis: 0.000, KLD: 0.095, DIV_REG: -2.375,
2023-04-03 22:10:52,274: EP 11 (780) | 170m 10s, 238 samples/s | loss: 31.712, gen: 3.503, dis: 0.000, KLD: 0.095, DIV_REG: -2.372,
...
2023-04-06 06:50:01,238: EP 196 (312) | 3569m 19s, 239 samples/s | loss: 29.604, gen: 3.586, dis: 0.000, KLD: 0.139, DIV_REG: -2.699,
2023-04-06 06:52:48,415: EP 196 (468) | 3572m 7s, 246 samples/s | loss: 29.521, gen: 3.587, dis: 0.000, KLD: 0.139, DIV_REG: -2.685,
2023-04-06 06:55:44,495: EP 196 (624) | 3575m 3s, 227 samples/s | loss: 29.409, gen: 3.579, dis: 0.000, KLD: 0.139, DIV_REG: -2.717,
2023-04-06 06:58:39,586: EP 196 (780) | 3577m 58s, 225 samples/s | loss: 29.638, gen: 3.586, dis: 0.000, KLD: 0.139, DIV_REG: -2.693,
2023-04-06 06:59:04,010: [VAL] loss: 0.124, joint mae: 0.04391, accel diff: 0.00251, FGD: 96.078, feat_D: 77.709 / 22.1s
2023-04-06 06:59:04,011: best validation loss so far: 92.642 at EPOCH 196
2023-04-06 07:02:02,511: EP 197 (156) | 3581m 21s, 249 samples/s | loss: 29.508, gen: 3.579, dis: 0.000, KLD: 0.139, DIV_REG: -2.697,
2023-04-06 07:04:52,607: EP 197 (312) | 3584m 11s, 243 samples/s | loss: 29.474, gen: 3.572, dis: 0.000, KLD: 0.139, DIV_REG: -2.702,
2023-04-06 07:07:40,048: EP 197 (468) | 3586m 58s, 233 samples/s | loss: 29.623, gen: 3.582, dis: 0.000, KLD: 0.139, DIV_REG: -2.698,
2023-04-06 07:10:25,800: EP 197 (624) | 3589m 44s, 240 samples/s | loss: 29.549, gen: 3.573, dis: 0.000, KLD: 0.139, DIV_REG: -2.703,
2023-04-06 07:13:13,151: EP 197 (780) | 3592m 31s, 228 samples/s | loss: 29.625, gen: 3.576, dis: 0.000, KLD: 0.139, DIV_REG: -2.716,
2023-04-06 07:13:38,024: [VAL] loss: 0.124, joint mae: 0.04386, accel diff: 0.00252, FGD: 94.531, feat_D: 77.749 / 22.8s
2023-04-06 07:13:38,025: best validation loss so far: 92.642 at EPOCH 196
2023-04-06 07:16:35,221: EP 198 (156) | 3595m 53s, 237 samples/s | loss: 29.624, gen: 3.582, dis: 0.000, KLD: 0.138, DIV_REG: -2.717,
2023-04-06 07:19:27,460: EP 198 (312) | 3598m 46s, 227 samples/s | loss: 29.587, gen: 3.588, dis: 0.000, KLD: 0.139, DIV_REG: -2.704,
2023-04-06 07:22:20,742: EP 198 (468) | 3601m 39s, 240 samples/s | loss: 29.528, gen: 3.580, dis: 0.000, KLD: 0.139, DIV_REG: -2.697,
2023-04-06 07:25:09,966: EP 198 (624) | 3604m 28s, 234 samples/s | loss: 29.491, gen: 3.578, dis: 0.000, KLD: 0.139, DIV_REG: -2.707,
2023-04-06 07:27:59,485: EP 198 (780) | 3607m 18s, 234 samples/s | loss: 29.544, gen: 3.579, dis: 0.000, KLD: 0.139, DIV_REG: -2.695,
2023-04-06 07:28:24,571: [VAL] loss: 0.124, joint mae: 0.04387, accel diff: 0.00251, FGD: 96.358, feat_D: 77.631 / 22.9s
2023-04-06 07:28:24,572: best validation loss so far: 92.642 at EPOCH 196
2023-04-06 07:31:15,323: EP 199 (156) | 3610m 34s, 232 samples/s | loss: 29.599, gen: 3.579, dis: 0.000, KLD: 0.139, DIV_REG: -2.722,
2023-04-06 07:34:01,768: EP 199 (312) | 3613m 20s, 235 samples/s | loss: 29.642, gen: 3.580, dis: 0.000, KLD: 0.138, DIV_REG: -2.708,
2023-04-06 07:36:48,889: EP 199 (468) | 3616m 7s, 242 samples/s | loss: 29.409, gen: 3.572, dis: 0.000, KLD: 0.138, DIV_REG: -2.694,
2023-04-06 07:39:35,402: EP 199 (624) | 3618m 54s, 252 samples/s | loss: 29.493, gen: 3.574, dis: 0.000, KLD: 0.138, DIV_REG: -2.695,
2023-04-06 07:42:24,087: EP 199 (780) | 3621m 42s, 235 samples/s | loss: 29.578, gen: 3.577, dis: 0.000, KLD: 0.138, DIV_REG: -2.708,
2023-04-06 07:42:26,224: --------- best loss values ---------
2023-04-06 07:42:26,225: loss: 0.113 at EPOCH 3
2023-04-06 07:42:26,225: joint_mae: 0.039 at EPOCH 3
2023-04-06 07:42:26,225: frechet: 92.642 at EPOCH 196
2023-04-06 07:42:26,225: feat_dist: 73.292 at EPOCH 5

semantic prompter

Nice work. I failed to find any code related to the semantic prompter, and the default bash seems to train a Tri model? Is there any wrong with the released version. Could you give more instruction?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.