byungsook / deep-fluids Goto Github PK

View Code? Open in Web Editor NEW

159.0 159.0 46.0 1.11 MB

Deep Fluids: A Generative Network for Parameterized Fluid Simulations

Home Page: http://www.byungsoo.me/project/deep-fluids

Python 97.84% Batchfile 2.16%

deep-fluids's People

Contributors

Stargazers

Watchers

Forkers

peterzhousz donglai96 wsnow99 greenty5 ialongz thw1021 linusec hwfluid wtsxjtu oanaoana alok endacyayisenga gdutzj kshitij-khode joqqy tudormot sanofc durswd hamoelotmany elendorial keshava ritvik03 rahulsundar vivanov879 m-strzalkowski aoe-khkhan meliao keitotakaishi myzhao777 jatropj fytalon smailaga erikyann univstar syam-s nadineab w434639247 yoharol peterzs hassaniqbal209 tjcd2929 alsulami22 woojooc glaand urieljc

deep-fluids's Issues

Why `curl` in `build_test_model` rather than `jacobian3`?

Hi @byungsook. In the build_test_model function, curl is used:

            self.G_ = curl(self.G_s)

while in build_model method, jacobian3 is called:

            _, self.G_ = jacobian3(self.G_s)

why is it so? I am looking to advect the model predictions in order to obtain files in .vdb format. Thus, I plan to use jacobian3 call just like when training, and then use the advect method from smoke3_vel_buo.py script.

mantaflow OpenGL functions not declared in scope

Hi, I'm currently trying to setup mantaflow and I keep running into the issue of OpenGL functions not being found when trying to make with -DGUI = ON. I've made sure to include all packages required so this issue kind of has me stumped.

Input pipeline: batches created without 'replacing'

Hello,
As far as I can tell, when reading input data from file and enqueuing it in the FIFO (from where batches are dequeued), the code does it by sampling from a uniform distribution with replacing (data.py, line 128). Is there any reason why sampling without replacement was not used? I think sampling without replacement is the common practice, this way all the data is guaranteed to be 'seen' once in each epoch, leading to better training in the short run

Thank you!

edited: replaced 'with' with 'without', sorry for the confusion

why the output of EncoderBE3 is of z_num?

Why the output shape of EncoderBE3 is z_num? I think it should be c_num by the paper. And there are no supervised parameters P in network AE and AE3.

training error

Dear Kim,
I run the data set generation and it worked. But when I run python main.py for training, I have a error message:
KeyError: "Registering two gradient with name 'BlockLSTM'! (Previous registration was in register /home/symphony/.conda/envs/tf/lib/python3.7/site-packages/tensorflow_core/python/framework/registry.py:66)

I remote to the Linux Red hat server with the Tensorflow 1.14.0. Could you please help me? Many thanks.

Different normalization for training and test data

@byungsook could you please explain why we have different normalization for training and test data?
in data.py in line 332 we normalize "y" variable by:

for i, ri in enumerate(y_range): 
    y[i] = (y[i]-ri[0]) / (ri[1]-ri[0]) * 2 - 1

whereas in trainer.py in line 327 we have:

c1 = p1/float(y1-1)*2-1
c2 = p2/float(y2-1)*2-1

In plain language, in the first case we have: (p_current - p_min)/(p_max - p_min)*2 - 1
and in the second: p_current/(p_number_of_values - 1)*2 - 1

Torch implementation

Hi @byungsook. I reimplemented 3d smoke experiments in PyTorch in my fork of your project https://github.com/vivanov879/deep-fluids. I also described the experiments and my analysis of the results in the paper https://github.com/vivanov879/deep-fluids/blob/master/PyTorch%20implementation%20of%20Deep%20Fluids.pdf. Feel free to share your thoughts on that and thanks for the awesome project.

Visualizing the output

I retain the network and I can produce a new output. However, I don't know how to show the result as you pictured it in your paper. Is there any app I should use such as blender ..etc?

Many thanks for your great work.

Tensor shape or OOM error for smoke3_mov200_f400 training

TL:DR: Could you please provide the commands for training a moving 3D smoke source?

Setup Info:
Tensorflow version: tensorflow-gpu, 1.12.0, Channel: pypi

Issue:
In accordance with run.bat, I generated the smoke3_mov200_f400 dataset using the following command:

python main.py --arch=ae --z_num=16 --max_epoch=10 --filter=64 --is_3d=True --lr_max=0.00005 --dataset=smoke3_mov200_f400 --res_x=48 --res_y=72 --res_z=48 --batch_size=4 --num_worker=1

Then trained the AE model using,

python main.py --arch=ae --z_num=16 --max_epoch=10 --filter=64 --is_3d=True --lr_max=0.00005 --dataset=smoke3_mov200_f400 --res_x=48 --res_y=72 --res_z=48 --batch_size=4 --num_worker=1

However, when I try generating latent code set with (trained model stored in log/smoke3_mov200_f400/1213_021319_ae_tag/ )

python main.py --is_train=False --load_path=log/smoke3_mov200_f400/1213_021319_ae_tag/  --arch=ae --z_num=16 --max_epoch=20 --is_3d=True --dataset=smoke_mov200_f400 --res_x=48 --res_y=72 --res_z=48 --test_batch_size=4

I run into the following error,

Traceback (most recent call last):
  File "main.py", line 35, in <module>
    main(config)
  File "main.py", line 21, in main
    trainer = Trainer3(config, batch_manager)
  File "/code/deep-fluids/trainer.py", line 123, in __init__
    self.sess = sv.prepare_or_wait_for_session(config=sess_config)
  File "/root/miniconda3/envs/sandbox/lib/python3.6/site-packages/tensorflow/python/training/supervisor.py", line 730, in prepare_or_wait_for_session
    init_feed_dict=self._init_feed_dict, init_fn=self._init_fn)
  File "/root/miniconda3/envs/sandbox/lib/python3.6/site-packages/tensorflow/python/training/session_manager.py", line 288, in prepare_session
    config=config)
  File "/root/miniconda3/envs/sandbox/lib/python3.6/site-packages/tensorflow/python/training/session_manager.py", line 218, in _restore_checkpoint
    saver.restore(sess, ckpt.model_checkpoint_path)
  File "/root/miniconda3/envs/sandbox/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1582, in restore
    err, "a mismatch between the current graph and the graph")
tensorflow.python.framework.errors_impl.InvalidArgumentError: Restoring from checkpoint failed. This is most likely due to a mismatch between the current graph and the graph from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:

I modified the latent code command with --filter=64, but then I ran out of memory,

Traceback (most recent call last):
  File "main.py", line 35, in <module>
    main(config)
  File "main.py", line 31, in main
    trainer.test()
  File "/code/deep-fluids/trainer.py", line 308, in test
    self.test_ae()
  File "/code/deep-fluids/trainer.py", line 502, in test_ae
    c = self.sess.run(self.z, {self.x: x})
  File "/root/miniconda3/envs/sandbox/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 929, in run
    run_metadata_ptr)
  File "/root/miniconda3/envs/sandbox/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1152, in _run
    feed_dict_tensor, options, run_metadata)
  File "/root/miniconda3/envs/sandbox/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1328, in _do_run
    run_metadata)
  File "/root/miniconda3/envs/sandbox/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1348, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[100,64,48,72,48] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
	 [[node AE_1/enc/0_conv/Conv3D (defined at /root/miniconda3/envs/sandbox/lib/python3.6/site-packages/tensorflow/contrib/layers/python/layers/layers.py:1057)  = Conv3D[T=DT_FLOAT, data_format="NDHWC", dilations=[1, 1, 1, 1, 1], padding="SAME", strides=[1, 1, 1, 1, 1], _device="/job:localhost/replica:0/task:0/device:GPU:0"](_arg_Placeholder_3_0_0/_1283, AE/enc/0_conv/weights/read)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

Even after reducing batch size via --test_batch_size=1. I still get OOM.

Since the paper suggests that all the training was done on NVIDIA Titan X (12Gb) GPU and I have the Titan X Pascal (12Gb), I'm assuming that memory shouldn't be the problem.

Could you please provide the commands for training a moving 3D smoke source? run.bat only provided it for smoke3_mov, 2D. Or alternatively please help me identify what I'm doing wrong?

Thanks

Any plans to release pre-trained weights?

I was wondering if there's any plans for releasing some pre-trained weights or checkpoints for a partially trained network?

I'm currently training the model with data generated by ./scene/smoke_pos_size.py, but it'd be nice if access to pre-trained weights was provided.

running error

Sorry to ask a question that seems to have nothing with this project,it seems there are something wrong with my mantaflow,but I dont know how to fix it.When I run smoke_pos_size.py,I got a error:
NameError: name 'copyGridToArrayMAC' is not defined ,not only copyGridToArrayMAC,but every function about numpy convert doesn't work. Have you ever met such problem?

PS:I have reinstalled mantaflow,same error just happens.

Using own data?

I have some fire spread simulation data that's not generated by mantaflow and is in a NumPy array. I looked at the data generated by the scripts in this repo, but I don't know what the format is.

If I run manta scene/smoke_pos_size.py, it generates a bunch of .npz files with fields x and y. x is of shape (128,96,2) and y has shape (,3). What do those shapes represent? I want to massage my data to fit so I can use it.

Velocity field -> density

Hi @byungsook. Can you please clarify one point for me: is it possible to move from velocity field to density? While reading the Deep Fluids paper, I noticed that the main focus is the velocity field. However, I can't grasp how to move from a velocity field to density. For example, the "Algorithm 1, Simulation with the Latent Space Integration Network" is about velocity field generation. However, the rendered images shown in the paper Figures 8, 9, 10, 11 are all obtained via rendering. Is it possible to make Maya/Blender render with only velocity fields, or is there some explicit way of getting density out of velocity fields for rendering?

The default dataformate of jocabian is wrong

def jacobian(x, data_format='NHCW'):
if data_format == 'NCHW':
x = nchw_to_nhwc(x)
the data_format should be 'NHWC'

problem in manta

Thanks for your nice code. I have a problem in running manta on linux. I followed the instruction and installed manta. When I run the test case: ./manta ../scenes/simpleplume.py. I have a message:

Version: mantaflow 0.12 64bit fp1 omp commit 15eaf4aa72da62e174df6c01f85ccd66fde20acc from Sep 25 2019, 15:17:48
QXcbConnection: Could not connect to display 
Aborted (core dumped)

I changed the Cmake to: cmake .. -DGUI=OFF -DOPENMP=ON -DNUMPY=ON
but nothing changed. I remote to the Red Hat 4.4.7-23, Do you have any idea?

Why is Algorithm 1 faster than integrating ODE?

Hi @byungsook. Can you please provide intuition why is Algorithm 1 "Simulation with the Latent Space Integration Network
" would be faster than integrating the ODE that obey the inviscid momentum Du/Dt = −∇ p + g and mass conservation ∇ · u = 0 equations? To me, it's like you have all the formulas and can integrate them. While a neural network introduces multiple layers which will take far more time.

How do you visualize the network output: velocity fields and density.

Hi. How do you obtain the pretty video visualizations shown in the youtube video accompanying the paper? I tried plotting the velocity field images via matplotlib but I believe there is a better way to do this. Do you maybe export the mantaflow format to Blender somehow?

I am not able to see any output files generated in log folder?

Hello,

Please could you tell me how to get the output after running test command. Which folder log ? It is taking too much time to train the model if we get the pre-trained model to see the results?

Interpreting output of the smoke3_vel_buo simulation

Hi, for the 3D bouyant smoke plume simulation, I ran manta with the command:

..\manta\build\Release\manta.exe .\scene\smoke3_vel_buo.py

which generated vector field arrays of size (32, 64, 112, 3). Then I've trained the model (only for 50 epochs so far) with the command

python main.py \
  --is_3d=True \
  --dataset=smoke3_vel5_buo3_f250 \
  --res_x=112 \
  --res_y=64 \
  --res_z=32 \
  --batch_size=5 \
  --max_epoch=50 \
  --num_worker=1 \
  --log_step=100 \
  --test_step=20 \
  --load_path=log/smoke3_vel5_buo3_f250/0324_110822_de_tag/

and I generate predictions with this command:

python main.py \
--is_train=False \
--load_path=log/smoke3_vel5_buo3_f250/0324_110822_de_tag/ \
--test_batch_size=5 \
--is_3d=True \
--dataset=smoke3_vel5_buo3_f250 \
--res_x=112 \
--res_y=64 \
--res_z=32 \
--batch_size=4 \
--num_worker=1

But the output arrays, which I assume are the model's prediction, are all of size (32, 64, 3, 2). Are these outputs 3D vector fields? Do they correspond to some subset of the volume over which the manta simulation was run?

copyGridToArrayMAC(vel, v_) Dimensions do not match Error

After successfully installing and running mantaflow, I try to run the data generation script by entering the command:

$ ..\manta\build\Release\manta.exe .\scene\smoke_pos_size.py

However, I receive the following error:
Error in copyGridToArrayMAC
sim: 0%| | 0/200 [00:00<?, ?it/s]
scenes: 0%| | 0/105 [00:00<?, ?it/s]
Traceback (most recent call last):
File "scene/smoke_pos_size.py", line 259, in
main()
File "scene/smoke_pos_size.py", line 199, in main
copyGridToArrayMAC(vel, v_)
RuntimeError: The dimensions of source grid (96, 128, 1) and target numpy array (3, 96, 128) do not match!
Error raised in /workdir/manta/source/plugin/numpyconvert.cpp:122
Script finished.

While installing, I deviated from original guidelines in readme.md at two points:

Install tensorflow version 2.11 instead of 1.15 since the later is incompatible with my python version 3.9
Installed manta with 'master' branch instead of '15eaf4' since the latter was giving the error 'ImportError: numpy.core.multiarray failed to import' even with DNUMPY=ON

Would be grateful if someone can guide on how to resolve this issue.

Computing the evaluation time

Hi @byungsook ,
sorry, I am a bit struggling to reproduce the exact evaluation time you reported in the paper (0.958 ms for a batch of size 5 for the "smoke inflow" dataset). Get about two orders higher time ~ 200 ms.
Could you please share how did you measure the time? Did you call time.time() inside the test_() function in the trainer.py? Did you compute it for a single batch or you ran multiple batches and took an average thereof? Thank you a lot!