Coder Social home page Coder Social logo

deep-fluids's People

Contributors

byungsook avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

deep-fluids's Issues

Why `curl` in `build_test_model` rather than `jacobian3`?

Hi @byungsook. In the build_test_model function, curl is used:

            self.G_ = curl(self.G_s)

while in build_model method, jacobian3 is called:

            _, self.G_ = jacobian3(self.G_s)

why is it so? I am looking to advect the model predictions in order to obtain files in .vdb format. Thus, I plan to use jacobian3 call just like when training, and then use the advect method from smoke3_vel_buo.py script.

mantaflow OpenGL functions not declared in scope

Hi, I'm currently trying to setup mantaflow and I keep running into the issue of OpenGL functions not being found when trying to make with -DGUI = ON. I've made sure to include all packages required so this issue kind of has me stumped.

image

Input pipeline: batches created without 'replacing'

Hello,
As far as I can tell, when reading input data from file and enqueuing it in the FIFO (from where batches are dequeued), the code does it by sampling from a uniform distribution with replacing (data.py, line 128). Is there any reason why sampling without replacement was not used? I think sampling without replacement is the common practice, this way all the data is guaranteed to be 'seen' once in each epoch, leading to better training in the short run

Thank you!

edited: replaced 'with' with 'without', sorry for the confusion

training error

Dear Kim,
I run the data set generation and it worked. But when I run python main.py for training, I have a error message:
KeyError: "Registering two gradient with name 'BlockLSTM'! (Previous registration was in register /home/symphony/.conda/envs/tf/lib/python3.7/site-packages/tensorflow_core/python/framework/registry.py:66)

I remote to the Linux Red hat server with the Tensorflow 1.14.0. Could you please help me? Many thanks.

Different normalization for training and test data

@byungsook could you please explain why we have different normalization for training and test data?
in data.py in line 332 we normalize "y" variable by:

for i, ri in enumerate(y_range): 
    y[i] = (y[i]-ri[0]) / (ri[1]-ri[0]) * 2 - 1

whereas in trainer.py in line 327 we have:

c1 = p1/float(y1-1)*2-1
c2 = p2/float(y2-1)*2-1

In plain language, in the first case we have: (p_current - p_min)/(p_max - p_min)*2 - 1
and in the second: p_current/(p_number_of_values - 1)*2 - 1

Visualizing the output

I retain the network and I can produce a new output. However, I don't know how to show the result as you pictured it in your paper. Is there any app I should use such as blender ..etc?

Many thanks for your great work.

Tensor shape or OOM error for smoke3_mov200_f400 training

TL:DR: Could you please provide the commands for training a moving 3D smoke source?

Setup Info:
Tensorflow version: tensorflow-gpu, 1.12.0, Channel: pypi

Issue:
In accordance with run.bat, I generated the smoke3_mov200_f400 dataset using the following command:

python main.py --arch=ae --z_num=16 --max_epoch=10 --filter=64 --is_3d=True --lr_max=0.00005 --dataset=smoke3_mov200_f400 --res_x=48 --res_y=72 --res_z=48 --batch_size=4 --num_worker=1

Then trained the AE model using,

python main.py --arch=ae --z_num=16 --max_epoch=10 --filter=64 --is_3d=True --lr_max=0.00005 --dataset=smoke3_mov200_f400 --res_x=48 --res_y=72 --res_z=48 --batch_size=4 --num_worker=1

However, when I try generating latent code set with (trained model stored in log/smoke3_mov200_f400/1213_021319_ae_tag/ )

python main.py --is_train=False --load_path=log/smoke3_mov200_f400/1213_021319_ae_tag/  --arch=ae --z_num=16 --max_epoch=20 --is_3d=True --dataset=smoke_mov200_f400 --res_x=48 --res_y=72 --res_z=48 --test_batch_size=4

I run into the following error,

Traceback (most recent call last):
  File "main.py", line 35, in <module>
    main(config)
  File "main.py", line 21, in main
    trainer = Trainer3(config, batch_manager)
  File "/code/deep-fluids/trainer.py", line 123, in __init__
    self.sess = sv.prepare_or_wait_for_session(config=sess_config)
  File "/root/miniconda3/envs/sandbox/lib/python3.6/site-packages/tensorflow/python/training/supervisor.py", line 730, in prepare_or_wait_for_session
    init_feed_dict=self._init_feed_dict, init_fn=self._init_fn)
  File "/root/miniconda3/envs/sandbox/lib/python3.6/site-packages/tensorflow/python/training/session_manager.py", line 288, in prepare_session
    config=config)
  File "/root/miniconda3/envs/sandbox/lib/python3.6/site-packages/tensorflow/python/training/session_manager.py", line 218, in _restore_checkpoint
    saver.restore(sess, ckpt.model_checkpoint_path)
  File "/root/miniconda3/envs/sandbox/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1582, in restore
    err, "a mismatch between the current graph and the graph")
tensorflow.python.framework.errors_impl.InvalidArgumentError: Restoring from checkpoint failed. This is most likely due to a mismatch between the current graph and the graph from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:

I modified the latent code command with --filter=64, but then I ran out of memory,

Traceback (most recent call last):
  File "main.py", line 35, in <module>
    main(config)
  File "main.py", line 31, in main
    trainer.test()
  File "/code/deep-fluids/trainer.py", line 308, in test
    self.test_ae()
  File "/code/deep-fluids/trainer.py", line 502, in test_ae
    c = self.sess.run(self.z, {self.x: x})
  File "/root/miniconda3/envs/sandbox/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 929, in run
    run_metadata_ptr)
  File "/root/miniconda3/envs/sandbox/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1152, in _run
    feed_dict_tensor, options, run_metadata)
  File "/root/miniconda3/envs/sandbox/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1328, in _do_run
    run_metadata)
  File "/root/miniconda3/envs/sandbox/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1348, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[100,64,48,72,48] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
	 [[node AE_1/enc/0_conv/Conv3D (defined at /root/miniconda3/envs/sandbox/lib/python3.6/site-packages/tensorflow/contrib/layers/python/layers/layers.py:1057)  = Conv3D[T=DT_FLOAT, data_format="NDHWC", dilations=[1, 1, 1, 1, 1], padding="SAME", strides=[1, 1, 1, 1, 1], _device="/job:localhost/replica:0/task:0/device:GPU:0"](_arg_Placeholder_3_0_0/_1283, AE/enc/0_conv/weights/read)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

Even after reducing batch size via --test_batch_size=1. I still get OOM.

Since the paper suggests that all the training was done on NVIDIA Titan X (12Gb) GPU and I have the Titan X Pascal (12Gb), I'm assuming that memory shouldn't be the problem.

Could you please provide the commands for training a moving 3D smoke source? run.bat only provided it for smoke3_mov, 2D. Or alternatively please help me identify what I'm doing wrong?

Thanks

Any plans to release pre-trained weights?

I was wondering if there's any plans for releasing some pre-trained weights or checkpoints for a partially trained network?

I'm currently training the model with data generated by ./scene/smoke_pos_size.py, but it'd be nice if access to pre-trained weights was provided.

running error

Sorry to ask a question that seems to have nothing with this project,it seems there are something wrong with my mantaflow,but I dont know how to fix it.When I run smoke_pos_size.py,I got a error:
NameError: name 'copyGridToArrayMAC' is not defined ,not only copyGridToArrayMAC,but every function about numpy convert doesn't work. Have you ever met such problem?

PS:I have reinstalled mantaflow,same error just happens.

Using own data?

I have some fire spread simulation data that's not generated by mantaflow and is in a NumPy array. I looked at the data generated by the scripts in this repo, but I don't know what the format is.

If I run manta scene/smoke_pos_size.py, it generates a bunch of .npz files with fields x and y. x is of shape (128,96,2) and y has shape (,3). What do those shapes represent? I want to massage my data to fit so I can use it.

Velocity field -> density

Hi @byungsook. Can you please clarify one point for me: is it possible to move from velocity field to density? While reading the Deep Fluids paper, I noticed that the main focus is the velocity field. However, I can't grasp how to move from a velocity field to density. For example, the "Algorithm 1, Simulation with the Latent Space Integration Network" is about velocity field generation. However, the rendered images shown in the paper Figures 8, 9, 10, 11 are all obtained via rendering. Is it possible to make Maya/Blender render with only velocity fields, or is there some explicit way of getting density out of velocity fields for rendering?

problem in manta

Thanks for your nice code. I have a problem in running manta on linux. I followed the instruction and installed manta. When I run the test case: ./manta ../scenes/simpleplume.py. I have a message:

Version: mantaflow 0.12 64bit fp1 omp commit 15eaf4aa72da62e174df6c01f85ccd66fde20acc from Sep 25 2019, 15:17:48
QXcbConnection: Could not connect to display 
Aborted (core dumped)

I changed the Cmake to: cmake .. -DGUI=OFF -DOPENMP=ON -DNUMPY=ON
but nothing changed. I remote to the Red Hat 4.4.7-23, Do you have any idea?

Why is Algorithm 1 faster than integrating ODE?

Hi @byungsook. Can you please provide intuition why is Algorithm 1 "Simulation with the Latent Space Integration Network
" would be faster than integrating the ODE that obey the inviscid momentum Du/Dt = −∇ p + g and mass conservation ∇ · u = 0 equations? To me, it's like you have all the formulas and can integrate them. While a neural network introduces multiple layers which will take far more time.

Interpreting output of the smoke3_vel_buo simulation

Hi, for the 3D bouyant smoke plume simulation, I ran manta with the command:

..\manta\build\Release\manta.exe .\scene\smoke3_vel_buo.py

which generated vector field arrays of size (32, 64, 112, 3). Then I've trained the model (only for 50 epochs so far) with the command

python main.py \
  --is_3d=True \
  --dataset=smoke3_vel5_buo3_f250 \
  --res_x=112 \
  --res_y=64 \
  --res_z=32 \
  --batch_size=5 \
  --max_epoch=50 \
  --num_worker=1 \
  --log_step=100 \
  --test_step=20 \
  --load_path=log/smoke3_vel5_buo3_f250/0324_110822_de_tag/ 

and I generate predictions with this command:

python main.py \
--is_train=False \
--load_path=log/smoke3_vel5_buo3_f250/0324_110822_de_tag/ \
--test_batch_size=5 \
--is_3d=True \
--dataset=smoke3_vel5_buo3_f250 \
--res_x=112 \
--res_y=64 \
--res_z=32 \
--batch_size=4 \
--num_worker=1

But the output arrays, which I assume are the model's prediction, are all of size (32, 64, 3, 2). Are these outputs 3D vector fields? Do they correspond to some subset of the volume over which the manta simulation was run?

copyGridToArrayMAC(vel, v_) Dimensions do not match Error

After successfully installing and running mantaflow, I try to run the data generation script by entering the command:

$ ..\manta\build\Release\manta.exe .\scene\smoke_pos_size.py

However, I receive the following error:
Error in copyGridToArrayMAC
sim: 0%| | 0/200 [00:00<?, ?it/s]
scenes: 0%| | 0/105 [00:00<?, ?it/s]
Traceback (most recent call last):
File "scene/smoke_pos_size.py", line 259, in
main()
File "scene/smoke_pos_size.py", line 199, in main
copyGridToArrayMAC(vel, v_)
RuntimeError: The dimensions of source grid (96, 128, 1) and target numpy array (3, 96, 128) do not match!
Error raised in /workdir/manta/source/plugin/numpyconvert.cpp:122
Script finished.

While installing, I deviated from original guidelines in readme.md at two points:

  1. Install tensorflow version 2.11 instead of 1.15 since the later is incompatible with my python version 3.9
  2. Installed manta with 'master' branch instead of '15eaf4' since the latter was giving the error 'ImportError: numpy.core.multiarray failed to import' even with DNUMPY=ON

Would be grateful if someone can guide on how to resolve this issue.

Computing the evaluation time

Hi @byungsook ,
sorry, I am a bit struggling to reproduce the exact evaluation time you reported in the paper (0.958 ms for a batch of size 5 for the "smoke inflow" dataset). Get about two orders higher time ~ 200 ms.
Could you please share how did you measure the time? Did you call time.time() inside the test_() function in the trainer.py? Did you compute it for a single batch or you ran multiple batches and took an average thereof? Thank you a lot!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.