senguptaumd / background-matting Goto Github PK
View Code? Open in Web Editor NEWBackground Matting: The World is Your Green Screen
Home Page: https://grail.cs.washington.edu/projects/background-matting/
Background Matting: The World is Your Green Screen
Home Page: https://grail.cs.washington.edu/projects/background-matting/
Hi,
Can this support video communication with resolution 1280*720 and frame rate 20fps above? Which work need to be done to support that?
hi, saw this interesting work in git trending. wonderful work.
from the sampledata, it seems the bachground matting is pic to pic.
can it run from video to video? (input the video and output the target video)?
CUDA Device: 0
Using image mode
Traceback (most recent call last):
File "test_background-matting_image.py", line 179, in
comp_im_tr1=composite4(fg_out0,back_img10,alpha_out0)
File "/content/drive/My Drive/Background-Matting/functions.py", line 8, in composite4
im = alpha * fg + (1 - alpha) * bg
ValueError: operands could not be broadcast together with shapes (1080,1920,1) (1080,1980,3)
background/0002.png miss.
0001.png is ok.
In a video, the rcnn could have no person in the image.
It would cause the where array here to error.
The bbox could be None as in the proposed_fix.zip attached
Hi, thanks for your work.
I found in compose_image_withshift()
in functions.py, image_sh
is wrapped by torch.autograd.Variable()
, which will detach it from previous computation graph, preventing gradient back-propagation from loss_ganG to the generator. So I was wondering if loss_ganG was not used?
Could I change it to:
def compose_image_withshift(alpha_pred,fg_pred,bg,seg):
image_sh=torch.zeros(fg_pred.shape).cuda()
for t in range(0,fg_pred.shape[0]):
al_tmp=to_image(seg[t,...]).squeeze(2)
where = np.array(np.where((al_tmp>0.1).astype(np.float32)))
x1, y1 = np.amin(where, axis=1)
x2, y2 = np.amax(where, axis=1)
#select shift
n=np.random.randint(-(y1-10),al_tmp.shape[1]-y2-10)
#n positive indicates shift to right
alpha_pred_sh=torch.cat((alpha_pred[t,:,:,-n:],alpha_pred[t,:,:,:-n]),dim=2)
fg_pred_sh=torch.cat((fg_pred[t,:,:,-n:],fg_pred[t,:,:,:-n]),dim=2)
alpha_pred_sh=(alpha_pred_sh+1)/2
image_sh[t,...]=fg_pred_sh*alpha_pred_sh + (1-alpha_pred_sh)*bg[t,...]
# return torch.autograd.Variable(image_sh.cuda())
return image_sh
So GReal also ouputs F, α but we set α to 1's...wouldn't that mean our composite will just be the foreground (with a black background)? Is that you mean by "would result in simply
copying the entire input image into the composite passed
to D" because I don't see why it is "indeed real".
Thank you once again! Really appreciate your replies :)
First of all, great work !!
I want to re-implement the result on sample videos as your procedure in the repo, but why I cannot get the same effect as yours. I'm confusing ... orz. I find the problem is that _back.png obtained by running test_pre_process_video.py looks incorrect, like following pictures in sample_video/input
This is one of frame extracted from teaser.mov
This is the corresponding bg generated by running test_pre_process_video.py
Obviously there is something wrong. As said in README.md, If there are significant exposure changes between the captured image and the captured background, use bias-gain adjustment to account for that. Should I turn on the part of bias-gain adjustment in the test_pre_process_video.py?
Is that correct ? Thanks very much !!
There is no problem in running the alignment part of the demo.What is the reason for the following error when running my own picture?
python test_pre_process.py -i sample_data/my_photo/
Traceback (most recent call last):
File "test_pre_process.py", line 108, in
back_align = alignImages(back, image,mask)
File "test_pre_process.py", line 42, in alignImages
im1Reg = cv2.warpPerspective(im1, h, (width, height))
cv2.error: OpenCV(3.4.5) /io/opencv/modules/imgproc/src/imgwarp.cpp:2927: error: (-215:Assertion failed) (M0.type() == CV_32F || M0.type() == CV_64F) && M0.rows == 3 && M0.cols == 3 in function 'warpPerspective'
when i run test_background-matting_image.py to test images, i found that the video memory increased considerably at the beginning time, and then reduced, what is the problem?
Is there some errors or specific setting in the line 99 (networks.py)?
oth_feat=torch.cat([self.comb_back(torch.cat([img_feat,back_feat],dim=1)),self.comb_seg(torch.cat([img_feat,seg_feat],dim=1)),self.comb_multi(torch.cat([img_feat,back_feat],dim=1))],dim=1)
May be it should be self.comb_multi(torch.cat([img_feat,multi_feat],dim=1))],dim=1))?
Hi, question:
What's the difference between the output foreground, and let's say getting original pixels * (alpha > 0.95)? Is the network changing somehow the color information in the output?
Thanks!
Running the colab notebook works fine till segmentation. Using sample images itself. I did 'fix' path etc. till this point. Getting this error
CUDA Device: 0
Using image mode
Traceback (most recent call last):
File "Background-Matting/test_background-matting_image.py", line 121, in <module>
bbox=get_bbox(rcnn,R=bgr_img0.shape[0],C=bgr_img0.shape[1])
File "/content/Background-Matting/functions.py", line 38, in get_bbox
x1, y1 = np.amin(where, axis=1)
File "<__array_function__ internals>", line 6, in amin
File "/usr/local/lib/python3.6/dist-packages/numpy/core/fromnumeric.py", line 2746, in amin
keepdims=keepdims, initial=initial, where=where)
File "/usr/local/lib/python3.6/dist-packages/numpy/core/fromnumeric.py", line 90, in _wrapreduction
return ufunc.reduce(obj, axis, dtype, out, **passkwargs)
ValueError: zero-size array to reduction operation minimum which has no identity`
Will diagnose more and share updates
CUDA Device: 0
Using video mode
Traceback (most recent call last):
File "test_background-matting_image.py", line 54, in
model_name1=fo[0]
IndexError: list index out of range
error occured when executed this command :
python test_background-matting_image.py -m real-hand-held -i sample_video/input/ -o sample_video/output/ -tb sample_video/background/
Traceback (most recent call last):
File "test_background-matting_image.py", line 20, in
print('CUDA Device: ' + os.environ["CUDA_VISIBLE_DEVICES"])
File "/home/dwijayanto/anaconda3/envs/back-matting/lib/python3.6/os.py", line 669, in getitem
raise KeyError(key) from None
KeyError: 'CUDA_VISIBLE_DEVICES'
In code
Background-Matting/train_real_fixed.py
Line 136 in 808b6c7
In article:
al_loss=2 * l1_loss(alpha_pred_sup,alpha_pred,mask0)+4 * g_loss(alpha_pred_sup,alpha_pred,mask0)
which is better?
How to run on windows? seems path problem
Background-Matting-master\Models\real-fixed-cam>export PYTHONPATH=$PYTHONPATH:pwd
:pwd
/slim
'export' is not recognized as an internal or external command,
operable program or batch file.
Can it be work in google colab?
Hi,
I was trying the background-Matting test_background-matting_image.py
Testing on Windows
Did installations from requirements.txt
I got an error:
No module 'scipy'
Please add scipy to requirements.txt.
I Got it working after that :)
Very cool technology, is there hope to be implemented on the mobile terminal (iOS / Android)?
Or are there any ideas or reference programs that can be used for reference.
Thank you very much!
English is not my native language; please excuse typing errors.
Where is the code to get the "perturbed version" of B? Or are the background images in the sample data already the perturbed versions?
hi,guys!
I'm very curious about your datasets.
Can you release your datasets as soon as possible?
tks!!!
!CUDA_VISIBLE_DEVICES=0 python test_background-matting_image.py -m real-fixed-cam -i sample_video_fixed/input/ -o sample_video_fixed/output/ -tb sample_video_fixed/background/ -b sample_video_fixed/teaser_back.png
CUDA Device: 0
Using video mode
Traceback (most recent call last):
File "test_background-matting_image.py", line 121, in
bbox=get_bbox(rcnn,R=bgr_img0.shape[0],C=bgr_img0.shape[1])
File "/content/drive/My Drive/background_matting/Background-Matting/functions.py", line 38, in get_bbox
x1, y1 = np.amin(where, axis=1)
File "<array_function internals>", line 6, in amin
File "/usr/local/lib/python3.6/dist-packages/numpy/core/fromnumeric.py", line 2746, in amin
keepdims=keepdims, initial=initial, where=where)
File "/usr/local/lib/python3.6/dist-packages/numpy/core/fromnumeric.py", line 90, in _wrapreduction
return ufunc.reduce(obj, axis, dtype, out, **passkwargs)
ValueError: zero-size array to reduction operation minimum which has no identity
Hi,
are you sure about the version of TensorFlow using CUDA 10.0? Tensorflow 1.4 requires CUDA 8.
I think you meant version 1.14.
I'm trying with 1.15 and now it works well.
I read the Adobe paper about their dataset. I also dropped them a email. To my understanding they have 455 ground truth foregrounds and alpha mattes that they put onto 100 backgrounds each. I'm still awaiting their reply as to whether their dataset includes those 100 stand-alone backgrounds. Meanwhile I would like to hear from yourself as to why you decided to make your own set of synthetic composite with backgrounds drawn from MS COCO. Thank you.
Hello guys,
I am a student at Carnegie Mellon University working right now on a video performance piece for one of my classes and I am completely blown away by this algorithm. It took me some time to make CUDA work on my PC, but after that, it worked perfectly using the image mode. Here, a short clip with the results (edited in Premiere):
I only would like to know if it is possible to change the 512x512 crop resolution to a higher value, so it can track a bigger motion area in the original image/video.
Thanks!
rcnn=np.delete(rcnn, range(reso[0],reso[0]+K), 0)
Why do you have this line of code for the segmentation mask?
Is it possible to run Background-Matting without CUDA-enabled GPU?
Hi, I tried to run my own footage but this came up when I run
python test_background-matting_image.py -m real-hand-held -i sample_video/input/ -o sample_video/output/ -tb sample_video/background/
Traceback (most recent call last):
File "test_background-matting_image.py", line 85, in <module>
bg_im0=cv2.imread(os.path.join(data_path, filename.replace('_img','_back'))); bg_im0=cv2.cvtColor(bg_im0,cv2.COLOR_BGR2RGB);
cv2.error: OpenCV(3.4.5) C:\projects\opencv-python\opencv\modules\imgproc\src\color.cpp:181: error: (-215:Assertion failed) !_src.empty() in function 'cv::cvtColor'
Did I fail to setup anything?
hi,
if i have a picture with one persion(_img.png), but i have no the corresponding pic of background(_back.png), can i do the background-matting? because in real life , we just take the picture(_img.png) without background(_back.png),thanks.
Following are the images I provided.
Image:
Background:
Target Background:
I used Facebook's Detectron2 to get the mask. The mask is clearly not good.
following are the results that I got.
Compose:
Foreground:
Matte:
out:
I'm thinking of retraining the Adobe network on all 450 images instead just the non-transparent ones as mentioned in the paper. I'm also in search of a better segmentation model. One that is trained on everyday objects instead of just humans. Please let me know if you're aware of any.
Do you have any other suggestions that I should look into? Please let me know, thanks!
Was the effects of leaving M behind studied for single-images? From what I see, it looks like it was left behind for the sake of being able to use the model for both photo and video. Yet it seems that this would cause some bias seens the model would have seen 4 duplicate channels of the same image on top of the 3 colored channels of the original image.
why should we predict foreground color? I think we can compute it by alpha and input image.
What is you Human in the loop hypotesis when you apply this to not realtime production?
How the artist will interact to fine-tune manually your results?
Every time I run the test_segmentation_deeplab.py
file, it always creates a new temporary folder and takes a long time to download the same model file.
Why not create a new folder "deeplab_model"?
Hi
I load the models of segmentation and matting at the same time, and hope to get the segmentation img by the segmentation net, and matting by the matting net. But:
2020-05-25 11:06:13.415797: E tensorflow/stream_executor/cuda/cuda_dnn.cc:324] Loaded runtime CuDNN library: 7.1.2 but source was compiled with: 7.4.2. CuDNN library major and minor version needs to match or have higher minor version in case of CuDNN 7.0 or later version. If using a binary install, upgrade your CuDNN library. If building from sources, make sure the library loaded at runtime is compatible with the version specified during compile configuration.
2020-05-25 11:06:13.418751: E tensorflow/stream_executor/cuda/cuda_dnn.cc:324] Loaded runtime CuDNN library: 7.1.2 but source was compiled with: 7.4.2. CuDNN library major and minor version needs to match or have higher minor version in case of CuDNN 7.0 or later version. If using a binary install, upgrade your CuDNN library. If building from sources, make sure the library loaded at runtime is compatible with the version specified during compile configuration.
Traceback (most recent call last):
File "/home/sundy/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1334, in _do_call
return fn(*args)
File "/home/sundy/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1319, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "/home/sundy/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1407, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
[[{{node MobilenetV2/Conv/Conv2D}}]]
[[{{node SemanticPredictions}}]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "demo.py", line 10, in
img_seg = P.pred(img)
File "/home/sundy/Background-Matting/pred_seg.py", line 162, in pred
res_im,seg = self.MODEL.run(image)
File "/home/sundy/Background-Matting/pred_seg.py", line 70, in run
feed_dict={self.INPUT_TENSOR_NAME: [np.asarray(resized_image)]})
File "/home/sundy/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 929, in run
run_metadata_ptr)
File "/home/sundy/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1152, in _run
feed_dict_tensor, options, run_metadata)
File "/home/sundy/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1328, in _do_run
run_metadata)
File "/home/sundy/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1348, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
[[node MobilenetV2/Conv/Conv2D (defined at /home/sundy/Background-Matting/pred_seg.py:49) ]]
[[node SemanticPredictions (defined at /home/sundy/Background-Matting/pred_seg.py:49) ]]
Caused by op 'MobilenetV2/Conv/Conv2D', defined at:
File "demo.py", line 6, in
P = pred_seg()
File "/home/sundy/Background-Matting/pred_seg.py", line 154, in init
self.MODEL = DeepLabModel(download_path)
File "/home/sundy/Background-Matting/pred_seg.py", line 49, in init
tf.import_graph_def(graph_def, name='')
File "/home/sundy/.local/lib/python3.5/site-packages/tensorflow/python/util/deprecation.py", line 507, in new_func
return func(*args, **kwargs)
File "/home/sundy/.local/lib/python3.5/site-packages/tensorflow/python/framework/importer.py", line 442, in import_graph_def
_ProcessNewOps(graph)
File "/home/sundy/.local/lib/python3.5/site-packages/tensorflow/python/framework/importer.py", line 235, in _ProcessNewOps
for new_op in graph._add_new_tf_operations(compute_devices=False): # pylint: disable=protected-access
File "/home/sundy/.local/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 3433, in _add_new_tf_operations
for c_op in c_api_util.new_tf_operations(self)
File "/home/sundy/.local/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 3433, in
for c_op in c_api_util.new_tf_operations(self)
File "/home/sundy/.local/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 3325, in _create_op_from_tf_operation
ret = Operation(c_op, self)
File "/home/sundy/.local/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1801, in init
self._traceback = tf_stack.extract_stack()
UnknownError (see above for traceback): Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
[[node MobilenetV2/Conv/Conv2D (defined at /home/sundy/Background-Matting/pred_seg.py:49) ]]
[[node SemanticPredictions (defined at /home/sundy/Background-Matting/pred_seg.py:49) ]]
Hi, first of all great work!
I'm testing your fixed-camera model on full body standing videos (with a fixed camera obviously) and, although is pretty good, there are still some errors in feet, near hands and between the legs.
After reading your post on towardsdatascience, I've retrained your final model with a couple of these videos, but, contrary to what I expected, the resulting inference has been slightly worse. I'm using the captured bacgrounds provided.
According to documentation, target backgrounds should have roughly similar lighting as the original videos. Could that be the cause? If that so, how could I create backgrounds with similar light as the video I'm trying to process?
Alpha mask with original fixed-cam model:
Hi,
In the networks.py, multi_feat is not used at all? Thank you so much! Should I change that to oth_feat=torch.cat([self.comb_back(torch.cat([img_feat,back_feat],dim=1)),self.comb_seg(torch.cat([img_feat,seg_feat],dim=1)),self.comb_multi(torch.cat([img_feat,multi_feat],dim=1))],dim=1)
def forward(self, image,back,seg,multi):
img_feat1=self.model_enc1(image)
img_feat=self.model_enc2(img_feat1)
back_feat=self.model_enc_back(back)
seg_feat=self.model_enc_seg(seg)
multi_feat=self.model_enc_multi(multi)
oth_feat=torch.cat([self.comb_back(torch.cat([img_feat,back_feat],dim=1)),self.comb_seg(torch.cat([img_feat,seg_feat],dim=1)),self.comb_multi(torch.cat([img_feat,back_feat],dim=1))],dim=1)
out_dec=self.model_res_dec(torch.cat([img_feat,oth_feat],dim=1))
out_dec_al=self.model_res_dec_al(out_dec)
al_out=self.model_al_out(out_dec_al)
out_dec_fg=self.model_res_dec_fg(out_dec)
out_dec_fg1=self.model_dec_fg1(out_dec_fg)
fg_out=self.model_fg_out(torch.cat([out_dec_fg1,img_feat1],dim=1))
I wonder to konw that when the training code is to be shared?
Thank you for this amazing project!
Due to the complexity to run the inference, it would be worth to have a Colab notebook like the recent [CVPR 2020] 3D Photography using Context-aware Layered Depth Inpainting
First off,
This tool works very well in the situation that you describe and on the sample data. Also, the directions are good enough that I was able to run the sample as well as try my own data. Congratulations to that. One thing that is missing from many projects is good directions.
That said, it looks like minor differences in the categorization maskDL for each frame of the video may be causing the composited result frame to be cropped/resized differently.
I exported a video to PNG images as well as the same background for each frame. The camera is on a tripod so I used the real-fixed-cam model. I then ran the procedure with a solid green background so I can key the background out later.
This worked mostly exactly as the sample data but but there was a reflection against an appliance that made some aberrations in the deeplab mask.
This may have caused the subject focus to change between frames and 're-centered' the frame around the subject. In some cases, it resized the subject in one dimension only making some frames with the subject 'thin/compressed' and others with the subject normal.
Background-Matting/networks.py
Lines 97 to 99 in c374150
you didn't use the "multi_feat".
I was wondering if I can use it for video streaming
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.