Coder Social home page Coder Social logo

Comments (23)

ankush-me avatar ankush-me commented on August 27, 2024 5

As far as I recall, there is a limit of 600px for the depth prediction network, (and no size limit at least for the gPb-UCM segmentation method). So I rescale down the image if it is bigger than 600 px; and then bring all the three things -- RGB, depth, segmentation -- to the same size (check here).

from synthtext.

junboLiu avatar junboLiu commented on August 27, 2024

I mean how to get the depth and segmentation information of image?

from synthtext.

ankush-me avatar ankush-me commented on August 27, 2024

Hi @junboLiu! You can use depth-from-single image ConvNets and popular segmentation methods like gPb-UCM or "Holistic Edge Detection".

from synthtext.

crazylyf avatar crazylyf commented on August 27, 2024

There are an in-door and an out-door pre-trained model for FCRN depth predictions. Is there much difference between those models, and which one should I choose?

from synthtext.

cjnolet avatar cjnolet commented on August 27, 2024

from synthtext.

ankush-me avatar ankush-me commented on August 27, 2024

The indoor/ outdoor models are expected to perform better on images from their respective domains. However, this may not be too critical for synthetic text placement.

from synthtext.

crazylyf avatar crazylyf commented on August 27, 2024

@cjnolet @ankush-me
I see, thx~

from synthtext.

crazylyf avatar crazylyf commented on August 27, 2024

Another question: the depth of images in dset.h5 file have two channels, while the depth predicted by FCRN has only one channel. Is there any additional preprocessing? Best!

from synthtext.

ankush-me avatar ankush-me commented on August 27, 2024

FCRN in our CVPR work does not predict depth.

The two channels correspond to predictions by the depth-prediction models trained on outdoor and indoor datasets.

from synthtext.

oneOfThePeople avatar oneOfThePeople commented on August 27, 2024

how you deal with the problem that each image have different size and the depth prediction and the segmentation network required fix size of input image?

from synthtext.

oneOfThePeople avatar oneOfThePeople commented on August 27, 2024

i see you resize to the depth image and not to the original one . why?
thank alot for your patience

from synthtext.

crazylyf avatar crazylyf commented on August 27, 2024

Hi @ankush-me
What does the segmentation variables seg, area, and label mean? Does the seg and area refer to the superpixels of proposals and area of the bounding boxes of gPb-UCM? what is label?
Thanks a lot!

from synthtext.

ankush-me avatar ankush-me commented on August 27, 2024

@oneOfThePeople : I did not want to interpolate the depth image.

@crazylyf : "Seg" is a region labeling, where pixels in each segment have the same label. "Label" is just a vector of unique values in "seg". "Area" is the number of pixels in the each segment, where, the i-th "area" corresponds to the segment with the i-th "label".

from synthtext.

oneOfThePeople avatar oneOfThePeople commented on August 27, 2024

ok thank alot

from synthtext.

CatherineYao avatar CatherineYao commented on August 27, 2024

How to get the "seg" "label" and "area" from gPb-UCM (https://github.com/jponttuset/mcg) code that you advised. Thanks a lot. @ankush-me

from synthtext.

FuriousRococo avatar FuriousRococo commented on August 27, 2024

Hello Ankush, I've been stuck in creating segmentation infomation of the images and I really need u help! I've managed to run u code in Windows system with Python 3.5 and OpenCV3, but I couldn't find a way to replace gPb-UCM? Is there any method written in python while feasible to run in Windows?

from synthtext.

ankush-me avatar ankush-me commented on August 27, 2024

I got segmentation information from gPb-UCM / MCG (as pointed above).
You can use scikit-image (I haven't used this myself): http://scikit-image.org/docs/dev/auto_examples/segmentation/plot_segmentations.html

from synthtext.

 avatar commented on August 27, 2024

Hello ankush, I am having trouble creating my own data set. I want create a new dest.h5, but I didn't find the method. Can you help me?

from synthtext.

425183525 avatar 425183525 commented on August 27, 2024

Hello Ankush,I have generated a .mat file, how do I turn a .mat file into a .h5 file?

from synthtext.

T43113452 avatar T43113452 commented on August 27, 2024

Sorry.I don't know that depth is translate persepctive.

from synthtext.

ahappycutedog avatar ahappycutedog commented on August 27, 2024

Hello Ankush,I have generated a .mat file, how do I turn a .mat file into a .h5 file?

你好 我想问下你生成这个mat文件是在ubuntu系统上吗 没有保错吗 我这边用ubuntu16.04 上的matlab2016 就出现了这个错误;
WARNING: Database pascal2012 (folder JPEGImages) not found in /path/to/pascal2012/
WARNING: Database SBD (folder images) not found in /path/to/SBD/
WARNING: Database COCO (folder images) not found in /path/to/COCO/
-- You can disable this warning in install.m --
-- Successful installation of MCG. Enjoy! --
Warning: The temporary variable im will be cleared at the beginning of each iteration of the parfor loop.
Any value assigned to it before the loop will be lost. If im is used before it is assigned in the parfor loop, a runtime error will occur.
See Parallel for Loops in MATLAB, "Temporary Variables".

In run_ucm (line 22)
Starting parallel pool (parpool) using the 'local' profile ... connected to 12 workers.
1 of 3
err
2 of 3
err
3 of 3
Analyzing and transferring files to the workers ...done.
1 of 3
err
2 of 3
err
3 of 3
Error using im2ucm (line 39)
An UndefinedFunction error was thrown on the workers for 'loadvar'. This might be because the file containing 'loadvar' is not accessible on the workers. Use addAttachedFiles(pool, files) to specify the
required files to be attached. See the documentation for 'parallel.Pool/addAttachedFiles' for more details.

Error in run_ucm (line 22)
parfor i = 1:numel(imname)

Caused by:
Undefined function 'loadvar' for input arguments of type 'char'.

请问这个知道怎么解决吗

from synthtext.

etsegenet2020 avatar etsegenet2020 commented on August 27, 2024

hi ... i use this code for my project but i get error like this
Traceback (most recent call last):
File "/home/molaligntamiru/Documents/SynthText/synthgen.py", line 658, in render_text
txt_render_res = self.place_text(img,place_masks[ireg],
File "/home/molaligntamiru/Documents/SynthText/synthgen.py", line 500, in place_text
render_res = self.text_renderer.render_sample(font,collision_mask)
File "/home/molaligntamiru/Documents/SynthText/text_utils.py", line 366, in render_sample
text_type = sample_weighted(self.p_text)
File "/home/molaligntamiru/Documents/SynthText/text_utils.py", line 22, in sample_weighted
return p_dict[np.random.choice(ps,p=ps)]
File "mtrand.pyx", line 892, in numpy.random.mtrand.RandomState.choice
ValueError: a must be 1-dimensional or an integer

what can i do?

from synthtext.

Lane689 avatar Lane689 commented on August 27, 2024

Hi guys, @crazylyf , @cjnolet

How did you extract depth images from h5 file, or how are you upload depth images into synthtext script?

Here is my code:

import h5py
import numpy as np
import cv2

save_dir = 'C:/Users.../depth_imgs'

with h5py.File('depth.h5', 'r') as hf:
    image_ds = hf
    for imagename in image_ds.keys():
        IMAGE_arr = image_ds[imagename][()]
        cv2.imwrite(f"{save_dir}/{imagename}", IMAGE_arr)
        cv2.waitKey(1000)
        cv2.destroyAllWindows()

But when I run this code it shows me this error:

image

Then I try print(IMAGE_arr.dtype, IMAGE_arr.shape) and this gives me a results:

float32 (2, 600, 400)
float32 (2, 600, 493)
float32 (2, 600, 284)

The first one "2" here refers to two different depth estimates by two different neural networks (one trained on mostly indoor images, and the other on outdoor images).
Because of that I can't imwrite( ) images into a folder while images from seg.h5 (segmentation images) I successfully saved in a folder.
Opencv's imwrite expects (height, width, channels).

from synthtext.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.