ika-rwth-aachen / cam2bev Goto Github PK
View Code? Open in Web Editor NEWTensorFlow Implementation for Computing a Semantically Segmented Bird's Eye View (BEV) Image Given the Images of Multiple Vehicle-Mounted Cameras.
License: MIT License
TensorFlow Implementation for Computing a Semantically Segmented Bird's Eye View (BEV) Image Given the Images of Multiple Vehicle-Mounted Cameras.
License: MIT License
Thank you very much for your excellent work. I would like to ask how I can download VTD ?
Hello, it's very strange that you use image mask as the model's input. Have you ever tried input original images?
First of all thank you so much for this great work.
In the paper I read that the field of view of the ground truth image for the full 360° is approximately 70m x 44m. Do you have an approximation of the field of view of the single-input model too?
Based on the ground truth BEV image it might also let us do an approximation of which field of view it might have on a real world application?
Thanks a lot in advance.
Hello! I've found a performance issue in /model/train.py: batch()
should be called before map()
, which could make your program more efficient. Here is the tensorflow document to support it.
Detailed description is listed below:
dataTrain.batch(conf.batch_size, drop_remainder=True)
(here) should be called before dataTrain.map(parse_sample, num_parallel_calls=tf.data.experimental.AUTOTUNE)
(here).dataValid.batch(1)
(here) should be called before dataValid.map(parse_sample, num_parallel_calls=tf.data.experimental.AUTOTUNE)
(here).Besides, you need to check the function called in map()
(e.g., parse_sample
called in dataValid.map(parse_sample, num_parallel_calls=tf.data.experimental.AUTOTUNE)
) whether to be affected or not to make the changed code work properly. For example, if parse_sample
needs data with shape (x, y, z) as its input before fix, it would require data with shape (batch_size, x, y, z).
Looking forward to your reply. Btw, I am very glad to create a PR to fix it if you are too busy.
Hello! When I was training your data set, I failed every time to the last step, I changed the batch-size to 2, also failed to the last step, what is your training time? Can you help me out here. Thank you!
hello, I was inspired after reading the paper.
I would like to know how to generate this simulated data. I want to simulate the garage scene to train the model
What is the difference between label and input images in the response I got from a prior issue?
If I'm using my own dataset with only front camera images as input, is the following organization correct?
to confirm, the segmented images do not go in the preprocessing scripts?
Problem is that your input image has a fourth alpha channel, s.t. the resized image has shape (256, 512, 4)
. This causes the crash during one-hot-encoding.
I will push a fix tomorrow, s.t. an image will always be loaded as RGB instead of RGBA, even if present. In the meantime, you can fix it yourself by replacing utils.py#L77 with
img = tf.image.decode_png(img, channels=3)
Some more notes on your files:
0,0,142
(RGB) blue is listed in the convert_10.xml
. You need to check the colors you specify there.(640, 480)
, while your input image has shape (480, 640)
. Keep in mind that both will be center-cropped/resized to (256, 512)
.Originally posted by @lreiher in #3 (comment)
Firstly i would like to thank you very much for this repository and the ideas introduce in the paper. I have reproduce the results using 1_FRLR deeplab_mobilenet and the provided dataset and was looking to test on some images that we have collected on my side.
What i found is that using the trained model, i wasn't able to interpret the results using my own images (We mount the fisheye camera on a truck and run segmentation and our own ipm to generated the homography image as input)
Below are the input and result from our custom image
Input:
Result:
Result using 1_FRLR validation data provided
Input:
Result:
I am not sure why there is such a huge difference in the results. Although I was expecting a poorer result because of changes in the data domain, i wasn't expecting instances like cars, which was represented in our own homography, to disappear upon inference.
I was wondering if there is anything that i did wrongly during the predict step and i look forward to anyone who could shed a light on whether its a prediction step issue or its a model generalization issue
Thank you
Hello, thank you for this repository and the ideas. I'm trying to reproduce this work recently.
I find the performace on my dataset of using SpatialTransformer to preprocess the input images is not better than splice the IPM images together, so i am trying to find where i did wrong about the SpatialTransformer unit.
In order to use the uNetXST model, I need to pre calculate uNetXST_ homographies for 1_ FRLR.py as theta_ init
, the init parameter input into the SpatialTransformer, but I found that the uNetXST_ homographies i calculated by provided script related to the front camera cannot match the data provided in preprocessing/holography provided in the code_ converter/uNetXST_ homographies/1_ FRLR.py
, but the other three cameras I calculated are all matched.
# uNetXST_ homographies about front
# provided in 1_FRLR.py
np.array([[4.651574574230558e-14, 10.192351107009959, -5.36318723862984e-07],
[-5.588661045867985e-07, 0.0, 2.3708767903941617],
[35.30731833118676, 0.0, -1.7000018578614013]]), # front
# what i calculated:
# use the ipm_homographs provide by preprocessing/homography_converter/README.md
[[6.5627512483814406e-15, 10.192351107009959, -5.363187232807152e-07],
[-5.588661045867985e-07, 0.0, 2.3708767903941617],
[35.30731833118676, 0.0, -1.7000018578614013]]
For this reason, I recalculated the homographs of IPM, and found that the parameters of cameras are also not the same as what provided in preprocessing/homography_converter/README.md
.
# what provided in preprocessing/homography_converter/README.md
# OpenCV homography for front:
# [[0.0, 0.8841865353311344, -253.37277367000263], [0.049056392233805146, 0.5285437237795494, -183.265385638118], [-0.0, 0.001750144780726984, -0.5285437237795492]]
# OpenCV homography for rear:
# [[6.288911300436434e-18, 0.8292344604207404, -264.08036704706365], [-0.04905639223380515, 0.5285437237795513, -135.9750235247304], [-0.0, 0.0017501447807269904, -0.5285437237795512]]
# OpenCV homography for left:
# [[0.04905639223380514, 0.7984814950483465, -264.7865925612947], [3.0038376863423275e-18, 0.4821577791689496, -159.26320930902278], [-0.0, 0.0016334684620118568, -0.49330747552758086]]
# OpenCV homography for right:
# [[-0.04905639223380516, 0.7984814950483448, -217.49623044790604], [3.0038376863423283e-18, 0.5044571718862112, -138.69450590963578], [-0.0, 0.0016334684620118542, -0.49330747552758]]
# what i calculated:
OpenCV homography for front:
[[6.288911300436432e-18, 0.8841865353311336, -253.37277367000237], [0.049056392233805146, 0.5285437237795487, -183.26538563811778], [1.1577177764518928e-20, 0.0017501447807269821, -0.5285437237795486]]
OpenCV homography for rear:
[[6.288911300436434e-18, 0.8292344604207396, -264.08036704706336], [-0.049056392233805146, 0.5285437237795507, -135.97502352473026], [-4.194994772127064e-21, 0.0017501447807269886, -0.5285437237795506]]
OpenCV homography for left:
[[0.04905639223380514, 0.7984814950483464, -264.7865925612947], [3.0038376863423302e-18, 0.48215777916894953, -159.26320930902278], [8.805125002324379e-36, 0.0016334684620118568, -0.49330747552758086]]
OpenCV homography for right:
[[-0.04905639223380516, 0.7984814950483449, -217.49623044790607], [3.0038376863423263e-18, 0.5044571718862112, -138.69450590963578], [-5.442256126106322e-36, 0.0016334684620118542, -0.49330747552758]]
which leads to more differences in the result of uNetXST_ homographies, but surprisingly the result about right camera is same as the file provided:
H = [
np.array([[4.651574574230558e-14, 10.192351107009959, -5.36318723862984e-07],
[-5.588661045867985e-07, 0.0, 2.3708767903941617],
[35.30731833118676, 0.0, -1.7000018578614013]]), # front
# what i calculated
# [[-5.336674296656391e-14, 10.192351107009959, -5.363187163709389e-07],
# [-5.588660999399972e-07, -1.3484445368213003e-16, 2.3708767903941643],
# [35.30731833118661, -1.5835212325065431e-16, -1.700001857861401]]
np.array([[-5.336674306912119e-14, -10.192351107009957, 5.363187220578325e-07],
[5.588660952931949e-07, 3.582264351370481e-23, 2.370876772982613],
[-35.30731833118661, -2.263156574813233e-15, -0.5999981421386035]]), # rear
# what i calculated
# [[2.6539246969884692e-14, -10.192351107009959, 5.363187207328902e-07],
# [5.58866099939995e-07, -4.8860892910808006e-17, 2.3708767729826152],
# [-35.30731833118661, -2.9784330148858767e-15, -0.5999981421386087]]
np.array([[20.38470221401992, 7.562206982469407e-14, -0.28867638384075833],
[-3.422067857504854e-23, 2.794330463189411e-07, 2.540225111648729],
[2.1619497190382224e-15, -17.65365916559334, -0.4999990710692976]]), # left
# what i calculated
# [[20.38470221401992, 3.566907532807007e-14, -0.28867638384075667],
# [-3.422067879481381e-23, 2.794330463189408e-07, 2.5402251116487293],
# [2.1619497190382196e-15, -17.65365916559332, -0.4999990710692995]]
np.array([[-20.38470221401991, -4.849709834037436e-15, 0.2886763838407495],
[-3.4220679184765114e-23, -2.794330512976549e-07, 2.5402251116487626],
[2.161949719038217e-15, 17.653659165593304, -0.5000009289306967]]) # right
# what i calculated
# [[-20.38470221401991, -4.849709834037436e-15, 0.28867638384074945],
# [-3.4220679184765114e-23, -2.794330512976549e-07, 2.5402251116487626],
# [2.161949719038217e-15, 17.653659165593304, -0.5000009289306967]]
]
The difference seems small due to the order of magnitude, but I'm not sure whether it will affect the performance of the uNetXST model, because the performance of the uNetXST model on my dataset is not as good as the result obtained by directly using the IPM image as input.
Finally, Is my calculation correct? if I want to use my own dataset, how can I get the correct uNetXST_Homographs value?
Hi,
I'm trying to train your UNetXST model for the front view only on a Google Colab notebook, but it takes forever for each epoch. (2hrs +)
The notebook is running Python 3.10.11 with CUDA 11.8 and Tensorflow 2.12.0 preinstalled. In requirements.txt you suggest to train with Tensorflow<2.5.0, but this seems to affect only the Deeplab models.
Another thing that I have noticed is the TF-TRT Warning: Could not find TensorRT... do I need to install TensorRT for training?
Thank you in advance!
Below is the output of the terminal while training:
2023-05-22 12:35:07.066135: I tensorflow/core/util/port.cc:110] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable TF_ENABLE_ONEDNN_OPTS=0
.
2023-05-22 12:35:07.119840: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-05-22 12:35:08.005860: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
Found 32190 training samples
Found 3172 validation samples
2023-05-22 12:36:46.183583: W tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:47] Overriding orig_value setting because the TF_FORCE_GPU_ALLOW_GROWTH environment variable is set. Original config value was 0.
2023-05-22 12:36:46.183643: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1635] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 38284 MB memory: -> device: 0, name: NVIDIA A100-SXM4-40GB, pci bus id: 0000:00:04.0, compute capability: 8.0
Built data pipeline for training
Built data pipeline for validation
Compiled model uNetXST.py
Starting training...
2023-05-22 12:36:59.890772: I tensorflow/core/common_runtime/executor.cc:1197] [/device:CPU:0] (DEBUG INFO) Executor start aborting (this does not indicate an error and you can ignore this message): INVALID_ARGUMENT: You must feed a value for placeholder tensor 'Placeholder/_1' with dtype string and shape [32190]
[[{{node Placeholder/_1}}]]
2023-05-22 12:36:59.891129: I tensorflow/core/common_runtime/executor.cc:1197] [/device:CPU:0] (DEBUG INFO) Executor start aborting (this does not indicate an error and you can ignore this message): INVALID_ARGUMENT: You must feed a value for placeholder tensor 'Placeholder/_1' with dtype string and shape [32190]
[[{{node Placeholder/_1}}]]
Epoch 1/100
2023-05-22 12:37:19.169642: E tensorflow/core/grappler/optimizers/meta_optimizer.cc:954] layout failed: INVALID_ARGUMENT: Size of values 0 does not match size of permutation 4 @ fanin shape inmodel_5/dropout/dropout/SelectV2-2-TransposeNHWCToNCHW-LayoutOptimizer
2023-05-22 12:37:25.133130: I tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:424] Loaded cuDNN version 8700
2023-05-22 12:37:31.275985: I tensorflow/compiler/xla/service/service.cc:169] XLA service 0x7f0603f4c350 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2023-05-22 12:37:31.276032: I tensorflow/compiler/xla/service/service.cc:177] StreamExecutor device (0): NVIDIA A100-SXM4-40GB, Compute Capability 8.0
2023-05-22 12:37:31.420157: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:269] disabling MLIR crash reproducer, set env var MLIR_CRASH_REPRODUCER_DIRECTORY
to enable.
2023-05-22 12:37:31.919526: I ./tensorflow/compiler/jit/device_compiler.h:180] Compiled cluster using XLA! This line is logged at most once for the lifetime of the process.
54/6438 [..............................] - ETA: 2:13:38 - loss: 1.3788 - categorical_accuracy: 0.5008 - mean_io_u_with_one_hot_labels: 0.2887
halo, in ipm.py when set the R matrix for Camera, the apply order is first pitch, then apply yaw:
this may cause apply yaw is not right because the Y axis is changed after we apply pitch?
(although its ok when pitch is 0.when i changed the pitch t0 no-zero, it generate wrong answer)
i think we should first apply yaw, and then apply pitch.
By the way, there is another question. When training the Spatial Transformer Network, how much should the regularization coefficient be set to solve the problem of loss being nan?
Hi,
Thank you again for the great work!
I am wondering if this model can also work well from BEV to front view translation?
I tried reproducing the results mentioned in the paper. To train deeplab-xception and deeplab-mobilenet, I simply changed "./train.py -c config.1_FRLR.unetxst.yml" to "./train.py -c config.1_FRLR.deeplab-xception.yml" and "./train.py -c config.1_FRLR.deeplab-mobilenet.yml". The training successfully starts but no improvement in performance occurs. I did not change any of the configurations and the training stops at epoch 21 for both models. Thanks for your time and I look forward to your guidance.
Hello, I found a performance issue in the definition of one_hot_encode_image_op
, model/utils.py, tf.zeros
will be called repeatedly during program execution, resulting in reduced efficiency. I think it should be created before the loop.
Looking forward to your reply. Btw, I am very glad to create a PR to fix it if you are too busy.
Hello,
Thanks for sharing your work, it looks awesome. Would you be pleased to provide pretrained weights of your models? It would be much easier to compare custom results with your work if there is no need to reproduce training, that may takes a lot of time unfortunately.
Best regards.
Hello,
Could you please explain how one can use real-world images after training the model ? I have tested the model successfully on the validation dataset from VTD, but for real-world data, I believe that I have to semantically segment it with the color palette that was used for training. Is there an existing model that you recommend for segmentation ? (i.e. which model do you use to label the left-most real-world input pictures of Fig. 6 from the paper ?)
Thank you
Originally posted by @mikqs in #16 (comment)
Dear Sir
Thanks so much for your amazing work. May you share the pre-trained model on both 360 view and frontal view?
Yours
Hazem
Thanks for your great job. I am reading the source code. In the ipm.py , why the t should multiply the -R?
def setT(self, XCam, YCam, ZCam):
X = np.array([XCam, YCam, ZCam])
self.t = -self.R.dot(X)
Hi, I cannot do the whole training in one go, thus I have to do multiple training sessions. I couldn't figure a way to resume training from where it was left off during the last training session, like loading the last trained weights and then doing the training.
Thank you for your time. looking forward to your reply.
Hi,
I am planning to use your software for a real application. I want to use the single input model with the uNetXST configuration and a single front facing camera. Is there any way to train the model with your dataset or do I need to create my own dataset with my camera parameters? Also, my camera operates in HD resolution (1280x720), does the training data also need to have the same resolution?
Many thanks in advance!
Hi, may I know what is frame rate the images in the data set are taken at?
Hello @lreiher, thank you for the fabulous work!
We're trying to reproduce your results on the UNetXST model with your dataset and your configuration (config.1_FRLR.unetxst.yml ). I can confirm a previous issue that the RAM usage continuously and rapidly grows as soon as the training starts. I am talking about the main RAM, not the GPU RAM. By the time process is killed due to OOM, it occupies ~55GB of physical memory. We're trying to train on two RTX 3090s with 24GB RAM each.
Can you think of a part of your code where stuff accumulates in memory without being garbage collected? It's very strange that you haven't encountered this. As a note, we were able to train MobileNetV2 with your 1.FRLR configuration so this issue is endemic to UNetXST.
I am trying to train this model on my own data. I have been able to get it to work before with my own data, but I wanted to have my images semantically segmented beforehand, so I used a different model to do so, and I think in doing so I must have changed my environment enough to start getting this error because I highly doubt it's an issue with the new images I'm using. They are the same size as the previous. I have run the requirements.txt file and still am getting this issue. I have posted the error below. Any help on what the problem might be and how to fix it would be greatly appreciated. Thank you in advance!
Starting training...
Train for 200 steps, validate for 1169 steps
Epoch 1/100
2020-12-06 20:35:50.965473: W tensorflow/core/common_runtime/base_collective_executor.cc:217] BaseCollectiveExecutor::StartAbort Invalid argument: Incompatible shapes: [3] vs. [256,512,4]
[[{{node Equal_29}}]]
[[IteratorGetNext]]
2020-12-06 20:35:50.987892: W tensorflow/core/common_runtime/base_collective_executor.cc:217] BaseCollectiveExecutor::StartAbort Invalid argument: Incompatible shapes: [3] vs. [256,512,4]
[[{{node Equal_29}}]]
[[IteratorGetNext]]
[[metrics/mean_io_u_with_one_hot_labels/StatefulPartitionedCall/confusion_matrix/assert_non_negative/assert_less_equal/Assert/AssertGuard/else/_6/Assert/data_1/_20]]
1/200 [..............................] - ETA: 35:21WARNING:tensorflow:Can save best model only with val_mean_io_u_with_one_hot_labels available, skipping.
WARNING:tensorflow:Early stopping conditioned on metric val_mean_io_u_with_one_hot_labels
which is not available. Available metrics are:
Traceback (most recent call last):
File "./train.py", line 185, in
callbacks=callbacks)
File "/home/techlab_grizzly/Desktop/Cam2BEV/env37/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training.py", line 819, in fit
use_multiprocessing=use_multiprocessing)
File "/home/techlab_grizzly/Desktop/Cam2BEV/env37/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2.py", line 342, in fit
total_epochs=epochs)
File "/home/techlab_grizzly/Desktop/Cam2BEV/env37/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2.py", line 128, in run_one_epoch
batch_outs = execution_function(iterator)
File "/home/techlab_grizzly/Desktop/Cam2BEV/env37/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2_utils.py", line 98, in execution_function
distributed_function(input_fn))
File "/home/techlab_grizzly/Desktop/Cam2BEV/env37/lib/python3.7/site-packages/tensorflow_core/python/eager/def_function.py", line 568, in call
result = self._call(*args, **kwds)
File "/home/techlab_grizzly/Desktop/Cam2BEV/env37/lib/python3.7/site-packages/tensorflow_core/python/eager/def_function.py", line 632, in _call
return self._stateless_fn(*args, **kwds)
File "/home/techlab_grizzly/Desktop/Cam2BEV/env37/lib/python3.7/site-packages/tensorflow_core/python/eager/function.py", line 2363, in call
return graph_function._filtered_call(args, kwargs) # pylint: disable=protected-access
File "/home/techlab_grizzly/Desktop/Cam2BEV/env37/lib/python3.7/site-packages/tensorflow_core/python/eager/function.py", line 1611, in _filtered_call
self.captured_inputs)
File "/home/techlab_grizzly/Desktop/Cam2BEV/env37/lib/python3.7/site-packages/tensorflow_core/python/eager/function.py", line 1692, in _call_flat
ctx, args, cancellation_manager=cancellation_manager))
File "/home/techlab_grizzly/Desktop/Cam2BEV/env37/lib/python3.7/site-packages/tensorflow_core/python/eager/function.py", line 545, in call
ctx=ctx)
File "/home/techlab_grizzly/Desktop/Cam2BEV/env37/lib/python3.7/site-packages/tensorflow_core/python/eager/execute.py", line 67, in quick_execute
six.raise_from(core._status_to_exception(e.code, message), None)
File "", line 3, in raise_from
tensorflow.python.framework.errors_impl.InvalidArgumentError: Incompatible shapes: [3] vs. [256,512,4]
[[{{node Equal_29}}]]
[[IteratorGetNext]] [Op:__inference_distributed_function_17137]
Function call stack:
distributed_function
Hello,
Thank you for sharing such a well-documented code repository for your work! Could you also share the co-ordinate system (left-handed, right-handed, or something else) that thepreprocessing/ipm/ipm.py
code is following. I see a comment about switching axes in the setR()
function so just wanted to make sure.
Thanks!
First and foremost, thank you for providing such a brilliant work!
Now here comes my issue. I firstly download the synthetic dataset you provide. Then when I try to use the ipm.py
in your pre-processing directory, I use the front/rear/left/right images in the data/1_FRLR/examples, and the camera calibration files in preprocessing/camera_configs/1_FRLR/, as the argument. This results in a totally blank image output.
After a debug through your code, I realize that it is caused by the non-matched resolution of images in example directory. Your configuration files tell that your cameras have a resolution of 964x604
, while the example images have a resolution of 320x200
.
Now I understand that this directory is used to store the images used to be shown in your README and I should select a set of images in train or val directory. However, for anyone coming up with this repo for the first time, such arrangement is very very very confusing. Cause at that moment we can hardly understand that it is a repo in your GitLab.
Therefore I highly recommend that you adjust such arrangement, maybe give images in examples directory proper resolution, or put another example images in your preprocessing directory.
It is not a critical issue. Your work is still generally wonderful!
Our camera suite lacks view of a ~30 degree region in the rear of the vehicle. Classical methods of image stitching would break down without overlapping regions between cameras for image stitching. Since this method uses a learning network, I could see it being robust to this.
So to clarify the question, how would you expect this method to respond to a non-360 view. Would the region just show up as the added "occluded" class since it is not within the view of any of the cameras?
Hello, may I ask if there is a trained model that can directly perform inference
Hello!I would like to ask whether drone config parameter is specified the exact value or we can choose an suitable one? if specified the exact value,how it was obtained
Thanks so much for sharing this method and code in such a well-documented fashion. I just had a question of clarification regarding the use of different one-hot-palette labels in the mutli-view versus single-view networks.
I started off training the 1_FRLR method, and I observe that the one-hot-palette-input, convert_10, and the one-hot-palette-label, convert_9+occl, seem very similar - the main difference is that the "sky" RGB values are changed to "occluded" instead, which makes sense because the inputs will not see an occluded class, and the BEV will not see the sky.
However, now looking at the 2_F config file, I see that while convert_10 is still used for the one-hot-palette-input, the one-hot-palette-label is now using convert_3+occl. My understanding is that now the network input views are seeing classes that the ground truth input will never have - for example, terrain that was seen as "9" in the input will be understood as "3" in the ground truth BEV. So my questions are:
Hi @lreiher, thanks for your great work.
I suffer from some training problem, when I run the code (with the released dataset),
at the end of the first epoch (6638/6639) (may be in the val stage), the GPU memory shows not enough and the training break. I run the code on the Titan X GPU.
I copy and show some error content:
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
2020-12-24 16:12:51.644620: I tensorflow/core/common_runtime/bfc_allocator.cc:917] 1 Chunks of size 8388608 totalling 8.00MiB
2020-12-24 16:12:51.644637: I tensorflow/core/common_runtime/bfc_allocator.cc:917] 1 Chunks of size 8415488 totalling 8.03MiB
2020-12-24 16:12:51.644654: I tensorflow/core/common_runtime/bfc_allocator.cc:917] 3 Chunks of size 10485760 totalling 30.00MiB
2020-12-24 16:12:51.644672: I tensorflow/core/common_runtime/bfc_allocator.cc:917] 1 Chunks of size 15204352 totalling 14.50MiB
2020-12-24 16:12:51.644689: I tensorflow/core/common_runtime/bfc_allocator.cc:917] 28 Chunks of size 26214400 totalling 700.00MiB
2020-12-24 16:12:51.644707: I tensorflow/core/common_runtime/bfc_allocator.cc:917] 1 Chunks of size 29360128 totalling 28.00MiB
2020-12-24 16:12:51.644725: I tensorflow/core/common_runtime/bfc_allocator.cc:917] 1 Chunks of size 32505856 totalling 31.00MiB
2020-12-24 16:12:51.644743: I tensorflow/core/common_runtime/bfc_allocator.cc:921] Sum Total of in-use chunks: 64.00GiB
2020-12-24 16:12:51.644760: I tensorflow/core/common_runtime/bfc_allocator.cc:923] total_region_allocated_bytes_: 68719476736 memory_limit_: 68719476736 available bytes: 0 curr_region_allocation_bytes_: 68719476736
2020-12-24 16:12:51.644783: I tensorflow/core/common_runtime/bfc_allocator.cc:929] Stats:
Limit: 68719476736
InUse: 68714584576
MaxInUse: 68719357696
NumAllocs: 115809
MaxAllocSize: 33554432
2020-12-24 16:12:51.645379: W tensorflow/core/common_runtime/bfc_allocator.cc:424] ****************************************************************************************************
What is the speed of this implementation? We are mostly interested in finding a way to stitch camera images and had thoughts about transferring to birds eye view for depth information. Our purpose is to implement this is on our autonmous race car, and we will be running at high speeds so traditional homography seems to have too many inaccuracies.
Your methods pose interest to us. However, given that this is on a racecar, speed is of high importance. We are currently running our cameras anywhere from 25 to 40 Hz. In the paper, you mention 2Hz. Is this a consequence from the speed of the network?
In short, what's the latency or maximum speed of the network?
Thank you for your excellent work! I have some questions to ask.
Thanks for your great work. I'd like to konw how i can get the camera_configs with my own data. My own data is collected by carla, Can you give me some suggestions. Thanks a lot
Hello!
I want to train the model (uNetXST) on the original images as input. I was wondering what change needs to be done in that case?
I assume I will have to change train.py
so that no one-hot-encoding is done on the input? In that case, what will the value of n_classes_input
be? Just the channels of the image (3)?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.