Official implementation of "Towards 3D Scene Reconstruction from Locally Scale-Aligned Monocular Video Depth (Boosting Monocular Depth Estimation with Sparse Guided Points)"
Dear authors,
The input for linear regression for e.q.2 in the paper is predicted depth, however, I found that in your implementation, you first refine the predicted depth using global linear regression, then the result serves as locally linear regression's input. Could you please explain it more clearly?