Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add warp_perspective operator #5542

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

banasraf
Copy link
Collaborator

Category:

New feature

Description:

It adds a new experimental.warp_perspective operator that uses CV-CUDA operator as its implementation.

Additional information:

Affected modules and functionalities:

New operator

Key points relevant for the review:

Correctness of handing of the parameteres

Tests:

  • Existing tests apply
  • New tests added
    • Python tests
    • GTests
    • Benchmark
    • Other
  • N/A

Checklist

Documentation

  • Existing documentation applies
  • Documentation updated
    • Docstring
    • Doxygen
    • RST
    • Jupyter
    • Other
  • N/A

DALI team only

Requirements

  • Implements new requirements
  • Affects existing requirements
  • N/A

REQ IDs: N/A

JIRA TASK: N/A

@banasraf banasraf force-pushed the add-warp-perspective-operator branch from eacdc59 to a0f9683 Compare June 28, 2024 10:57
elif dtype == np.int16 or dtype == np.uint16:
eps = 5

test_utils.compare_pipelines(pipe1, pipe2, batch_size=bs, N_iterations=10, eps=eps)

Check failure

Code scanning / CodeQL

Potentially uninitialized local variable Error

Local variable 'eps' may be used before it is initialized.
.AddOptionalArg<float>("matrix",
R"doc(
Perspective transform mapping of destination to source coordinates.
If `inverse_map` argument is set to true, the matrix is interpreted
Copy link
Contributor

@mzient mzient Jul 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
If `inverse_map` argument is set to true, the matrix is interpreted
If `inverse_map` argument is set to false, the matrix is interpreted

?
At least that's what OpenCV's documentation says.

Comment on lines +94 to +103
cv2_warp_perspective(
dst[f, :, :, :],
img[f, :, :, :],
matrix,
layout,
border_mode,
interp_type,
inverse_map,
fill_value,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have objections against using OpenCV as a reference. It has a different notion of pixel centers than we do. If we match OpenCV, it's actually a bad thing. Instead, when the perspective matrix is actually an affine transform matrix, WarpPerspective should match WarpAffine.

Comment on lines +179 to +180
matrix = AcquireTensorArgument(ws, scratchpad, matrix_arg_, TensorShape<1>(9),
nvcvop::GetDataType<float>(), "W");
Copy link
Contributor

@mzient mzient Jul 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure we need to apply a fixup to the matrix to match WarpAffine. We can add OpenCV compatibility (here and in WarpAffine) as an option, but I think being self-consistent is far better than randomly matching a patchwork of common libraries.


DALI_SCHEMA(experimental__WarpPerspective)
.DocStr(R"doc(
TODO
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should probably be gone before merging. It'd be nice to have a description what this operator actually does

@@ -264,7 +264,7 @@ if (BUILD_CVCUDA)
set(DALI_BUILD_PYTHON ${BUILD_PYTHON})
set(BUILD_PYTHON OFF)
# for now we use only median blur from CV-CUDA
set(CV_CUDA_SRC_PATERN medianblur median_blur morphology)
set(CV_CUDA_SRC_PATERN medianblur median_blur morphology warp)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: I know it's not you but this should be PATTERN

@@ -0,0 +1,372 @@
# Copyright (c) 2024, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do a lot of parameter handling with tons of bug-prone ifs. I think we need to have some negative tests to check if we handle invalid argument combinations properly.

Comment on lines +41 to +42
"Transformation matrix data. Should be used to pass the GPU data. "
"For CPU data, the `matrix` argument should be used.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The _data suffix and data word in the description gives me an impression that this is somehow lower level thingy than the matrix argumemnt.

Suggested change
"Transformation matrix data. Should be used to pass the GPU data. "
"For CPU data, the `matrix` argument should be used.")
"Like ``matrix`` keyword argument, but accepts matrices placed in GPU memory.")

case the matrix can be placed on the GPU.)doc",
std::vector<float>({}), true, true)
.AddOptionalArg("border_mode",
"Border mode to be used when accessing elements outside input image.",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd enumerate the supported values here. It's a string, so I guess user will have no link to some enumeration.

.AddOptionalArg("border_mode",
"Border mode to be used when accessing elements outside input image.",
"constant")
.AddOptionalArg("interp_type", "Interpolation method.", "nearest")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here.

"Value used to fill areas that are outside the source image when the "
"\"constant\" border_mode is chosen.",
std::vector<float>({}))
.AddOptionalArg<bool>("inverse_map", "Inverse perspective transform matrix", false);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
.AddOptionalArg<bool>("inverse_map", "Inverse perspective transform matrix", false);
.AddOptionalArg<bool>("inverse_map", "If set to false (default), the ``matrix`` is interpreted as...", false);

if (channels > 0) {
if (channels == static_cast<int>(fill_value_arg_.size())) {
float4 fill_value{0, 0, 0, 0};
memcpy(&fill_value, fill_value_arg_.data(), fill_value_arg_.size() * sizeof(float));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

paranoid nitpick (due to narrowing cast in the channels check):

Suggested change
memcpy(&fill_value, fill_value_arg_.data(), fill_value_arg_.size() * sizeof(float));
memcpy(&fill_value, fill_value_arg_.data(), sizeof(float4));
Suggested change
memcpy(&fill_value, fill_value_arg_.data(), fill_value_arg_.size() * sizeof(float));
memcpy(&fill_value, fill_value_arg_.data(), sizeof(decltype(fill_value)));

auto width = std::max<int>(std::roundf(shape_arg_[1]), 1);
auto out_sample_shape = (channels != -1) ? TensorShape<>({height, width, channels}) :
TensorShape<>({height, width});
output_shape = TensorListShape<>::make_uniform(input.num_samples(), out_sample_shape);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't the size argument marked per-sample?

for (int d = shape.size() - 2; d >= 0; --d) {
inBuf.strides[d] = shape[d + 1] * inBuf.strides[d + 1];
}
TensorLayout out_layout = layout.empty() ? tensor.GetLayout() : layout;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we validate that the layout non-zero size matches the number of dims?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants