-
Notifications
You must be signed in to change notification settings - Fork 609
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add warp_perspective operator #5542
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Rafal Banas <[email protected]>
eacdc59
to
a0f9683
Compare
.AddOptionalArg<float>("matrix", | ||
R"doc( | ||
Perspective transform mapping of destination to source coordinates. | ||
If `inverse_map` argument is set to true, the matrix is interpreted |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If `inverse_map` argument is set to true, the matrix is interpreted | |
If `inverse_map` argument is set to false, the matrix is interpreted |
?
At least that's what OpenCV's documentation says.
cv2_warp_perspective( | ||
dst[f, :, :, :], | ||
img[f, :, :, :], | ||
matrix, | ||
layout, | ||
border_mode, | ||
interp_type, | ||
inverse_map, | ||
fill_value, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have objections against using OpenCV as a reference. It has a different notion of pixel centers than we do. If we match OpenCV, it's actually a bad thing. Instead, when the perspective matrix is actually an affine transform matrix, WarpPerspective should match WarpAffine.
matrix = AcquireTensorArgument(ws, scratchpad, matrix_arg_, TensorShape<1>(9), | ||
nvcvop::GetDataType<float>(), "W"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty sure we need to apply a fixup to the matrix to match WarpAffine. We can add OpenCV compatibility (here and in WarpAffine) as an option, but I think being self-consistent is far better than randomly matching a patchwork of common libraries.
|
||
DALI_SCHEMA(experimental__WarpPerspective) | ||
.DocStr(R"doc( | ||
TODO |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should probably be gone before merging. It'd be nice to have a description what this operator actually does
@@ -264,7 +264,7 @@ if (BUILD_CVCUDA) | |||
set(DALI_BUILD_PYTHON ${BUILD_PYTHON}) | |||
set(BUILD_PYTHON OFF) | |||
# for now we use only median blur from CV-CUDA | |||
set(CV_CUDA_SRC_PATERN medianblur median_blur morphology) | |||
set(CV_CUDA_SRC_PATERN medianblur median_blur morphology warp) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick: I know it's not you but this should be PATTERN
@@ -0,0 +1,372 @@ | |||
# Copyright (c) 2024, NVIDIA CORPORATION & AFFILIATES. All rights reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do a lot of parameter handling with tons of bug-prone ifs. I think we need to have some negative tests to check if we handle invalid argument combinations properly.
"Transformation matrix data. Should be used to pass the GPU data. " | ||
"For CPU data, the `matrix` argument should be used.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The _data
suffix and data word in the description gives me an impression that this is somehow lower level thingy than the matrix
argumemnt.
"Transformation matrix data. Should be used to pass the GPU data. " | |
"For CPU data, the `matrix` argument should be used.") | |
"Like ``matrix`` keyword argument, but accepts matrices placed in GPU memory.") |
case the matrix can be placed on the GPU.)doc", | ||
std::vector<float>({}), true, true) | ||
.AddOptionalArg("border_mode", | ||
"Border mode to be used when accessing elements outside input image.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd enumerate the supported values here. It's a string, so I guess user will have no link to some enumeration.
.AddOptionalArg("border_mode", | ||
"Border mode to be used when accessing elements outside input image.", | ||
"constant") | ||
.AddOptionalArg("interp_type", "Interpolation method.", "nearest") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here.
"Value used to fill areas that are outside the source image when the " | ||
"\"constant\" border_mode is chosen.", | ||
std::vector<float>({})) | ||
.AddOptionalArg<bool>("inverse_map", "Inverse perspective transform matrix", false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.AddOptionalArg<bool>("inverse_map", "Inverse perspective transform matrix", false); | |
.AddOptionalArg<bool>("inverse_map", "If set to false (default), the ``matrix`` is interpreted as...", false); |
if (channels > 0) { | ||
if (channels == static_cast<int>(fill_value_arg_.size())) { | ||
float4 fill_value{0, 0, 0, 0}; | ||
memcpy(&fill_value, fill_value_arg_.data(), fill_value_arg_.size() * sizeof(float)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
paranoid nitpick (due to narrowing cast in the channels check):
memcpy(&fill_value, fill_value_arg_.data(), fill_value_arg_.size() * sizeof(float)); | |
memcpy(&fill_value, fill_value_arg_.data(), sizeof(float4)); |
memcpy(&fill_value, fill_value_arg_.data(), fill_value_arg_.size() * sizeof(float)); | |
memcpy(&fill_value, fill_value_arg_.data(), sizeof(decltype(fill_value))); |
auto width = std::max<int>(std::roundf(shape_arg_[1]), 1); | ||
auto out_sample_shape = (channels != -1) ? TensorShape<>({height, width, channels}) : | ||
TensorShape<>({height, width}); | ||
output_shape = TensorListShape<>::make_uniform(input.num_samples(), out_sample_shape); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't the size
argument marked per-sample?
for (int d = shape.size() - 2; d >= 0; --d) { | ||
inBuf.strides[d] = shape[d + 1] * inBuf.strides[d + 1]; | ||
} | ||
TensorLayout out_layout = layout.empty() ? tensor.GetLayout() : layout; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we validate that the layout non-zero size matches the number of dims?
Category:
New feature
Description:
It adds a new experimental.warp_perspective operator that uses CV-CUDA operator as its implementation.
Additional information:
Affected modules and functionalities:
New operator
Key points relevant for the review:
Correctness of handing of the parameteres
Tests:
Checklist
Documentation
DALI team only
Requirements
REQ IDs: N/A
JIRA TASK: N/A