Add sanity check for length of a PE resource type name #974

HoundThe · 2021-06-22T15:49:14Z

Aims to fix #959

Sometimes the type name pointer points to (probably?) junk data, which results in a string with absurd length and content. A simple sanity check on the string length could solve this problem.

As in example 097a0f8b8c3f2b90f5360f27da5ef8e5a7406c6f211c8d3e56a1671933508bc1. More details can be seen in the mentioned issue.

If this solution is "OK", I'll add a test case in regression test suite.

PeterMatula

This string length solution is OK. However, I would like to discuss few things - possibly propose other solutions, that could work together with the ultimate string length limit implemented as is:

Are these strings typically, or by definition, null-terminated? Would it make sense to add check that would break string loading if either simple null char, or wide null char, was encountered?
Are these strings typically readable? Would it be too strict if we added a check that would break parsing if a non-printable (ASCII) character was encountered?

But these are just things to discuss - rule out. In general, I like that we would not be too strict/clever and do it like LIEF.

HoundThe · 2021-07-09T19:33:48Z

I am afraid that the resource string has the following format (https://docs.microsoft.com/en-us/windows/win32/debug/pe-format#resource-directory-string): 2 byte length at start, and then the variable length Unicode string, which is probably UTF 16 (I don't see it defined there to confirm)? So no null terminators and parsing this would be probably a bit complex, but maybe doable?

But I am thinking if it makes sense, to have wide null in a string even tho it is not null terminated. I am not 100 % sure about the variable length UTF 16 encoding and what it means to have 0x0000 🤔 Might be a usable heuristic.

PeterMatula · 2021-07-12T05:42:21Z

Well, in that case 100 chars limit it is ok:

Do not try to do "printable" chars heuristics, I don't think it makes sense for UTF 16.
Yes, 0x0000 should be terminating char for such strings, but because there is also string size, and LIEF is not using this heuristic, I'm afraid there might be a reason. So lets just keep it like this, and solve it only ad-hoc if it causes another problem.

Btw, tests are OK. They failed because tests for #973 were merged to master and master was not merged here. Maybe we should/could configure TC to tests the state of PR after it is merged to master (@s3rvac?) but that might have other disadvantages? So I don't know.

@HoundThe I'm merging this, please add a regression test(s).

s3rvac · 2021-07-13T05:03:42Z

Btw, tests are OK. They failed because tests for #973 were merged to master and master was not merged here. Maybe we should/could configure TC to tests the state of PR after it is merged to master (@s3rvac?) but that might have other disadvantages? So I don't know.

Yes, this is a possibility. There are the following options:

Build just the PR branch (this is the current state).
Build just the PR as if it was merged to master (this is suggested in the quote).
Build both the PR branch as well as the PR as if it was merged to master. This would result in two builds for every configuration on every PR.

If we choose to go with option 2 or 3, we will have to decide what to do in case there are merge conflicts (the build would fail in such case) and also consider proper changes to the handling of the retdec-regression-tests repo (e.g. when there is a branch having the same name in both the retdec and retdec-regression-tests repos, that branch is used for testing).

PeterMatula · 2021-07-13T07:58:54Z

@s3rvac I think, we are ultimately interested in state 2. Option 3 could be useful in some cases, but because builds on windows may take nearly an hour, it would often be too slow to work with. I think, merge conflicts are not a big problem - they need to be solved before merging to master anyway, therefore when they are solved, and sources presumably modified, tests need to run again anyway - i.e. they should be solved first and then we can run option 2 tests.
I created #985 to explore this.

Add sanity check for max resource string length

ace5f3a

HoundThe requested a review from PeterMatula June 22, 2021 15:51

Change code style to follow rest the existing code

9cd9397

PeterMatula reviewed Jul 9, 2021

View reviewed changes

PeterMatula merged commit 922745a into avast:master Jul 12, 2021

HoundThe mentioned this pull request Jul 12, 2021

Test for PE resource junk string check avast/retdec-regression-tests#97

Merged

PeterMatula added a commit that referenced this pull request Jul 12, 2021

CHANGELOG.md: add entry for #974 and #959

362b48a

PeterMatula mentioned this pull request Jul 13, 2021

CI: Build PR merged with master in TeamCity #985

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add sanity check for length of a PE resource type name #974

Add sanity check for length of a PE resource type name #974

HoundThe commented Jun 22, 2021

PeterMatula left a comment

HoundThe commented Jul 9, 2021 •

edited

Loading

PeterMatula commented Jul 12, 2021

s3rvac commented Jul 13, 2021

PeterMatula commented Jul 13, 2021

Add sanity check for length of a PE resource type name #974

Add sanity check for length of a PE resource type name #974

Conversation

HoundThe commented Jun 22, 2021

PeterMatula left a comment

Choose a reason for hiding this comment

HoundThe commented Jul 9, 2021 • edited Loading

PeterMatula commented Jul 12, 2021

s3rvac commented Jul 13, 2021

PeterMatula commented Jul 13, 2021

HoundThe commented Jul 9, 2021 •

edited

Loading