Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sarif: initial implementation of csdiff fingerprints #168

Merged
merged 11 commits into from
May 3, 2024

Conversation

kdudka
Copy link
Member

@kdudka kdudka commented Mar 29, 2024

csdiff/v0 fingerprints

It hashes the data that csdiff uses in its matching algorithm.

From the updated tests it is obvious that these hashes already
have numerous collisions on the existing test data.

csdiff/v1 fingerprints

It hashes the data that csdiff uses in its matching algorithm
and the line content without spaces. For this fingerprint to
be computed, the results need to include the line content for
the key event in the format produced by csgrep --embed-context.

Related: https://issues.redhat.com/browse/OSH-9
Resolves: #98

@kdudka kdudka self-assigned this Mar 29, 2024
kdudka added a commit to kdudka/csdiff that referenced this pull request Mar 29, 2024
It hashes the data that csdiff uses in its matching algorithm
and the line content without spaces.  For this fingerprint to
be computed, the results need to include the line content for
the key event in the format produced by `csgrep --embed-context`.

Related: https://issues.redhat.com/browse/OSH-9
Resolves: csutils#98
Closes: csutils#168
kdudka added a commit to kdudka/csdiff that referenced this pull request Apr 1, 2024
kdudka added a commit to kdudka/csdiff that referenced this pull request Apr 1, 2024
kdudka added a commit to kdudka/csdiff that referenced this pull request Apr 1, 2024
kdudka added a commit to kdudka/csdiff that referenced this pull request Apr 1, 2024
@kdudka kdudka force-pushed the sarif-fingerprints branch 2 times, most recently from 53621d0 to 05db189 Compare April 1, 2024 13:55
kdudka added a commit to kdudka/csdiff that referenced this pull request Apr 1, 2024
@kdudka kdudka marked this pull request as ready for review April 1, 2024 14:12
kdudka added a commit to kdudka/csdiff that referenced this pull request Apr 9, 2024
kdudka added a commit to kdudka/csdiff that referenced this pull request Apr 19, 2024
kdudka added a commit to kdudka/csdiff that referenced this pull request Apr 23, 2024
kdudka added a commit to kdudka/csdiff that referenced this pull request Apr 23, 2024
@jamacku
Copy link
Member

jamacku commented Apr 26, 2024

Great example when codeql fingerprints work:

✋ Defects, NEEDS INSPECTION
Error: SHELLCHECK_WARNING:
./src/functions.sh:309:46: warning[SC2154]: GITHUB_REF is referenced but not assigned.
#  306|   uploadSARIF () {
#  307|     is_debug && local verbose=--verbose
#  308|   
#  309|->   echo '{"commit_oid":"'"${HEAD}"'","ref":"'"${GITHUB_REF//merge/head}"'","analysis_key":"differential-shellcheck","sarif":"'"$(gzip -c output.sarif | base64 -w0)"'","tool_names":["differential-shellcheck"]}' > payload.json
#  310|   
#  311|     local curl_args=(
#  312|       "${verbose:---silent}"

Error: SHELLCHECK_WARNING:
./src/functions.sh:331:24: info[SC2312]: Consider invoking this command separately to avoid masking its return value (or use '|| true' to ignore).
#  328|   
#  329|   get_shellcheck_version () {
#  330|     local shellcheck_version
#  331|->   shellcheck_version=$(shellcheck --version | grep -w "version:" | cut -s -d ' ' -f 2)
#  332|   
#  333|     echo "${shellcheck_version}"
#  334|   }

Error: SHELLCHECK_WARNING:
./src/functions.sh:331:47: info[SC2312]: Consider invoking this command separately to avoid masking its return value (or use '|| true' to ignore).
#  328|   
#  329|   get_shellcheck_version () {
#  330|     local shellcheck_version
#  331|->   shellcheck_version=$(shellcheck --version | grep -w "version:" | cut -s -d ' ' -f 2)
#  332|   
#  333|     echo "${shellcheck_version}"
#  334|   }

The code hasn't changed for the last two defects, but they were still reported by csdiff.

I'll try to check if csdiff fingerprints would give the same results as codeql.

@jamacku
Copy link
Member

jamacku commented Apr 26, 2024

Here is the csgrep output from before the change and after:

You should be able to reproduce the above situation form:

kdudka pushed a commit to kdudka/csdiff that referenced this pull request Apr 26, 2024
... taken from a real-world example encountered
by differential-shellcheck:
redhat-plumbers-in-action/differential-shellcheck#376

Related: csutils#98
Closes: csutils#168
@kdudka
Copy link
Member Author

kdudka commented Apr 26, 2024

@jamacku Nice! I have added it as a regression test in 95e35d3. It passes with ca7f32a but not without it.

Copy link
Member

@jamacku jamacku left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @kdudka

@jamacku jamacku requested a review from lzaoral April 26, 2024 16:24
src/csgrep.cc Outdated Show resolved Hide resolved
src/csgrep.cc Outdated Show resolved Hide resolved
src/lib/finger-print.cc Show resolved Hide resolved
src/lib/finger-print.cc Outdated Show resolved Hide resolved
src/lib/finger-print.cc Show resolved Hide resolved
src/lib/finger-print.hh Outdated Show resolved Hide resolved
src/lib/hash-util.hh Outdated Show resolved Hide resolved
src/lib/writer-json-sarif.cc Outdated Show resolved Hide resolved
kdudka pushed a commit to kdudka/csdiff that referenced this pull request May 2, 2024
... taken from a real-world example encountered
by differential-shellcheck:
redhat-plumbers-in-action/differential-shellcheck#376

Related: csutils#98
Closes: csutils#168
@kdudka kdudka requested a review from lzaoral May 2, 2024 15:46
@lzaoral
Copy link
Member

lzaoral commented May 2, 2024

Somehow, the comment I've just written is not here. Reposting:

Copy link
Member

@lzaoral lzaoral left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's up to you if you want to use a move or copy constructor for the elements of the map. Otherwise, LGTM.

jamacku added a commit to jamacku/differential-shellcheck that referenced this pull request May 3, 2024
This is preparation for Fingerprints that are coming in next version of csdiff.

csutils/csdiff#168

> It hashes the data that csdiff uses in its matching algorithm
> and the line content without spaces. For this fingerprint to
> be computed, the results need to include the line content for
> the key event in the format produced by `csgrep --embed-context`.
jamacku added a commit to redhat-plumbers-in-action/differential-shellcheck that referenced this pull request May 3, 2024
This is preparation for Fingerprints that are coming in next version of csdiff.

csutils/csdiff#168

> It hashes the data that csdiff uses in its matching algorithm
> and the line content without spaces. For this fingerprint to
> be computed, the results need to include the line content for
> the key event in the format produced by `csgrep --embed-context`.
kdudka added 9 commits May 3, 2024 12:52
It hashes the data that csdiff uses in its matching algorithm.
The interfaces are already prepared for csdiff/v1, which will
also take the line content into account when available.

From the updated tests it is obvious that these hashes already
have numerous collisions on the existing test data.

Related: https://issues.redhat.com/browse/OSH-9
Related: csutils#98
It hashes the data that csdiff uses in its matching algorithm
and the line content without spaces.  For this fingerprint to
be computed, the results need to include the line content for
the key event in the format produced by `csgrep --embed-context`.

Related: https://issues.redhat.com/browse/OSH-9
Resolves: csutils#98
... to make the code easier to follow.  No change in behavior intended
with this commit.

Related: csutils#98
... when the line content in the `csgrep --embed-context` format
is available.

Out of the 1783 fingerprints generated for the csgrep regression tests
we got 115 collisions, which need to be analyzed.  Some of them look
undesired as, for example:
```
Error: CERT EXP40-C (CWE-758):
wget-1.21.1/src/http.c:256: cert_exp40_c_violation: Casting pointer "value" with type "char const *" to type "void *" allows an object defined with a const-qualified type to be modified through use of an lvalue with non-const-qualified type.
 # 254|             release_header (hdr);
 # 255|             hdr->name = (void *)name;
 # 256|->           hdr->value = (void *)value;
 # 257|             hdr->release_policy = release_policy;
 # 258|             return;

Error: CERT EXP40-C (CWE-758):
wget-1.21.1/src/http.c:271: cert_exp40_c_violation: Casting pointer "value" with type "char const *" to type "void *" allows an object defined with a const-qualified type to be modified through use of an lvalue with non-const-qualified type.
 # 269|     hdr = &req->headers[req->hcount++];
 # 270|     hdr->name = (void *)name;
 # 271|->   hdr->value = (void *)value;
 # 272|     hdr->release_policy = release_policy;
 # 273|   }
```

Related: csutils#98
No change in behavior intended with this commit.

Related: csutils#98
... to make the code easier to follow.  No change in behavior intended
with this commit.

Related: csutils#98
kdudka pushed a commit to kdudka/csdiff that referenced this pull request May 3, 2024
... taken from a real-world example encountered
by differential-shellcheck:
redhat-plumbers-in-action/differential-shellcheck#376

Related: csutils#98
Closes: csutils#168
jamacku and others added 2 commits May 3, 2024 13:04
... taken from a real-world example encountered
by differential-shellcheck:
redhat-plumbers-in-action/differential-shellcheck#376

Related: csutils#98
We use boost-1.69 in our EPEL-7 builds.

Related: csutils#98
Closes: csutils#168
@kdudka
Copy link
Member Author

kdudka commented May 3, 2024

@jamacku @lzaoral Thank you both for review!

@kdudka kdudka merged commit 4ae879f into csutils:main May 3, 2024
33 checks passed
@kdudka kdudka deleted the sarif-fingerprints branch May 3, 2024 11:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SARIF: Please add support for fingerprints in SARIF format 🐾
3 participants