Skip to content

Commit

Permalink
Add support for entity-matching in domain= filter option
Browse files Browse the repository at this point in the history
Related issue:
- uBlockOrigin/uBlock-issues#1008

This commit adds support entity-matching in the filter
option `domain=`. Example:

    pattern$domain=google.*

The `*` above is meant to match any suffix from the Public
Suffix List. The semantic is exactly the same as the
already existing entity-matching support in static
extended filtering:

- https://github.com/gorhill/uBlock/wiki/Static-filter-syntax#entity

Additionally, in this commit:

Fix cases where "just-origin" filters of the form `|http*://`
were erroneously normalized to `|http://`. The proper
normalization of `|http*://` is `*`.

Add support to store hostname strings into the character
buffer of a hntrie container. As of commit time, there are
5,544 instances of FilterOriginHit, and 732 instances of
FilterOriginMiss, which filters require storing/matching a
single hostname string. Those strings are now stored in the
character buffer of the already existing origin-related
 hntrie container. (The same approach is used for plain
patterns which are not part of a bidi-trie.)
  • Loading branch information
gorhill committed May 24, 2020
1 parent 56a3aff commit 3c67d2b
Show file tree
Hide file tree
Showing 3 changed files with 376 additions and 207 deletions.
4 changes: 2 additions & 2 deletions src/js/background.js
Original file line number Diff line number Diff line change
Expand Up @@ -138,8 +138,8 @@ const µBlock = (( ) => { // jshint ignore:line

// Read-only
systemSettings: {
compiledMagic: 27, // Increase when compiled format changes
selfieMagic: 26, // Increase when selfie format changes
compiledMagic: 28, // Increase when compiled format changes
selfieMagic: 28, // Increase when selfie format changes
},

// https://github.com/uBlockOrigin/uBlock-issues/issues/759#issuecomment-546654501
Expand Down
43 changes: 43 additions & 0 deletions src/js/hntrie.js
Original file line number Diff line number Diff line change
Expand Up @@ -407,6 +407,49 @@ const HNTrieContainer = class {
return true;
}

// The following *Hostname() methods can be used to store hostname strings
// outside the trie. This is useful to store/match hostnames which are
// not part of a collection, and yet still benefit from storing the strings
// into a trie container's character buffer.
// TODO: WASM version of matchesHostname()

storeHostname(hn) {
let n = hn.length;
if ( n > 255 ) {
hn = hn.slice(-255);
n = 255;
}
if ( (this.buf.length - this.buf32[CHAR1_SLOT]) < n ) {
this.growBuf(0, n);
}
const offset = this.buf32[CHAR1_SLOT];
this.buf32[CHAR1_SLOT] = offset + n;
const buf8 = this.buf;
for ( let i = 0; i < n; i++ ) {
buf8[offset+i] = hn.charCodeAt(i);
}
return offset - this.buf32[CHAR0_SLOT];
}

extractHostname(i, n) {
const textDecoder = new TextDecoder();
const offset = this.buf32[CHAR0_SLOT] + i;
return textDecoder.decode(this.buf.subarray(offset, offset + n));
}

matchesHostname(hn, i, n) {
this.setNeedle(hn);
const buf8 = this.buf;
const hr = buf8[255];
if ( n > hr ) { return false; }
const hl = hr - n;
const nl = this.buf32[CHAR0_SLOT] + i;
for ( let j = 0; j < n; j++ ) {
if ( buf8[nl+j] !== buf8[hl+j] ) { return false; }
}
return n === hr || hn.charCodeAt(hl-1) === 0x2E /* '.' */;
}

async enableWASM() {
if ( typeof WebAssembly !== 'object' ) { return false; }
if ( this.wasmMemory instanceof WebAssembly.Memory ) { return true; }
Expand Down
Loading

0 comments on commit 3c67d2b

Please sign in to comment.