Skip to content

Commit

Permalink
New design, bug fixes
Browse files Browse the repository at this point in the history
**tldr: this change preserves the existing APIs but makes the library
faster and also able to recursively format structs and arrays.**

Also a new API: `redact.ManualBuffer`, which gives ownership of
swapping between safe and unsafe output to the caller. This is
envisioned to be useful in CockroachDB to help implement `tree.FmtCtx`.

Details:

We are really looking at a major redesign.

The previous design was attempting to preserve Go's own `fmt/print.go`
as much as possible, and was "injecting" special formatting behavior
by wrapping every printed value in a custom object whose type had a
`Format()` method. The idea was to give “ownership” of adding
redaction markers to each argument in turn. That worked for simple
values but there was no good way to make it able to print structs and
arrays recursively.

This new implementation customizes the `print.go` code more heavily.
In particular, the check for safe/unsafe types is now done at every
level of the recursive print. It also makes the code use a
`redact.ManualBuffer` as target during the print evaluation, which
saves string copies (hence a general performance boost).

The new supported cases can be understood by looking at the diffs
on the files `markers_test.go` and `internal/rfmt/registry_test.go`.

Note about the recursive display of struct values: the print logic
uses reflection to print out structs. However, the Go semantics
prevent a package from looking at non-exported fields in structs from
other packages. This means that the print logic is unable to recognize
the various safety interfaces for non-exported struct fields:
`SafeValue`, `SafeFormatter`, `FormatError`, etc. These are thus
considered unsafe regardless of the interfaces implemented.

The safe type registry and the special types `RedactableString` and
`RedactableBytes` are still recognized for non-exported structs
though.

Peformance improvement using the included `bench_test.go`:

```
name       old time/op    new time/op    delta
Redact-32    2.71µs ±22%    1.49µs ± 0%  -45.25%  (p=0.000 n=10+9)

name       old alloc/op   new alloc/op   delta
Redact-32      360B ± 0%      248B ± 0%  -31.11%  (p=0.000 n=10+10)

name       old allocs/op  new allocs/op  delta
Redact-32      12.0 ± 0%       5.0 ± 0%  -58.33%  (p=0.000 n=10+10)
```
  • Loading branch information
knz committed Jun 25, 2021
1 parent 2b6aa62 commit 3e96329
Show file tree
Hide file tree
Showing 36 changed files with 2,961 additions and 1,133 deletions.
163 changes: 100 additions & 63 deletions api.go
Original file line number Diff line number Diff line change
Expand Up @@ -14,22 +14,22 @@

package redact

import "fmt"
import (
"fmt"

"github.com/cockroachdb/redact/builder"
i "github.com/cockroachdb/redact/interfaces"
b "github.com/cockroachdb/redact/internal/buffer"
fw "github.com/cockroachdb/redact/internal/fmtforward"
m "github.com/cockroachdb/redact/internal/markers"
w "github.com/cockroachdb/redact/internal/redact"
ifmt "github.com/cockroachdb/redact/internal/rfmt"
)

// SafeFormatter is implemented by object types that want to separate
// safe and non-safe information when printed out by a Printf-like
// formatter.
type SafeFormatter interface {
// SafeFormat is like the Format method of fmt.Formatter, except
// that it operates using a SafePrinter instead of a fmt.State for
// output.
//
// The verb argument is the control character that defines
// the formatting mode in the surrounding Printf call.
// For example, if this method is called to format %03d,
// the verb is 'd'.
SafeFormat(s SafePrinter, verb rune)
}
type SafeFormatter = i.SafeFormatter

// SafeValue is a marker interface to be implemented by types that
// alias base Go types and whose natural representation via Printf is
Expand All @@ -48,17 +48,13 @@ type SafeFormatter interface {
// collect all the types that implement this interface, as well as all
// uses of this type, and produce a report. Changes to this report
// should receive maximal amount of scrutiny during code reviews.
type SafeValue interface {
SafeValue()
}
type SafeValue = i.SafeValue

// SafeMessager is an alternative to SafeFormatter used in previous
// versions of CockroachDB.
// NB: this interface is obsolete. Use SafeFormatter instead.
// TODO(knz): Remove this.
type SafeMessager interface {
SafeMessage() string
}
type SafeMessager = i.SafeMessager

// SafePrinter is a stateful helper that abstracts an output stream in
// the context of printf-like formatting, but with the ability to
Expand All @@ -67,60 +63,101 @@ type SafeMessager interface {
// This package provides one implementation of this using marker
// delimiters for unsafe data, see markers.go. We would like to aim
// for alternate implementations to generate more structured formats.
type SafePrinter interface {
// SafePrinter inherits fmt.State to access format flags, however
// calls to fmt.State's underlying Write() as unsafe.
fmt.State

// SafePrinter provides the SafeWriter interface.
SafeWriter
}
type SafePrinter = i.SafePrinter

// SafeWriter provides helper functions for use in implementations of
// SafeFormatter, to format mixes of safe and unsafe strings.
type SafeWriter interface {
// SafeString emits a safe string.
SafeString(SafeString)

// SafeRune emits a safe rune.
SafeRune(SafeRune)

// Print emits its arguments separated by spaces.
// For each argument it dynamically checks for the SafeFormatter or
// SafeValue interface and either use that, or mark the argument
// payload as unsafe.
Print(args ...interface{})

// For printf, a linter checks that the format string is
// a constant literal, so the implementation can assume it's always
// safe.
Printf(format string, arg ...interface{})

// UnsafeString writes an unsafe string.
UnsafeString(string)

// UnsafeByte writes an unsafe byte.
UnsafeByte(byte)

// UnsafeBytes writes an unsafe byte slice.
UnsafeBytes([]byte)

// UnsafeRune writes an unsafe rune.
UnsafeRune(rune)
}
type SafeWriter = i.SafeWriter

// SafeString aliases string. This is not meant to be used directly;
// the type definition ensures that SafePrinter's SafeString method
// can only take constant string literals as arguments. Typically, a
// Go linter would ensure that ConstantString is never used to cast a
// value.
type SafeString string

// SafeValue makes SafeString a SafeValue.
func (SafeString) SafeValue() {}
type SafeString = i.SafeString

// SafeRune aliases rune. See the explanation for SafeString.
type SafeRune rune
type SafeRune = i.SafeRune

// RedactableString is a string that contains a mix of safe and unsafe
// bits of data, but where it is known that unsafe bits are enclosed
// by redaction markers ‹ and ›, and occurrences of the markers
// inside the original data items have been escaped.
//
// Instances of RedactableString should not be constructed directly;
// instead use the facilities from print.go (Sprint, Sprintf)
// or the methods below.
type RedactableString = m.RedactableString

// RedactableBytes is like RedactableString but is a byte slice.
//
// Instances of RedactableBytes should not be constructed directly;
// instead use the facilities from print.go (Sprint, Sprintf)
// or the methods below.
type RedactableBytes = m.RedactableBytes

// StartMarker returns the start delimiter for an unsafe string.
func StartMarker() []byte { return m.StartMarker() }

// EndMarker returns the end delimiter for an unsafe string.
func EndMarker() []byte { return m.EndMarker() }

// RedactedMarker returns the special string used by Redact.
func RedactedMarker() []byte { return m.RedactedMarker() }

// EscapeMarkers escapes the special delimiters from the provided
// byte slice.
func EscapeMarkers(s []byte) []byte { return m.EscapeMarkers(s) }

// EscapeBytes escapes markers inside the given byte slice and encloses
// the entire byte slice between redaction markers.
// EscapeBytes escapes markers inside the given byte slice and encloses
// the entire byte slice between redaction markers.
func EscapeBytes(s []byte) RedactableBytes { return ifmt.EscapeBytes(s) }

// ManualBuffer is a variable-sized buffer of bytes with
// Write methods. Writes are escaped in different ways
// depending on the output mode.
// Note: This type is only meant for "advanced" usage.
// For most common uses, consider using StringBuilder instead.
type ManualBuffer = b.Buffer

// Unsafe turns any value that would otherwise be considered safe,
// into an unsafe value.
func Unsafe(a interface{}) interface{} { return w.Unsafe(a) }

// Safe turns any value into an object that is considered as safe by
// the formatter.
//
// This is provided as an “escape hatch” for cases where the other
// interfaces and conventions fail. Increased usage of this mechanism
// should be taken as a signal that a new abstraction is missing.
// The implementation is also slow.
func Safe(a interface{}) SafeValue { return w.Safe(a) }

// RegisterRedactErrorFn registers an error redaction function for use
// during automatic redaction by this package.
// Provided e.g. by cockroachdb/errors.
func RegisterRedactErrorFn(fn func(err error, p i.SafePrinter, verb rune)) {
ifmt.RegisterRedactErrorFn(fn)
}

// MakeFormat is a helper for use by implementations of the
// SafeFormatter interface. It reproduces the format currently active
// in fmt.State and verb. This is provided because Go's standard
// fmt.State does not make the original format string available to us.
//
// If the return value justV is true, then the current state
// was found to be %v exactly; in that case the caller
// can avoid a full-blown Printf call and use just Print instead
// to take a shortcut.
func MakeFormat(s fmt.State, verb rune) (justV bool, format string) {
return fw.MakeFormat(s, verb)
}

// SafeValue makes SafeRune a SafeValue.
func (SafeRune) SafeValue() {}
// StringBuilder accumulates strings with optional redaction markers.
//
// It implements io.Writer but marks direct writes as redactable.
// To distinguish safe and unsafe bits, it also implements the SafeWriter
// interface.
type StringBuilder = builder.StringBuilder
26 changes: 26 additions & 0 deletions bench_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
// Copyright 2021 The Cockroach Authors.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
// implied. See the License for the specific language governing
// permissions and limitations under the License.

package redact

import "testing"

func BenchmarkRedact(b *testing.B) {
x := struct {
a int
}{456}
for i := 0; i < b.N; i++ {
_ = Sprintf("hello %v %v %v", 123, x, Safe("world"), Unsafe("universe"))
}
}
91 changes: 30 additions & 61 deletions builder.go → builder/builder.go
Original file line number Diff line number Diff line change
Expand Up @@ -12,13 +12,15 @@
// implied. See the License for the specific language governing
// permissions and limitations under the License.

package redact
package builder

import (
"bytes"
"fmt"
"io"
"unicode/utf8"

i "github.com/cockroachdb/redact/interfaces"
ib "github.com/cockroachdb/redact/internal/buffer"
ifmt "github.com/cockroachdb/redact/internal/rfmt"
)

// StringBuilder accumulates strings with optional redaction markers.
Expand All @@ -27,26 +29,13 @@ import (
// To distinguish safe and unsafe bits, it also implements the SafeWriter
// interface.
type StringBuilder struct {
// we use bytes.Buffer internally to simplify the implementation of
// the SafeWriter interface.
buf bytes.Buffer
ib.Buffer
}

// String returns the accumulated string, with redaction markers stripped out.
// To obtain the redactable string, call RedactableString().
func (b StringBuilder) String() string { return b.RedactableString().StripMarkers() }

var _ fmt.Stringer = StringBuilder{}
var _ fmt.Stringer = (*StringBuilder)(nil)

// RedactableString returns the accumulated string, including redaction markers.
func (b StringBuilder) RedactableString() RedactableString { return RedactableString(b.buf.String()) }

// RedactableBytes returns the accumulated bytes, including redaction markers.
func (b StringBuilder) RedactableBytes() RedactableBytes { return RedactableBytes(b.buf.Bytes()) }

// SafeFormat implements SafeFormatter.
func (b StringBuilder) SafeFormat(p SafePrinter, _ rune) {
func (b StringBuilder) SafeFormat(p i.SafePrinter, _ rune) {
// We only support the %v / %s natural print here.
// Go supports other formatting verbs for strings: %x/%X/%q.
//
Expand All @@ -64,86 +53,66 @@ func (b StringBuilder) SafeFormat(p SafePrinter, _ rune) {
p.Print(b.RedactableString())
}

var _ SafeFormatter = StringBuilder{}
var _ SafeFormatter = (*StringBuilder)(nil)

// Len returns the number of accumulated bytes, including redaction
// markers; b.Len() == len(b.RedactableString()).
func (b *StringBuilder) Len() int { return b.buf.Len() }

// Cap returns the capacity of the builder's underlying byte slice. It is the
// total space allocated for the string being built and includes any bytes
// already written.
func (b *StringBuilder) Cap() int { return b.buf.Cap() }

// Reset resets the Builder to be empty.
func (b *StringBuilder) Reset() { b.buf.Reset() }
var _ i.SafeFormatter = StringBuilder{}
var _ i.SafeFormatter = (*StringBuilder)(nil)

// StringBuilder implements io.Writer.
// Direct Write() calls are considered unsafe.
var _ io.Writer = (*StringBuilder)(nil)

// Write implements the io.Writer interface.
func (b *StringBuilder) Write(s []byte) (int, error) {
b.UnsafeBytes(s)
return len(s), nil
b.SetMode(ib.UnsafeEscaped)
return b.Buffer.Write(s)
}

// StringBuilder implements SafeWriter.
var _ SafeWriter = (*StringBuilder)(nil)
var _ i.SafeWriter = (*StringBuilder)(nil)

// Print is part of the SafeWriter interface.
func (b *StringBuilder) Print(args ...interface{}) {
_, _ = Fprint(&b.buf, args...)
b.SetMode(ib.PreRedactable)
_, _ = ifmt.Fprint(&b.Buffer, args...)
}

// Printf is part of the SafeWriter interface.
func (b *StringBuilder) Printf(format string, args ...interface{}) {
_, _ = Fprintf(&b.buf, format, args...)
b.SetMode(ib.PreRedactable)
_, _ = ifmt.Fprintf(&b.Buffer, format, args...)
}

// SafeString is part of the SafeWriter interface.
func (b *StringBuilder) SafeString(s SafeString) {
w := escapeWriter{w: &b.buf, enclose: false}
_, _ = w.Write([]byte(s))
func (b *StringBuilder) SafeString(s i.SafeString) {
b.SetMode(ib.SafeEscaped)
_, _ = b.Buffer.WriteString(string(s))
}

// SafeRune is part of the SafeWriter interface.
func (b *StringBuilder) SafeRune(s SafeRune) {
if s == startRedactable || s == endRedactable {
s = escapeMark
}
_, _ = b.buf.WriteRune(rune(s))
func (b *StringBuilder) SafeRune(s i.SafeRune) {
b.SetMode(ib.SafeEscaped)
_ = b.Buffer.WriteRune(rune(s))
}

// UnsafeString is part of the SafeWriter interface.
func (b *StringBuilder) UnsafeString(s string) {
w := escapeWriter{w: &b.buf, enclose: true, strip: true}
_, _ = w.Write([]byte(s))
b.SetMode(ib.UnsafeEscaped)
_, _ = b.Buffer.WriteString(s)
}

// UnsafeRune is part of the SafeWriter interface.
func (b *StringBuilder) UnsafeRune(s rune) {
_, _ = b.buf.WriteRune(startRedactable)
b.SafeRune(SafeRune(s))
_, _ = b.buf.WriteRune(endRedactable)
b.SetMode(ib.UnsafeEscaped)
_ = b.Buffer.WriteRune(s)
}

// UnsafeByte is part of the SafeWriter interface.
func (b *StringBuilder) UnsafeByte(s byte) {
_, _ = b.buf.WriteRune(startRedactable)
if s >= utf8.RuneSelf ||
s == startRedactableS[0] || s == endRedactableS[0] {
// Unsafe byte. Escape it.
_, _ = b.buf.Write(escapeBytes)
} else {
_ = b.buf.WriteByte(s)
}
_, _ = b.buf.WriteRune(endRedactable)
b.SetMode(ib.UnsafeEscaped)
_ = b.Buffer.WriteByte(s)
}

// UnsafeBytes is part of the SafeWriter interface.
func (b *StringBuilder) UnsafeBytes(s []byte) {
w := escapeWriter{w: &b.buf, enclose: true, strip: true}
_, _ = w.Write(s)
b.SetMode(ib.UnsafeEscaped)
_, _ = b.Buffer.Write(s)
}
Loading

0 comments on commit 3e96329

Please sign in to comment.