Skip to content

Commit

Permalink
Decode headers as latin1/UTF-8, show real reason phrase
Browse files Browse the repository at this point in the history
External changes:

- We now print the actual reason phrase sent by the server instead
  of guessing it from the status code. That is, if servers reply with
  "200 Wonderful" instead of "200 OK" then we show that. This is
  especially useful for status codes that xh doesn't recognize.

- Header values are now decoded as latin1, with the UTF-8 decoding
  also shown if applicable.

- A new FAQ file with an entry that explains header value encoding.
  Header output now hyperlinks to this entry when relevant and if
  supported by the terminal.

Under the hood we now color headers manually. It's still hooked up to
the `.tmTheme` files but not to the `.sublime-syntax` file. This lets
us highlight the latin1 header values differently. In the future we
could use the same approach to optimize JSON highlighting.

I'm unsure about the position of the hyperlink. Currently it's the
text "UTF-8" in `<latin1 value> (UTF-8: <utf-8 value>)`. But that
means it's only shown if the value can be decoded as UTF-8. An
alternative is to turn the latin1 value itself into a hyperlink, but
that's confusing if the value itself is already a URL (which is a
common case for the `Location` header).

I also don't feel that our text is quite distinct enough from the
header value in the default `ansi` theme. Though the hyperlink does
help to set it apart.
  • Loading branch information
blyxxyz committed Jul 4, 2024
1 parent 2c7eaf9 commit 00bc6f2
Show file tree
Hide file tree
Showing 17 changed files with 641 additions and 166 deletions.
7 changes: 7 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 2 additions & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ dirs = "5.0"
encoding_rs = "0.8.28"
encoding_rs_io = "0.1.7"
flate2 = "1.0.22"
hyper = { version = "1.2", default-features = false }
indicatif = "0.17"
jsonxf = "1.1.0"
memchr = "2.4.1"
Expand All @@ -42,6 +43,7 @@ serde = { version = "1.0", features = ["derive"] }
serde-transcode = "1.1.1"
serde_json = { version = "1.0", features = ["preserve_order"] }
serde_urlencoded = "0.7.0"
supports-hyperlinks = "3.0.0"
termcolor = "1.1.2"
time = "0.3.16"
unicode-width = "0.1.9"
Expand Down
21 changes: 21 additions & 0 deletions FAQ.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
<h3 name="header-value-encoding">Why do some HTTP headers show up mangled?</h3>

HTTP header values are officially only supposed to contain ASCII. Other bytes are "opaque data":

> Historically, HTTP has allowed field content with text in the ISO-8859-1 charset [[ISO-8859-1](https://datatracker.ietf.org/doc/html/rfc7230#ref-ISO-8859-1)], supporting other charsets only through use of [[RFC2047](https://datatracker.ietf.org/doc/html/rfc2047)] encoding. In practice, most HTTP header field values use only a subset of the US-ASCII charset [[USASCII](https://datatracker.ietf.org/doc/html/rfc7230#ref-USASCII)]. Newly defined header fields SHOULD limit their field values to US-ASCII octets. A recipient SHOULD treat other octets in field content (obs-text) as opaque data.
([RFC 7230](https://datatracker.ietf.org/doc/html/rfc7230#section-3.2.4))

In practice some headers are for some purposes treated like UTF-8, which supports all languages and characters in Unicode. But if you try to access header values through a browser's `fetch()` API or view them in the developer tools then they tend to be decoded as ISO-8859-1, which only supports a very limited number of characters and may not be the actual intended encoding.

xh as of version 0.23.0 shows the ISO-8859-1 decoding by default to avoid a confusing difference with web browsers. If the value looks like valid UTF-8 then it additionally shows the UTF-8 decoding.

That is, the following request:
```console
xh -v https://example.org Smile:☺
```
Displays the `Smile` header like this:
```
Smile: â�º (UTF-8: ☺)
```
The server will probably see `�` instead of the smiley. Or it might see `` after all. It depends!
33 changes: 0 additions & 33 deletions assets/syntax/basic/http.sublime-syntax

This file was deleted.

13 changes: 12 additions & 1 deletion assets/themes/ansi.tmTheme
Original file line number Diff line number Diff line change
Expand Up @@ -165,6 +165,17 @@
<string>#0C000000</string>
</dict>
</dict>
<dict>
<key>name</key>
<string>Error</string>
<key>scope</key>
<string>error</string>
<key>settings</key>
<dict>
<key>foreground</key>
<string>#01000000</string>
</dict>
</dict>
</array>
</dict>
</plist>
</plist>
14 changes: 13 additions & 1 deletion assets/themes/fruity.tmTheme
Original file line number Diff line number Diff line change
Expand Up @@ -163,6 +163,18 @@
<string>#CA000000</string>
</dict>
</dict>
<!-- FIXME: does this color fit the theme? -->
<dict>
<key>name</key>
<string>Error</string>
<key>scope</key>
<string>error</string>
<key>settings</key>
<dict>
<key>foreground</key>
<string>#01000000</string>
</dict>
</dict>
</array>
</dict>
</plist>
</plist>
14 changes: 13 additions & 1 deletion assets/themes/monokai.tmTheme
Original file line number Diff line number Diff line change
Expand Up @@ -185,6 +185,18 @@
<string>#C5000000</string>
</dict>
</dict>
<!-- FIXME: does this color fit the theme? -->
<dict>
<key>name</key>
<string>Error</string>
<key>scope</key>
<string>error</string>
<key>settings</key>
<dict>
<key>foreground</key>
<string>#01000000</string>
</dict>
</dict>
</array>
</dict>
</plist>
</plist>
14 changes: 13 additions & 1 deletion assets/themes/solarized.tmTheme
Original file line number Diff line number Diff line change
Expand Up @@ -174,6 +174,18 @@
<string>#21000000</string>
</dict>
</dict>
<!-- FIXME: does this color fit the theme? -->
<dict>
<key>name</key>
<string>Error</string>
<key>scope</key>
<string>error</string>
<key>settings</key>
<dict>
<key>foreground</key>
<string>#01000000</string>
</dict>
</dict>
</array>
</dict>
</plist>
</plist>
4 changes: 2 additions & 2 deletions src/buffer.rs
Original file line number Diff line number Diff line change
Expand Up @@ -301,8 +301,8 @@ impl Buffer {
})
}

pub fn print(&mut self, s: impl AsRef<[u8]>) -> io::Result<()> {
self.write_all(s.as_ref())
pub fn print(&mut self, s: &str) -> io::Result<()> {
self.write_all(s.as_bytes())
}

pub fn guess_pretty(&self) -> Pretty {
Expand Down
4 changes: 4 additions & 0 deletions src/cli.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1070,6 +1070,10 @@ impl Theme {
Theme::Fruity => "fruity",
}
}

pub(crate) fn as_syntect_theme(&self) -> &'static syntect::highlighting::Theme {
&crate::formatting::THEMES.themes[self.as_str()]
}
}

#[derive(Debug, Clone, Copy)]
Expand Down
28 changes: 23 additions & 5 deletions src/formatting.rs
Original file line number Diff line number Diff line change
@@ -1,4 +1,7 @@
use std::io::{self, Write};
use std::{
io::{self, Write},
sync::OnceLock,
};

use syntect::dumps::from_binary;
use syntect::easy::HighlightLines;
Expand All @@ -9,6 +12,9 @@ use termcolor::WriteColor;

use crate::{buffer::Buffer, cli::Theme};

pub(crate) mod headers;
pub(crate) mod palette;

pub fn get_json_formatter(indent_level: usize) -> jsonxf::Formatter {
let mut fmt = jsonxf::Formatter::pretty_printer();
fmt.indent = " ".repeat(indent_level);
Expand All @@ -30,7 +36,7 @@ pub fn serde_json_format(indent_level: usize, text: &str, write: impl Write) ->
Ok(())
}

static TS: once_cell::sync::Lazy<ThemeSet> = once_cell::sync::Lazy::new(|| {
pub(crate) static THEMES: once_cell::sync::Lazy<ThemeSet> = once_cell::sync::Lazy::new(|| {
from_binary(include_bytes!(concat!(
env!("OUT_DIR"),
"/themepack.themedump"
Expand All @@ -53,14 +59,14 @@ pub struct Highlighter<'a> {
impl<'a> Highlighter<'a> {
pub fn new(syntax: &'static str, theme: Theme, out: &'a mut Buffer) -> Self {
let syntax_set: &SyntaxSet = match syntax {
"json" | "http" => &PS_BASIC,
"json" => &PS_BASIC,
_ => &PS_LARGE,
};
let syntax = syntax_set
.find_syntax_by_extension(syntax)
.expect("syntax not found");
Self {
highlighter: HighlightLines::new(syntax, &TS.themes[theme.as_str()]),
highlighter: HighlightLines::new(syntax, theme.as_syntect_theme()),
syntax_set,
out,
}
Expand Down Expand Up @@ -103,7 +109,9 @@ fn convert_style(style: syntect::highlighting::Style) -> termcolor::ColorSpec {
use syntect::highlighting::FontStyle;
let mut spec = termcolor::ColorSpec::new();
spec.set_fg(convert_color(style.foreground))
.set_underline(style.font_style.contains(FontStyle::UNDERLINE));
.set_underline(style.font_style.contains(FontStyle::UNDERLINE))
.set_bold(style.font_style.contains(FontStyle::BOLD))
.set_italic(style.font_style.contains(FontStyle::ITALIC));
spec
}

Expand Down Expand Up @@ -142,3 +150,13 @@ fn convert_color(color: syntect::highlighting::Color) -> Option<termcolor::Color
Some(Color::Rgb(color.r, color.g, color.b))
}
}

pub(crate) fn supports_hyperlinks() -> bool {
static SUPPORTS_HYPERLINKS: OnceLock<bool> = OnceLock::new();
*SUPPORTS_HYPERLINKS.get_or_init(supports_hyperlinks::supports_hyperlinks)
}

pub(crate) fn create_hyperlink(text: &str, url: &str) -> String {
// https://gist.github.com/egmontkob/eb114294efbcd5adb1944c9f3cb5feda
format!("\x1B]8;;{url}\x1B\\{text}\x1B]8;;\x1B\\")
}
Loading

0 comments on commit 00bc6f2

Please sign in to comment.