Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decode headers as latin1/UTF-8, show real reason phrase #377

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 2 additions & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ dirs = "5.0"
encoding_rs = "0.8.28"
encoding_rs_io = "0.1.7"
flate2 = "1.0.22"
hyper = { version = "1.2", default-features = false }
indicatif = "0.17"
jsonxf = "1.1.0"
memchr = "2.4.1"
Expand All @@ -42,6 +43,7 @@ serde = { version = "1.0", features = ["derive"] }
serde-transcode = "1.1.1"
serde_json = { version = "1.0", features = ["preserve_order"] }
serde_urlencoded = "0.7.0"
supports-hyperlinks = "3.0.0"
termcolor = "1.1.2"
time = "0.3.16"
unicode-width = "0.1.9"
Expand Down
21 changes: 21 additions & 0 deletions FAQ.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
<h3 name="header-value-encoding">Why do some HTTP headers show up mangled?</h3>

HTTP header values are officially only supposed to contain ASCII. Other bytes are "opaque data":

> Historically, HTTP has allowed field content with text in the ISO-8859-1 charset [[ISO-8859-1](https://datatracker.ietf.org/doc/html/rfc7230#ref-ISO-8859-1)], supporting other charsets only through use of [[RFC2047](https://datatracker.ietf.org/doc/html/rfc2047)] encoding. In practice, most HTTP header field values use only a subset of the US-ASCII charset [[USASCII](https://datatracker.ietf.org/doc/html/rfc7230#ref-USASCII)]. Newly defined header fields SHOULD limit their field values to US-ASCII octets. A recipient SHOULD treat other octets in field content (obs-text) as opaque data.

([RFC 7230](https://datatracker.ietf.org/doc/html/rfc7230#section-3.2.4))

In practice some headers are for some purposes treated like UTF-8, which supports all languages and characters in Unicode. But if you try to access header values through a browser's `fetch()` API or view them in the developer tools then they tend to be decoded as ISO-8859-1, which only supports a very limited number of characters and may not be the actual intended encoding.

xh as of version 0.23.0 shows the ISO-8859-1 decoding by default to avoid a confusing difference with web browsers. If the value looks like valid UTF-8 then it additionally shows the UTF-8 decoding.

That is, the following request:
```console
xh -v https://example.org Smile:☺
```
Displays the `Smile` header like this:
```
Smile: â�º (UTF-8: ☺)
```
The server will probably see `â�º` instead of the smiley. Or it might see `☺` after all. It depends!
33 changes: 0 additions & 33 deletions assets/syntax/basic/http.sublime-syntax

This file was deleted.

13 changes: 12 additions & 1 deletion assets/themes/ansi.tmTheme
Original file line number Diff line number Diff line change
Expand Up @@ -165,6 +165,17 @@
<string>#0C000000</string>
</dict>
</dict>
<dict>
<key>name</key>
<string>Error</string>
<key>scope</key>
<string>error</string>
<key>settings</key>
<dict>
<key>foreground</key>
<string>#01000000</string>
</dict>
</dict>
</array>
</dict>
</plist>
</plist>
14 changes: 13 additions & 1 deletion assets/themes/fruity.tmTheme
Original file line number Diff line number Diff line change
Expand Up @@ -163,6 +163,18 @@
<string>#CA000000</string>
</dict>
</dict>
<!-- FIXME: does this color fit the theme? -->
<dict>
<key>name</key>
<string>Error</string>
<key>scope</key>
<string>error</string>
<key>settings</key>
<dict>
<key>foreground</key>
<string>#01000000</string>
</dict>
</dict>
</array>
</dict>
</plist>
</plist>
14 changes: 13 additions & 1 deletion assets/themes/monokai.tmTheme
Original file line number Diff line number Diff line change
Expand Up @@ -185,6 +185,18 @@
<string>#C5000000</string>
</dict>
</dict>
<!-- FIXME: does this color fit the theme? -->
<dict>
<key>name</key>
<string>Error</string>
<key>scope</key>
<string>error</string>
<key>settings</key>
<dict>
<key>foreground</key>
<string>#01000000</string>
</dict>
</dict>
</array>
</dict>
</plist>
</plist>
14 changes: 13 additions & 1 deletion assets/themes/solarized.tmTheme
Original file line number Diff line number Diff line change
Expand Up @@ -174,6 +174,18 @@
<string>#21000000</string>
</dict>
</dict>
<!-- FIXME: does this color fit the theme? -->
<dict>
<key>name</key>
<string>Error</string>
<key>scope</key>
<string>error</string>
<key>settings</key>
<dict>
<key>foreground</key>
<string>#01000000</string>
</dict>
</dict>
</array>
</dict>
</plist>
</plist>
4 changes: 2 additions & 2 deletions src/buffer.rs
Original file line number Diff line number Diff line change
Expand Up @@ -301,8 +301,8 @@ impl Buffer {
})
}

pub fn print(&mut self, s: impl AsRef<[u8]>) -> io::Result<()> {
self.write_all(s.as_ref())
pub fn print(&mut self, s: &str) -> io::Result<()> {
self.write_all(s.as_bytes())
}

pub fn guess_pretty(&self) -> Pretty {
Expand Down
4 changes: 4 additions & 0 deletions src/cli.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1070,6 +1070,10 @@ impl Theme {
Theme::Fruity => "fruity",
}
}

pub(crate) fn as_syntect_theme(&self) -> &'static syntect::highlighting::Theme {
&crate::formatting::THEMES.themes[self.as_str()]
}
}

#[derive(Debug, Clone, Copy)]
Expand Down
28 changes: 23 additions & 5 deletions src/formatting.rs
Original file line number Diff line number Diff line change
@@ -1,4 +1,7 @@
use std::io::{self, Write};
use std::{
io::{self, Write},
sync::OnceLock,
};

use syntect::dumps::from_binary;
use syntect::easy::HighlightLines;
Expand All @@ -9,6 +12,9 @@ use termcolor::WriteColor;

use crate::{buffer::Buffer, cli::Theme};

pub(crate) mod headers;
pub(crate) mod palette;

pub fn get_json_formatter(indent_level: usize) -> jsonxf::Formatter {
let mut fmt = jsonxf::Formatter::pretty_printer();
fmt.indent = " ".repeat(indent_level);
Expand All @@ -30,7 +36,7 @@ pub fn serde_json_format(indent_level: usize, text: &str, write: impl Write) ->
Ok(())
}

static TS: once_cell::sync::Lazy<ThemeSet> = once_cell::sync::Lazy::new(|| {
pub(crate) static THEMES: once_cell::sync::Lazy<ThemeSet> = once_cell::sync::Lazy::new(|| {
from_binary(include_bytes!(concat!(
env!("OUT_DIR"),
"/themepack.themedump"
Expand All @@ -53,14 +59,14 @@ pub struct Highlighter<'a> {
impl<'a> Highlighter<'a> {
pub fn new(syntax: &'static str, theme: Theme, out: &'a mut Buffer) -> Self {
let syntax_set: &SyntaxSet = match syntax {
"json" | "http" => &PS_BASIC,
"json" => &PS_BASIC,
_ => &PS_LARGE,
};
let syntax = syntax_set
.find_syntax_by_extension(syntax)
.expect("syntax not found");
Self {
highlighter: HighlightLines::new(syntax, &TS.themes[theme.as_str()]),
highlighter: HighlightLines::new(syntax, theme.as_syntect_theme()),
syntax_set,
out,
}
Expand Down Expand Up @@ -103,7 +109,9 @@ fn convert_style(style: syntect::highlighting::Style) -> termcolor::ColorSpec {
use syntect::highlighting::FontStyle;
let mut spec = termcolor::ColorSpec::new();
spec.set_fg(convert_color(style.foreground))
.set_underline(style.font_style.contains(FontStyle::UNDERLINE));
.set_underline(style.font_style.contains(FontStyle::UNDERLINE))
.set_bold(style.font_style.contains(FontStyle::BOLD))
.set_italic(style.font_style.contains(FontStyle::ITALIC));
spec
}

Expand Down Expand Up @@ -142,3 +150,13 @@ fn convert_color(color: syntect::highlighting::Color) -> Option<termcolor::Color
Some(Color::Rgb(color.r, color.g, color.b))
}
}

pub(crate) fn supports_hyperlinks() -> bool {
static SUPPORTS_HYPERLINKS: OnceLock<bool> = OnceLock::new();
*SUPPORTS_HYPERLINKS.get_or_init(supports_hyperlinks::supports_hyperlinks)
}

pub(crate) fn create_hyperlink(text: &str, url: &str) -> String {
// https://gist.github.com/egmontkob/eb114294efbcd5adb1944c9f3cb5feda
format!("\x1B]8;;{url}\x1B\\{text}\x1B]8;;\x1B\\")
}
Loading