Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SIP001: Header obfuscating #26

Closed
madeye opened this issue Dec 13, 2016 · 78 comments
Closed

SIP001: Header obfuscating #26

madeye opened this issue Dec 13, 2016 · 78 comments

Comments

@madeye
Copy link
Contributor

madeye commented Dec 13, 2016

Shadowsocks Improvement Proposal 001

SIP001 - Allow header obfuscating to cheat on QoS.

Recently, QoS of some ISPs becomes unreasonable. A cheap way to solve this problem is header obfuscating, which inserts some fake headers before shadowsocks handshake packets.

For example, before a shadowsocks request, we insert this HTTP GET header:

    POST / HTTP/1.1\r\n
    Host: www.baidu.com:8388\r\n
    User-Agent: curl/7.45.1\r\n
    Accept: */*\r\n
    Content-Type: application/octet-stream\r\n
    Content-Length: 176\r\n
    \r\n

Similarly, we insert this HTTP header before a shadowsocks response.

    HTTP/1.1 200 OK\r\n
    Server: nginx/1.0.2\r\n
    Date: Tue, 13 Dec 2016 13:25:12 GMT\r\n
    Content-Type: application/octet-stream\r\n
    Content-Length: 176\r\n
    Connection: keep-alive\r\n
    Cache-Control: private, no-cache, no-store, proxy-revalidate, no-transform\r\n
    Pragma: no-cache\r\n
    \r\n

With this SIP, we may cheat on most of QoS mechanisms, avoiding QoS related packets dropping or bandwidth limit.

A demonstration can be found here: https://github.com/shadowsocks/shadowsocks-libev/tree/obfs

Any suggestion is welcome.

@Mygod
Copy link
Contributor

Mygod commented Dec 13, 2016

  1. This feature is optional and configurable right?
  2. Why does it use \r\n instead of \n?
  3. May I suggest to use POST and add Content-Length to the request since we need to post data to the server?
  4. Content-Type: text/html and Content-Encoding: gzip doesn't match the content the server returns which would be suspicious. How about application/octet-stream and remove Content-Encoding (which means anything is valid)?

@madeye
Copy link
Contributor Author

madeye commented Dec 13, 2016

  1. Yes, it would introduce additional features of the traffic. We may refine the implementation to make it closer to real HTTP traffic.
  2. It should be a problem. For now, we should warn the user about the risk and make this feature disabled by default.
  3. Do you mean we should fake the header like a CDN header?

@madeye
Copy link
Contributor Author

madeye commented Dec 13, 2016

@Mygod

  1. Right, optional and configurable.
  2. From RFC, it seems to be \r\n. Correct me if I'm wrong.
   HTTP/1.1 defines the sequence CR LF as the end-of-line marker for all
   protocol elements except the entity-body (see appendix 19.3 for
   tolerant applications). The end-of-line marker within an entity-body
   is defined by its associated media type, as described in section 3.7.

       CRLF           = CR LF
  1. Yes, it looks a good idea.
  2. Ditto.

@nekolab
Copy link

nekolab commented Dec 13, 2016

I suggest let user define request and response header by themselves, not use a fixed template.

@Mygod
Copy link
Contributor

Mygod commented Dec 13, 2016 via email

@nekolab
Copy link

nekolab commented Dec 13, 2016

Fine, another question is this is a connection-level header or a conversation-level header.

A connection-level header only appears when TCP connection established, after that it won't be sent any more. A conversation-level header will appears everywhere in a TCP stream, each time invoke send will append fake header to the stream.

Neither POST nor GET method in HTTP can represent a connection-level header in semantics, because after a send-recv round, ordinary HTTP client will close the TCP connection or hold it for another HTTP connection (with another header), but TCP connection will still send and receive data.

I'm not familiar with the libev version of SS, after a quick look I believe this implementation use the connection-level header, correct me if I'm wrong.

The conversation-level header may looks more like an ordinary HTTP client works on POST method and multiplexing the connection, but will it decrease the performance, add the complexity to find and remove the fake header or add more(more more) characteristic to the protocol?

@v3aqb
Copy link

v3aqb commented Dec 13, 2016

how about use a websocket header?

@Mygod
Copy link
Contributor

Mygod commented Dec 14, 2016

Hmm. Maybe we can support both HTTP mode and WebSocket mode?

@madeye
Copy link
Contributor Author

madeye commented Dec 14, 2016

Websocket looks a great idea. It helps to avoid conversation headers mentioned by @nekolab.

I'm not a big fan of fully customized headers, which may introduce illegal usage of this feature.

@nekolab
Copy link

nekolab commented Dec 14, 2016

We may run some tests to confirm whether the websocket header can cheat QoS successfully or not. I'm not pretty sure since it's a new protocol and may be ignored by QoS, if it works, I vote yes for it.

@ayanamist
Copy link

I dont think WebSocket header will cheat QoS since the cheat proved valid seems to be very bad implemented.

SSR with simple_http has been successfully proved to be valid on cheating QoS under Hangzhou Telecom. SSR with simple_http are using GET method with request body which is definitely a illegal formed http request.

Do you plan to move some data like IV from request body to request path like SSR does? This can make request url different from request to request which i think will increase detect difficulty.

@ayanamist
Copy link

@wongsyrone I dont understand what you said. If a request is invalid, it can't bypass shadowsocks existent verification mechnism, so where a correct response comes from?
In fact i think it will decrease the risk of exposing server side, since it can emulate like a normal http server.

@madeye
Copy link
Contributor Author

madeye commented Dec 14, 2016

Update the websocket obfuscating via shadowsocks/shadowsocks-libev@6176903

Request:

    GET / HTTP/1.1\r\n
    Host: www.baidu.com:8388\r\n
    User-Agent: curl/7.18.1\r\n
    Upgrade: websocket\r\n
    Connection: Upgrade\r\n
    Sec-WebSocket-Key: XVOfcm44bdPb0+xNrmf4tg==\r\n
    \r\n

Response:

    HTTP/1.1 101 Switching Protocols\r\n
    Server: nginx/1.2.2\r\n
    Date: Wed, 14 Dec 2016 13:42:07 GMT\r\n
    Upgrade: websocket\r\n
    Connection: Upgrade\r\n
    Sec-WebSocket-Accept: byeMGrcAr+bKUtt+i2Thaw==\r\n
    \r\n

Basically, it's still a HTTP GET obfuscating. However, websocket protocol lets the whole traffic stream look more normal.

@Mygod
Copy link
Contributor

Mygod commented Dec 14, 2016

illegal usage of this feature.

Hmm I thought that was the point of this feature.

@simonsmh
Copy link

@wongsyrone That's why it should be disabled by default if necessary.
@Mygod In another project shadowsocksr could ban these ip/domain for illegal usage at the server side. That's not the major issue.

@madeye
Copy link
Contributor Author

madeye commented Dec 14, 2016

Actually, I don't think we need to worry about adding new features.

The soul of shadowsocks is to solve a stupid problem (you know what I mean) with as less effort as possible. If any small change works well, we just add it. If not, we drop it.

As an optional protocol extension, even if this proposal introduces new problems, we can continue to refine it or just drop it.

As the next step, I suggest to do more tests in real environments and let's see what will happen.

@v2ray
Copy link

v2ray commented Dec 14, 2016

Assuming the proposal applies on TCP connections only. This feature is equivalent to a customized HTTP proxy (say ShadowHTTP).

The only difference is that ShadowHTTP only tranfers encrypted content, when normal HTTP proxy allows both plain and encrypted payload. The HTTP method may be different but configurable (discussed above). Going further, ShadowHTTP may have ability to proxy out (or deny) invalid request, in order to avoid detection/probing. This is one step further to be a normal HTTP proxy.

A HTTP proxy is fine, but it doesn't fit the need of a socks proxy. If your end goal is to cover UDP or provide other type of obfuscation, I would suggest the design to be more fundamental and extensible, to fit potential grow in the future.

@ayanamist
Copy link

@v2ray No, it is not a HTTP proxy, but a SOCKS proxy obfuscated as a HTTP proxy which definitely fits the need of a SOCKS proxy.

@madeye
Copy link
Contributor Author

madeye commented Dec 14, 2016

@v2ray The proposal here is header obfuscation and the goal is to find a cheap way to cheat on QoS. In other words, it just does some simple obfuscation, no plan to implement full HTTP protocol.

@pexcn
Copy link

pexcn commented Dec 14, 2016

Good idea.

@librehat
Copy link
Contributor

Will it be separately optionally enabled in client-side and server-side? (i.e., as a server, I received obfuscated request, am I allowed to respond with non-obfuscated response?) Or it would be similar to OTA, an obfuscated request will also make sure the response is also obfuscated.

@v3aqb
Copy link

v3aqb commented Dec 14, 2016

with URI like this?

ss://method:password@hostname:port/?obfs=http[&hostname=www.baidu.com]

or

ss://method:password@hostname:port/?obfs=http[&header=BASE64-ENCODED-HEADER-DATA]

@madeye
Copy link
Contributor Author

madeye commented Dec 15, 2016

@librehat Right, it's totally optional. Both client and server should enable the same obfuscation. On the server side, when the obfuscation is enabled, it still can handle normal protocol without obfuscating. So,

-------------------------------------------
| Client-Obfs |   Server-Obfs  |  Working |
| Yes         |   Yes          |  Yes     |
| Yes         |   No           |  No      |
| No          |   Yes          |  Yes     |
| No          |   No           |  Yes     |
-------------------------------------------

@v3aqb The first one looks better. As the hostname should be ASCII, no need to do base64 encoding.

@librehat
Copy link
Contributor

librehat commented Dec 15, 2016

@madeye Actually I don't think server need to be able to disable the obfs if it supports it since it should be fully back-compatible. We don't have to add one more config in server side each time a new feature is proposed (but it can also be up to each implementation)

@madeye
Copy link
Contributor Author

madeye commented Dec 15, 2016

@librehat I think there are two reasons why we need to provide an option on the server side:

  1. Prevent potential security issues. If any security issue is found in the future, users can easily disable obfuscating support on their servers. Or if a user doesn't want to take risk to enable obfuscating, he can still keep updating to the latest software with obfuscating disabled by default.
  2. Support different kinds of obfuscating. Currently, we only have HTTP obfuscating, but someday we may have more. So, it's necessary to provide an option for switching between different obfuscating implementations.

@ghost
Copy link

ghost commented Dec 17, 2016

Is there a reproducible test to show the problem, that is ISP will favor an HTTP request over a shadowsocks TCP request, in the first place? Because I am not observing it.

@ghost
Copy link

ghost commented Dec 17, 2016

@nekolab I don't believe HTTP spec 1.1 denied the possibility for multiplexing, in other words a strict request / response semantic is only conventional. A single obfuscation at the start of the TCP stream should be sufficient.

@madeye
Copy link
Contributor Author

madeye commented Dec 18, 2016

@nfjinjing If you have a link with China Telecom, you may try experiments around 9:00PM to 11:00PM everyday. Actually, according to some internal sources of Cisco, they have deployed similar QoS mechanism on ASR 1000 series for China Telecom years ago.

@ghost
Copy link

ghost commented Dec 18, 2016

@madeye That's very interesting. Unfortunately because of a different ISP, I can't verify it myself.

I tried the obfs branch at 3d71c2, how do I know if obfuscation is turned on? There seems to be no options to enable it, and I didn't find any HTTP headers with tcpdump.

@falseen
Copy link

falseen commented Dec 29, 2016

我英文比较烂,就不打英文了。

现在的obfs虽然是个不错的主意,而且SSR也实践过了,在某些地区确实有效果,但是大规模应用之后还是会很容易被找到规律。

我认为未来ss一个可能的出路是:ss平台+插件 模式,ss提供平台,其他开发者提供插件。这种开发模式在某种程度上降低了二次开发的成本,也增加了ss特征的多样性。即使未来有一天ss停更了,插件式设计也能让它始终保持活力。所以我在想能不能把obfs设计成插件模式,完善相应的接口,给ss一个无可限量的未来。

如果还想更进一步的话,可以这样设计:
设计一种模式,让插件可以单独编译。只要把编译过的插件放入服务端/客户端的目录中就可以更新插件,而不需要编译整个服务端/客户端。甚至可以让服务端把插件推送到客户端,这样就更加方便了(特别是对于手机端来说)。如果做到这一步,有可能会形成一个新的生态圈。

现在的生态圈是:核心开发者开发+编译,第三方开发者帮忙修BUG,用户使用+反馈。
未来的生态圈有可能是:核心开发者提供平台+完善插件机制+编译,第三方开发者提供插件+编译,用户选择不同的插件+反馈。
(或许我所提到的“第三方开发者”根本不会有,管它呢,我只是想把自己的想法分享出来,仅此而已。)

以上只是个人的一点拙见,欢迎拍砖。

@Artoria2e5
Copy link

Artoria2e5 commented Dec 30, 2016

Since @falseen has mentioned some ideas regarding a pluggable obfuscation system, I would like to bring up some attention in supporting Tor's Pluggable Transport protocol, which allows Tor to speak with separate obfuscating programs ("Pluggable Transports"; PTs) like obfs2/3/4, meek, fteproxy and ScrambleSuit. Tor has a very rich repository of PTs, and there is no reason not to use these field-tested and well-reviewed implementations.

For faking HTTP traffic for better QoS, Tor already has a fteproxy, which transforms traffic into something that matches a specified regex. Tor's evaluation highlights a few weaknesses in fteproxy, but some of them are actually not hard to fix since SS deployments have more space for customization:

  • fteproxy performs no effort in hiding the packet size/timing signatures. Since obfs4 can do all of these, a very lame hack is possible: just wrap fteproxy around an obfs4 configured to do these.
  • fteproxy uses a static key on Tor deployments , and therefore is vulnerable to active probing on its own level. But SS itself can perform some key derivation from given password(s) to make it non-static.
  • fteproxy currently cuts the connection on receiving a normal HTTP response. This is a fatal issue to be fixed by SS developers.

Regarding the super-well-known obfs4, there is actually some timing obfuscation not enabled by default due to non-trivial performance penalty and costs on censors like GFW themselves. It might be worth mentioning as there are increasing concerns over timing detection on looks-like-nothing transports like SS itself and obfs4.

A successful non-Tor PT protocol implementation is @gumblex's ptproxy.

In retrospect, even kcptun can be made a PT this way. The name "Pluggable Transport" itself does not limit the transport to obfuscators; it can be anything that provides a transport-layer tunnel. And who said that we can't chain them?


@falseen 提到的 SSR 混淆让我想起了 obfs4。obfs4 其实是 Tor 的插拔式传输层(PT)的一种。传输层程序(一般都是混淆器)通过一种公开协议与 Tor 交流,实际上已经实现了这个插件模式的提议。Tor 有很多很好的混淆组件,没有道理不用啊。

修正:SSR 那个 obfs 只是 obfuscation(混淆)的简写,我还当 obfs4 了呢。

@anonymous-contributor
Copy link

anonymous-contributor commented Dec 30, 2016

Personally speaking, I don't really think current obfuscation is really obfuscating anything.
Package sequence and timing are not changed at all.
This seems to be a dirty hack, for given ISP. Not elegent nor generic.

So I never like the idea itself.

Here +1 for Tor PT, and in fact, I'm already using obfsproxy(scramblesuite) for SS for a long time.
My ISP seems to RST my connection quite often with plain ss(of cource, AES encrypted).
I swithed to obfsproxy + ss, and things work fine since then.

I'm using the proxyed mode for now, but it should not be hard to support managed mode.
(Always want to add managed mode, but since current proxy mode works fine and I'm too lazy so...)

BTW, latest obfs4 PT only supports manged mode.

So I prefer to deperate the current dirty hack, and just implement obfsproxy managed mode.
This is not only generic, but also KISS.

Thanks

@madeye
Copy link
Contributor Author

madeye commented Dec 31, 2016

After reviewing the whole proposal again, I realized that I made a big mistake here.

As mentioned by @Artoria2e5 and @anonymous-contributor, this proposal is actually "a dirty hack for given ISP". We should not directly add this proposal to the shadowsocks protocol, which also breaks KISS that we have insisted in the past four years.

So, here are my next steps:

  1. I'll deprecate this change in the next release of shadowsocks-libev.
  2. As this proposal is still useful for many users. I plan to move all the related implementation from shadowsocks-libev to a new project (simple-obfs?). So, it you're already using this feature or working on your compatible implementation , don't worry, the new project will continue to work as a plugin server for you.
  3. As proposed by @Mygod in SIP002 - Optional extension configurations as query strings in ss URLs #27, we can keep adding more obfuscating tools, e.g. obfs4, as plugin server and recommend them to all the shadowsocks users.

BTW, I forked obfs4 months ago and modified it to work in standalone mode as a simple tunnel tool. It may be useful if we plan to add obfs4 as a plugin server in the future. https://github.com/madeye/obfs4-tunnel

Thanks again to all the suggestions and comments in this issue. You're awesome!

@Mygod
Copy link
Contributor

Mygod commented Dec 31, 2016

How does plugin server work? Shadowsocks clients are written in very different languages and does this mean every client should work on a plugin platform next before implementing new plugins?

EDIT: In comparison, Tor's approach seems more doable.

@madeye
Copy link
Contributor Author

madeye commented Dec 31, 2016

@Mygod I mean plugin servers like shadowsocks over kcptun, tor over obfs4. Any plugin server can work for every implementation of shadowsocks.

@Mygod
Copy link
Contributor

Mygod commented Dec 31, 2016

Yeah I just realized that. In that case is it possible to use multiple plugins at the same time? For example, shadowsocks over obfs4 over kcptun?

@madeye
Copy link
Contributor Author

madeye commented Dec 31, 2016

For now, I think we should avoid this kind of plugin over plugin... And actually, obfs over kcptun is meaningless.

@anonymous-contributor
Copy link

anonymous-contributor commented Dec 31, 2016

For easy configuration, I prefer to use obfsproxy managed mode instead of standalone one.
(More and more like tor, right?)

This makes us able to configure ss client like:

{
  "server": "test.example.com",
  "server_pot": "obfs4: 6666"
}

And configure server like:

{
  "port_password":
  {
        "obfs4: 6666": "OBFSpaSsWoRD",
        "6667": "PLAINpaSSwoRD"
  }
}

Such configuration can save a lot of time, and can avoid double password for scramblesuite.
(We can just hash the ss password and use the hash as scramblesuite password)

Further more, it's possible to stack all plugins together:
(I must be crazy to do that, although I didn't find a good method to pass tcptun parameters)

{
  "port_password":
   {
       "obfs4+tcptun: 6666": "WTFarewedoing?"
   }
}

@simonsmh
Copy link

simonsmh commented Dec 31, 2016 via email

@ghost
Copy link

ghost commented Dec 31, 2016

I suggest we first clarify what it is that we are trying to obfuscate from:

There are two things that are could disrupt internet usage: gfw and ISP, which might be dealt with similarly or differently.

In this case, if I understand correctly, we are dealing with QoS, which is deployed at ISPs. What has been proposed, which has been proven useful in China Telecom, might be a dirty hack, but a working one.

obfs4 seems to me is not solving this particular problem, since it's a "look-like nothing obfuscation protocol" as described in the project page and that's exactly what an ISP think of shadowsocks. We at least need some test to show it's effectiveness in China Telecom to even consider it as an alternative.

obfs4 seems to prevent ISPs from resetting TCP connections according to @anonymous-contributor, but is this a general phenomenal or an isolated instance?

None of these should stop the development of the general architecture, of course. And since the original proposal still fit and has been tested, I see no reason to abandon it. As long as there is a pluggable design, it can be amended anytime.

@anonymous-contributor
Copy link

@nfjinjing it's more dependant on your VPS (and TCP congestion algorithm setup) than the so-call obfs.

Just as your benchmark shows, the dirty hack does improve the performance, but at a small scale compared to BBR TCP congestion algorithm.
The bottleneck lies in VPS location/route and TCP congestion algorithm.

So at least for me, integrate such hack into SS is not worthy.
Who knows when will other users request to add some extra dirty hack for other ISP.
If we start integrating it, it will begin an whac-a-mole in ss.

While the Tor PT method is both generic and KISS, any ISP specified hack can be one PT, and if there is really a lot of user need it, the project will grow and we will know.
And the generic Tor PT style interface will be quite easy for us to integrate (if using managed mode, just several lines of json config).

And for the obfs4 vs RST problem, it may be an individual problem, but it doesn't change the above KISS pricinple.

@gumblex
Copy link

gumblex commented Dec 31, 2016

@nfjinjing obfs4 can't prevent RSTs. Its purpose is to disrupt blocking or QoS that based on the observation of specific protocol characteristics and timing.

@madeye
Copy link
Contributor Author

madeye commented Dec 31, 2016

@anonymous-contributor Supporting PT looks a good idea. Any interest in opening a pull request?

@ghost
Copy link

ghost commented Dec 31, 2016

@anonymous-contributor thanks for the clarification. I'm not against PT or something similar, I'm in favor of it. I might had the wrong impression that the proposal, which could be implemented as a "plugin", is being replaced by obfs4.

@anonymous-contributor
Copy link

@madeye I'll spare some time for implementing the managed mode support, along with the json configuration part.

But don't expect it soon, it may be one or two month.

(I'm just too busy launching satellites in KSP 🚀 )

@madeye
Copy link
Contributor Author

madeye commented Dec 31, 2016

@anonymous-contributor Great! I'll keep studying more details about PT.

(And to be honest, I'm also busy with fighting against Germans in the Argonne Forest)

@ghost
Copy link

ghost commented Dec 31, 2016

@gumblex I'm curious if such capability is observed any ISP?

@liaozibo
Copy link

I think ss can open a server side api. client side should do that also.
developers can dev lots of server side and client side plugins to obfuscate the tcp/udp steam.
if any developers drop out ,would not make any affect to ss community.
This dev mode can make ss community(protocol) much more strong.

@cat-new
Copy link

cat-new commented Jan 2, 2017

我们是否应该也要用投票的方式决定是否加入 混淆?
https://goo.gl/forms/PIJ4ykg6NCViKtdD2
此表单在 2017年1月31日24时 失效!

@simonsmh
Copy link

simonsmh commented Jan 2, 2017 via email

@ghost
Copy link

ghost commented Jan 2, 2017

On a second thought, the "optional" feature of simple-obfs might be a problem. Just imagine what gfw will think when it sees a service that sometimes looks like an almost valid http request, where the host name probably won't match, and sometimes nothing at all.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests