Add an option to Kestrel to disable threadpool dispatching #1390

halter73 · 2017-02-23T00:08:43Z

No description provided.

tmds · 2017-04-06T14:15:54Z

@halter73 @davidfowl This is now an InternalKestrelServerOptions, is the plan to make it a KestrelServerOptions?

halter73 · 2017-04-06T17:23:55Z

I don't think there are currently plans to put it on KestrelServerOptions proper. We are using the internal options in some of our benchmarks, and the improvement seen by disabling threadpool dispatching is negligible in netcoreapp2.0.

tmds · 2017-04-06T18:18:08Z

I did a benchmark using two 8 core Azure machines (D4 I think) a la Techempower plaintext it was using 1.1 runtime, don't know if that makes a big difference.
With threadpool, giving Kestrel 2 threads had best performance.
Without threadpool, giving Kestrel 8 threads had best performance.
The increase was about 7%. Did you see a similar increase? Perhaps my benchmarking was not ok.
I think enabling this depends on the application and for some Transports the benefit may be more than for others.

tmds · 2017-04-06T19:12:51Z

Form the looks of aspnet/benchmarks it seems no dispatching is ran with kestrelthreads equal to 1/2.
For plaintext, this compares "dispatching using all cores" to "no dispatching using 2 cores".
Shouldn't no dispatching kestrelthreads be set to the number of cores of the machine to make it a fair comparison?

benaadams · 2017-04-06T19:13:28Z

Threadpool has been improved in 2.0. No-dispatching means the application code blocks I/O. Plaintext reposnse stream is just a memory copy of a cached set of 13 bytes so no real application work (just server work). With threadpool the total I/O thread count being less that the physical core count makes sense as you still need CPU for the threadpool to do the application work.

Not dispatching introduces Head-of-Line blocking; (which was one of the "failures" of http1.1 pipelining); where a slow request on one connection will stop the processing of all other connections on the same thread; regardless of CPU being free.

That's not to mention if the user application code makes a sync(blocking) Read/Write Task.Wait() or Task.Result, sync SQL Connect/Execute call when you've knocked out an entire Kestrel thread of I/O processing.

Perhaps just dispatching the application code to the thread pool; but keeping the Kestrel server code on the same thread might seem like a good comprise between the two?

However, picking a site at random... For the new .NET docs api browser site my browser sends 7kB of request headers, mostly cookie, which has to be parsed by the server and is CPU bound work. So again parsing that 7kB to create the header dictionary, request object etc will block all I/O on that thread, regardless of whether there is CPU is free. Whereas all the I/O has been done for that request.

So on balance, for the general case, it likely works out better to dedicate the Kestrel I/O threads to doing the I/O, then dispatching the processing of the data to spare CPU while the I/O thread goes back to getting the next bit of data.

Otherwise you'll have very unbalanced and poor utilization of CPU.

Just my 2c...

tmds · 2017-04-06T19:19:14Z

I mean 1 or 2.

    <Scenarios Include="-n Plaintext -o KestrelHttpServer@dev --kestrelThreadCount 1" />
    <Scenarios Include="-n Plaintext -o KestrelHttpServer@dev --kestrelThreadCount 1 --kestrelThreadPoolDispatch false" />
    <Scenarios Include="-n Plaintext -o KestrelHttpServer@feature/dev-si --kestrelThreadCount 2" />
    <Scenarios Include="-n Plaintext -o KestrelHttpServer@feature/dev-si --kestrelThreadCount 2 --kestrelThreadPoolDispatch false" />

I think to make it a fair comparison, it should be 6 when kestrelThreadPoolDispatch is false.

tmds · 2017-04-06T19:20:47Z

6 being the number of cores on an Intel® Xeon® Processor E5-1650.

Drawaes · 2017-04-06T19:31:22Z

I think tweaking on and off at this point is a bit academic. There are enough shifting parts and variables to get a solid baseline for this release. Including new transport types. When the dust has settled I think the whole threading subject could be revisited including numa dispatching, where nics are located relative to sockets, where the pools are etc.

tmds · 2017-04-06T19:49:22Z

I agree it makes sense to defer request handling to the threadpool to avoid the issues caused by bad application code.

What are your thoughts on the threadcount used for the benchmark?

benaadams · 2017-04-06T19:51:29Z

What are your thoughts on the threadcount used for the benchmark?

Don't know the rational, at a guess, its examining the overhead associated with dispatching to the threadpool vs not?

davidfowl · 2017-04-09T08:45:53Z

@tmds I think you're right, we need to change the benchmark. If we're using libuv threads make threads = number of cores or number of cores * 2.

tmds · 2017-04-11T19:13:15Z

Related topic: Kestrel ThreadCount.

So Netty does double amount of cpus ('logical processors').
And Kestrel does half.

When dispatching every request to the threadpool it makes sense to have a lower number of io threads. There is more load to parse the request and generate a response than there is to get data from/to the kernel.
Perhaps Netty doesn't dispatch? Or why would they have double the amount...

tmds · 2017-04-11T20:04:59Z

From http://stackoverflow.com/questions/5474372/how-netty-uses-thread-pools

Per end point there is a boss thread and a worker thread pool. The worker thread pool defaults to 2 times the number of cores.

The boss thread is similar to the ListenerPrimary in that it accepts an distributes connections. It does not handle requests.
The worker threads do the reads and writes. The handlers are executed in the worker threads. So no dispatching to a threadpool.

tmds · 2017-04-12T08:06:32Z

The approach matches well with what @benaadams explained before.
By not doing any work, the boss thread avoids getting blocked.
And there are more worker threads, which reduces the chance of all of them getting blocked.

halter73 added 2 - Working enhancement labels Feb 23, 2017

halter73 added this to the 2.0.0 milestone Feb 23, 2017

halter73 self-assigned this Feb 23, 2017

This was referenced Feb 27, 2017

Add an option to Kestrel to disable threadpool dispatching #1405

Merged

Add an option to Kestrel to disable threadpool dispatching #1408

Merged

halter73 added 3 - Done and removed 2 - Working labels Mar 1, 2017

halter73 closed this as completed Mar 1, 2017

davidfowl reopened this Apr 9, 2017

davidfowl closed this as completed Apr 9, 2017

tmds mentioned this issue Apr 12, 2017

Allow the transport suggest an application scheduler #1662

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add an option to Kestrel to disable threadpool dispatching #1390

Add an option to Kestrel to disable threadpool dispatching #1390

halter73 commented Feb 23, 2017

tmds commented Apr 6, 2017

halter73 commented Apr 6, 2017

tmds commented Apr 6, 2017

tmds commented Apr 6, 2017 •

edited

Loading

benaadams commented Apr 6, 2017 •

edited

Loading

tmds commented Apr 6, 2017

tmds commented Apr 6, 2017

Drawaes commented Apr 6, 2017

tmds commented Apr 6, 2017

benaadams commented Apr 6, 2017

davidfowl commented Apr 9, 2017

tmds commented Apr 11, 2017

tmds commented Apr 11, 2017

tmds commented Apr 12, 2017

Add an option to Kestrel to disable threadpool dispatching #1390

Add an option to Kestrel to disable threadpool dispatching #1390

Comments

halter73 commented Feb 23, 2017

tmds commented Apr 6, 2017

halter73 commented Apr 6, 2017

tmds commented Apr 6, 2017

tmds commented Apr 6, 2017 • edited Loading

benaadams commented Apr 6, 2017 • edited Loading

tmds commented Apr 6, 2017

tmds commented Apr 6, 2017

Drawaes commented Apr 6, 2017

tmds commented Apr 6, 2017

benaadams commented Apr 6, 2017

davidfowl commented Apr 9, 2017

tmds commented Apr 11, 2017

tmds commented Apr 11, 2017

tmds commented Apr 12, 2017

tmds commented Apr 6, 2017 •

edited

Loading

benaadams commented Apr 6, 2017 •

edited

Loading