Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Corrupted data in response, when changing a preset #3641

Closed
1 task done
nilvanis opened this issue Jan 3, 2024 · 7 comments
Closed
1 task done

Corrupted data in response, when changing a preset #3641

nilvanis opened this issue Jan 3, 2024 · 7 comments
Labels
bug fixed in source This issue is unsolved in the latest release but fixed in master

Comments

@nilvanis
Copy link

nilvanis commented Jan 3, 2024

What happened?

When requesting a preset change via JSON, I see some corrupted data in the response.
Preset changes successfully, but the response generates an error in Home Assistant, e.g.:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa6 in position 861: invalid start byte

It happens most of the times.
Very rarely response is correct.
It happens when trying to activate any configured preset.

To Reproduce Bug

  1. Use ESP8266_2M (Athom LS-4P in my case)
  2. Create multiple segments and presets.
  3. Change preset via JSON.
    Example:
$ curl -D - -s -H 'Content-Type: application/json' -X POST http://192.168.1.10/json/state -d '{"ps":1,"v":true}'
HTTP/1.1 200 OK
Content-Length: 1708
Content-Type: application/json
Access-Control-Allow-Origin: *
Access-Control-Allow-Methods: *
Access-Control-Allow-Headers: *
Connection: close
Accept-Ranges: none

{"on":true,"bri":93,"transition":7,"ps":2,"pl":-1,"nl":{"on":false,"dur":60,"mode":1,"tbri":0,"rem":-1},"udpn":{"send":false,"recv":true,"sgrp":1,"rgrp":1},"lor":0,"mainseg":3,"seg":[{"id":0,"start":22,"stop":89,"len":67,"grp":1,"spc":0,"of":0,"on":true,"frz":false,"bri":255,"cct":127,"set":0,"n":"Prawo","col":[[255,160,0],[0,0,0],[0,0,0]],"fx":28,"sx":94,"ix":128,"pal":0,"c1":128,"c2":128,"c3":16,"sel":false,"rev":false,"mi":false,"o1":false,"o2":false,"o3":false,"si":0,"m12":1},{"id":1,"start":89,"stop":128,"len":39,"grp":1,"spc":0,"of":0,"on":true,"frz":false,"bri":255,"cct":127,"set":0,"n":"Furtka","col":[[255,170,0],[0,0,0],[0,0,0]],"fx":0,"sx":128,"ix":128,"pal":0,"c1":128,"c2":128,"c3":16,"sel":false,"rev":false,"mi":false,"o1":false,"o2":false,"o3":false,"si":0,"m12":1},{"id":2,"start":128,"stop":137,"len":9,"grp":1,"spc":0,"of":0,"on":true,E8|▒▒▒*▒▒u▒▒::▒▒▒p

                                                                                                                                                         <<▒.
00▒]    ,,▒,,▒▒
,,             ,,(#<<PF$00▒▒

V       O       8s-Control-Allow-Origin: *
Access-Control-Allow-Methods: *
Access-Control-Allow-Headers: *
Connection: close
Accept-Ranges: none

"202

V       HTTP/1.1 200 OK
Content-Length: 1708
Content-Type: application/json
Access-Control-Allow-Origin: *
Access-Control-Allow-Methods: *
Access-Control-Allow-Headers: *
Connection: close
Accept-Ranges: none

{"on":true,"bri":93,"transition":7,"ps":2,"pl":-1,"nl":{"on":false,"dur":60,"mode":1,"tbri":0,"rem":-1},"udpn":{"send":false,"recv":true,"sgrp":1,"rgrp":1},"lor":0,"mainseg":3,"seg":[{"id":0,"start":22,"stop":89,"len":67,"grp":1,"spc":0,"of":0,"on":true,"frz":

Expected Behavior

$ curl -D - -s -H 'Content-Type: application/json' -X POST http://192.168.1.10/json/state -d '{"ps":1,"v":true}'
HTTP/1.1 200 OK
Content-Length: 1715
Content-Type: application/json
Access-Control-Allow-Origin: *
Access-Control-Allow-Methods: *
Access-Control-Allow-Headers: *
Connection: close
Accept-Ranges: none

{"on":true,"bri":93,"transition":7,"ps":1,"pl":-1,"nl":{"on":false,"dur":60,"mode":1,"tbri":0,"rem":-1},"udpn":{"send":false,"recv":true,"sgrp":1,"rgrp":1},"lor":0,"mainseg":0,"seg":[{"id":0,"start":22,"stop":89,"len":67,"grp":1,"spc":0,"of":0,"on":true,"frz":false,"bri":255,"cct":127,"set":0,"n":"Prawo","col":[[255,160,0],[0,0,0],[0,0,0]],"fx":49,"sx":128,"ix":128,"pal":0,"c1":128,"c2":128,"c3":16,"sel":true,"rev":false,"mi":false,"o1":false,"o2":false,"o3":false,"si":0,"m12":0},{"id":1,"start":89,"stop":128,"len":39,"grp":1,"spc":0,"of":0,"on":true,"frz":false,"bri":255,"cct":127,"set":0,"n":"Furtka","col":[[255,170,0],[0,0,0],[0,0,0]],"fx":49,"sx":128,"ix":128,"pal":0,"c1":128,"c2":128,"c3":16,"sel":false,"rev":false,"mi":false,"o1":false,"o2":false,"o3":false,"si":0,"m12":0},{"id":2,"start":128,"stop":137,"len":9,"grp":1,"spc":0,"of":0,"on":true,"frz":false,"bri":255,"cct":127,"set":0,"n":"SlupekPrawy","col":[[255,170,0],[0,0,0],[0,0,0]],"fx":49,"sx":128,"ix":128,"pal":0,"c1":128,"c2":128,"c3":16,"sel":false,"rev":false,"mi":false,"o1":false,"o2":false,"o3":false,"si":0,"m12":0},{"id":3,"start":150,"stop":161,"len":11,"grp":1,"spc":0,"of":0,"on":true,"frz":false,"bri":255,"cct":127,"set":0,"n":"SlupekLewy","col":[[255,170,0],[0,0,0],[0,0,0]],"fx":49,"sx":128,"ix":128,"pal":0,"c1":128,"c2":128,"c3":16,"sel":false,"rev":false,"mi":false,"o1":false,"o2":false,"o3":false,"si":0,"m12":0},{"id":4,"start":161,"stop":200,"len":39,"grp":1,"spc":0,"of":0,"on":true,"frz":false,"bri":255,"cct":127,"set":0,"n":"Lewo","col":[[255,170,0],[0,0,0],[0,0,0]],"fx":49,"sx":128,"ix":128,"pal":0,"c1":128,"c2":128,"c3":16,"sel":false,"rev":false,"mi":false,"o1":false,"o2":false,"o3":false,"si":0,"m12":0}]}

Install Method

Self-Compiled

What version of WLED?

0.15, 202401010

Which microcontroller/board are you seeing the problem on?

ESP8266

Relevant log/trace output

No response

Anything else?

Issue probably exists also on 0.14.0, see: home-assistant/core#106375

Code of Conduct

  • I agree to follow this project's Code of Conduct
@nilvanis nilvanis added the bug label Jan 3, 2024
@blazoncek
Copy link
Collaborator

blazoncek commented Jan 3, 2024

Known issue. Cause still unknown, suspected TCP stack or AsyncTCP error (reuse pointer after free).
Possible duplicate of: #3599 #3447

EDIT: It will not happen on all ESP8266. In my experience original Lolin Wemos D1 mini will not exhibit this error.

@nilvanis
Copy link
Author

nilvanis commented Jan 3, 2024

Actually I experienced the #3447 issue in 0.14, but I don't see it in 0.15

@blazoncek
Copy link
Collaborator

The issue appeared when we updated ESP8266 platform in PIO. We cannot revert because newer NeoPixelBus requires it.

@willmmiles
Copy link
Contributor

willmmiles commented Jan 5, 2024

I'm not 100% finished debugging yet, but I'm out of time for the moment, so I'll give a quick report. It's looking to me like the issue is re-use of the global ArduinoJSON buffer before the AsyncJsonResponse from serveJSON is finished with it. In the case when the response is too big to send in a single packet case, the AsyncWebServer seems to hold a reference to the JSON object until it receives an ack packet from the client, at which point it generates the rest of the response. However serveJson immediately releases the lock, and in the trace when I got a corrupted packet, I found another client took the lock (handlePresets) and started using the buffer before the final reply packet was sent. So when AsyncWebServer went to re-generate the reply text to send the second packet, it wound up walking through now invalid pointers.

@blazoncek
Copy link
Collaborator

@willmmiles thanks to your hints I can now reliably reproduce corruption (not exactly the same as above, but it is consistent). It is indeed linked to use-after-free situation regarding JSON buffer. But not JSON document itself but strings within it.
It will always happen in serializeModeData() and serializeModeNames() which fill JSON array with names extracted from PROGMEM.

I wonder why that doesn't happen on ESP32 (and variants) but only on ESP8266. And not always at that.

@willmmiles
Copy link
Contributor

I wonder why that doesn't happen on ESP32 (and variants) but only on ESP8266. And not always at that.

You're safe if the reply is less than the outbound TCP buffer size, so the server code could queue the entire response to the underlying IP stack before returning to the wled code. On ESP8266 that's 1024 bytes, which I think will cover most use cases of 2 segments or less. A quick google search suggests the ESP32 defaults to 5760 bytes, which will cover a lot more segments and effects without issues.

And even then, it'll only have problems if something else tries to use the JSON buffer and overwrites the strings before the wifi code finishes the transfer (ie. a classic race condition). This didn't consistenly happen in my test setup -- depending on what else was going on, most of the time it would still complete successfully for me.

willmmiles added a commit to willmmiles/WLED that referenced this issue Jan 6, 2024
Extend the JSON response class to hold the global JSON buffer lock
until the transaction is completed.

Fixes Aircoookie#3641
@blazoncek blazoncek added the fixed in source This issue is unsolved in the latest release but fixed in master label Jan 6, 2024
@blazoncek
Copy link
Collaborator

Closing as corrupt JSON has been fixed in 0.14.1-b3.
If the issue persist in 0.14.1-b3 or later, please open a new issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug fixed in source This issue is unsolved in the latest release but fixed in master
Projects
None yet
Development

No branches or pull requests

3 participants