Skip to content
This repository has been archived by the owner on Mar 26, 2020. It is now read-only.

JSON output is locale-dependent and is not cross-platform. #38

Open
sergey-shambir opened this issue Jul 29, 2015 · 16 comments
Open

JSON output is locale-dependent and is not cross-platform. #38

sergey-shambir opened this issue Jul 29, 2015 · 16 comments
Labels

Comments

@sergey-shambir
Copy link

json11 uses locale-dependent functions for conversion between double and it's string representation. On Ubuntu with Russian language it prints doubles with comma like '3,1415926'. On Windows with Russian language pack or any OS with English language it prints '3.1415926' and parser also expects '3.1415926' so '3,1415926' cannot be parsed.

@peterritter
Copy link

Sergey

Thanks for pointing that out. I'm just a user myself, not any of the
developers, and I hadn't thought about this issue so far. This is actually
a problem for me too. I wonder if the json specification mandates how
floating point numbers are to presented and whether json is cross platform
by definition.

Peter

On Wed, Jul 29, 2015 at 6:26 AM, sergey-shambir [email protected]
wrote:

json11 uses locale-dependent functions for conversion between double and
it's string representation. On Ubuntu with Russian language it prints
doubles with comma like '3,1415926'. On Windows with Russian language pack
or any OS with English language it prints '3.1415926' and parser also
expects '3.1415926' so '3,1415926' cannot be parsed.


Reply to this email directly or view it on GitHub
#38.

@crazy2be
Copy link

It does: http://www.json.org/

It is very clear that a period is supposed to be used for the decimal point if you look at the parsing diagrams.

@j4cbo
Copy link
Contributor

j4cbo commented Oct 15, 2015

This is definitely a bug. We likely don't have time to implement a proper fix, but I'd love to see a pull request. Options include:

  • Use snprintf_l if available or emulate somehow if not
  • Loop through the string after conversion and replace localeconv()->decimal_point with .
  • Embed our own copy of grisu2 (I think this is my preferred option)

@artwyman artwyman added the bug label Apr 9, 2016
@Dushistov
Copy link
Contributor

json11 also have str->double bug, yes, in parse_number() it checks that number have "." as delimiter, but after that it uses std::strtod, and this function depend on locale, so 10.5 it can convert to 10.0, because of it expects another delimiter.

@Dushistov
Copy link
Contributor

What about import code from rapid json, it has very fast string<->double routies: https://github.com/miloyip/rapidjson/tree/master/include/rapidjson/internal

and the code is under MIT license?

@Dushistov
Copy link
Contributor

Also there is https://github.com/google/double-conversion from author of girus3 (BSD like license), what about use it as submodule?

@Dushistov
Copy link
Contributor

I convert json11 to usage of double-conversion (for string->double and double->string): Dushistov@115fc76

nativejson-benchmark from here
https://github.com/miloyip/nativejson-benchmark give:

Before:
Parse Time (ms) 51
Stringify Time (ms) 109

After:
Parse Time (ms) 41
Stringify Time (ms) 23

As you can see speedup 24% for parsing, and 374% for dump,
plus it not depend on locale.

What do you think?

@artwyman
Copy link
Contributor

I like the idea of fixing this, and like the sound of those benchmarks. A submodule seems quite heavy weight for json11, which is currently a very tiny library which is easy to integrate. Dropbox's own build environment wouldn't have any trouble with a submodule, but it's a lot of overhead for other users. Is there a simpler option?

I'll get an answer on whether the license is okay, but I suspect it would be.

@dabulla
Copy link

dabulla commented May 10, 2017

Bump. In my opinion this is a pretty heavy bug. I'm happy I found this discussion after googleing the right words.
We used json11 in a project for a while and I just started to use it for the first double-values. After parsing there values where all truncated ints (which resulted in completely deformed 3D-Objects in a 3D Visualization). Can't think of an elegant way to fix this in my application: Replace "." by "," is not possible because commas have a different meaning in json. Using quotation would result in a string. We have to use another json library at the moment. If all our systems/test-systems where configured to English locale, we would not have found the bug and shipped a faulty application (which is not your fault of cause, thanks for the great work with json11 :) ). But this bug should at least be advertised a bit more.

@artwyman
Copy link
Contributor

Pull requests welcome.

@skypjack
Copy link

@artwyman

@j4cbo suggested the followings:

  • snprintf_l if available or emulate somehow if not
  • Loop through the string after conversion and replace localeconv()->decimal_point with .
  • Embed our own copy of grisu2 (I think this is my preferred option)

Are all good for you? Probably point 2 would be a quick (though pretty dirty) solution while waiting for someone to implement or port grisu2.

@artwyman
Copy link
Contributor

I don't have time to go deep, but they all sound plausible. My primary concerns on any new dependencies would be how portable they are, particularly to the mobile platforms where Dropbox uses json11. For something hand-tweaked like #2 I'd just want to make sure it has minimal risk for unintended impact (i.e. don't accidentally mess with commas in strings) and minimal performance impact.

@alekseyt
Copy link

Patch for json.cpp

diff --git a/json11.cpp b/json11.cpp
index 9647846..73d76e8 100644
--- a/json11.cpp
+++ b/json11.cpp
@@ -24,7 +24,9 @@
 #include <cmath>
 #include <cstdlib>
 #include <cstdio>
+#include <iomanip>
 #include <limits>
+#include <sstream>
 
 namespace json11 {
 
@@ -56,9 +58,11 @@ static void dump(NullStruct, string &out) {
 
 static void dump(double value, string &out) {
     if (std::isfinite(value)) {
-        char buf[32];
-        snprintf(buf, sizeof buf, "%.17g", value);
-        out += buf;
+        std::stringstream ss;
+        ss.imbue(std::locale::classic());
+        ss.precision(std::numeric_limits<double>::digits10);
+        ss << value;
+        out += ss.str();
     } else {
         out += "null";
     }
@@ -618,7 +622,13 @@ struct JsonParser final {
                 i++;
         }
 
-        return std::strtod(str.c_str() + start_pos, nullptr);
+        std::stringstream ss(str.c_str() + start_pos);
+        ss.imbue(std::locale::classic());
+        ss.precision(std::numeric_limits<double>::digits10);
+        double value = 0;
+        ss >> value;
+
+        return value;
     }
 
     /* expect(str, res)

To reproduce this in tests, locale is need to be set at the start of testcase (commented out in patch).

Patch for test.cpp

diff --git a/test.cpp b/test.cpp
index ab2e2b9..cf26b34 100644
--- a/test.cpp
+++ b/test.cpp
@@ -60,13 +60,16 @@ CHECK_TRAIT(is_nothrow_move_assignable<Json>);
 CHECK_TRAIT(is_nothrow_destructible<Json>);
 
 JSON11_TEST_CASE(json11_test) {
+    // setlocale(LC_ALL, "ru_RU.utf8");
+
     const string simple_test =
-        R"({"k1":"v1", "k2":42, "k3":["a",123,true,false,null]})";
+        R"({"k1":"v1", "k2":42.31415, "k3":["a",123,true,false,null]})";
 
     string err;
     const auto json = Json::parse(simple_test, err);
 
     std::cout << "k1: " << json["k1"].string_value() << "\n";
+    std::cout << "k2: " << json["k2"].number_value() << "\n";
     std::cout << "k3: " << json["k3"].dump() << "\n";
 
     for (auto &k : json["k3"].array_items()) {
@@ -158,12 +161,12 @@ JSON11_TEST_CASE(json11_test) {
     // Json literals
     const Json obj = Json::object({
         { "k1", "v1" },
-        { "k2", 42.0 },
+        { "k2", 42.31415 },
         { "k3", Json::array({ "a", 123.0, true, false, nullptr }) },
     });
 
     std::cout << "obj: " << obj.dump() << "\n";
-    JSON11_TEST_ASSERT(obj.dump() == "{\"k1\": \"v1\", \"k2\": 42, \"k3\": [\"a\", 123, true, false, null]}");
+    JSON11_TEST_ASSERT(obj.dump() == "{\"k1\": \"v1\", \"k2\": 42.31415, \"k3\": [\"a\", 123, true, false, null]}");
 
     JSON11_TEST_ASSERT(Json("a").number_value() == 0);
     JSON11_TEST_ASSERT(Json("a").string_value() == "a");

I think it's in the spirit of json11, but whatever, hope this helps.

@artwyman
Copy link
Contributor

Thanks for taking a stab at it. Please phrase your submissions in the form of a Pull Request, though, not a patch.

On a quick look, I wonder about the potential performance difference between stringstream and snprintf, and whether that could add up when printing a lot of numbers. I wouldn't know for sure if that's an issue without measuring, though.

@ashaduri
Copy link

C++17 has std::from_chars() and std::to_chars() specifically for this kind of usage (JSON, etc...). They're also supposed to be very fast (faster than the usual locale-aware functions).
An ifdef to enable these for people using C++17 would be very nice.

http://en.cppreference.com/w/cpp/utility/from_chars
http://en.cppreference.com/w/cpp/utility/to_chars

@alekseyt
Copy link

Nah, i'm good. I can only note that it's too trivial to be copyrighted, so anyone are welcome to do whatever they want with those patches.

rominf pushed a commit to rominf/json11 that referenced this issue Dec 12, 2019
Parse the number according to the JSON grammar, ignoring the current
locale.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

10 participants