Yes! The JSON library I wrote for the Zephyr RTOS does this. Say, for instance, you have the following struct:
struct SomeStruct {
char *some_string;
int some_number;
};
You would need to declare a descriptor, linking each field to how it's spelled in the JSON (e.g. the some_string member could be "some-string" in the JSON), the byte offset from the beginning of the struct where the field is (using the offsetof() macro), and the type.
The parser is then able to go through the JSON, and initialize the struct directly, as if you had reflection in the language. It'll validate the types as well. All this without having to allocate a node type, perform copies, or things like that.
This approach has its limitations, but it's pretty efficient -- and safe!
One thing to note, too, is that `atoi()` should be avoided as much as possible. On error (parse error, overflow, etc), it has an unspecified return value (!), although most libcs will return 0, which can be just as bad in some scenarios.
Also not mentioned, is that atoi() can return a negative number -- which is then passed to malloc(), that takes a size_t, which is unsigned... which will make it become a very large number if a negative number is passed as its argument.
It's better to use strtol(), but even that is a bit tricky to use, because it doesn't touch errno when there's no error but you need to check errno to know if things like overflow happened, so you need to set errno to 0 before calling the function. The man page explains how to use it properly.
I think it would be a very interesting exercise for that web framework author to make its HTTP request parser go through a fuzz-tester; clang comes with one that's quite good and easy to use (https://llvm.org/docs/LibFuzzer.html), especially if used alongside address sanitizer or the undefined behavior sanitizer. Errors like the one I mentioned will most likely be found by a fuzzer really quickly. :)
Unspecified, really? cppreference's [C documentation][1] says that it returns zero. The [OpenGroup][2] documentation doesn't specify a return value when the conversion can't be performed. This recent [draft][3] of the ISO standard for C says that if the value cannot be represented (does that mean over/underflow, bad parse, both, neither?), then it's undefined behavior.
So three references give three different answers.
You could always use sscanf instead, which tells you how many values were scanned (e.g. zero or one).
The Linux man page (https://man7.org/linux/man-pages/man3/atoi.3.html#VERSIONS) says that POSIX.1 leaves it unspecified. As you found out, it's really something that should be avoided as much as possible, because pretty much everywhere disagrees how it should behave, especially if you value portability.
For instance, if you try to parse a number that's preceded by a lot of spaces, sscanf() will take a long time going through it. I've been hit by that when fuzzing Lwan.
Programs running under any Valgrind tool will be executed using a CPU emulator, making it quite a bit slower than, say, running the instrumented binaries as required by sanitizers; it's often an order of magnitude slower, but could be very well be close to two orders of magnitude slower in some cases. This also means that it just can't be attached to any running program, because, well, it's emulating a whole CPU to track everything it can.
(Valgrind using a CPU emulator allows for a lot of interesting things, such as also emulating cache behavior and whatnot; it may be slow and have other drawbacks -- it has to be updated every time the instruction set adds a new instruction for instance -- but it's able to do things that aren't usually possible otherwise precisely because it has a CPU emulator!)
You're right and I was wrong, but in my experience Valgrind has been way faster then the AdressSanitizer. I don't perceive a difference with Valgrind, while ASan makes the program slower around 10x.
I wrote the Lwan web server, which similarly to Go, has its own scheduler and makes use of stackful coroutines. I have spent quite a bit of time Valgrinding it after adding the necessary instrumentation to not make Valgrind freak out due to the stack pointer changing like crazy. Despite a lot of Valgrind's limitations due to the way it works, it has been instrumental to finding some subtle concurrency issues in the scheduler and vicinity.
From a quick glance, it seems that Go is now registering the stacks and emitting stack change commands on every goroutine context switch. This is most likely enough to make Valgrind happy with Go's scheduler.
It's already pretty efficient but I'm working on it to make it even more efficient so I can use it as some sort of primitive fragment shader for an art project. This Forth variant is intended to execute Forth Haikus, as defined by the Forth Salon website.
I recently implemented a Forth, compatible with the Forth Haiku dialect used by https://forthsalon.appspot.com/ -- it uses a tail call/continuation-passing dispatching method, and performs some rudimentary optimizations. At some point it also spit out some C but I decided to give this feature the axe until it has a better infrastructure for optimizations and codegen.
The idea is to use it to drive an LED matrix and have a simple web UI to develop "fragment shaders" in Forth. It's developed as part of the Lwan project, although currently it generates GIF files on the fly rather than drive a LED matrix.
Reminds me of this sample from the Lwan project: https://time.lwan.ws/blocks -- where the clock is rendered on the server, and new frames are sent to the client using chunked encoding.
The parser is then able to go through the JSON, and initialize the struct directly, as if you had reflection in the language. It'll validate the types as well. All this without having to allocate a node type, perform copies, or things like that.
This approach has its limitations, but it's pretty efficient -- and safe!
Someone wrote a nice blog post about (and even a video) it a while back: https://blog.golioth.io/how-to-parse-json-data-in-zephyr/
The opposite is true, too -- you can use the same descriptor to serialize a struct back to JSON.
I've been maintaining it outside Zephyr for a while, although with different constraints (I'm not using it for an embedded system where memory is golden): https://github.com/lpereira/lwan/blob/master/src/samples/tec...