RFC: Improving JSON processing performance

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

RFC: Improving JSON processing performance

bserdar
I've been thinking about using JSON documents in the core and mongo components. There are several problems with using pure JSON:
 1) We keep at least two copies of the doc for insert/update: the original as read from the db or submitted in the request, and the copy that we operate on.
 2) Document processing is always done referencing metadata, and we have to resolve field names to locate corresponding metadata for a node.
 3) During projection, we create a new copy, and it is a bottleneck

Here's something else instead: instead of using Jackson JSON tree to represent documents, we can use a custom Document class. This class would use Jackson JsonNode values for everything except objects and arrays. We keep hashmaps and lists for object and array nodes. To manage multiple copies of the same doc, we keep multiple copies of those hashmaps and lists, so taking a copy of the document would involve taking copies of all containers in a document tree, without copying value nodes.

For every node, we associate the metadata with each document node once, so we don't have to look it up for every field.

To speed up projection, we keep a boolean at every node so projecting a document becomes setting those flags instead of computing a copy.

Thoughts?
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Improving JSON processing performance

jewzaam
Administrator
Makes sense, but it is something we already have to some degree with JsonDoc.  Would we look at this new structure to replaces all places where we managed document structure?  What would be a part of it, if that's true?
* the value: json node, map (object), or array
* the metadata reference
* boolean indicating if the value is projected

To support hooks we could distinguish between old and new value.

What about for associations?
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Improving JSON processing performance

bserdar
This would require changes to everywhere where json doc structure is
modified. That would include associations, there we insert one json
doc into another.

The alternative is to somehow save on metadata lookups. If there's
another faster way to lookup metatada references, that should work
equally well.

For projections, maybe we can come up with a completely new algorithm.

If we can solve these two, then we don't need to change the structure.

On Mon, Oct 19, 2015 at 1:10 PM, jewzaam [via lightblue-dev]
<[hidden email]> wrote:

> Makes sense, but it is something we already have to some degree with
> JsonDoc.  Would we look at this new structure to replaces all places where
> we managed document structure?  What would be a part of it, if that's true?
> * the value: json node, map (object), or array
> * the metadata reference
> * boolean indicating if the value is projected
>
> To support hooks we could distinguish between old and new value.
>
> What about for associations?
>
> ________________________________
> If you reply to this email, your message will be added to the discussion
> below:
> http://dev.forum.lightblue.io/RFC-Improving-JSON-processing-performance-tp403p404.html
> To start a new topic under lightblue-dev, email
> [hidden email]
> To unsubscribe from lightblue-dev, click here.
> NAML