BY Akash Kava20 Dec 2021 Edit
YantraJS Performance Improvement using Sequence of T

While building parser, we found that we are creating huge list of tokens/nodes to build abstract syntax tree, and from abstract syntax tree we are creating IL code. Each of this list are very huge in size. And while trying to parse script with over 100K lines, parser would take more than 30 seconds.

And while profiling, we found two hot paths taking maximum time.

  1. String.Substring
  2. Array.Resize

We removed String.Substring by introducing StringSpan, this will be covered in a different post.

Array.Resize is called by List<T>.SetCapacity method. So we looked up source code of List<T> to investigate further.

Internals of List

List<T> uses an internal array to store items, and when there is no space left in the array, it resizes the array (which basically creates a larger array and copies all items from old array to new array). Default size of the array is 16, and as the list grows, it keeps on doubling the capacity. So when you are adding 17th item on the array, the capacity is set to 32. And when you are adding 33rd item, the capacity is set to 64 and so on with little variance.

This does not hurt performance for small scripts. But when we started parsing larger scripts, it started slowing down exponentially. As an entire script with too many closures and many nested functions, the tree is very large and memory usage and GC pressure is very high.

So we decided to drop List<T> and we created Sequence<T> which is suitable for append and read once scenario. Most of the times, in parser we are creating a list by appending items but we are only reading them once. However, when array is resized, every node is copied from old array to new array making many enumerations.

Sequence

Sequence<T> stores nodes of smaller array, and when there is no more free space, it creates a new node with a smaller array. Basically Sequence<T> is a linked list of variable sized arrays. Default size is 16, but if you specify an initial capacity, it will create first node of same capacity. `Sequence will not resize any array. The only drawback is indexed access is slower. When a sequence needs to be read, it starts from first node till last and enumerates all array items in the node.

Benefits are,

  1. It does not resize existing array
  2. It does not increase exponentially
  3. Appending is fast
  4. Enumerating is marginally slow but not at all noticeable.

Drawbacks,

  1. Indexed access is slow
  2. Insert is slow

Sequence is a basically a linked list of small arrays. Linked List are not suitable as smaller arrays will be localized while accessing and single node can have multiple array items. When AddRange is called with known size, Sequence will allocate all items at once if they are more than capacity.

BY Akash Kava
LikeCommentSave
LikeCommentSaveShare
0
Categories
General
YantraJS
Developer Guides
Tutorials
Web Atoms Updates

POPULAR POSTS
17 Mar 2021
LATEST ACTIVITY
Simmi Kava
commented this post.
Simmi Kava
liked this post.
Show more
ARCHIVES
2024
2023
2022
2021
TAGS
javascript (56)
developer (25)
javascriptdeveloper (16)
Xamarin.Forms (16)
Html (14)
typescript (12)
webatoms (12)
xamarin (11)
coding (10)
web-atoms (10)
arrays (9)
android (8)
javascript-developer (8)
csharp (7)
dotnet (7)
css (6)
update (6)
dotnet-standard (5)
function (5)
iOS (5)
methods (4)




Web Atoms: JSX (TSX + TypeScript) for Xamarin.Forms, Hot Reload Your App in Production Environment

PlaygroundSamples Repository