16 Aug 2020, 18:28

Getting Started With Handling TypeScript ASTs

TypeScript provides a language which builds on top of JavaScript adding static types to help determine type mismatches at compile time (and even whilst you write!). This is useful from a developmental perspective as (without wanting to dive into the dynamic vs static typing) it makes it easier to reason about the data flowing through your programs, documenting the shape of data at each stage.

TypeScript is powerful in that it has a solid level of inference, which means you don’t need to strictly type every variable. For variable assignment it will take what’s called a best common type; this means if your provide an integer or float, it will infer a number type, but also if you have an array with multiple types it will type the variable as an array with those possible types (i.e. let arr = ["1", 2, null] would type to (string | number | null)[]). Along with side this, it also provides what you could call the ‘inverse’ of this; contextual typing. Contextual typing means that the type can be implied by the location and usage. For example, when you assign a function to environment callbacks like window.onmousemove in the browser, TypeScript can imply the parameter type of the function is of type MouseEvent because it knows onmousemove takes a function with this as its first parameter (the type is inferred from the hand side of the assignment).

When the TypeScript compiler compiles your code, it creates an Abstract Syntax Tree (AST) of it. Essentially, an AST can be thought of as a tree representation of the syntax of your source code, with each node being a data structure representing a construct in the relating source code. The tree is complete with nodes of all the elements of the source code.

For example, if we took the simple TypeScript program:

console.log("hello world");

We would end up with an AST structure that could be represented textually in this form:

SourceFile
  ExpressionStatement
    CallExpression
      PropertyAccessExpression
        Identifier
        Identifier
      StringLiteral
  EndOfFileToken

Here, SourceFile represents the file itself, we then have two child nodes; a parent ExpressionStatement (console.log("hello world");) and the EndOfFileToken, representing the end of the file. The ExpressionStatement comprises of one CallExpression, which has one PropertyAccessExpression composed of two Identifiers console and it’s property log. We also have the StringLiteral in the call expression which is, in this case, "hello world".

Each of these items in our tree is an AST node, which can be traversed and has a series of metadata attached to it. This data can be useful or interesting to us as developers. As an elementary example, we can determine the kind of the node; in TypeScript this is represented as an enum called SyntaxKind. To demonstrate, we can say that the value 243 is equal to FunctionDeclaration in the SyntaxKind enum. This scratches simple use scratches the surface what we can do, however, and as we go deeper into this blog post you’ll see the types of things we can do with the TypeScript AST.

To get more specific here are some ideas of things you could do by traversing and manipulating ASTs:

  • Write documentation in markdown or HTML for your TypeScript code
  • Edit code programmatically by using the manipulation features
  • Automatically update code between new versions of a library or framework
  • Write new code, leveraging something like code-block-writer
  • Write types for the return payloads of your APIs for your frontend code
  • Write a custom linter for your TypeScript code

Now, lets first dive into how we can explore ASTs using the TypeScript compiler API.

Making sense of TypeScript’s AST

The typescript package provides a compiler API, which allows you to access to the AST nodes. In theory, if we wanted to print the above AST we could do so using the typescript module itself like so:

import * as ts from 'typescript';

const code = "console.log('hello world')";

const sourceFile = ts.createSourceFile('temp.ts', code);
let indent = 0;

function printTree(node) {
    console.log(new Array(indent + 1).join(' ') + ts.SyntaxKind[node.kind]);
    indent++;
    ts.forEachChild(node, printTree);
    indent--;
}
printTree(sourceFile);

Here we can see us creating a temporary sourcefile in memory using the code string, then traversing down the tree to print out its children type. This will print out the tree depicted above. Let’s take this a small step further and use the node APIs to only log out the programs syntactic elements when it’s a string literal:

import * as ts from 'typescript';

const code = "console.log('hello world')";

const sourceFile = ts.createSourceFile('temp.ts', code);

function printTree(node) {
    if (ts.isStringLiteral(node)) {
        console.log("Text:", node.getFullText());
    }
    ts.forEachChild(node, printTree);
}
printTree(sourceFile);

This program would log out hello world, as it ignores any node that isn’t a string literal.

This is perhaps one of the most rudimentary things we could do with the compiler API, but there’s a whole host of use cases and more detailed dives into the TypeScript Compiler API in the TypeScript GitHub documentation.

Introducing ts-morph

Some might argue that the TypeScript compiler API can be slightly cumbersome to work with. ts-morph is a package that wraps around the compiler API to provide a smoother and easier way to deal with the TypeScript code.

Let’s rewrite the first program we took from using the TypeScript compiler API:

import { Project } from "ts-morph";

const project = new Project();
const code = "console.log('hello world')";
const sourceFile = project.createSourceFile("temp.ts", code);

let indent = 0;

function printTree(node) {
    console.log(new Array(indent + 1).join(' ') + node.getKindName());
    indent++;
    node.forEachChild(printTree);
    indent--;
}
printTree(sourceFile);

The main difference here is that instead of using methods on the ts module, ts-morph creates an object that has those methods attached to the nodes. As such we could turn the second program into:

import { Project } from "ts-morph";

const project = new Project();
const code = "console.log('hello world')";
const sourceFile = project.createSourceFile("temp.ts", code);

function printTree(node: SourceFile) {
    if (node.getKindName() === 'StringLiteral') {
        console.log("Text:", node.getFullText());
    }
    node.forEachChild(printTree);
}
printTree(sourceFile);

Putting ts-morph to work

Taking ts-morph a bit further we can relatively straightforwardly go about doing more interesting things. Say we wanted to get the names of all the classes that contained a certain property name in a file, we could write a function that does that like so that:

const findClassesWithPropertyName = (sourceFile: SourceFile, name: string) => {

    const classes = sourceFile.getClasses();
    const classesWithProperty: ClassDeclaration[] = [];

    for (let i = 0; i > classes.length; i++) {
        const tsClass = classes[i];
        const matches = tsClass.getProperties().map((p) => p.getName()).includes(name)
        if (matches) {
            classesWithProperty.push(tsClass);
        }
    }

    return classesWithProperty;

} 

Let’s think of another example program; in this case, we are interested in determining the depth of the deepest nested function within a source code file which contains top-level functions. We could achieve this using something like the following code:

const getDeepestFunction = (sourceFile: Node | SourceFile) => {
    let deepest = 0;

    const stack: Node[] = [sourceFile.getFirstChildByKind(ts.SyntaxKind.SyntaxList)];
    const parentNodes = [...stack[0].getChildren()]

    let depth = 0;

    while (stack.length) {
        const node = stack.pop();
        const kind = node.getKindName();

        if (parentNodes.indexOf(node) !== -1) {
            depth = 0;
        }
        
        if ((kind === 'FunctionDeclaration' || kind === 'ArrowFunction') ) {
            depth++
            if (depth > deepest) {
                deepest = depth;
            }
        }
        stack.push(...node.getChildren());
    }

    return deepest;
} 

Here you can see we create a stack and iterate through the child nodes, adding depth every time we encounter a function declaration or an ES6 arrow function. We reset if we are in one of the top-level parent nodes.

Replacing and updating AST nodes

Lastly, I want to demonstrate how you could rewrite your codes AST using ts-morph (although the behaviour is basically identical for the core typescript library). Let’s say we have a code snippet that converts 10000000000 bytes to kilobytes, megabytes and gigabytes. However we notice that the conversion is off by one!:

const totalSize = 10000000000;
const totalSizeKB = totalSize / Math.pow(1024,2);
const totalSizeMB = totalSize / Math.pow(1024,3);
const totalSizeGB = totalSize / Math.pow(1024,4);

This is of course slightly contrived and we could certainly fix this issue manually, but it’s useful to demonstrate how we can start doing code transforms and updating our code programmatically. We could write a function that changes the powers from 2, 3 and 4 to be 1, 2 and 3 respectively:

const rewriteMemoryPowers = (sourceFile: SourceFile) => {
    return sourceFile.transform(traversal => { 
        const node = traversal.visitChildren(); // Here travseral visits children in postorder
        if (ts.isNumericLiteral(node) && node.getText().length === 1) {
            return ts.createNumericLiteral(String(parseInt(node.getText()) - 1));
        }
        return node;
    });
};

Here we use the transform method which returns a TransformTraversalControl. This, in turn, allows us to traverse down the AST and replace and update node; we call visitChildren to make sure we traverse down the child nodes (here ‘traversal’ also has a currentNode property if you are not interested in the node’s children).

The program goes through each node and its children, determines if it is a numeric literal, and if it is and has a length of 1 (e.g. 2, 3, 4) then we decrement it by one and create a new numeric literal in its place. That should fix our issue! Arguably the way this is coded isn’t super robust, but the aim here is to demonstrate how you can use the transform method to recurse down the AST tree and update the behaviour of the program by changing its nodes.

Saving and emitting to disk

If you made transforms to your code, or you’re interested in converting them to JavaScript, saving and emitting will be useful features. Thankfully with ts-morph, this is reasonably straightforward. In our previous example, if we wanted to save the file to disk, we could do something like so:

const sourceFile = project.createSourceFile("conversion.ts", code);
rewriteMemoryPowers(sourceFile).saveSync();

Here we use saveSync, but you can use save if you would like that to be asynchronous. On a similar note, assuming we wanted to emit the compiled JavaScript file, we could do:

const sourceFile = project.createSourceFile("conversion.ts", code);
rewriteMemoryPowers(sourceFile).emitSync();

This will write a conversion.js file to disk. Here we have been using in-memory files up until the point we save them, but we can also read files from disk, and even whole directories if we so wish. See the ts-morph docs on navigation and directories.

Conclusion

Hopefully, this blog post has shown you how you can get started with exploring TypeScript ASTs and how you can manipulate their data structures in useful ways.

An aim here was that the ideas at the beginning of this post give some inspiration for what might be possible. If there is interest I can look into exploring one of these examples in a further blog post.

24 May 2020, 18:28

AssemblyScript: Passing Data to and From Your WebAssembly Program

AssemblyScript takes a strict subset of TypeScript and allows it to be compiled to WebAssembly. This is a very persuasive selling point for developers familiar with JavaScript and TypeScript as it immediately allows them to transfer their skills to be able to write WebAssembly programs. This is exciting as WebAssembly has proved useful from things like game engines such as Unity to design tools such as Figma.

In this short blog post, we will look at how you can pass data from JavaScript to WebAssembly programs created using AssemblyScript. This blog will assume you’ve followed the quick start guide and that you are familiar with npm and JavaScript.

As it currently stands the only types WebAssembly supports are integers and floats, which are stored in linear memory (a single contiguous address space of bytes) for the WebAssembly program. As a language, AssemblyScript works with us in abstracting away some of the complexity of managing more complex types like strings and arrays. A common use-case for AssemblyScript might be to write a WebAssembly module and then make use of that within a JavaScript runtime. At some point, it will probably be necessary to pass some data from the JavaScript code to the WebAssembly module and/or back again. Let’s take a look at how we go about passing various data types to AssemblyScript.

Numbers

Passing numbers to our program is straightforward and doesn’t require any special treatment. We can achieve this by just passing them number values directly to our WebAssembly module function. You can use i32 your AssemblyScript code for intergers and f32 for float types, like so:


    // AssemblyScript
    export function add(a: i32, b: i32): i32 {
      return a + b;
    }

    export function addFloats(a: f32, b: f32): f32 {
      return a + b;
    }


    // JavaScript
    {
        add,
        addFloats
    } = wasmModule.exports;

    const result = add(1, 2);
    // result will be 3

    const floatResult = addFloats(1.5, 2.5);
    // result will be 4

This is great, especially if we don’t need to deal with any other types. However if our program gets more complex we may need to start dealing with other types.

Introducing the loader

As mentioned previously, WebAssembly as it stands only deals with number types. So how can we go about passing JavaScript data types like strings and arrays to our WebAssembly programs? One solution is to to use the AssemblyScript loader which simplifies the process of loading more complex data types into the WebAssembly memory. The module provides a set of convenience functions to allow loading in types like strings and arrays into memory, returning their pointers. It also allows for the managing of their lifecycle via retaining and releasing them. To get started using the AssemblyScript loader lets install it into our project using npm:

npm install @assemblyscript/loader

Once we’ve compiled our AssemblyScript program to a wasm file, we will want to use this in our web application.

Let’s start with how we go about instantiating our program (here we will be in a Node environment):


    const loader = require("@assemblyscript/loader");
    const buf = fs.readFileSync('./build/optimized.wasm');
    const wasm = new WebAssembly.Module(new Uint8Array(buf));
    loader.instantiate(wasm, { 
      env: { 
        abort: (err) => {
          console.error(err)
        }
      }
    }).then((wasmModule) => {

      console.log(wasmModule.exports);
      // Code to use the instantiated wasm module

    });

Strings

For more complex types like strings, we can leverage the loader. Strings in AssemblyScript are immutable, and hence we can’t change a string once we’ve passed its pointer to the AssemblyScript function. We could, however, return a pointer to a newly constructed string value. In this case, we’ll replace ‘hello’ in a string with ‘hi’ in the string and return a new string pointer, and then read it with the __getString method:


    // AssemblyScript
    export function replaceHelloWithHi(a: string): string {
      return a.replace("hello", "hi");
    }


    // JavaScript
    {
        __retain, 
        __allocString,
        __release,
        replaceHelloWithHi
    } = wasmModule.exports;
    const originalStr = "hello world";
    const ptr = __retain(__allocString(originalStr));
    const newPtr = replaceHelloWithHi(ptr);
    const newStr = __getString(newPtr);
    __release(ptr);
    __release(newStr);
    console.log(newStr);
    // logs out 'hi world'

Arrays

If you have a regular untyped array in our JavaScript side, we’ll still need to allocate a typed array on the WebAssembly side, we can use the AssemblyScript i32 type for this. We can get its id using the idof function to get the ID. For Typed Arrays we can use the same approach use the appropriate Typed Array type, in this case, Int32Array. We use idof like so:


    // AssemblyScript
    export const i32ArrayId = idof<i32[]>()
    export const Int32ArrayId = idof<Int32Array>()

Now we have the IDs we can use them using the appropriate functions from the AssemblyScript loader. We will need to allocate an array in the module’s memory and retain it to make sure it doesn’t get collected prematurely. Let’s work through this for the example of summing an array:


    // AssemblyScript
    export function sumArray(arr: i32[]): i32 {
      let sum: i32 = 0;
      for (let i: i32 = 0; i < arr.length; i++) {
        sum = sum + arr[i];
      }
      return sum;
    }


    // JavaScript
    const { 
        __retain, 
        __allocArray,
        __release,
        i32ArrayId,
        Int32ArrayId,
        sumArray
    } = wasmModule.exports;


    // Untyped arrays

    const arrayPtr = __retain(__allocArray(i32ArrayId, [1, 2, 3]))
    const sum = sumArray(arrayPtr);

    __release(arrayPtr);

    // Now for TypedArrays

    const typedArrayPtr = __retain(__allocArray(Int32ArrayId, new Int32Array([1, 2, 3])))
    const typedSum = sumArray(typedArrayPtr);


    __release(typedArrayPtr);

Are there any other approaches?

The loader could is deliberately quite minimalist and not as abstracted as they potentially could be. If you are looking for something simpler, I would definitely recommend taking a look at Aaron Turner’s asbind library which steamlines the process. For example, we can reduce the string example to the following code:


    // JavaScript
    import { AsBind } from "as-bind";
    const wasm = fs.readFileSync("./build/optimized.wasm");

    (async () => {
      const asBindInstance = await AsBind.instantiate(wasm);

      const response = asBindInstance.exports.replaceHelloWithHi("Hello World!");
      console.log(response); // Hi World!
    })();

08 May 2020, 18:28

Writing Web Workers in TypeScript

TypeScript has taken the web development world by storm, and I too am a fan. Unfortuantely what I’m not a fan of is contention on the main thread, which has increased over time as we ship more and more JavaScript to our pages.

I’ve written in previous posts about Web Workers, but for those of you not familiar they allow the developer to move work off of the main thread and into a separate thread of execution. These work great for tasks that often block such as process large amounts of data. This is very common in audio, gaming and mapping applications as examples. We can also leverage them for more generic work, and Surma has done a great job of explaining why that is an important consideration for web developers.

In this post, I want to show how you can write Workers in TypeScript and build them using the popular bundler Webpack. The first step we need to take is to install all the modules we need via npm as development dependencies. We can do this from our command line like so:

npm install webpack webpack-cli worker-loader typescript ts-loader --save-dev

We also need to set up a TypeScript configuration, tsconfig.json, file in our root directory. We can do a rudimentary implementation like this:

{
    "compilerOptions": {
      "outDir": "./dist/",
      "noImplicitAny": true,
      "module": "es6",
      "target": "es5",
      "allowJs": true,
      "sourceMap": true
    }
}

You can adjust this to your required tastes but this is a barebones starter to get going. Next lets setup the webpack.config.js file again in our root directory to configure Webpack and allow us to build our application and worker:

const path = require("path");

module.exports = {
  mode: "development",
  entry: "./src/index.ts",
  devtool: "inline-source-map",
  module: {
    rules: [
      // Handle TypeScript
      {
        test: /\.tsx?$/,
        use: "ts-loader",
        exclude: [/node_modules/],
      },
      // Handle our workers
      {
        test: /\.worker\.js$/,
        use: { loader: "worker-loader" },
      },
    ],
  },
  resolve: {
    extensions: [".ts", ".js"],
  },
  output: {
    // This is required so workers are known where to be loaded from
    publicPath: "/dist/",
    filename: "bundle.js",
    path: path.resolve(__dirname, "dist/"),
  },
};

This covers the build step side of things, now we can look at our code itself. Let’s assume we have a src folder for our source code, and a dist folder for a compiled code. The first thing we’ll want to do is setup types for the Workers so that TypeScript doesn’t complain:

// types.d.ts
declare module "worker-loader!*" {
    class WebpackWorker extends Worker {
      constructor();
    }

    export default WebpackWorker;
}

Now, let’s write a Worker. As an example of a large workload, this Worker will generate primes using the Sieve of Erastosthenes and return them back to the main thread:

// worker.js

// We alias self to ctx and give it our newly created type
const ctx: Worker = self as any;

class SieveOfEratosthenes {

    // This is the logic for giving us back the primes up to a given number
    calculate(limit: number) {

      const sieve = [];
      const primes: number[] = [];
      let k;
      let l;

      sieve[1] = false;
      for (k = 2; k <= limit; k += 1) {
        sieve[k] = true;
      }

      for (k = 2; k * k <= limit; k += 1) {
        if (sieve[k] !== true) {
          continue;
        }
        for (l = k * k; l <= limit; l += k) {
          sieve[l] = false;
        }
      }

      sieve.forEach(function (value, key) {
        if (value) {
          this.push(key);
        }
      }, primes);

      return primes;

    }

}

// Setup a new prime sieve once on instancation
const sieve = new SieveOfEratosthenes();

// We send a message back to the main thread
ctx.addEventListener("message", (event) => {

    // Get the limit from the event data
    const limit = event.data.limit;

    // Calculate the primes
    const primes = sieve.calculate(limit);

    // Send the primes back to the main thread
    ctx.postMessage({ primes });
});

And then back to our main thread we instantiate the worker and send a message asking the first 1000 primes:

// index.ts

// Not the worker-loader! syntax to keep Webpack happy
import PrimeWorker from "worker-loader!./worker";

const worker = new PrimeWorker();

worker.postMessage({ limit: 1000 });
worker.onmessage = (event) => {
  document.getElementById("primes").innerHTML = event.data.primes;
};

Now if we can build the file. Here we assign a build script "build": "webpack" in our package.json which will build the file for us into a dist directory as bundle.js. This can then be referenced inside your webpage of choice.

If you want to see the full working example I’ve posted it to this GitHub repository for you to experiment with.