Modular Javascript bindings from Rust

I've been working on a Rust library for time series data analysis which comes with both Python and Javascript bindings. The Javascript bindings are generated from a Rust crate which belongs to the Cargo workspace, which has been OK so far, but as the scope of the project has grown (from just forecasting originally, to outlier detection, clustering, changepoint detection, and more), the size of the WASM bundle has grown to about 1MB, which is... not enormous, but definitely not ideal.

It's particularly annoying when users only want to use a tiny fraction of the library, but must load the entire WASM bundle first. What I'd really like is for my JS library's package.json to look something like this:

{
  "name": "@bsull/augurs",
  "version": "0.6.0",
  "files": [
    "*.wasm",
    "*.js",
    "*.d.ts",
    "snippets/"
  ],
  "main": "core.js",
  "exports": {
    ".": "./core.js",
    "./clustering": "./clustering.js",
    "./dtw": "./dtw.js",
    "./prophet": "./prophet.js",
    "./outlier": "./outlier.js",
  },
  "types": "augurs.d.ts"
}

Users could then import the parts of the library they need like this:

import { Prophet } from '@bsull/augurs/prophet';

I can think of a few ways to do this:

1. Manually split the JS crate into multiple crates

This approach appears to be the most straightforward. Rather than the JS crate being a single crate with each set of bindings in a module, we split it into multiple crates, each with a single module. We can then use wasm-pack with each crate to generate the JS bindings for that module, shove them all into a single directory, manually generate the package.json file, and we're done.

This is fine, but it's a bit of a pain to maintain (each original Rust crate has to have a corresponding JS crate). Not only that but each WASM module is self-contained, so if a user wants to use more than one module, there's a bunch of duplication in amongst them (e.g. all the WASM machinery, serde stuff, tracing, etc is duplicated). So the overall bundle size is probably larger than we started with, but if someone only wants to use outlier detection, they save bandwidth. Great.

But. Not all of the modules are self contained. For example, both the clustering and dtw modules use a shared DistanceMatrix type, which is intended to be opaque to users, returned from distance matrix calculation functions in dtw and consumed by clustering functions in clustering. This will only work if the WASM modules know how to talk to each other, which they don't. Passing an object returned from one module to another isn't possible - they each have their own memory space. We'd need to deserialize and reserialize the data in order to pass it between modules, which can be quite slow if the data is large.

It's certainly an option, but it's not perfect.

2. Create a WASM component for each piece of functionality

This feels exactly like the WASM component model's raison d'être: allowing multiple core WASM modules to talk to each other. The idea would be to create a separate WASM component for each module, starting with an interface defined using WIT, then using cargo-component to generate the bindings and implementing them in Rust. These would supersede the existing JS crate; we could then use jco to generate the JS bindings for each component.

The feasibility of this approach depends entirely on how jco generates the initialization code for the WASM modules. It's not clear to me how it would know about the dependencies between the modules, and how it would handle the case where a module depends on another module. Ideally it'd just load exactly what it needs for any given module, but it's not clear how to do that.

For example, in the situation above, both dtw and clustering depend on DistanceMatrix, but DistanceMatrix is defined in the core module. If someone imports clustering, I'd want the bindings to load and instantiate the core and clustering modules. Then if someone imports dtw, I'd want the bindings to load and instantiate only the dtw module, and use the existing core module.

I've yet to find out if this is possible, but I'll write more about it if I do.

One way I think it might work is by having a final wrapper WASM component which imports and re-exports the other components. This way, the dependency tree of the modules would be known to jco, so it would hopefully be able to generate optimal bindings. The last time I tried this, it didn't work for two reasons:

the initialization code generated by jco was very eager to load all of the WASM modules, even if they weren't needed, which is no better than the approach we're currently taking. It is possible to modify the instantiation code somewhat (mentioned in the instantiation docs) but I struggled to do anything meaningful here.
the Rust bindings generated by cargo-component produced separate Rust modules with separate types for each interface, so I would have had to write a ton of boilerplate to convert types between those expected by each module. I asked about this on the jco Zulip and it sounded like this might be fixable by first defining the types in some separate shared WIT file. I'll probably try that next.

And... I think that's it? If you know any other ways to do this, please let me know! My contact details are on the about page.