Dem Einbrecher den Schlüssel reichen Was für ein Zirkus

How to improve Aurelia's watcher performance

Veröffentlicht am Montag, 10. April 2017 04:00 in Programmieren & Tools

Aurelia's CLI has a watcher feature which - in theory - adds quite some improvement to your daily work: I watches files for changes, recompiles your project automatically and uses browser link to refresh the page you're currently working on. For quite some time the watcher had a rather unpleasant property: it crashed. A lot. Fortunately, that was resolved when a pull request was accepted for the CLI. Disappointment struck when I tested the fix though; yes, the crashes were gone, but the watcher's performance now was horrible. Looking through the code I realized that the fix simply turned off incremental TypeScript builds to get rid of the crash. After a discussion on GitHub, I decided to take a deeper look into this, and - well - here are the results.

Let's take a look at the results first:

  • A full "au build" of our current main project took 28 seconds. With an idle watcher, a single file change (independent of the file's type) took 18 seconds to complete. Doing the same but with a change of two files: 34 seconds, which is actually slower than simply invoking a full rebuild manually. Wow.
  • With all of the changes outlined below applied, the improved numbers are:
    • Idle watcher, single .ts file change: 3 seconds
    • Idle watcher, single .html file change: < 1 second
    • Idle watcher, single .less file change: 7 seconds
    • Idle watcher, change of three files (.ts, .html, .less) "simultaneously" ("save all"): 9 seconds

Sounds like a hell of an improvement, right? Here's what I did for that.

Note that all changes are applied on the template level, meaning you only have to work on the project files generated by the Aurelia CLI. You do not have to fiddle with the inner workings of the CLI itself, patch any dependencies or similar things. It's actually quite straight-forward.

2017-04-14: Aurelia team member Jeroen Vinke will pick up these changes and include them in the CLI itself; so if you're patient enough you'll probably see at least some of these in a future version. ;)

Important: When I sat down to work on this, I decided to do my experiments on the build task (build.ts), not to the run task (run.ts). This feature ("build --watch") also has been requested before (see issue #265), and since we are not using the built-in serve feature at all, it was the logical thing to do for me. The following concepts however fully apply in the same way, no matter where you put the watcher logic, and simply would require to pass some of the information described below from the run task to the build task if you prefer to put it in the run task.

Step 1: Change the way gulp.watch is used

If you would only want to fix the crashes and keep the incremental build features of TypeScript, you would need to do the following:

  • Obviously, revert the previous "fix" for the crash to re-enable incremental builds (2c74cfe)
  • Change the use of individual gulp.watch calls for each project source to a single gulp.watch that takes an array of globs instead. Since all of the individual watchers did trigger the same process anyway, this does not change any additional behavior. So basically:
let watch = function() {
  // gulp.watch(project.transpiler.source, refresh).on('change', onChange);
  // gulp.watch(project.markupProcessor.source, refresh).on('change', onChange);
  // gulp.watch(project.cssProcessor.source, refresh).on('change', onChange);

  gulp.watch(
    [
      project.transpiler.source, 
      project.markupProcessor.source, 
      project.cssProcessor.source
    ], refresh).on('change', onChange);
}

But let's go a step further with this: I would recommend to switch from the built-in gulp.watch feature to the separate gulp-watch package (subtle difference but huge improvement). The built-in feature does not pass on information about individual file changes, which makes further optimization difficult. The changes described in the details that follow rely on this, so if you want to add that benefit, you need to switch to gulp-watch (or a similar package that supports this).

Another small optimization is to turn off content reading. By default, the watcher reads the file contents, but they are thrown away because the actual build process re-reads them using gulp.src. For gulp-watch, there's official documentation about how to turn content reading off (see options.read). Since both the built-in gulp.watch feature and gulp-watch rely on chokidar, I would suspect that should also work for gulp.watch (although it's not documented there and I didn't test that). Sample:

import * as gulpWatch from "gulp-watch";

let watch = function() {
  return gulpWatch(
    [
      project.transpiler.source,
      project.markupProcessor.source,
      project.cssProcessor.source
    ],
    {
      read: false, // performance optimization: do not read actual file contents
      verbose: true
    },
    (vinyl) => {
      // do something with the info, or simply trigger an incremental build across all sources here
    });
});

Step 2: Add debouncing to the build logic

The above solution fixes the crashes and improves performance for simple scenarios, but still has the "multiple successive builds" problem when more than one file is changed rapidly ("save all"). So I decided to take a look at debouncing the build. Amazingly, this took the most time to get right. The problem mostly is with finding a solution that integrates nicely with the function composition in combination with the asynchronity of gulp. If you look around you'll find quite some discussions around this and how it should be done (for example in this GitHub issue: gulpjs/gulp#1304). I use the default "debounce" Node package. If you choose a different package, make sure it does the triggering on the trailing edge or at least can be configured/forced to do so.

The proposed solution(s) all had the problem that they could not be applied easily to the Aurelia CLI, mostly because the underlying gulp wrapper of the CLI apparently makes some assumptions about the tasks structure and crashes if these are not satisfied (I mostly had problems with the makeInjectable function in Aurelia CLI's gulp.js). Since I didn't want to dig that deep into the code base, I preferred bridging to the gulp world of things with a simple manual solution. It basically looks like that (that's pseudo-code, the full/real code follows below):

import * as debounce from "debounce";
const debounceWaitTime = 100;

let isBuilding = false;
let refresh = debounce(() => {
  if (isBuilding) {
    return;
  }

  isBuilding = true;
  
  triggerActualBuild()
	.then(() => {
		isBuilding = false;
		if (weHaveAnotherPendingBuildRequest) { refresh(); }
	});  
  
}, debounceWaitTime);

There are probably nicer solutions than that, but like I said I wanted to fix this on the project template level and not dig into the Aurelia CLI package itself.

Step 3: Perform selective builds only

With that in place you get rid of the crashes and prevent rapid changes to multiple files triggering multiple successive builds. Basically, you'll reach the "9 seconds" per incremental build performance level I mentioned in the beginning. Now, as you can see from these performance numbers above, our LESS task contributes the most to incremental build times, where at the same time we actually do not much of LESS editing in our day-to-day work. With that in mind I decided it would be nice to only trigger those parts of the build process which actually need to be performed, for example only do a TypeScript compile if we actually changed .ts files etc. To achieve this, here is what I did:

  • When gulp-watch triggers, push the changed file's information into a "poor man's queue" (an array) so it is preserved until the debounced function actually is triggered
  • In the refresh function, collect the actual file changes and test them against the configured globs of the project sources to determine which tasks need to be executed
  • That array then can also be nicely used to determine whether another build needs to be triggered when the current one finishes (because more file changes may have piled up during the build)

Ok, let's put the puzzle pieces together and see what we get.

import * as minimatch from "minimatch"; /* used to test paths against globs */
import * as gulp from "gulp";
import * as gulpWatch from "gulp-watch";
import * as debounce from "debounce";
// more imports as required, for Aurelia's sub tasks, project configuration etc...

const debounceWaitTime = 100;
let isBuilding = false;
let pendingRefreshPaths = [];

let watch = () => {
  return gulpWatch(
    [
      project.transpiler.source,
      project.markupProcessor.source,
      project.cssProcessor.source
    ],
    {
      read: false, // performance optimization: do not read actual file contents
      verbose: true
    },
    (vinyl) => {
      if (vinyl.path && vinyl.cwd && vinyl.path.startsWith(vinyl.cwd)) {
        let pathToAdd = vinyl.path.substr(vinyl.cwd.length + 1);
        log(`Watcher: Adding path ${pathToAdd} to pending build changes...`);
        pendingRefreshPaths.push(pathToAdd); 
        refresh();
      }
    });
});

The refresh function itself is the one that's debounced, and looks like that:

let refresh = debounce(() => {
  if (isBuilding) {
    log("Watcher: A build is already in progress, deferring change detection...");
    return;
  }

  isBuilding = true;

  let paths = pendingRefreshPaths.splice(0);
  let tasks = [];
  
  // dynamically compose tasks, note: extend as needed, for example with copyFiles, linting etc.
  if (paths.find((x) => minimatch(x, project.cssProcessor.source)) {
    log("Watcher: Adding CSS tasks to next build...");
    tasks.push(processCSS);
  }

  if (paths.find((x) => minimatch(x, project.transpiler.source)) {
    log("Watcher: Adding transpile task to next build...");
    tasks.push(transpile);
  }

  if (paths.find((x) => minimatch(x, project.markupProcessor.source)) {
    log("Watcher: Adding markup task to next build...");
    tasks.push(processMarkup);
  }

  if (tasks.length === 0) {
    log("Watcher: No relevant changes found, skipping next build.");
    isBuilding = false;
    return;
  }
  
  let toExecute = gulp.series(
    readProjectConfiguration,
    gulp.parallel(tasks),
    writeBundles,
    (done) => {
      isBuilding = false;
      done();
      if (pendingRefreshPaths.length > 0) {
        log("Watcher: Found more pending changes after finishing build, triggering next one...");
        refresh();
      }
    }
  );

  toExecute();
}, debounceWaitTime);

The remaining build.ts content stays more or less the same, I only added the --watch command line option. Like that:

let processBuildPipeline = gulp.series(
  readProjectConfiguration,
  gulp.parallel(
    transpile,
    processMarkup,
    processCSS,
    copyFiles
  ),
  writeBundles
);

let main;

if (CLIOptions.hasFlag("watch")) {
  main = gulp.series(
    processBuildPipeline,
    watch
  );
} else {
  main = processBuildPipeline;
}

export default main;

function readProjectConfiguration() {
  return build.src(project);
}

function writeBundles() {
  return build.dest();
}

function log(message: string) {
  console.log(message);
}

With that in place, you now have a rock solid watcher that does not crash and performs way better than the current official template implementation. Even switching between largely different git branches that changes dozens of files leaves the watcher unimpressed, and finishing the build still is only a matter of seconds.

Enjoy!

Tags: Aurelia · Performance