Test-Driven Development and Dependency Injection are the way

For newcomers to the journey of building skott, a quick introduction to set the context.

skott is an all-in-one devtool that automatically analyzes, searches and visualizes dependencies from JavaScript, TypeScript (JSX/TSX) and Node.js (ES6, CommonJS) projects. It deeply traverses a directory, parses all supported files, resolves module imports, builds the whole graph and exposes a graph API from it.

With that graph, you can do many things, such as visualizing it in a web application:

An overview of the skott web application rendering a project dependency graph.

If you’re familiar with static analysis tools (if not, see my previous post on the subject), you already know some of the challenges. In the context of skott, here are a few:

extracting workspace information such as manifests (package.json), lockfiles (package-lock.json, yarn.lock, pnpm-lock.yml) and config files (tsconfig.json, .eslintrc.json),
traversing the file system, using ignore patterns and discarding gitignored files,
parsing files with multiple parsers for each supported language, covering every specificity (e.g. JSX/TSX),
walking the emitted AST to collect all module declarations from each file, for both CommonJS and ESM,
resolving all module declarations to their real file-system location,
building the graph from all the resolved modules.

That’s quite a lot, even in a simplified version.

The question is: how do you manage that complexity in a way that you’re fast, confident and safe enough to add/remove/update features, fix bugs and process large refactorings? In other words, how do you enable true software agility?

Testing

The first thing that comes to mind is tests. Writing good tests brings safety and confidence. But there’s a double-edged side: depending on what is tested (the system under test) and how, tests can lead to false confidence, sneak in poor design decisions such as coupling code to implementation details, and make refactoring painful and unsafe. As software engineers, this is something we should avoid at all costs.

In my previous post Don’t target 100% coverage, I show how tests written a certain way can produce unexpected behaviors and a false sense of security. I recommend reading it if you’re not familiar with Test-First, Test-Last, Test-Driven Development, mutation testing and code coverage.

Test-First, Test-Last and Test-Driven Development

Let’s see the differences between these three.

Test-First

Writing the test before writing any line of the targeted feature. This puts a precise plan on what should be implemented but misses the most important part, the how, because Test-First doesn’t provide an incremental feedback loop.

Test-First can be useful and generally helps increase specification coverage, but it omits the fact that new constraints and ideas are discovered along the way. It doesn’t let tests drive the implementation, and it favors writing big chunks of code at once: you go from 0 lines to X lines, where X is whatever was needed for the test to pass. Among those chunks, some may be useless or too costly to introduce given the complexity you face all at once.

Test-Last

Probably the most popular approach: writing all the tests after all the feature code is written. This aims to bring safety around already-produced code, but does it really? Not quite. Test-Last is often used to convert a desire for confidence (more tests, higher coverage) into a sneaky, false feeling of safety.

Remember Goodhart’s law: when a measure becomes a target, it ceases to be a good measure.

As a result, lots of unnecessary and noisy tests get added, covering parts of the code that don’t need it, either because higher-level tests already do, or because they repeat things already tested. How would you even know a test is useless or redundant? You add a test, it passes, but is that due to the code you just added, or was it already the case?

Because Test-Last introduces tests at the very last step of the feature life cycle, it comes with pain: design smells, hidden implementation details, code that isn’t easily testable. That’s wasted time, because you could have figured it out earlier. It also tends to favor mocks (the test-double type, specifically), which are a smell in most cases, they introduce structural coupling between tests and code (asserting that X calls Y with abc parameters), drastically reducing flexibility and the ability to refactor.

In that case, tests become fragile: they depend on implementation details and break at the next function change, making you resent the tests themselves.

“Tests prevent me from refactoring, they break all the time when I change the code, I’m losing flexibility, it’s not convenient.”

Is there a better way to achieve efficiency and safety while keeping the code highly flexible, easily refactorable, and abstracted from the implementation details we don’t want to depend on?

Test-Driven Development (TDD)

Disclaimer: I’m not advocating TDD at all costs, it’s not a silver bullet. The goal is to offer an overview of a highly misunderstood and underestimated discipline. It’s then your choice to try it, blame it, or blame me. But believe me: once you become decent at TDD, you never look back.

Test-Driven Development is a more evolved version of Test-First. It’s a software-development discipline that drives you to find the quickest, best path to writing each line of code targeting a domain specification, through fine-grained decomposed steps (so-called baby steps), dealing with complexity incrementally.

Using tests, TDD drives the writing of just enough code to make a failing test ❌ turn GREEN ✅, in very short feedback-loop cycles.

If there is no prerequisite, no failing test, why would I add any line of code? Nothing in my system justifies it yet. Maybe the behavior is already implemented? The first step is to make sure we have a failing test: that’s our first checkpoint.

Once we have it, we write the code required to make it pass. Now we’re sure we did something useful: code justified by a specification that went from red to green.

Right after, the third TDD step kicks in: refactoring. You’re free to refactor and produce the best code while keeping the test green. That’s why TDD fundamentally helps designing software: it lets you infinitely and safely refactor the code (at any point in time) while ensuring the expected behavior stays intact.

TDD requires good software-engineering skills to be effective. It creates a perpetual confrontation between the design choices to be made and the current system requirements. But it needs solid design and refactoring skills upfront, otherwise you may get blocked or make the wrong decisions (or worse, none at all).

TDD is all about feedback loops, and so is software engineering in general. Consider the feedback loops we work with every day:

IDE: the red underlines, warnings, auto-format and popups all tell you whether you’re on the right path.
Compilers: they tell you whether the code is semantically correct. Statically typed languages such as TypeScript improve the loop with type safety and a great real-time experience via the Language Server Protocol (LSP).
CI/CD pipelines: they continuously tell you whether the product could be deployed to production, passing all verification steps.

All of these leverage the fail fast principle: identify failures quickly rather than letting them persist or, worse, letting customers discover the bugs. The feedback loop is the foundation of continuous improvement.

TDD is about having the shortest feedback loop toward the written code, asserting whether the system produces the expected outcome as you add, remove or update code. Thanks to automated tests, the loop is quick enough in most cases, as long as you respect the Fast nature from the F.I.R.S.T principles.

One downside: TDD can become counter-productive if applied incorrectly. In my opinion it’s better not to practice TDD than to practice it the wrong way. The learning curve is steep, it requires strong technical skills and a mental shift.

Putting it into small practical examples

Back to our initial question: how do you manage complexity in a way that you’re fast, confident and safe enough to add/remove/update features, fix bugs and continuously improve the code through refactoring?

For now, the only approach I’ve found that works for all of these is Test-Driven Development. You might be able to do the same without it, but at what cost, with what confidence, and how fast? Without introducing any regression? I’ve been there too; now I could never do it again without it.

Let’s take a highly simplified version of a skott feature.

Given two JavaScript modules where one depends on the other, both should be analyzed, and a graph should contain two vertices with one edge representing that relationship:

index.js

import { add } from "./feature.js";

// Rest of the code consuming the function, we don't care about that

feature.js

export function add(a, b) {
  return a + b;
}

We want to:

read both files,
find the module declarations and see that index.js imports feature.js,
build a graph of two vertices, index.js and feature.js, with a directed edge from index.js to feature.js (we say index.js is “adjacent to” feature.js).

So the expected outcome of our system is a graph shaped like this:

{
  "index.js": {
    "adjacentTo": ["feature.js"]
  },
  "feature.js": {
    "adjacentTo": []
  }
}

This is the expected outcome we want skott to produce.

Test-First

Test-First would suggest starting by writing the test with the whole expectation we have from our system:

describe("Graph construction", () => {
  describe("When having two JavaScript modules", () => {
    describe("When the first module imports the second", () => {
      test("Should produce a graph with two nodes and one edge from the index module to the imported module", () => {
        createFile("feature.js").withContent(
          `
            export function add(a, b) {
                return a + b;
            }
          `
        );
        createFile("index.js").withContent(
          `
            import { add } from './feature.js';
          `
        );

        const graph = new GraphResolver();

        expect(graph.resolve()).toEqual({
          "index.js": {
            adjacentTo: ["feature.js"],
          },
          "feature.js": {
            adjacentTo: [],
          },
        });
      });
    });
  });
});

The test includes the three main components: Arrange / Act / Assert. Running it will surely fail, since we don’t have any code yet. But how do we make it pass?

Quick reminder of all the steps we must go through: file traversal, file parsing, module extraction, module resolution, graph construction. That’s a long way, with many components involved, so we might spend quite a bit of time. Unfortunately the test won’t be useful during all that time; it only asserts the final result once everything is already produced.

Test-Last

Nothing happens here: Test-Last doesn’t want us to write any test yet. Good luck.

Test-Driven Development

Finally, something that helps us reach the desired behavior. As said, TDD wants us to take an incremental approach and let the code grow in complexity across steps. By design, it wants the baby step approach: add the minimum of code to make the test pass.

Here’s the first test that comes to mind:

describe("Graph construction", () => {
  describe("When not having any modules", () => {
    test("Should produce an empty graph", () => {
      const graph = new GraphResolver();

      expect(graph.resolve()).toEqual({});
    });
  });
});

First things first: we focus on the shape of the contract we want to expose. It constrains our thinking: one problem at a time. One benefit of TDD is that the test becomes the first client of the code itself, letting the design emerge progressively.

Here’s the quickest way to make the test pass:

class GraphResolver {
  resolve() {
    return {};
  }
}

Easy: a raw class with a method returning a hardcoded empty object.

Then comes the question: what’s the next test? Don’t forget, the end goal is to traverse files and build dependencies between them.

TDD efficiency comes from the choice of tests: there’s no magic. The developer is responsible for finding the right order; each missed or oversized step is a missed opportunity to fully benefit from TDD. But don’t worry: if you end up there, you can always downshift to a smaller step.

So what comes next? After the “no modules” case, let’s introduce one module to analyze:

describe("When having one JavaScript module", () => {
    test("Should produce a graph with the module", () => {
      /**
       * ARRANGE
       *
       * How can we create a file system context including that file specifically
       * for this test?
       */
      const graph = new GraphResolver();

      expect(graph.resolve()).toEqual({
        "index.js": {},
      });
    });
  });

Even if it looks simple, it grows in complexity as we introduce the notion of a file in the Arrange part. Our tool is based on the file system, so it must read from it at some point. Does that mean the test itself should read from the file system? Not at all.

But why not use the real file system right away, via the Node.js API? Because introducing the real file system is exactly what we want to avoid in unit tests: we want them fast, isolated and repeatable across executions, and the file-system layer checks none of these criteria. We want a fully manageable, specialized version of the Node.js file-system module. Do we need to fake everything in it? No, just the subset we need for now.

That small test brings a whole new level of thinking around the feature. You could argue it complicates the task for little gain, but it forces us to build testable code over which we have full control, which already leads us to designing solutions.

What we want first is a minimal in-memory file system that can fake a real one:

test("Should produce a graph with the module", () => {
  const fileSystem = new FakeFileSystem();
  fileSystem.createFile("index.js").empty();
});

Now that we have that in-memory context, we want it to be usable within the GraphResolver context, in other words, injected into it. This naturally leads us to Dependency Injection (DI).

Dependency Injection

Dependency injection is a pattern for decoupling the usage of dependencies from their creation. It’s the process of injecting a service’s dependencies from the outside world; the service itself doesn’t know how to create them.

The dependency (the file-system module) is created outside the GraphResolver context and injected right away:

class FakeFileSystem {
 // ...
}

const fileSystem = new FakeFileSystem();
fileSystem.createFile("index.js").empty();

class GraphResolver {
  constructor(private readonly fileSystem: FakeFileSystem) {}
}

// Dependency Injection
const graphResolver = new GraphResolver(fileSystem);

Now we can fake the required behavior from the file-system module:

class FakeFileSystem {
  fs = {};

  createFile(name) {
    return {
      empty: () => {
        this.fs[name] = "";
      },
    };
  }

  readFiles() {
    return Object.keys(this.fs);
  }
}

A simple fake that emulates the operations we need for now: createFile and readFiles. Nothing more; we only cover what’s needed in the test’s context.

To make the test pass, we consume this implementation in the GraphResolver service:

class GraphResolver {
  graph = {};

  constructor(private readonly fileSystem: FakeFileSystem) {}

  resolve() {
    for (const filename of this.fileSystem.readFiles()) {
      this.graph[filename] = {};
    }

    return this.graph;
  }
}

And it passes, matching our expectations. Note how focused we are on the current desired behavior: we produced only the minimum needed. The readFiles method only yields the filename, since the test only requires a record including that name, even though we know we’ll later need file content too.

We’re also fully in-memory and isolated from the real file system: the dependency produces only the outcome we need, and we don’t have to fake the whole Node.js file-system API.

Here I voluntarily skip some intermediate steps, knowing for sure we’ll want to read all files from the directory at some point. Instead of introducing a readFile method (single file), I go straight to readFiles. As a rule of thumb, follow a sequence of Transformations (the Transformation Priority Premise, by Robert C. Martin) to produce the minimal code that makes the test pass in short loops.

After the green step comes the third TDD phase: refactoring.

Refactoring is about changing the system design without altering its behavior.

We can use that phase to improve a small thing in the DI setup. DI can be used in a way that decouples from concrete implementations. Above, GraphResolver is directly coupled to a FakeFileSystem instance, leaving no room for flexibility and offering no way to inject anything else matching the same contract. Worse, it leaks implementation details: FakeFileSystem exposes createFile, which exists only for the test but has no value to GraphResolver. GraphResolver should only know about what it cares about: the readFiles method.

So we introduce a generic interface, making GraphResolver depend on abstractions, not on a concrete implementation. This brings us to the Dependency Inversion Principle (the D in SOLID). That way, skott gets a runtime-agnostic way of traversing file systems:

interface FileSystem {
   readFiles(): string[];
}

class FakeFileSystem implements FileSystem {
  // unchanged
}

class GraphResolver {
  constructor(private readonly fileSystem: FileSystem) {}
                                           // ^ this changes
}

All tests still pass: we just played with interfaces and the static compiler to narrow the correct types into GraphResolver.

Now let’s start introducing dependencies between JavaScript modules. As said, those can be modeled with a directed graph, where files are vertices and relationships are directed edges.

Using graph terminology, a vertex A that depends on another vertex B is said to be “adjacent to B”.

One concept at a time. This time we don’t even need a new test. We can update the last one:

expect(graph.resolve()).toEqual({
-        "index.js": {},
+        "index.js": {
+          adjacentTo: [],
+        },
      });

This fails; now we make it pass:

resolve() {
    for (const filename of this.fileSystem.readFiles()) {
-      this.graph[filename] = {};
+      this.graph[filename] = {
+        adjacentTo: [],
+      };
    }

    return this.graph;
  }

Adding two modules wouldn’t push us forward. We need a test that drives us to the case where an import creates an edge between two modules:

describe("When having two modules with one dependency", () => {
    test("Should produce a graph with both modules and a dependency between the two", () => {
      const fileSystem = new FakeFileSystem();
      const fileWithModuleImport = `import "./feature.js";`;

      fileSystem.createFile("index.js").content(fileWithModuleImport);
      fileSystem.createFile("feature.js").empty();

      const graph = new GraphResolver(fileSystem);

      expect(graph.resolve()).toEqual({
        "index.js": {
          adjacentTo: ["feature.js"],
        },
        "feature.js": {
          adjacentTo: [],
        },
      });
    });
  });

Still writing the easiest code that makes the test pass, we add a small if in resolve. From the Transformation Priority Premise, this is the middle transformation: (unconditional → if) splitting the execution path:

  resolve() {
    for (const [fileName, fileContent] of this.fileSystem.readFiles()) {
+      if (fileContent.includes("import")) {
+        const moduleName = fileContent.split("./")[1].split("'")[0];
+
+        this.graph[fileName] = {
+         adjacentTo: [moduleName],
+        };
+
+       continue;
+     }

      this.graph[fileName] = {
        adjacentTo: [],
      };
    }

    return this.graph;
  }

Test passing ✅. Note how ugly that split is, maybe the ugliest statement I’ve ever written, but we don’t care: it turns the test green. Remember, after this step you’re free to refactor as much as you want.

Let’s slightly improve the code during refactoring:

resolve() {
    for (const [fileName, fileContent] of this.fileSystem.readFiles()) {
      if (fileContent.includes("import")) {
+        const moduleImportParser = /import '(.*)';/g;
+        const [moduleImport] = moduleImportParser.exec(fileContent);
         // path comes from Node.js "path" module
+        const moduleName = path.basename(moduleImport);

        this.graph[fileName] = {
          adjacentTo: [moduleName],
        };

        continue;
      }

      this.graph[fileName] = {
        adjacentTo: [],
      };
    }

    return this.graph;
  }

The parsing in the two previous steps is an implementation detail, part of our own business logic, and should remain hidden, allowing heavy refactoring without breaking the API surface. As long as resolve returns the expected graph, we’re fine, because that’s what matters most.

As you progressively add cases to reliably parse all types of module imports (and there are many, including fun edge cases), you’ll realize a RegExp doesn’t scale and becomes too complex to manage. But nothing prevents you from introducing an ECMAScript-compliant parser (meriyah, acorn, swc) that does the job for you. The best part: it stays hidden inside the internals of your use case, allowing infinite refactoring and improvement.

I won’t go into the parser implementation itself (you can check skott’s source code for a complete example), but hopefully you can now imagine the next steps, and see the advantages TDD brings.

Wrap up

By working through a few use cases, we saw how Test-Driven Development incrementally drives the development of a feature.

Not only does it surface constraints very early thanks to the feedback loop, it also naturally forces us to make the code easily testable (fast, isolated, repeatable) via Dependency Injection. It reduces the complexity we deal with at each new step, while going faster and safer. And we can abuse heavy refactoring phases to make the code cleaner (clean code and great design being prerequisites for efficient refactoring), confident the system behavior still matches our expectations.

Personal take: I would not feel confident with skott’s features if I hadn’t built 130+ unit tests with Test-Driven Development. Features can be added without fear, and refactoring done with ease and confidence.

This doesn’t mean skott can’t produce bugs: it means that for the use cases skott is expected to handle, it most likely works as expected. A bug will usually be a missing system behavior or edge case. And that’s fine: to fix it, you just add the test case reproducing the missing behavior, and let the TDD flow.

This was the final chapter of the journey of building skott. The whole project is open source on GitHub.