JavaScript Module Bundlers Demystified

JavaScript Module Bundlers Demystified

JavaScript Module bundlers are somewhat of a black box for most web developers. they are the cornerstones of building modern web applications. There are many of them out there these days, being the most popular ones Webpack, Rollup, Parcel, Snowpack and now Vite. They are powering most of the JavaScript frameworks under the hood. So, Whether you know it or not, you have been using them directly or indirectly.

First of all,

What is a JavaScript bundler?

A JavaScript bundler is a tool that puts your code and all its dependencies that your application needs, then bundles all of those into a small number of bundles - often, only one that is production-ready loadable in the browser. It usually starts with an entry file, and from there it bundles up all of the code needed for that entry file.

Why do we need that? Well, the underlying problem is handling dependencies in frontend code. Historically JavaScript hasn’t had a standard for requiring dependencies from your code. There was no import or require statements. And here shines the fantastic feature of a bundler.

It generates a dependency graph as it traverses your first code files. This implies that beginning with the entry point you specified, the module bundler keeps track of both your source files’ dependencies and third-party dependencies. This dependency graph guarantees that all source and associated code files are kept up to date and error-free.

Now all modern Browsers do have module support natively. So, We should ask the question "then, why bundle in the first place?". Well, The web has so many different environments and you want your code to run in as many as possible, while only writing it once, bundlers solve that issue. The 3rd party code are expensive, We can use static code analysis to optimize it with things like cherry picking and tree shaking and guess what, Now most of the bundlers also supports these. And and there's more, We can also simplify what is shipped by turning 100 files into 1, limiting the data and resource expense on the user.

But let’s ignore Bundlers for a second like these never existed. And you have a website that has a Script over here

//index.html
<html>
    <body>
        <script src="./index.js"></script>
    </body>
</html>

//index.js
const [a, b] = [5, 10];
const c = a * b;
const d = a + b;

If you want to use a function that is defined in another script.

//math.js
function multiply(a, b){
    return a * b;
}
function add(a, b){
    return a + b;
}

What would you do? Obviously you will change the standard script tag to a "module".

//index.html
<html>
    <body>
        <script type="module" src="/index.js"></script>
    </body>
</html>

and import the function from the other script file.

//math.js
export function multiply(a, b){
    return a * b;
}
export function add(a, b){
    return a + b;
}

//index.js
import { multiply, add } from "./math.js";

const [a, b] = [5, 10];
const c = multiply(a, b);
const d = add(a, b);

But ES modules are not fully implemented by All Browsers yet and suppose we need to support browser that does not support JS Module, How would you import and export things in JavaScript? How would you make functions of your code visible to the outer world and how would you import functions from other people’s code?

The only way has always been through is using global variables.

//index.html
<html>
    <body>
        <script src="./math.js"></script>
        <script src="./index.js"></script>
    </body>
</html>

But this is not good, As multiply and add functions will became part of the window object and accessible from anywhere. To solve this problem we can use IIFEs something like this

//math.js
const App = {};

(function () {
  App.multiply = function multiply(a, b) {
    return a * b;
  };
  App.add = function add(a, b){
    return a + b;
  };
})();

//index.js
const [a, b] = [4, 5];
const c = App.multiply(a, b);
const d = App.add(a, b);

this pattern will solve the problem with the functions and variables being in global namespace and accessible from anywhere but this IIFE pattern still has some flaws like we have to put every single file in its own IIFE, there's still have a single property on the global namespace viz App , If some other library has a property named App on global namespace there will be naming collision and that would be bad. Also the order of the scripts matter, You need to be careful with the order in which you put the script tags. This will become harder and harder to maintain with more complex dependencies.

//index.html
<html>
    <body>
       // changed the order
       <script src="./index.js"></script>         
       <script src="./math.js"></script>
      // this will cause an error as `index.js` does not know what `App` is.
    </body>
</html>

To solve these issues a group came together and created a standard in order to define a module in the JavaScript ecosystem and they came up with CommonJs.

The CommonJs group defined a module format to solve JavaScript scope issues by making sure each module is executed in its own namespace. This is achieved by forcing modules to explicitly export those variables it wants to expose to the universe. and also by defining those other modules required to properly work.

using CommonJs we could have solve our issue something like this

//math.js
module.exports = {
  multiply: function multiply(a, b) {
    return a * b;
  },
  add: function add(a, b) {
    return a + b;
  },
};

//index.js
const math = require("./math");

const [a, b] = [4, 5];
const c = math.multiply(a, b);
const d = math.add(a, b);

But there are some Cons,

  1. Browsers don't support CommonJs out of the Box.
  2. require() is synchronous so if we need to require a file that hasn’t been loaded in some way we need to do an HTTP request, but that is asynchronous.

the second problem could be solved by putting all the dependencies in one file to have all the code in memory, ready to be used when invoking the require() function.

And this is when Module Bundlers came in existence, what a module bundler does is that it examines all of your code base and it looks for the require statements and all of the exports that you are exporting from each of your modules and it intelligently bundles all of your modules in a single file.

It usually starts with an entry file, and from there it bundles up all of the code needed for that entry file.

A bundler's operation is divided into two main stages: dependency graph generation and eventual bundling.

Mapping a Dependency Graph

The first thing a module bundler does is generate a relationship map of all the served files. This process is called Dependency Resolution. To do this, the bundler requires an entry file which should ideally be your main file. It then parses through this entry file to understand its dependencies.

In our case the index.js file is the entry file and has only one dependency which is math.js.

Following that, it traverses the dependencies to determine the dependencies of these dependencies.

So, now it will try to traverse the dependencies of the math.js module. which in our case is none. but let's say the math.js also has two dependencies multiply.js and add.js.

//math.js
const multiply = require("./multiply");
const add = require("./add");

module.exports = {
  multiply,
  add,
};

//add.js
module.exports = function add(a, b) {
  return a + b;
};
//multiply.js
module.exports = function multiply(a, b) {
  return a * b;
};

It assigns unique IDs to each file it sees throughout this process. Finally, it extracts all dependencies and generates a dependency graph that depicts the relationship between all files.

It enables the module to construct a dependency order, vital for retrieving functions when a browser requests them. and also It prevents naming conflicts since the JS bundler has a good source map of all the files and their dependencies.

It detects unused files allowing us to get rid of unnecessary files and code.

Bundling

After receiving inputs and traversing its dependencies during the Dependency Resolution phase, a bundler delivers static assets that the browser can successfully process. This output stage is called Packing. During this process, the bundler will leverage the dependency graph to integrate our multiple code files, inject the required function and module.exports object, and return a single executable bundle that the browser can load successfully.

If we look at the bundled file of our application which is bundled by webpack you will see something like this

(() => {
  var r = {
      241: (r) => {
        r.exports = function (r, t) {
          return r + t;
        };
      },
      52: (r, t, o) => {
        const n = o(992),
          e = o(241);
        r.exports = { multiply: n, add: e };
      },
      992: (r) => {
        r.exports = function (r, t) {
          return r * t;
        };
      },
    },
    t = {};
  function o(n) {
    var e = t[n];
    if (void 0 !== e) return e.exports;
    var s = (t[n] = { exports: {} });
    return r[n](s, s.exports, o), s.exports;
  }
  (() => {
    const r = o(52),
      [t, n] = [4, 5],
      e = r.multiply(t, n),
      s = r.add(t, n);
    console.log(e, s);
  })();
})();

Let's try to understand the generated code.

First of all, What it did is that it wrapped everything inside an IIFE to prevent having everything in global namespace, just like we did earlier.

then it created a dependency graph with unique IDs to integrate our multiple code files

var r = {
     //this is the module.exports object of `add.js`
      241: (r) => {
        r.exports = function (r, t) {
          return r + t;
        };
      },
     //this is the module.exports object of `math.js`
      52: (r, t, o) => {
        const n = o(992),
          e = o(241);
        r.exports = { multiply: n, add: e };
      },
     //this is the module.exports object of `multiply.js`
      992: (r) => {
        r.exports = function (r, t) {
          return r * t;
        };
      },
    },

this function below takes an unique ID and return the module.exports from the created dependency graph.

function o(n) {
    var e = t[n];
    if (void 0 !== e) return e.exports;
    var s = (t[n] = { exports: {} });
    return r[n](s, s.exports, o), s.exports;
  }

this IIFE below is the code of index.js which has one dependency math.js. So, it inputs the unique ID of math.js file to get the exports from math module and use that.

(() => {
    const r = o(52),
      [t, n] = [4, 5],
      e = r.multiply(t, n),
      s = r.add(t, n);
    console.log(e, s);
  })();

So, as you can see the bundled file has everything packed inside it. This is what a bundler does in general under the hood.

I hope you find this blog helpful, Thanks for reading 👋

Did you find this article valuable?

Support Goutam Nath by becoming a sponsor. Any amount is appreciated!