A Case Study of Incrementally Rewriting a JavaScript Codebase to TypeScript
In late 2024, after being fed up with frequent losses in developer productivity at my workplace at the time, I aimed to convince our team and leadership that we should embark on a project to incrememntally rewrite our existing virtual health platform from JavaScript to TypeScript. I will give an overview of how I gathered data to justify this position, and the role I played in the rewriting project itself.
I have found data-driven decision making effective in winning over peers and leaders. Though it can be difficult to acquire reliable historical data, mainly if we have not anticipated the need for specific information in advance, if we can manage to gather data to support a position then discussions will be clarified; we'll be discussing facts rather than opinions. I started by writing a Deno script which accessed the company's GitHub repository via the GitHub API. The API allows such operations as listing PRs, filtered by various criteria including their tags. Using some back-of-the-napkin math, I estimated that ~45% of developer time is spent on bugfixes, and of that 45%, 40% was spent on type safety fixes, for a total of 20%. This was a very rough estimate: the first data point was based on PR frequency, since team members did not actually log their time, whereas the latter data point was determined by examining a random sampling of the code changes from bugfix PRs. The calculation of 20% assumes that the time spent on a bugfix is the same on average between logic errors and type safety errors, which is not necessarily true. However, this metric is also not inclusive of all costs associated with type unsafety. In addition to developer time spent fixing a type safety bug, there is also the time for QA developers to find and record the bugs, for admins to receive bug reports and communicate them, and the time that developers spend addressing type safety issues during featureful work.
Presenting this data, which shows that we're spending 20% of our developer time solving problems that the TypeScript compiler could have detected like
As for the details of the transition itself, I will try to give as many details as I can remember, but without access to the repo itself I will necessarily be a bit vague on some parts. The goals of the rewrite were:
- complete compatibility between rewritten modules and existing codebase
- uphold TS best-practices, so, avoid the use of
any
wherever possible
Compatibility
The purpose for maintaining compatibility is to ensure that all other featureful development can continue while the rewriting project is ongoing. Several pitfalls of rewriting projects occur when the existing and rewritten codebases are separate. These projects may have several downsides:
- development on the existing project stops, leading to protracted stagnation which may give competitors an edge development (such as the rewrite of Netscape, which effectively ended that company's competitiveness in the market)
- development of the new codebase doesn't have immediate positive effects on the company's clients. This makes it harder to justify the project to stakeholders, since users will only use the new codebase once feature parity is acheived (which may take a long time)
- if work on both codebases continues, then development on both projects is slowed, and the size of the rewriting task gets larger and larger
The solution to these drawbacks is to use the Strangler Fig pattern, in which the rewritten modules are integrated into the existing codebase, and progressively replaces existing modules. With this refactor pattern, code that uses the new architectural pattern, in our case TypeScript, grows around and slowly "strangles" the existing application. The refactor is managed through the use of a seam, on one side of which is the new pattern, on the other side the old pattern. Changes can be made on either side of the seam: fixing bugs and adding features, but over time the seam will be pushed to encroach on the old architecture, until eventually its gone completely. This may seem abstract, I'll explain concretely how the strangler fig pattern was used in this project.
I planned a strategy that allowed us to incrementally rewrite the codebase while keeping up TS best-practices, including avoiding polluting the TS modules with any
types which turn off type checking for expressions involving them. This was to model the JS codebase as a directed acyclic graph. The nodes of this graph are the codebase's module, edges are import statements, thus its source nodes (nodes without incoming edges) are the modules which do not depend on other JS modules (but may have library dependencies, if those dependencies also export types). Rewriting a particular module meant that its node is removed from the model graph, potentially creating additional source nodes. At any given moment, all source nodes are subject to being rewritten, and the expectation was established that other work which required edits to such modules (depending on the time sensitive nature of the task at hand) would first rewrite those modules into TS. Initially the modules subject to being rewritten were React components at the leaves of the component tree, which often included forms, modules that wrapped 3rd party APIs, modules which served to extract some logic that would otherwise be duplicated, the Redux store, and modules which queried our own REST and GraphQL endpoints. The former endpoints were not flawless, since the API could potentially have produced data whose structure didn't match the contract, though this was less common. The GraphQL endpoints were mostly flawless, since the amplify codegen
tool could be configured to also produce types corresponding to all possible GraphQL queries. In some cases however, there was data in the production database which did not match the GraphQL schema, which AWS AppSync uses to define our DynamoDB database. This was typically the result of old bugs that had been fixed, but which created corrupt data that hadn't been sanitized. Data of this type could cause the GrahpQL endpoints to produce data which did not match the generated types for the specified query, causing runtime exceptions. Overall this strategy meant that the codebase could be rewritten, module by module, until completion.
This approach had one downside, which is that it required using tsc
to generate JS output files which were then committed to the repo. It is these files that other JS files would import from, rather than from the TS files directly. Most modern typescript projects use a workflow that avoids dealing with JS output files directly, either by using TS-native runtimes like Bun or Deno, or using a bundler like Vite which generates a /dist folder that is not committed to the reposityor.
Best Practices
I have already covered how we were able to avoid the use of any
by carefully selecting which modules would refactor, but there were several other practices which aimed to improve the reliability of the resulting TS code.
Branded Types
Strict
There were several difficulties during the rewriting process. Initially, we expected that modules could be converted to TS with a one-to-one equivalency in their logic, the rewriting process would be a simple matter of defining explicitly the implicit type model of the JS modules by adding type annotations. However, it became clear soon that almost every JS module of some complexity contains small type unsafety bugs which may or may not have a noticable effect during runtime. In some cases the bug would only create an exception during a rare edge case, or would cause an error or warning to be printed to the browser console, but otherwise have no effect. As an aside, this application had enormous warning/error output to the console during runtime, mainly due to type safety bugs, which harmed debugging efforts and application performance. The presence of these bugs meant that rewriting most modules was not trivial, instead the logic had to be redesigned in some way to introduce a coherent and explicit type model. This slowed rewriting efforts and introduced an increased likelyhood of regression, which tylically would not be discovered automatically due to the low coverage of our automated tests.
Another difficulty was due to the preference for one-to-one logic equivalence. In some modules a perfect match from the implicit to explicit type system required new or uncommonly used TS features, however we were already limited by the TS version that would work with create-react-app (this method of bootstrapping a React application was deprecated even at the project's initiation). It was thus challenging to define a configuration and TS version that was suitable for rewriting all modules.
Overall, the project was successful in contributing to development velocity, however I didn't stick around long enough to see it completed.