Isenguard: A Self-Service Test Automation Framework
Alan Lee, Software Engineer
At Ethos Life, there’s a testing framework built in-house called Isenguard, and we’d like to introduce you to it to exemplify the ethos (ha!) of our Test team.
FITTING INTO OUR CORE PRINCIPLES
We know that testing increases the “good” at the expense of “cheap” and “fast.” It is, to our firm belief, a worthy trade-off, as testing increases our product quality, which reduces risks of impacting customer experience. This helps us achieve our mission of protecting millions of families.
But how can we maximize the benefits of testing (increasing the “good”) and minimize the expense (being able to build products “cheap” and “fast”)?
By sticking to the Test team’s four operating principles, of course!
First, we want to increase efficiency and productivity, which Isenguard helped achieve by automating a lot of manual processes and streamlining the result gathering, thus making testing faster and less painful.*
We also aim to provide early and virtuous feedback and to make quality everyone’s responsibility. In that spirit, Isenguard was designed to be self-service, meaning that by cutting out the engineers as middlepersons, the users would have direct access to the tool and the test results. Users then can quickly iterate based on the feedback provided by the tool, and since they’re closest to the implementations, they can become the guardians of the quality of the product they’re working on.
HOW WAS IT USED IN THE PAST?
During the summer of 2021, we had a major project called 100% Disclosures Automation. Its goal was to allow our Underwriting Engine to make 100% automated decisions — versus requiring a manual approval process in many cases — based on the answers the customers provided in the interview (we call these answers “disclosures”). To complete the project, analysts and PMs needed to make thousands of changes to the questions and answers, and ideally every one of those changes should be tested to confirm that they produced the expected outcomes (e.g. what product is the customer eligible for? What health rating should they receive? Should they be approved or declined?).
The old way of testing was for the author (i.e. the person making changes to the questions/answers) to go through the customer flow — sign up for a policy on the UI, input fake personal info (address, name, DOB, etc.), answer all the interview questions — and then query a couple of databases and make a couple of API calls to get the data on the outcomes. In a short test case, the whole process took ~5–7 minutes, but ~10 minutes was closer to reality.
With thousands of changes, there was a combination of tens of thousands of paths (because an interview has a tree structure; a different answer to the same question can lead to a different question path), with each path representing a test case.
If we were to stick to the old way of testing, assuming 10,000 test cases, we would’ve needed about 100,000 minutes (1667 hours) to complete testing. That would’ve been 41 days (7 days a week) of 10-hour-per-day work if we had four people dedicated to do testing and nothing else. And that’s not counting the time needed to prepare test cases, inevitably rerun some cases whether due to test data error or fat-finger errors caused by fatigue, etc.
So when thinking about the testing strategy for this project, we realized that our API end-to-end integration tests already had the ability to create applications by making a series of API calls. It was certainly faster and arguably more reliable compared to using the UI.
That’s when the idea of an orchestrator service was born. We envisioned this service to ingest test data reflecting the interview changes, coordinate the appropriate API calls to create and complete policy applications, fetch the outcome data, and compile the data into a single report. It’d have a simple frontend that allowed interview authors (the users of this testing tool) to upload test data and start test runs.
Together, the orchestrator backend and the frontend became the self-service test automation framework. And true to Ethos’ fashion, we gave it a name: Isenguard, paying tribute to Tolkien and making it a pun (it guards against errors).
We chose to use TypeScript and Express to build the orchestrator backend, mostly for practical reasons. Our Consumer App’s API e2e test, which Isengard was based on, was already built with TypeScript and Jest, so using TypeScript again allowed for seamless transfer of code and logic. Express is a fairly lightweight and easy-to-use framework for Node, and its middleware-driven design seemed like a good one for an API that essentially went through a bunch of tasks (which became the middlewares) when called.
The lightweight frontend was built with React, mostly due to our familiarity with the library.
The first technical challenge was to modify the end-to-end test’s logic so that the orchestrator could be fully data driven — meaning that the authors could test with any combination of questions and answers, as opposed to the Consumer e2e test’s largely static test data geared towards regression testing. By leveraging a more complex data structure (a deeply nested JSON array of objects) and revamping the question-answering logic, the orchestrator was able to go through any question path presented by the Interview Engine (our microservice responsible for the conducting of interviews from the backend perspective) as long as the author provided the answers they wanted to use.
While designing the data structure, we also introduced the concept of “baseline” answers to streamline test case creation. If the author only wanted to test the changes to one particular question and its follow-up questions, they only needed to include the answers for that branch of questions and nothing else, because the default “baseline” answers would be used. This sped up test case creation and reduced payload sizes.
The second technical challenge was increasing the speed of going through test cases. The Consumer API e2e test was run during off-work hours and it only had 30–40 test cases, so speed wasn’t a huge concern. But to handle the large volume of cases and to provide early feedback to enable quick iterations, we leveraged ES2020’s Promise.allSettled (not Promise.all because we didn’t want a failed case to halt the execution of others) to concurrently run a batch of test cases. Besides the overall speed, concurrency provided a nicer UX, too — the author didn’t have to upload a test case one after another and could simply upload a big batch, start a test run, and come back later for the result. Early testing showed that when executing by batches, we could reduce each test case’s average time to ~20 seconds from slightly over one minute running sequentially.
While solving the speed problem, we also faced another issue with the Node event loop. Because the process of waiting for an underwriting decision was a synchronous and blocking one, which caused trouble when our Kubernetes cluster was doing its regular liveness probes on our services, we had to make sure to unblock it so the I/O cycles weren’t “starved” and Isenguard wouldn’t be mistakenly thought as “dead” and thus restarted, which would interrupt long test runs (the longest one we did was ~4 hours).
Examples of other challenges included choosing a reporting format that balanced richness of info and readability (we settled on CSV), retrying a test run with cases that received errors that were not test data related (e.g. upstream service errors), and identifying the bottlenecks within a test run to shave off seconds here and there.
By the time we reached the finish line of this major project, Isenguard allowed us to crunch through about 14,000 test cases in a matter of two weeks with one analyst focusing on testing full-time. That number of cases was not something we thought was possible and gave us much more confidence in rolling out this highly complex project on a tight timeline.
CURRENT USAGE
Since the end of the 100% Disclosures Automation project, minor features had been added to Isenguard to conduct testing on various rules automation. In recent days, we realized Isenguard had essentially been running something similar to an end-to-end test for the Underwriting Engine’s services. So, more features were introduced to tailor Isenguard to that particular need. One notable feature is the integration with Cypress UI test suites so the lengthy application creation could be done by Isenguard, and then Cypress tests could take over by searching for the policy in our Workbench UI and validating the accuracy of the decisions and policy statuses. We automated the e2e test runs with GitHub Actions and added the capability to send test summaries to a Slack channel, with links embedded to download CSVs and to rerun tests if desired.
This is already saving us a lot of time — about five engineer-hours per week on Underwriting team’s pre-deployment checks that used to be manual tests — and giving us more signals on the reliability of the Workbench by letting us run daily automated regression tests.
(To get a glimpse of our complex and fascinating Underwriting Engine, check out this blog post.)
CAPABILITIES FOR THE FUTURE
Isenguard is now the backbone for the Underwriting team to test product changes and enhancements quickly, which in turn allows us to increase confidence and velocity in our releases. So, naturally, we will be extending its capabilities.
For example, we have several initiatives focusing on reducing costs by more intelligently determining when or whether to order certain third party evidence. As a result, the underwriting process will increase in complexity, with the timing of those external API calls becoming even more critical. Here, Isenguard has room to grow to become a more comprehensive e2e testing tool.
Stay tuned for more blogs on how we build tools to solve challenges in order to serve our customers with high quality products!
Alan Lee joined Ethos in April 2021 as a Software Engineer on the Test team. When Alan isn’t building tools at Ethos, he enjoys herding the dog and cats, hiking with family, and doing the occasional road trips. Interested in joining Alan’s team? Learn more about our career opportunities here.
*Not being dramatic here. One PM, in her presentation, used “painful” to describe the testing process for interview changes.