Testing on Kotlin Multiplatform Mobile and a Strategy to Speed Up Development Time (2022)
The main focus of Kotlin Multiplatform is to avoid duplicating domain logic in different programming languages. You write it once and reuse it on different platforms.
Because the KMM code base is the heart of multiple platforms, it needs to work correctly. If the shared code is broken, then all its platforms will work incorrectly. In my opinion, the best way to ensure that something works correctly is to write a test for it.
In this article, I'll share my experience on writing tests for Kotlin Multiplatform, and my strategy for speeding up development using tests. At FootballCo we use this strategy for our app and we see that it helps our development cycle.
Even though the article focuses on Kotlin Multiplatform, a lot of the principles can also be applied to plain Kotlin applications or any other type of applications.
The Kotlin Multiplatform testing ecosystem
Testing Framework
Compared to the JVM, the Kotlin Multiplatform ecosystem is still relatively young. JUnit can only be used on JVM platforms, other platforms depend on the Kotlin standard library testing framework.
An alternative way for testing Kotlin Multiplatform code would be to use a different testing framework like Kotest. I don't have much experience using it, however I found it to be less reliable than writing tests using the standard testing framework. For example: I was unable to run a singular test case (a function) through the IDE, however this may come down to an incorrect set-up on my part.
Assertions
Kotest also has an assertion library in addition to the testing framework, which can be used alongside the Kotlin standard library testing framework. Another library is Atrium, however I didn't have a chance to try it yet, so I can't give my opinion on it.
Mocking
For a long time, Kotlin Multiplatform did not have a mocking framework because mocking on Kotlin / Native is really complex. Things seem to have changed because Mockk now supports Kotlin Multiplatform. However, there still might be issues on the Kotlin / Native side. An alternative to using a mocking framework is writing the mock or any other test double by hand, which will be explained in more detail in the next section.
Mocking is prevalent in tests that treat a single class as a unit, so let's touch on what can be defined as a unit before diving into the Kotlin Multiplatform testing strategy.
Definition of a Unit
One unit, one class
On Android, a unit is usually considered one class where all of its dependencies are mocked, probably using a framework like Mockito or Mockk. These frameworks can be easily, abused which leads to brittle tests which are coupled to the implementation details of the system under test. However, these types of unit tests are the easiest to write and read (Given that the number of mocks is not that high).
Another benefit is that all the internal dependencies API (class names, function signature etc.) are more refined because they are used inside the tests (e.g. for setting up or verification) through mocks. The downside of this is that the these mocks often make refactoring harder, since changing implementation details (like for example extracting a class) will most likely break the test (because the extracted class needs to be mocked), even though the behavior of the feature did not change.
These types of tests work in isolation, which only verify that one unit (in this case, a class) works correctly. In order to verify that a group of units behave correctly together, there is a need for additional integration tests.
One unit, multiple classes
An alternative way of thinking about a unit could be a cohesive group of classes for a given feature. These tests try to use real dependencies instead of mocks, however the boundary dependencies (e.g. networks, persistence etc.) are still replaced with a test double (usually written by hand instead of mocked by a framework).
The most frequent test double for boundary dependencies are Fakes, which resemble the real implementation but in a simpler form to allow testing (e.g. replacing a real database with an in-memory one). The benefits and downsides of this approach will be discussed at the end of this article.
Side note: the mock keyword is overloaded and misused in a lot of places. I won't get into the reasons why, but if you're interested in learning more about it and other test doubles, take a look here:
My strategy for testing Kotlin Multiplatform
Because mocking the Kotlin Multiplatform is far from perfect, I went the path of writing test doubles by hand. The problem with this is that if we wanted to write every Kotlin Multiplatform unit test like on Android (unit == class). We would be forced to create interfaces for every class along with a test double for it. Which would add unnecessary complexity just for testing purposes.
This is why I decided to treat a unit as a feature / behavior (group of classes). This way, there are less test doubles involved, and the system is tested in a more "production" like setting.
Depending on the complexity, the tests might become integration tests rather than unit tests, but in the grand scheme of things it's not that important as long as the system is properly tested.
The system under test
Most of the time, the system under test would be the public domain class that the Kotlin Multiplatform module exposes. If we had a feature that allowed the user to input a keyword and get a search result based on that keyword, the functional API could have the following signature:
fun performSearch(input: String): List<String>
This could be a function of an interface, a use case or anything else, the critical part is that this is the public function that is called from the platforms.
Tests for this feature could look like this:
class SuccesfulSearchTest
class NetworkErrorSearchTest
class InvalidKeywordSearchTest
Each test class exercise a different path that the system could take. In this case, one for a happy path, and two for unhappy paths. They could only focus the domain layer where the network API is faked, or they could also include the data layer where the real network layer is used but mocked somehow (e.g. Ktor MockEngine, MockWebServer, Mocked HttpInterceptor).
The keyword validation might contain a lot of edges which may be hard to test through the InvalidKeywordSearchTest which could only focus on the domain aspects of what happens on invalid keywords. All the edge cases could be tested in a separate class:
class KeywordValidatorTest {
fun `"ke" is invalid`()
fun `"key" is valid`()
fun `" ke " is invalid`()
fun `" key " is valid`()
}
Of course, for more complex features, the tests can be written in a more granular manner and also focus on testing internal classes. However, testing the KMM "public API" is a good start.
Test set up
Tests which use real dependencies can be hard to set up and can involve a lot of boilerplate. One way to mitigate this would be to create an object mother, which could be a top-level function that creates the system under test, making it usable in multiple test classes.
Another way to achieve this would be to use dependency injection, luckily Koin allows for easy test integrations, which more or less comes down to this:
class SuccesfulSearchTest : KoinTest {
private val searchEngine: SearchEngine by inject()
@BeforeTest
fun setUp() {
startKoin {
modules(systemUnderTestModule)
}
}
@AfterTest
fun teardown() {
stopKoin()
}
// ...
}
The test needs a Koin module which will provide all the needed dependencies. If the Kotlin Multiplatform code base is modularized (like in my KMM modularization article), the systemUnderTestModule could be the public Koin module that is attached to the dependency graph (e.g. module, dependency graph). An example test suite which uses Koin for test set up can be found in my ktor-mock-tests repository.
Contract tests
When creating test doubles, there might be a point when they start becoming complex, just because the production code they are replacing is also complex. Writing tests for test helpers might seem unnecessary, however, how would you otherwise prove that a test double behaves like its production counterpart? Contract tests serve that exact purpose, they verify that multiple implementations behave in the same way (that their contract is preserved).
For example, the system under test uses a database to persist its data, using a real database for every test will make the tests run a lot longer. To help with this, a fake database could be written to make the tests faster. This would result in the real database being used only in one test class and the fake in the remaining cases.
Let's say that real database has the following rules (contract):
- adding a new item updates a "reactive stream"
- new items cannot overwrite an existing item if their id is the same
The contract base test could look like this:
abstract class DatabaseContractTest {
abstract var sut: Dao
@Test
fun `New items are correctly added`() {
val item = Item(1, "name")
sut.addItem(item)
sut.items shouldContain item
}
@Test
fun `Items with the same id are not overwritten`() {
val existingItem = Item(1, "name")
sut.addItem(existingItem)
val newItem = Item(1, "new item")
sut.addItem(newItem)
assertSoftly {
sut.items shouldNotContain newItem
sut.items shouldContain existingItem
}
}
}
The base class contains the tests which will be run on the implementations (real and fake database):
class SqlDelightDatabaseContractTest : DatabaseContractTest() {
override var sut: Dao = createDealDatabase()
}
class FakeDatabaseContractTest : DatabaseContractTest() {
override var sut: Dao = createFakeDatabase()
}
This is just a trivial example to show a glimpse of what can be done with contract tests. If you'd like to learn more about this, feel free to check out these resources:
Big shout out to Jov Mit for creating so many Android related testing content which inspired this testing strategy (If you're interested in Test Driven Development, be sure to check out his insightful screencast series on YouTube).
Benefits
Development speed
The strategy I'm proposing would verify the KMM feature / module correctness at a larger scale instead of focusing on verifying individual classes. This more closely resembles how the code behaves in production, which gives us more confidence that the feature will work correctly in the application. This in turn means that there is less need to actually open up the application every time.
Building applications using Kotlin Multiplatform usually takes longer than their fully native counterparts. The Android app can be built relatively fast thanks to incremental compilation on the JVM, however for iOS the story is different. Kotlin / Native compilation in itself is pretty fast, the issue arises when creating the Objective-C binary where the gradle tasks linkDebugFrameworkIos and linkReleaseFrameworkIos are called (and take a really long time to complete). Luckily, tests avoid that because they only compile Kotlin / Native without creating the Objective-C binary.
Ignoring the build speed issues, let's say that the build didn't take longer. Building the whole application means building all of its parts. But when we work on a feature, we typically only want to focus and verify a small portion of the entire application. Tests allow just that, verifying only a portion of the app without needing to build everything. When we're finished working on a feature, we can plug the code into the application and verify that it correctly integrates with other parts of the application.
Test function / test case names
Because these tests focus more on the end result of the feature rather than on implementation details of a single class, the test function names reflect the behavior of the feature. A lot of time, this behavior also represents the business requirements of the system.
Refactoring
With this testing strategy, refactoring would be easier because the tests don't dig into the implementation details of the system under test (like mocks tend to do). They only focus on the end result, as long as the behavior remains the same*, then the tests don't care how it was achieved.
* And it should be the same, since that's what refactoring is all about.
Kotlin Multiplatform threading*
One of the biggest benefits of this approach is that it also verifies the correctness of the threading on Kotlin / Native. The old memory model is really strict when it comes to mutating data on a background thread. This causes a lot of unexpected errors which only happen on iOS, which unfortunately has a much higher compilation time than Android. Tests that include all the thread switching and mutations can verify the threading correctness much, much faster than running the iOS app ever could.
* Before the new memory model.
Test double reusability
The last thing I want to touch on is test double reusability. To keep the code DRY, the test doubles could also be moved to a common testing module, which helps with its reusability. For example, the data layer test doubles (e.g. network or persistence) can be often reused for UI tests. An example of this can be found in my Ktor Mock Engine article, where the integration tests and UI tests use the same engine for returning predefined data (not strictly a test double, but you get the idea). The repository in the article is Android only, but it can easily be applied to Kotlin Multiplatform since Ktor has great support for it.
Downsides
Test speed
The first thing I want to address is the test speed because no one wants to wait too long for the tests to complete. Tests which treat unit as a class are superfast, but only when they use normal test doubles. Mocking frameworks with all their magic take up a lot of time and make the tests much slower compared to a test double written by hand.
The test strategy I'm proposing does not use any mocking framework, only test doubles written by hand. However, the tests use a lot more production code in a single test case, which does take more time. From my experience working with these types of tests on Kotlin Multiplatform, I didn't see anything worrying about the test speed (besides Kotlin / Native taking longer). Additionally, if the KMM code base is modularized then only the tests from a given module are executed, which is a much smaller portion of the code base.
Test speed is a subjective topic, where every one has a different opinion on it. Martin Fowler has an interesting article which touches on this topic:
Test readability
As I said before, tests using real dependencies can involve some boilerplate and set up code. Real dependencies cannot be set up to an ideal testing state as easily as test doubles, most of the time the set-up will be longer and more difficult.
This brings me to the point that these types of test require more knowledge about how the system under test works internally, not only its dependencies but sometimes dependencies of dependencies etc. This is a doubled edged sword because the tests are not as easy to understand, but after you understand them you'll most likely know how the system under test works.
Hard to define what should be tested
Tests where unit is class are easy to write because they always focus on a single class. When a unit is a group of classes, it is hard to define what the group should be, how deep should the test go?
Unfortunately, there is no rule that works in every case. Every system is different, and has different business requirements. If you start noticing that the test class is becoming too big and too complex, this might be a sign that the test goes too deep. In this case, maybe the most nested / complex dependencies could have their own test classes (and be replaced by a Test Double for the more broad tests).
Summary
In my opinion, Kotlin Multiplatform should be the most heavily tested part of the whole application. It is used by multiple platforms, so it should be as bulletproof as possible.
Writing tests during development can cut down on compilation time and give confidence that any future regressions (even 5 minutes later) will be caught by the test suite.
If you want to improve your Kotlin Multiplatform tests, check out my other articles: