Home

04/27/2020

Migrating Transeo to Vapor 4

Over the past few weeks, our team has been working to upgrade our Vapor 3 codebase for the Transeo backend to the latest and greatest Vapor 4. This process was significantly easier than upgrading to Vapor 3, but it still had some bumps along the road that I think are worth documenting and sharing for others.

Context

For some context, Transeo is a web application that provides software to school districts that helps them manage their college, career, and life readiness indicators. We consume mass amounts of data from student profiles, activities, and job experiences, and use that data to surface insights for administrators around compliance, equity, and inclusion. We serve hundreds of thousands of users in states all around the country and are growing quickly.

In terms of the tech, we produce terabytes of log files, support ~220 database models, and maintain hundreds of unit and integration tests. It's nowhere close to a Google/Apple/Facebook codebase, but it's a lot to manage for a company as tiny as ours.

All of that said to arrive at the main point of the context here: we knew from day 1 that the migration was going to take a lot of work. To try and estimate schedules and resources we turned to the last big migration we had performed which was the conversion from Vapor 2 to Vapor 3. That migration took around 3 months of non-stop work since it was such a massive paradigm shift. After it was all said and done, we were able to convert our codebase from Vapor 3 to Vapor 4 in around 90 hours of work. That being said, your experience may differ based on lots of different factors, so take my advice with a grain of salt :)

The Migration Plan

One of our core requirements was 1:1 parity with our existing Vapor 3 backend. We serve a lot of users who rely on us to do their jobs every day - even one feature missing or broken because of the migration would be a problem.

When we started we considered two different options:

  1. Work inside of our existing codebase, update Vapor from the Package.swift file, and then start updating the actual code
  2. Start with a brand new project setup to use Vapor 4 and migrate the code from the Vapor 3 codebase into the Vapor 4 codebase, piece by piece.

I'm a big fan of receiving feedback at every step (both from the compiler and other humans), so I wanted something that we could build along the way to ensure that everything was migrating smoothly, as opposed to facing thousands of errors at the end. We decided to go with route #2 and it worked well for us.

The Process

After we scaffolded the project our process had very clear steps:

  1. Setup the configuration files
  2. Start with the models
  3. Add extensions, helpers, and DTO models
  4. Add in tests
  5. Start uncommenting routes, one by one
  6. Make tests pass

Step 1 - Configuration

Step 1 was straightfoward as the project came with the necessary configuration files in place. We did a little bit of work to add support for Queues and MySQL but past that it was easy.

Step 2 - Models

Step 2 is where we spent the vast majority of the time during the migration - to put it lightly, moving our models over was very labor intensive.

Let's look at a simple Vapor 3 model we had in our app:

import Foundation
import Vapor
import FluentMySQL

final class Tag: Content {
    var id: Int?
    
    var name: String
    var sendsParentEmail: Bool
    
    var createdAt: Date?
    var updatedAt: Date?
    var deletedAt: Date?
    
    init(name: String, sendsParentEmail: Bool) {
        self.name = name
        self.sendsParentEmail = sendsParentEmail
    }
}

extension Tag: MySQLModel {
    static var entity = "tags"
}

extension Tag: Migration {
    typealias Database = MySQLDatabase
    
    static func prepare(on connection: Database.Connection) -> Future<Void> {
        return Database.create(self, on: connection) { builder in
            try addProperties(to: builder)
        }
    }
}

extension Tag: Parameter { }

//MARK: - Timestamps + soft delete
extension Tag {
    static var createdAtKey: TimestampKey? = \.createdAt
    static var updatedAtKey: TimestampKey? = \.updatedAt
    static var deletedAtKey: TimestampKey? = \.deletedAt
}

This model is about as simple as it gets with Vapor 3 - two properties, no relationships, and a straightforward migration. Moving this over to Vapor 4 isn't hard, but it takes some mental rewiring. Here's what our model looks like now:

import Foundation
import Vapor
import Fluent

final class Tag: Content {
    @ID(custom: .id)
    var id: Int?
    
    @Field(key: "name")
    var name: String
        
    @Field(key: "sendsParentEmail")
    var sendsParentEmail: Bool
    
    @Timestamp(key: "createdAt", on: .create)
    var createdAt: Date?
    
    @Timestamp(key: "updatedAt", on: .update)
    var updatedAt: Date?
    
    @Timestamp(key: "deletedAt", on: .delete)
    var deletedAt: Date?
    
    init() { }
    
    init(name: String, sendsParentEmail: Bool) {
        self.name = name
        self.sendsParentEmail = sendsParentEmail
    }
}

extension Tag: Model {
    static let schema = "tags"
}

struct CreateTagMigration: Migration {
    func prepare(on database: Database) -> EventLoopFuture<Void> {
        return database.schema(Tag.schema)
            .autoIncrementingId()
            .field(\Tag.$name, .string, .required)
            .field(\Tag.$sendsParentEmail, .bool, .required)
            .timestampFields()
            .create()
    }
    
    func revert(on database: Database) -> EventLoopFuture<Void> {
        return database.schema(Tag.schema).delete()
    }
}

extension Tag: ParameterModel { }

(Side note - you can learn more about our custom helpers like ParameterModel in the next section)

The model isn't that much more complex in Vapor 4, but shifting your mindset to property wrappers instead of normal properties on a model takes a bit of time.

On the flip side, here's a more complicated model with relationships in Vapor 3:

import Foundation
import Vapor
import FluentMySQL

final class DistrictTag: Content {
    var id: Int?
    
    var district_id: District.ID
    var tag_id: Tag.ID
    
    var createdAt: Date?
    var updatedAt: Date?
    var deletedAt: Date?
    
    init(district_id: District.ID, tag_id: Tag.ID) {
        self.district_id = district_id
        self.tag_id = tag_id
    }
}

extension DistrictTag: MySQLModel {
    static var entity = "district_tags"
}

extension DistrictTag: Migration {
    typealias Database = MySQLDatabase
    
    static func prepare(on connection: Database.Connection) -> Future<Void> {
        return Database.create(self, on: connection) { builder in
            builder.field(for: \.id, isIdentifier: true)
            builder.field(for: \.district_id)
            builder.field(for: \.tag_id)
            builder.field(for: \.createdAt)
            builder.field(for: \.updatedAt)
            builder.field(for: \.deletedAt)
            
            builder.reference(from: \.district_id, to: \District.id)
            builder.reference(from: \.tag_id, to: \Tag.id)
        }
    }
}

extension DistrictTag {
    static var createdAtKey: TimestampKey? = \.createdAt
    static var updatedAtKey: TimestampKey? = \.updatedAt
    static var deletedAtKey: TimestampKey? = \.deletedAt
}

And in Vapor 4:

import Foundation
import Vapor
import Fluent

final class DistrictTag: Content {
    @ID(custom: .id)
    var id: Int?
    
    @Parent(key: "district_id")
    var district: District
    
    @Parent(key: "tag_id")
    var tag: Tag
    
    @Timestamp(key: "createdAt", on: .create)
    var createdAt: Date?
    
    @Timestamp(key: "updatedAt", on: .update)
    var updatedAt: Date?
    
    @Timestamp(key: "deletedAt", on: .delete)
    var deletedAt: Date?
    
    init() { }
    
    init(district_id: District.IDValue, tag_id: Tag.IDValue) {
        self.$district.id = district_id
        self.$tag.id = tag_id
    }
}

extension DistrictTag: Model {
    static let schema = "district_tags"
}

struct CreateDistrictTagMigration: Migration {
    func prepare(on database: Database) -> EventLoopFuture<Void> {
        return database.schema(DistrictTag.schema)
            .autoIncrementingId()
            .relation(\DistrictTag.$district.$id, required: true, references: \District.$id)
            .relation(\DistrictTag.$tag.$id, required: true, references: \Tag.$id)
            .timestampFields()
            .create()
    }
    
    func revert(on database: Database) -> EventLoopFuture<Void> {
        return database.schema(DistrictTag.schema).delete()
    }
}

The bulk amount of time we spent migrating models to Vapor 4 was spent on models that had a large number of relationships. The DistrictTag example above only has two relationships and it's pretty simple. It started getting ugly when we got into models that had 5+ relationships and we needed to keep all of the field names straight (remember - we needed 1:1 parity with our existing codebase).

Additionally, due to the lack of type safety in migrations and the speed at which we were moving, there were quite a few instances where we marked a property required (or left off that requirement) when we shouldn't have. These bugs only surfaced much later when we got the application up and running and hit very specific edge cases (or, in lucky cases, our tests caught it).

All that being said, migrating our models first proved to be super important. All of our routes, DTOs, helpers, and queue jobs are based around these models - I don't think there's a single route that doesn't touch at least one of them. We never would have been able to complete the migration as quickly as we did without starting here.

Step 3 - Extensions and Helpers

One of the most beneficial strategies we implemented early on was using a collection of extensions and helpers to make more code compile as-is. Some of these we removed after the migration was complete and did a quick find and replace to update, and some we've left as-is.

Migrations

These extensions make our migrations a little more DRY:

// Helpers for auto incrementing int ids and timestamps:
extension SchemaBuilder {
    public func autoIncrementingId() -> Self {
        self.field("id", .int, .identifier(auto: true))
    }
    
    public func timestampFields() -> Self {
        self
            .field("createdAt", .datetime)
            .field("updatedAt", .datetime)
            .field("deletedAt", .datetime)
    }
}

// Then, in your migration:
return database.schema(User.schema)
    .autoIncrementingId()
    .timestampFields()

Additionally, we've been using the generic schema builder from this gist. These extensions are a bit controversial as they circumvent the design decisions made specifically to prevent type-safe schemas but to get up and running as quickly as possible, we decided to use it.

Parameters

These helpers restore familiar syntax to Vapor 4 from Vapor 3 regarding parameters.

// Restores some of the type-safe functionality of parameters (parts of this based off of @0xtim's work)
// A few gotchas: the IDValue on the Model must be an Int and this doesn't work with multiple parameters of the same type in the path
import Vapor
import Fluent

protocol ParameterModel {
    static var parameterKey: String { get }
    static var parameter: PathComponent { get }
}

extension ParameterModel where Self: Model {
    static var parameterKey: String {
        return Self.schema
    }
    
    static var parameter: PathComponent {
        return PathComponent(stringLiteral: ":\(Self.parameterKey)")
    }
}

extension Request {
    func next<M: ParameterModel & Model>(_ type: M.Type) -> Future<M> where M.IDValue == Int {
        guard let stringValue = self.parameters.get(M.parameterKey) else {
            return future(error: Abort(.badRequest, reason: "Could not find \(M.parameterKey) parameter"))
        }
        
        guard let intValue = Int(stringValue) else {
            return future(error: Abort(.badRequest, reason: "Could not convert \(M.parameterKey) parameter to int"))
        }
        
        return M
            .find(intValue, on: db)
            .unwrap(or: Abort(.badRequest, reason: "Could not find a \(M.self) with ID of \(intValue)"))
    }
}


// In the model:
extension User: ParameterModel { }

// In the router:
routes.get(User.parameter, use: getUser)

// In the route:
let userQuery = req.next(User.self)

Futures

We also added these extensions to Request calling through to the underlying EventLoop helpers. This makes the call site a little cleaner (req.future vs req.eventLoop.future)

extension Request {
    /// Creates a new, succeeded `Future` from the worker's event loop with a `Void` value.
    ///
    ///    let a: Future<Void> = req.future()
    ///
    /// - Returns: The succeeded future.
    public func future() -> EventLoopFuture<Void> {
        return self.eventLoop.future()
    }
    
    /// Creates a new, succeeded `Future` from the worker's event loop.
    ///
    ///    let a: Future<String> = req.future("hello")
    ///
    /// - Parameter value: The value that the future will wrap.
    /// - Returns: The succeeded future.
    public func future<T>(_ value: T) -> EventLoopFuture<T> {
        return self.eventLoop.future(value)
    }
    
    /// Creates a new, failed `Future` from the worker's event loop.
    ///
    ///    let b: Future<String> = req.future(error: Abort(...))
    ///
    /// - Parameter error: The error that the future will wrap.
    /// - Returns: The failed future.
    public func future<T>(error: Error) -> EventLoopFuture<T> {
        return self.eventLoop.future(error: error)
    }
}

Large Post Bodies

Right now it's only possible to set the max incoming body size on a per route basis (pending Vapor #2312) so we added the following extensions in the meantime:

extension RoutesBuilder {
    @discardableResult
    public func postLarge<Response>(
        _ path: PathComponent...,
        use closure: @escaping (Request) throws -> Response
    ) -> Route
        where Response: ResponseEncodable
    {
        return self.on(.POST, path, body: .collect(maxSize: 10 << 20), use: closure)
    }

    @discardableResult
    public func patchLarge<Response>(
        _ path: PathComponent...,
        use closure: @escaping (Request) throws -> Response
    ) -> Route
        where Response: ResponseEncodable
    {
        return self.on(.PATCH, path, body: .collect(maxSize: 10 << 20), use: closure)
    }
}

We plan to remove these after global max body size is configurable but to start we just did a simple find and replace for routes.post and replaced it with routes.postLarge. (Side note - 10 mb is the size we need for our application, feel free to change that to whatever size you need).

Model Lifecycle Configuration

Vapor 4 introduces a cool new way to hook into the Model lifecycle (before delete, etc) but adding it to the configuration file can get a little verbose if you have a lot of middleware objects. We added the following function:

import Fluent

public func lifecycle(_ app: Application) {
    app.databases.middleware.use(MyModelMiddleware(), on: .mysql)
}

And then from configure.swift we call:

//MARK: - Register Model Lifecycles
lifecycle(app)

Key Storage

We had a concept built in Vapor 3 that we call KeyStorage which sucks in a bunch of environment variables and allows routes, queue jobs, etc to use those variables without repeating the typical Environment.get call. We ported this to Vapor 4 as well, here's what that looks like:

struct KeyStorage {
    let mySuperSecretApiKey: String
}

extension Application {
    struct KeyStorageKey: StorageKey {
        typealias Value = KeyStorage
    }
    
    var keyStorage: KeyStorage {
        get {
            guard let val = self.storage[KeyStorageKey.self] else { fatalError("Register key storage") }
            return val
        }
        set {
            self.storage[KeyStorageKey.self] = newValue
        }
    }
}

extension Request {
    var keyStorage: KeyStorage { self.application.keyStorage }
}

Then, in configure.swift:

app.keyStorage = KeyStorage(mySuperSecretApiKey: Environment.get("SUPER_SECRET_API_KEY") ?? "")

Finally, in the route:

req.keyStorage.mySuperSecretApiKey

Step 4 - Adding Tests

After we got all of our helpers and models setup, we introduced tests into the codebase. Our Vapor 3 codebase had hundreds of unit and integration tests and our goal was to keep every single one (we ended up dropping around 30 due to Swift 5.2 automatic test discovery - we no longer needed a check for parity between the allTests array and the XCTManifests entry).

As expected, most of our tests were failing during this step. The ones that were passing were either tests that didn't hit any database models or tests that focused on database model methods (i.e. testing model lifecycle hooks).

However, having these tests in place gave us the confidence to move forward to the routes.

Step 5 - Routes

Perhaps somewhat suprisingly the routes were one of the quickest parts of the migration. After we had all the methods and models in place that the routes referenced, uncommenting and adjusting Fluent queries was straightforward. That being said, two things, in particular, made life a bit more difficult.

Lack of automatic content decoding

Vapor 3 had this awesome feature where the content expected in the body of the request could be automatically decoded before hitting the function, like so:

router.post(MyContentType.self, at: "url", use: myFunction)

func myFunction(req: Request, content: MyContentType) -> Future<HTTPStatus> {
	// use content here
	return req.future(.ok)
}

This functionality disappeared in Vapor 4, which broke a ton of our route definitions. Thankfully, the fix is easy, you just have to decode the content directly in the function now:

router.post("url", use: myFunction)

func myFunction(req: Request) -> Future<HTTPStatus> {
	let content = try req.content.decode(MyContentType.self)
	// use content here
	return req.future(.ok)
}

Although it's a little less magical, it works just as well and is probably a little clearer.

No more long route strings

Vapor 4's routing engine no longer allows a single PathComponent to contain / or multiple paths. For example, the following worked in Vapor 3:

routes.get("my/pathed/url/here", use: myFunction)

In Vapor 4, you must break out each path component and omit the /:

get("my", "pathed", "url", "here", use: myFunction)

If your app starts throwing 404's in places that you are positive have a registered route, this might be why.

Step 6 - Green Tests

After getting all of our controllers to compile, it was time to make our tests green and happy. An initial run of the tests showed about a 50% success rate - not exactly promising but workable. A shocking number of tests were failing because of a really simple mistake/misunderstandidng - accessing parent relationship IDs on child models.

Let's say we have a Tag and that Tag optionally has a Folder. You'd expect the Tag model to have a property like this:

@OptionalParent(key: "folder_id")
var folder: Folder?

In a route, you might want to check to see if a Tag has a Folder or not. While converting the routes, the quickest way to do this that made the compiler happy was to write:

tag.folder?.id == nil

This will only work under one circumstance - if you call .with(\.$folder) on the Tag query. If you simply want to check if the folder_id column from the database is nil or not, you have to prefix the property with the $ symbol (accessing the underlying PropertyWrapper):

tag.$folder.id == nil

This isn't a huge deal, but it's easy enough to miss because the compiler doesn't complain about version #1. After hunting our codebase for instances of this (they were all over), we got our tests about 90% passing. The remaining 10% were a combination of Vapor/Fluent bugs, properties that were incorrectly labeled required, and tests on some complex custom SQL queries we were running.

Bonus Step - Database Imports + Manual Testing

In our case, just getting the tests running wasn't quite good enough as we still had a few edge cases hiding bugs. One way that we solved this was through an import of our Vapor 3 database into the new Vapor 4 version. Ideally, if our goal of 1:1 parity was accomplished, the data should import with no problems. With a few more tweaks to our migrations and database, we were able to accomplish repeatable successful database imports.

We also spent a day going through every feature of our app on the frontend and checking to make sure that all functionality remained the same. Our tests covered almost all of this but it was reassuring to know that a human validated those tests as well.

Hurdles

The process, while much easier than previous migrations, was not entirely straightforwad.

Logging

For compliance/audit and user experience reasons we store all of our application logs in AWS S3 (shipped using Fluentd) and analyzed using Elasticsearch/Kibana via CHAOSSEARCH. Changing our ETL pipeline to accomodate a new log output format wasn't a viable option so we needed a way to backport the new format to what we expected.

In Vapor 3, we had written a custom fork of swift-log that wrapped the logger inside of a class so that the object could be passed through the request lifecycle. We no longer needed that functionality as Vapor handled that automatically, but we did need to adjust the format of the log output. This is relatively straightforward, accomplished by simply registering a different Logger backend in main.swift:

try CustomLoggingSystem.bootstrap()

However, Vapor's new log service came with some nice helpers that allow the developer to change the log level via environment variables or flags passed into the executable. I didn't want us to lose that functionality just because we needed a custom log format. What we settled on is pulling the main Vapor log bootstrap code into our project but then calling into the swift-log default backend directly. It looks like this:

import Vapor
import Logging

extension LoggingSystem {
    public static func customBootstrap(from environment: inout Environment) throws {
        struct LogSignature: CommandSignature {
            @Option(name: "log", help: "Change log level")
            var level: Logger.Level?
            init() { }
        }

        // Determine log level from environment.
        let level = try LogSignature(from: &environment.commandInput).level
            ?? Environment.process.LOG_LEVEL
            ?? (environment == .production ? .notice: .info)

        // Disable stack traces if log level > debug.
        if level > .debug {
            StackTrace.isCaptureEnabled = false
        }

        // Bootstrap logger to use Terminal.
        return LoggingSystem.customBootstrap(console: Terminal(), level: level)
    }
    
    public static func customBootstrap(console: Console, level: Logger.Level = .info) {
        self.bootstrap { label in
            var logHandler = StreamLogHandler.standardOutput(label: label)
            logHandler.logLevel = level
            return logHandler
        }
    }
}

Then, in main.swift we call:

try LoggingSystem.customBootstrap(from: &env)

Now our logs look like, instead of the default Vapor format:

2020-04-27T11:25:00-0500 notice: role=student district_id=1 user_id=1 method=GET school_id=1 request-id=96F3A623-9A14-4158-9993-9B8EC81A09BE path=redacted request

This has worked perfectly for us and our ETL pipeline is happy :)

Squashing Bugs

There were several bugs that we found during our migration - mostly in Fluent - that we reported and in some cases fixed thanks to the generous help of the Vapor community. (Side note - bugs at this stage were acceptable as Fluent still hasn't hit a stable 4.0).

Here are the issues and pull requests I filed along the way during our migration:

  1. https://github.com/vapor/fluent-kit/issues/254
  2. https://github.com/vapor/fluent-kit/issues/257
  3. https://github.com/vapor/fluent-kit/issues/247
  4. https://github.com/vapor/fluent-kit/issues/245
  5. https://github.com/vapor/fluent-kit/issues/242
  6. https://github.com/vapor/fluent-kit/pull/244
  7. https://github.com/vapor/fluent-kit/pull/256
  8. https://github.com/vapor/fluent-kit/pull/258
  9. https://github.com/vapor/fluent-kit/pull/248
  10. https://github.com/vapor/sql-kit/pull/102
  11. https://github.com/vapor/vapor/pull/2315

As mentioned, the Vapor community was a tremendous help at every turn and cared deeply about unblocking our migration when it got stuck.

Documentation

One of the goals for Vapor 4 is to improve upon the documentation and make it easier to find answers to common questions. Significant improvements have been made in this area, but the docs are still incomplete for Vapor 4 so there were moments where I needed to dig into the package tests to find specific syntax. This wasn't a showstopper for us, and the documentation will be completed before all packages hit GM, but it's something to be aware of if you're going to start your migration now - asking well-worded questions and hunting for answers is a critical skillset at this stage.

Outcomes

Overall we are super pleased with the outcomes of the migration. Our codebase is cleaner, our performance is better, and overall developers are happier.

Performance

We haven't run any performance tests that would hold up scientifically, but in terms of raw response time averages, we saw a pretty nice speed up of around 20% on most of our routes. I'm most pleased that some of our heaviest routes from Vapor 3 that had lots of dependent queries and children to fetch also saw improvements that are noticeable.

There remain to be real tests done on performance for Vapor 4, but that's a known to-do item on the Vapor team.

Codebase Quality

One of the more unexpected benefits of upgrading to Vapor 4 is that many of our call sites are easier to read and easier to write. This impact is most noticeable when we are querying children along with their parents using the new .with feature - we no longer have to chain on a ton of joins or push data down through multiple future chains to collect everything we need.

Testing

Testing is a lot easier now, and our tests are running up to 2 times faster due to a trick I wanted to implement in Vapor 3 but couldn't due to the restrictions of the framework. Instead of reverting all of the migrations in the table after every test and then re-migrating at the top of the next test, we run migrations once for all tests and then simply truncate the tables after each test so that the structure is already in place for the subsequent execution. We accomplish this using raw queries (Syntax for non-MySQL databases may differ slightly):

func reset() throws {
    let url = URL(string: Environment.get(Constants.databaseURL)!)!
    let dbInfo = MySQLConfiguration(url: url)?.database
    let conn = self.db as! SQLDatabase
    
    let results = try conn.raw("""
    select table_name from information_schema.`TABLES`
    where TABLE_SCHEMA = '\(dbInfo!)' and TABLE_NAME != "_fluent_migrations"
    """).all().wait()
        
    let tableNames = try results.map { try $0.decode(column: "table_name", as: String.self) }
    for table in tableNames {
        try conn.raw("SET FOREIGN_KEY_CHECKS=0;").run().wait()
        try conn.raw("TRUNCATE TABLE \(table)").run().wait()
        try conn.raw("SET FOREIGN_KEY_CHECKS=1;").run().wait()
    }
}

Conclusion

Migrating to Vapor 4 was a big task that our team set out to complete but it was incredibly worthwhile and our codebase is stronger because of it. This wouldn't have been possible without the incredibly patient and kind help from people like Tanner, Tim, Gwynne, and countless others from the Vapor community.

If you haven't had a chance yet to try out Vapor, now is a great time - the framework is better than ever and the community is the best online community I've ever been a part of. The ecosystem is growing rapidly and Apple is investing heavily - seriously, give it a shot.

Let me know if you use any of the tips or tricks mentioned in this article! I'd love to hear how your migration goes.

Tagged with: