Overview

Overview of floodplain

Floodplain

Floodplain is a data liberation platform! In a nutshell, you point it to a database, floodplain will receive all changes to that database (even logically speaking all the changes that have occurred up to that point), then it will optionally transform those changes in some way, and finally it will push those changes to another database.

So that gives us a new, near-realtime updated materialized view in a new database.

It’s closest competitor would be KSQLdb. Where KSQLdb implements a SQL variant, we chose for a more general purpose language. Which works better for you will obviously depend on use case and personal preference.

First of all, KSQL(db) is much more mature, and Floodplain is very much alpha. KSQLdb is developed by Confluent, the company behind Kafka, where Floodplain is develped as an unpaid open source project by a tiny team.

So while not nearly as mature as KSQL, there are reasons to favor Floodplain’s approach though:

SQL

Perhaps personal taste, but I don’t like SQL much. It has served the world well, and there are many very skilled people who can do amazing things with it, that does not necessarily translate to being the best choice for essentially a new field.

The streaming SQL version of KSQL is another extension of an already pretty weak standard, while different versions of SQL seem similar, switching between implementations is far from painless, and the superficial similarities actually make it more confusing.

And we think that embracing a new language seems more effective than forcing a query language further and further into a non-query domain. Also static analysis of SQL is tricky, as is IDE integration. A modern and statically typed language language can help the developer so much better.

KSQL is working on improving this, integrating existing code into Floodplain streams is much easier. All stateful and stateless transformations are just Kotlin and Java code, so it is much more straightforward to integrate.

What is it good for?: When you need another, different materialized view on your database. Maybe you don’t like working with your original database, but changing it is too invasive. Maybe you want to have much bigger read scale than your current database can handle.

Or even: You have an existing application that runs fine but you don’t want to touch the code with a 10 foot pole.

What is it not good for?: If eventual consistency is a problem, Floodplain is not for you. Whenever a transaction happens in the source database, the changes pass through Floodplain into the target database at some point. How long that is depends on a lot of things. Usually it is milliseconds, sometimes longer.

If you have an application that is hitting the limit of write traffic it can take, Floodplain will probably not help. If writing to a database is a problematic without Floodplain, it is probably problematic with Floodplain. The only situation where Floodplain might help is when the source database is struggling dealing with the combination of read and write traffic. In that case you could move most (remember: eventual consistent) read traffic to the destination database.

What is it not yet good for?: Floodplain only used single partition topics for now.

Where should I go next?

Give your users next steps from the Overview. For example:

Getting Started: Get started with Floodplain
Examples: Check out some example code!

Last modified May 11, 2020: some house cleaning (b1ff86f)