In relation to managing multi-tenancy, there are a selection of architectural routes you may take. A few the most well-liked are:
- Create a database per tenant.
- Use a single database however with tenant IDs to separate out knowledge.
Each of those choices have professionals and cons, however stepping into that’s not the intention of this weblog. At the moment we’ll be exploring what choice 1 could appear to be on a single CockroachDB cluster.
Is that this a sustainable mannequin on CockroachDB? What number of databases/tables can we add? I am going to try to reply these questions and extra as I add round 22,000 databases and 300,000 tables to a cluster.
- Create a 9-node CockroachDB cluster.
- Load x instances of the TPC-C database.
- Use the built-in
cockroach workloadcommand and a script to execute x workloads against x number of databases.
- Monitor stability and performance throughout the exercise.
For this experiment, I created a cluster in Google Cloud Platform (GCP) with the below specs.
I won’t be doing a how-to on the CockroachDB setup in this blog, but below are some links to get you started. I ended up deploying all my nodes in a single AZ since I didn’t care about survivability. I also did not create a load balancer in GCP, but rather used HaProxy on the VM I used to run a workload.
There are a few cluster settings I enabled to improve performance. The main setting we’ll keep throughout the duration of the cluster is:
set cluster setting spanconfig.host_coalesce_adjacent.enabled = true;
This setting allows multiple tables per range, rather than a table per range, which can come in handy when adding as many tables as I plan to.
The below settings are only used for initial schema and data load. More can be read about their purpose here.
set cluster setting kv.range_merge.queue_interval = "50ms"; set cluster setting jobs.registry.interval.gc = "30s"; set cluster setting jobs.registry.interval.cancel = "180s"; set cluster setting jobs.retention_time = "15s";
The ultimate goal is to load a lot of databases and tables – but I also want to easily run workloads against those databases once they are in the cluster. The easiest way to accomplish that for me was to leverage one of the built-in workloads in CockroachDB.
Whereas there are a selection of them out there, I selected the TPC-C workload since it’s a good check of a transactional workload and it will possibly additionally report again effectivity numbers. TPC-C consists of 9 tables in whole. I would like to see at the very least 300k tables whole, so we have now a methods to go.
In case you are unfamiliar with TPC-C, extra data is accessible here.
The base command for initializing a schema is quite simple with the built-in command. It looks like the following, where
localhost is ideally a load balancer instead.
cockroach workload init tpcc 'postgresql://root@localhost:26257?sslmode=disable'
This command will create 9 tables in a database named
tpcc and load 1 warehouse worth of data which comes out to ~200MB. The number of warehouses is configurable via a flag, but we’ll keep it with the default of 1.
Once the data is compressed and replicated it comes out to around 70MB in CockroachDB. We’ll have to keep this number in mind since we currently have 3.1TB of storage available. The final database result for 1 warehouse looks like this.
There are a few things we need to do to expand on the base command. The first one should be obvious: we can’t use the name
tpcc if we are loading tens of thousands of databases.
Depending on how much time you have, this second tip may be equally important. The
workload init command does an
INSERT by default. While this is ok and may be something you want to test, I’d like to switch to
IMPORT and make things a bit faster.
With these two thoughts in mind, our base command now looks something more like this:
cockroach workload init tpcc --db database_$i --data-loader import 'postgresql://root@localhost:26257?sslmode=disable'
Throw in a Bash Script
You’ll notice the command above specifies
database_$i since we want to load
i instances. Feel free to get as fancy as you want here with your scripting abilities, but in its simplest form, we can do something like the following.
#!/usr/bin/env bash for i in 1..20000 do cockroach workload init tpcc --db test$i --data-loader import 'postgresql://root@localhost:26257?sslmode=disable' done
This will admittedly take a while as we are loading 20,000 databases, 180,000 tables, and ~1.5TB of data (after compression). As I said, feel free to write a more impressive script to parallelize this logic and speed up the process.
If you want to speed up past the
IMPORT command, you can leverage a BACKUP and RESTORE course of that I discovered to be even faster.
Again up a TPC-C database into cloud storage. This instance reveals
s3, however different cloud storage choices can be found as effectively. These instructions are run as soon as really within the SQL shell.
BACKUP DATABASE tpcc INTO 's3://BUCKET NAME?AWS_ACCESS_KEY_ID=KEY ID&AWS_SECRET_ACCESS_KEY=SECRET ACCESS KEY' AS OF SYSTEM TIME '-10s';
tpcc database into a brand new database with a special title.
RESTORE DATABASE tpcc FROM LATEST IN 's3://bucket_name?AWS_ACCESS_KEY_ID=key_id&AWS_SECRET_ACCESS_KEY=access_key' WITH DETACHED, new_db_name="database_$i";
We may additionally loop by means of this command within the language of our alternative to realize a bit faster outcomes than the IMPORT.
As I discussed above, 20k iterations solely will get us to 180k tables. I would prefer to get nearer to 300k tables managed, however I am not tremendous involved in including extra knowledge as we’re already at 50% capability.
At this level, I added ~700 extra databases, every with 150 tables. This schema was fairly completely different than the TPC-C tables already loaded. How, or if, you selected to get to this increased variety of tables, I am going to go away it as much as you.
Only a fast affirmation of the cluster measurement and objects added!
First, we’ll see the whole databases rely is simply over 22.5k. Then we’ll do a rely on the descriptors. Descriptors embrace tables, amongst just a few different issues.
Right here is the Cluster Overview.
Working a Workload
Now that we have now every part loaded, we will undergo the same course of to run workloads in opposition to a few of these an identical databases. Whereas I haven’t got the infrastructure in place to run 22.5k concurrent TPC-C workloads, hopefully, I can get one thing respectable going.
Identical to the preliminary load, Cockroach makes really operating the load easy as effectively. The instance under will run a TPC-C workload for 60 minutes.
cockroach workload run tpcc --duration=60m 'postgresql://root@localhost:26257?sslmode=disable'
Increasing the Base Command
There are just a few modifications I made right here as effectively which finally ends up being fairly just like the preliminary load.
cockroach workload run tpcc --db database_$i --duration=60m --conns=1 --display-every=1m 'postgres://[email protected]:26257?sslmode=disable'
As soon as once more we have added a database flag so we will choose among the databases we created. I additionally added a connections flag and set it to 1. By default, the workload will create 2 connections per run.
Lastly, I added a
--display-every flag. This simply adjustments how typically the workload prints out its per-operation statistics. The default is 1s and prints out rather a lot. I wasn’t actually planning on digesting all this data for 500+ workloads. In a while, we are going to seize the ultimate workload statistics that are printed out on the finish.
Now that we have now our base command, we will create one other script to loop by means of and begin these processes. I used a separate digital machine, outdoors of the cluster, and with HaProxy put in. FYI, decrease these numbers fairly a bit if you do not have a strong VM. I used to be utilizing n2-standard-32.
Under is the straightforward script I used that loops by means of beginning the workload 500 occasions.
#!/usr/bin/env bash for i in 1..500 do nohup cockroach workload run tpcc --db database_$i --duration=60m --conns=1 --display-every=1m 'postgres://root@localhost:26257?sslmode=disable' > outcomes/$i.txt 2>&1 & carried out
The code on the finish throws the output of every workload into its personal file inside a
For my precise checks, I leveraged the next syntax so I wasn’t at all times deciding on simply the primary 500. (I notice there is a semi-decent likelihood for dupes with this logic).
nohup cockroach workload run tpcc --db database_$(( ( RANDOM % 20000 ) + 1 )) --conns=1
Under are the outcomes for workloads of sizes 500, 750, and 1000. Every ran for 1 hour.
General I believed CockroachDB dealt with these checks effectively whereas sustaining a TPC-C efc proportion of 97% all through all checks. Mileage will differ relying on the variety of nodes and vCPU, however this might be a viable choice if the present answer right this moment is making a bodily database per tenant.
There are some things I would develop on when testing out this mannequin extra:
- Influence and pace of creating a schema change throughout every database
- Implementing role-based entry to every database
- QPS in opposition to a non-TPC-C workload
- Much more tables! Though not talked about within the weblog, I did check 200k vs 300k and noticed no variations in efficiency.