Creating a cluster
To create your first ClickHouse cluster, click on ‘create cluster’ and you will be presented with all the customization options for deploying your cluster.
Cluster name
You may name your cluster any name (including spaces, numbers and special characters). With a maximum of 50 characters.
note
Your cluster must always start with a letter. Not a number or a special character.
Cloud provider
You may deploy your cluster on either AWS or GCP. Free trial accounts are limited to AWS and our default cluster.
For more information about AWS please click here: https://docs.aws.amazon.com/AmazonECS/latest/userguide/clusters.html
For more information about GCP please click here: https://cloud.google.com/compute
Region
To obtain the lowest latency working with your data, it is recommended that you choose the region closest to where you work. Unless you have specific requirements to have your data in a specific region.
All geographical regions supported by AWS and GCP are available in Gigapipe. If your preferred region is not in the existing list, please email support@gigapipe.com and request the new region to be added to the system. It will usually be available within 24 hours.
List of AWS regions: https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Concepts.RegionsAndAvailabilityZones.html#Concepts.RegionsAndAvailabilityZones.Regions
List of GCP regions: https://cloud.google.com/compute/docs/regions-zones
Regions are grouped by continental area. You will only be able to see regions available for the specific cloud provider selected (AWS or GCP).
Gigapipe will NEVER move your data from your specified region. All local data laws and compliance apply to your specified region.
Machine
tip
If unsure, always start with the smallest available machine type. You can always change the machine type at a later date.
Gigapipe recommends and supports machine types best optimized for deploying your ClickHouse clusters. These are:
AWS: m5.large instances
Machine name | vCPUs | Memory (GiB) |
---|---|---|
m5.large | 2 | 8 |
m5.xlarge | 4 | 16 |
m5.2xlarge | 8 | 32 |
m5.4xlarge | 16 | 64 |
m5.8xlarge | 32 | 128 |
For full detials follow this link: https://aws.amazon.com/ec2/instance-types/m5/
GCP: E2 machine series
Machine name | vCPUs | Memory (GB) |
---|---|---|
e2-standard-2 | 2 | 8 |
e2-standard-4 | 4 | 16 |
e2-standard-8 | 8 | 32 |
e2-standard-16 | 16 | 64 |
e2-standard-32 | 32 | 128 |
For full detials follow this link: https://cloud.google.com/compute/docs/general-purpose-machines#e2_machine_types
Shards and Replicas
info
Number of nodes = Shards * Replicas
You can decide how many Shards (principal nodes), and Replicas (mirrors) are in your cluster, to provide the desired performance and contingency infrastructure for your installation. However the default recommended cluster for production environments has 3 nodes and 1 replica to ensure high availability.
We would recommend single shard single replica (1 node) for testing.
tip
When deploying a cluster, we recommend a node count (shards * replicas) divisible by the number of availability zones in your chosen region. Gigapipe always has 3 availability zones for every region. So the number of nodes should always be 3, 6, 9 etc.
EG: 3 shards and 1 replica, or 3 shards and 2 replicas, or 6 shards and 2 replicas etc.
You can deploy on as many or as few shards and replicas as needed.
For more information on nodes, shards and replicas please click here: https://en.wikipedia.org/wiki/Shard_(database_architecture).
warning
When creating tables ALWAYS use 'ON CLUSTER' to ensure the table is created on all nodes in the cluster
EG: CREATE TABLE `Table name_local` ON CLUSTER `{cluster}`
Disks
When deploying ClickHouse through Gigapipe you can have multiple disks of multiple available disk types.
Disk types available:
- AWS
- (SSD) - gp2
- General purpose SSD
- Low-latency interactive apps
- Development and test environments
- (HDD) - st1
- Throughput Optimized HDD
- Data warehousing (for less frequently queried data)
- Minimum 150 GiB per node
- (SSD) - gp2
For more information on AWS available disk types please click here: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-volume-types.html
- GCP
- (SSD) - Regional SSD PD
- Fast and reliable block storage with synchronous replication across two zones in a region
- (HDD) - Regional standard PD
- Efficient and reliable block storage with synchronous replication across two zones in a region
- (SSD) - Regional SSD PD
For more information on GCP disk types please click here: https://cloud.google.com/compute/docs/disks
You can deploy using a single disk however if you intend to use storage policies or TTLs to offload less frequently queried or older data, then we advise deploying larger and slower less expensive disks alongside the primary disk.
Admin username/password
Name your ClickHouse cluster ‘Admin username’ and ‘Admin password’ (Confirm password). You’ll need these credentials if you intend to directly interact with your ClickHouse cluster without using the Gigapipe UI.
Pricing calculator
All disk costs are passed along at cost (no mark-up) so you are never punished for storing more data!
Clusters are billed on an minute by minute basis; the monthly estimate is the expected cost if the cluster were to be running from an entire month (a 730 hour month).
Create Cluster
After reviewing all the details and the estimated price for the monthly run of your new cluster, click on 'Create Cluster' button to deploy your cluster in your specified region/provider.