Replicate data to a ClickHouse database on DoubleCloud
Learn how to replicate data from Neon to a ClickHouse database on DoubleCloud
warning
DoubleCloud is winding down operations. Please see the DoubleCloud announcement for details. DoubleCloud will stop creating new accounts on October 1st, and existing DoubleCloud clients will have an opportunity to transition from DoubleCloud until March 1st, 2025.
Neon will remove DoubleCloud documentation from our site in the near future.
Neon's logical replication feature allows you to replicate data from your Neon Postgres database to external destinations.
ClickHouse is an open-source column-oriented database that allows you to query billions of rows in milliseconds. Its architecture is designed to handle analytical queries efficiently, which makes it ideal for data warehousing and analytics applications. Thanks to the columnar storage format, data can be compressed and retrieved more efficiently, allowing some analytical queries to execute 100 times faster compared to traditional databases like Postgres.
DoubleCloud is a managed data platform that helps engineering teams build data infrastructure with zero-maintenance open-source technologies.
In this guide, you will learn how to replicate data from a Neon Postgres database to a managed ClickHouse cluster with DoubleCloud Transfer — a real-time data replication tool. It natively supports ClickHouse data types, data mutations, automated migrations (adding columns), as well as emulating insertions and deletions. With Transfer, you can replicate your data to both managed ClickHouse clusters on DoubleCloud and on-premise ClickHouse instances.
Prerequisites
- A DoubleCloud account
- A Neon account
- Read the important notices about logical replication in Neon before you begin
Enable logical replication in Neon
important
Enabling logical replication modifies the Postgres wal_level
configuration parameter, changing it from replica
to logical
for all databases in your Neon project. Once the wal_level
setting is changed to logical
, it cannot be reverted. Enabling logical replication also restarts all computes in your Neon project, meaning active connections will be temporarily dropped before automatically reconnecting.
To enable logical replication in Neon:
- Select your project in the Neon Console.
- On the Neon Dashboard, select Settings.
- Select Logical Replication.
- Click Enable to enable logical replication.
You can verify that logical replication is enabled by running the following query from the Neon SQL Editor:
Create a Postgres role for replication
It is recommended that you create a dedicated Postgres role for replicating data. The role must have the REPLICATION
privilege. The default Postgres role created with your Neon project and roles created using the Neon CLI, Console, or API are granted membership in the neon_superuser role, which has the required REPLICATION
privilege.
Grant schema access to your Postgres role
If your replication role does not own the schemas and tables you are replicating from, make sure to grant access. For example, the following commands grant access to all tables in the public
schema to Postgres role replication_user
:
GRANT USAGE ON SCHEMA public TO replication_user;
GRANT SELECT ON ALL TABLES IN SCHEMA public TO replication_user;
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT SELECT ON TABLES TO replication_user;
Granting SELECT ON ALL TABLES IN SCHEMA
instead of naming the specific tables avoids having to add privileges later if you add tables to your publication.
Unlike replicating to other destinations, you don't need to configure a publication and replication slot manually. DoubleCloud Transfer does that for you automatically.
Add DoubleCloud Transfer's IPs to the allowlist
If you are using Neon's IP Allow feature to limit IP addresses that can connect to Neon, add DoubleCloud Transfer's IPs to your allowlist in Neon:
# IPv6
2a05:d014:e78:3500::/56
# IPv4
3.77.1.232
3.74.181.206
3.78.156.2
3.77.29.32
3.125.212.122
For instructions, see Configure IP Allow. You'll need to do this before you can validate your connection in the next step. If you are not using Neon's IP Allow feature, you can skip this step.
Create a managed ClickHouse cluster on DoubleCloud
tip
If you already have a ClickHouse instance — for example, an on-premise one — and you want to transfer data there, skip this step and continue with steps described in Create endpoints in DoubleCloud.
- Log in to the DoubleCloud console.
- In the left menu, select Clusters, click Create cluster, and select ClickHouse.
- Select cluster parameters.
note
If you're just testing ClickHouse, you can proceed with default parameters that will create a fully functional cluster suitable for testing and development. For production, make sure to select at least three replicas, 16 GB of RAM, and dedicated Keeper hosts to ensure high availability.
-
Under Basic settings, enter the cluster name, for example
clickhouse-dev
. -
Click Submit at the bottom of the page. Creating a cluster takes around five minutes depending on the provider, region, and settings.
-
After the cluster status changes from Creating to Alive, select it in the cluster list.
-
On the Overview tab, click WebSQL at the top right.
WebSQL is a DoubleCloud service that allows you to connect to your managed ClickHouse clusters from your browser tab. It provides a full-fledged SQL editor that you can use to view databases and execute SQL queries.
-
Select a database in the connection manager on the left to open the query editor.
-
Create a database:
CREATE DATABASE IF NOT EXISTS <database_name> ON CLUSTER default
-
Make sure that the database has been created:
SHOW DATABASES
┌─name───────────────┐ │ INFORMATION_SCHEMA │ │ _system │ │ default │ │ <database_name> │ // your database │ information_schema │ │ system │ └────────────────────┘
Create endpoints in DoubleCloud
Before you create a transfer in DoubleCloud, you need to create a source endpoint that fetches data from Neon and a target endpoint that writes the data to ClickHouse.
To create a source endpoint:
-
In the left menu in the console, select Transfer.
-
Click Create → Source endpoint.
-
Under Basic settings, select PostgreSQL as the source type.
-
Enter a name for your source endpoint, for example
neon
. -
Under Endpoint parameters, enter connection details for your Neon database. You can get these details from your Neon connection string, which you'll find in the Connection Details widget on the Dashboard of your Neon project. For example, let's say this is your connection string:
postgresql://alex:AbC123dEf@ep-cool-darkness-123456.us-east-2.aws.neon.tech/dbname?sslmode=require
From this string, the values would show as below. Your actual values will differ, with the exception of the port number.
- Host: ep-cool-darkness-123456.us-east-2.aws.neon.tech
- Port: 5432
- Username: alex
- Password: AbC123dEf
- Database Name: dbname
-
Click Test connection and if it's successful, click Submit.
To create a target endpoint:
-
In the left menu in the console, select Transfer.
-
Click Create → Target endpoint.
-
Under Basic settings, select ClickHouse as the target type.
-
Enter a name for your source endpoint, for example
clickhouse
. -
If you created a managed ClickHouse cluster in DoubleCloud, select it as the target endpoint in Connection settings → Managed cluster.
If you want to transfer data to a ClickHouse instance elsewhere, select On-premise in Connection settings → Connection type and specify the connection details.
-
Enter the database name.
-
Click Test connection and if it's successful, click Submit.
Create a transfer in DoubleCloud
- In the left menu in the console, select Transfer and click Create transfer.
- Under Endpoints, select the source and target endpoints you created in the previous step.
- Enter the transfer name, for example
neon-to-clickhouse
. - Under Transfer settings, select Snapshot and replication as the transfer type and specify transfer parameters if needed.
tip
Even when logical replication isn't available on the Neon side, you can schedule Transfer to copy incremental data from Postgres to ClickHouse at a given interval. For that, enable Periodic snapshot and specify the time period.
-
Click Submit to create the transfer.
-
On the transfer page, click Activate.
When the data has transferred, the transfer status changes to Done.
Query the transferred data with WebSQL
note
You can use WebSQL only to connect to managed ClickHouse clusters on DoubleCloud. If you've transferred data to an on-premise ClickHouse cluster, use the ClickHouse client or a similar tool to connect to it.
-
In the left menu, select Clusters and select your cluster from the list.
-
On the Overview tab, click WebSQL at the top right.
-
Select the database you created earlier in the connection manager on the left.
-
In the query editor, enter and execute your query.
The query output will be displayed under the editor.
References
Need help?
Join our Discord Server to ask questions or see what others are doing with Neon. Users on paid plans can open a support ticket from the console. For more detail, see Getting Support.