---
title: The pg_mooncake extension
subtitle: Fast analytics in Postgres with columnstore tables and DuckDB execution
enableTableOfContents: true
updatedOn: '2025-08-02T10:33:29.237Z'
---
The [pg_mooncake](https://github.com/Mooncake-Labs/pg_mooncake) extension enables fast analytic workloads in Postgres by adding native columnstore tables and vectorized execution (DuckDB).
Columnstore tables improve analytical queries by storing data vertically, enabling compression and efficient column-specific retrieval with vectorized execution.
`pg_mooncake` columnstore tables are designed so that only metadata is stored in Postgres, while data is stored in an object store as Parquet files with [Iceberg](https://iceberg.apache.org/)or [Delta Lake](https://delta.io/) metadata.
Queries on `pg_mooncake` columnstore tables are executed by DuckDB.
The extension is maintained by [Mooncake Labs](https://www.mooncake.dev/).
You can create and use `pg_mooncake` columnstore tables like regular Postgres heap tables to run:
- Transactional `INSERT`, `SELECT`, `UPDATE`, `DELETE`, and `COPY` operations
- Joins with regular Postgres tables
In addition, you can:
- Load Parquet, CSV, and JSON files into columnstore tables
- Load Hugging Face datasets
- Run DuckDB specific aggregate functions like `approx_count_distinct`
- Read existing Iceberg and Delta Lake tables
- Write Delta Lake tables from Postgres tables
`pg_mooncake` is an open-source extension for Postgres that can be installed on any Neon Project using the instructions below.
## Use cases for pg_mooncake
`pg_mooncake` supports several use cases, including:
1. Analytics on Postgres data
2. Time Series & Log Analytics
3. Exporting Postgres tables to your Lake or Lakehouse
4. Querying and updating existing Lakehouse tables and Parquet files directly in Postgres
This guide provides a quickstart to the `pg_mooncake` extension.
## Enable the extension
The `pg_mooncake` extension is currently in Beta and classified as experimental in Neon. A separate, dedicated Neon project is recommended when using an extension that is still in Beta. For additional guidance, see [Experimental extensions](/docs/extensions/pg-extensions#experimental-extensions).
While the `pg_mooncake` extension is in Beta, you need to explicitly allow it to be used on Neon before you can install it. To do so, connect to your Neon database via an SQL client like [psql](/docs/connect/query-with-psql-editor) or the [Neon SQL Editor](/docs/get-started/query-with-neon-sql-editor) and run the `SET` command shown below.
```sql
SET neon.allow_unstable_extensions='true';
```
Install the extension:
```sql
CREATE EXTENSION pg_mooncake;
```
## Set up your object store
Run the commands outlined in the following steps on your Neon database to setup your object store.
_If you don't have an object storage bucket, you can get a free S3 express bucket [here](https://s3.pgmooncake.com/). When using the free s3 bucket, the `SELECT` and `SET` statements defined below are generated for you, which you can copy and run._
Add your object storage credentials. In this case, S3:
```sql
SELECT mooncake.create_secret('', 'S3', '',
'', '{"REGION": ""}');
```
Set your default bucket:
```sql
SET mooncake.default_bucket = 's3://';
```
The `pg_mooncake` extension also supports R2 and GCP buckets. For set up instructions, refer to **pg_mooncake's** [cloud storage docs](https://pgmooncake.com/docs/cloud-storage).
In the future, you will not have to bring your own bucket to use `pg_mooncake` with Neon.
## Create a columnstore table with `USING columnstore`
Run the following SQL statement on your Neon database to create a columnstore table:
```sql
CREATE TABLE reddit_comments(
author TEXT,
body TEXT,
controversiality BIGINT,
created_utc BIGINT,
link_id TEXT,
score BIGINT,
subreddit TEXT,
subreddit_id TEXT,
id TEXT
) using columnstore;
```
## Load data
You can find a list of data sources [here](https://pgmooncake.com/docs/load-data).
This dataset has 13 million rows and may take a few minutes to load.
```sql
INSERT INTO reddit_comments
(SELECT author, body, controversiality, created_utc, link_id, score, subreddit, subreddit_id, id
FROM mooncake.read_parquet('hf://datasets/fddemarco/pushshift-reddit-comments/data/RC_2012-01.parquet')
AS (author TEXT, body TEXT, controversiality BIGINT, created_utc BIGINT, link_id TEXT, score BIGINT, subreddit TEXT, subreddit_id TEXT, id TEXT));
```
## Query the table
Queries on columnstore tables are executed by DuckDB. For example, this aggregate query runs in ~200 milliseconds on 13 million rows:
```sql
-- Top commenters (excluding [deleted] users)
SELECT
author,
COUNT(*) as comment_count,
AVG(score) as avg_score,
SUM(score) as total_score
FROM reddit_comments
WHERE author != '[deleted]'
GROUP BY author
ORDER BY comment_count DESC
LIMIT 10;
```
## References
- [Repository](https://github.com/Mooncake-Labs/pg_mooncake)
- [Documentation](https://pgmooncake.com/docs)
- [Architecture](https://www.mooncake.dev/blog/how-we-built-pgmooncake)
- [YouTube demo](https://youtu.be/QDNsxw_3ris?feature=shared&t=2048)