Zero Copy Cloning in Snowflake Cloud Data Warehouse

Zero copy cloning is one of the cool and innovative features of Snowflake. In this tutorial you will learn:

What is zero copy cloning in Snowflake ?

Cloning, also referred to as “zero-copy cloning” creates a copy of a database, schema or table. A snapshot of data present in the source object is taken when the clone is created, and is made available to the cloned object. The cloned object is writable, and is independent of the clone source. That is, changes made to either the source object or the clone object are not part of the other

Advantages of zero copy cloning in Snowflake ?

Here are some of the benefits of zero copy cloning:

Zero Copy cloning is super fast, often referred to "Fast Clone" in Snowflake

You normally have to wait hours, days, maybe weeks to spin up a copy of your production data warehouse so you could have a test or a development environment? And have to pay extra for the test or development environment to able to hold all the replicated data? Or maybe you have some new data scientists that just want to play around with the data but they really need their own copy?

Cloning is very fast (as in it takes barely a minute to clone a 2TB database!), it allows you to have multiple copies of your data without the additional cost of storage usually associated with replicating data

Zero Copy Cloning saves you in Storage costs

Zero Copy cloning creates a copy of the object without having to duplicate the underlying storage. When a clone is created of a table, the clone utilizes no data storage because it shares all the existing micro-partitions of the original table at the time it was cloned; however, rows can then be added, deleted, or updated in the clone independently from the original table. Each change to the clone results in new micro-partitions that are owned exclusively by the clone and are protected through CDP

Zero copy cloning syntax in Snowflake

To clone a entire production database and make it available for development:

CREATE DATABASE Dev CLONE Prod;

To clone a schema

CREATE SCHEMA Dev.DataSchema1 CLONE Prod.DataSchema1;

To clone a single table:

CREATE TABLE C CLONE Dev.public.C;

FAQs on Zero copy cloning in Snowflake

Here are some of the frequently asked questions:

In case I do some DDL operation on the cloned table will that not affect my original table ?

No. the clone is its very own copy, any changes made to the clone are local to the clone

If there are some changes to original tables data like insert/update of rows will it automatically get reflected in the cloned data as well ?

No, when the clone is created it is a point in time view of the data. If you want a clone copy which receives updates from the source you could use Snowflake Data Sharing

Since zero-copy cloning is a copy of the database,schema or table do we pay for storage twice or only one

There is no additional storage cost for the zero copy cloning as all the data is shared from the source version. The only additional cost would be for any changed data blocks which are local to the cloned copy

What is the difference between cloning vs Time Travel in Snowflake ?

Time travel allows you to query tables as they were at a point in time in the past, up to 90 days with Enterprise edition. You can think of cloning as a point in time snapshot. You can clone from any point in time version within the 90 day retention period. Any clones which exist beyond the retention period are retained which means clones can be used as a form of backup

If the original table data has been deleted then will the clonned table be able to see the deleted data

Yes, the cloned table will still have the original data

Zero copy cloning sounds like magic. How does it work behind the scenes ?

Snowflake uses immutable blocks in the object store. When you delete something, it isn't really deleted. Instead, the Snowflake Global Services layer just records that the blocks are no longer relevant for that table. If table 1 had 0 days of retention, those blocks would have eventually been flushed after the data had been deleted. However, if you had cloned table 1 prior to deleting the data, the Global Services layer would remember those blocks were still being referenced and they wouldn't be flushed



Next Section: Data Sharing



MetricCamp.com