For the best web experience, please use IE11+, Chrome, Firefox, or Safari

What is Snowflake and what does it do?

What is Snowflake?

Snowflake is the most popular Software-as-a-Service (SaaS) based cloud data warehouse solution. It works on all major cloud platforms, including AWS (Amazon Web Services), Microsoft Azure, and GCP (Google Cloud Platform). Organizations can use Snowflake to store and analyze massive amounts of data with high scalability with decoupled storage and compute tiers. It is flexible enough to work with structured and semi-structured data. As a data warehouse solution, Snowflake acts as a single endpoint for performing the Extract, Transform, and Load (ETL) processes to get meaningful insights from data.

What are the benefits of using Snowflake?

There are many reasons why organizations are using Snowflake. Here’s some of the key benefits and advantages:

  • Fast data retrieval: Snowflake works efficiently with massive data volumes with faster data retrieval.
  • Vertical scalability: Its unique architecture separates the storage and compute resources. Therefore, users can scale storage or compute independently as per workload requirements. Its vertical scaling allows an increase in the compute resources for handling large workloads.
  • Horizontal scalability: Snowflake’s horizontal scaling allows adding compute clusters for distributing workloads to handle concurrent workloads.
  • Auto-scaling: Snowflake supports auto-scaling that dynamically configures the compute resources. For example, adding more clusters if the workload increases and scaling down clusters if the workload decreases. This auto-scaling ensures high performance with cost optimization.
  • Concurrency: Snowflake is designed to handle highly concurrent workloads without impacting performance. It ensures multiple users can access the environments with high performance and reliability.
  • Advanced security: Snowflake has industry-leading security features for protecting data. These features include federated authentication and single sign-on (SSO), key-pair authentication, multi-factor authentication, network policies and rules, private endpoints, access controls, encryption, and an OAuth security interface.
  • Extensive data integration support: Snowflake supports various data integration tools and technologies. Examples are Ascend, Azure Data Factory, Coalesce, Census, Datameer, Data Virtuality, Diyotta, Etleap, and etlworks.
  • Business Intelligence support: Various BI tools and technologies provide native support to Snowflake. For example, Adobe Analytics, Amazon QuickSight, Astrato, Google Looker Studio, IBM Cognos Analytics, Microsoft Power BI, Oracle Analytics, and Pentaho Business Analytics.
  • Flexible billing methods: It has on-demand month-to-month and pre-purchased resources. Its monthly billing is per the data storage (compressed terabytes per month) and compute resources.
  • SQL queries: It uses the SQL (SnowSQL) command line client tool to execute the SQL queries and perform DLL\DML operations.
  • Data sharing: Snowflake's unique feature includes data sharing among partners without duplicating it. Users only pay for data processing since no storage data exchange is involved.
  • Seamless integration: Snowflake supports seamless integration with various data integration tools, allowing organizations to ingest and transform data from different sources into their Snowflake data warehouse.

How does Snowflake work?

Snowflake is a cloud-native data warehouse solution. Snowflake has three layers of unique architecture. These layers are database storage, query processing, and cloud services.

  1. Database storage: Snowflake stores data in cloud storage, such as Amazon S3, Google Cloud Storage, and Azure Blob Storage. It stores data in optimized, compressed, and columnar formats. Data is not directly accessible by the users. Users can access it through SQL queries in Snowflake.
  2. Query processing: Snowflake query execution occurs in the processing layer. It uses a concept known as a virtual warehouse or warehouse, which is a cluster of computing resources. Each warehouse comprises multiple compute nodes (MPP compute cluster) allocated by Snowflake. Each virtual warehouse is an independent compute cluster. It provides all the required resources for running queries in a Snowflake session. These queries include DML queries such as updating, loading, and unloading data.
  3. Cloud services: The cloud service layer comprises various services for coordinating multiple activities in Snowflake. It includes authentication, infrastructure management, metadata management, query parsing, optimization, and access control.
How does Snowflake work?

Is Snowflake a database?

Yes, Snowflake is a database warehouse solution built for cloud-native architecture. It can be considered a cloud-based relational database or SaaS data warehouse. Snowflake includes the features of the traditional database with cloud capabilities. A few characteristics are as follows:

  • Snowflake stores data in compressed, optimized, and multiple micro partitions in the columnar format.
  • Snowflake supports the standard SQL statements for querying data. These statements include SELECT, TOP, FROM, JOIN, PIVOT, UNPIVOT, GROUP BY, HAVING, ORDER BY, LIMIT, FETCH.
  • Like a traditional database, it includes security features such as encryption, access controls, and multiple authentication mechanisms.

It provides mechanisms to quickly scale up or down your Snowflake resources to meet workload-changing needs.

Is Snowflake just a data warehouse?

Snowflake is a cloud database warehouse. However, its features and functionalities extended its capabilities beyond a traditional data warehouse solution. A few useful features are as follows:

  • Management and Development Capabilities
  • It has a collection of tools for working with the data warehouse solution.
  • SnowSight is for account management, resource monitoring and system usage, and querying data.
  • SnowSQL: It is a Python-based command line client tool.
  • Snowflake extension for Visual Studio code
  • Extensibility and Integration Capabilities
    • Snowflake's integrations with various data science, machine learning, and business intelligence tools enable it to expand its capability without depending on a specific tool for Extract-Transform-Load (ETL) operations.
    • Snowflake handles structured and semi-structured data such as JSON, Parquet, and Avro, which makes it suitable for data lakes.
    • It has an extensive set of client connectors and drivers. For example, Python connector, Spark Connector, Node.JS driver, Go Snowflake driver, JDFC client drive, and ODBC client driver.
  • Security and Compliance Capabilities
    • It supports business critical and sensitive data in compliance with the HIPAA and HITRUST CSF regulations.
    • It provides a granular object access control mechanism.
    • It provides isolation of data for loading and unloading using Amazon S3 policy controls, Azure Storage access controls, and Google Cloud storage access permissions.
    • It has built-in support for authentication methods: multi-factor authentication, federated authentication and single sign-on (SSO), Snowflake OAuth, and external OAuth.
    • Federated authentication and single sign-on (SSO).
    • It allows organizations to choose the geographical region for storing the data as required.
    • Choose the location of your data stored based on your region.
    • Snowflake enables users to recover their historical data for disaster recovery.

Is Snowflake SQL or MySQL?

Snowflake does not have a direct relationship with SQL or MySQL.

  • SQL is a language that works with relational database systems
  • MySQL is an open-source relational database system.

At a high level, the differences between Snowflake and MySQL are as follows.

MySQL

Snowflake

It is a traditional RDBMS.

It is a cloud-based solution.

It supports the horizontal scaling of resources.

It supports both horizontal and vertical scaling.

It uses standard SQL language with variations.

It uses Snowflake SQL to query the underlying data.

It is available on on-premises and cloud deployment models.

It is a fully managed AWS, Azure, or GCP cloud service.

Its architecture uses a server with storage and computing together.

Snowflake architecture includes separation for storage and computing.

Its main objective is to support the structured data. MySQL extends support for semi-structured data in limited form as well.

It supports structured, semi-structured, and Unstructured data as well.

MySQL is mainly used for web applications or workloads requiring a relational database for storing data.

Snowflake's purpose is to analyze data processing and data warehousing.

Get started now

Request a free trial of SharePlex® and see how to efficiently move workloads to Snowflake.