5 Tips and Tricks for Snowflake Data Warehousing
Author: Tobin Thankachen | 7 min read | September 22, 2022
Snowflake is a cloud data warehouse platform that has recently surged in popularity. When Snowflake went public in September 2020, it was the largest software IPO in history and one of the biggest IPOs of 2020. Since then, Snowflake has risen to claim 19 percent of the data warehousing market, second behind only Amazon Redshift.
With so many new adopters of Snowflake, it’s no surprise that many businesses need help optimizing their data warehouse. One of Datavail’s recent clients, a nonprofit in the education industry, needed advice on fine-tuning their Snowflake installation and using it to the fullest extent. Datavail highlighted five issues with the client’s deployment of their Snowflake data warehouse:
- Lack of separation: Instead of dividing their data into development, test, and production environments, the client placed it all into a single database.
- Backups and business continuity: The client had not yet implemented a robust backup strategy for their Snowflake data warehouse.
- Data quality: The client was not validating the quality of incoming data—neither the raw data, nor the staging data after performing data cleansing.
- Data privacy: The client did not leverage data masking to protect sensitive and confidential information during the data integration process.
- Data security: By default, the client allowed any IP address to access Snowflake (including users outside the organization), creating a security risk.
Below, we’ll discuss the tips and best practices for addressing these Snowflake data warehousing problems, so that you can enjoy these benefits with your own Snowflake deployment.
1. Lack of separation
During the software lifecycle, it’s extremely common for users to work in separate environments such as development, testing, quality assurance, and production. What’s more, the contents of the underlying data warehouse may differ between these environments. For example, a test data warehouse may contain unorthodox records designed to check for errors, or it may have an extremely large number of records in order to stress test the software.
For this reason, setting up different Snowflake data warehouses for different environments is an IT best practice. Snowflake provides features such as Zero-Copy Cloning that make it easy to transfer the contents of a data warehouse from one environment to another.
2. Backups and business continuity
Database backups are an essential practice for any data-driven enterprise. Snowflake offers the Time Travel feature, allowing users to access historical data up to 90 days in the past. For more severe cases (e.g. a system failure or data breach), Snowflake includes the Fail-safe data recovery service when other alternatives have failed.
Snowflake also helps businesses improve the availability and reliability of their information. With Snowflake’s database replication, users can run multiple copies of the data warehouse in different regions, zones, or cloud providers. If the primary instance of the data warehouse goes down, users can fall back to a secondary copy without disrupting business operations.
3. Data quality
Increasing the quality of information in your Snowflake data warehouse will require a fresh look at the first two stages of ETL: extracting and transforming the data. Start by examining the various sources that you consume during the extraction stage—from internal files and databases to external websites, services, and APIs. Are there any sources that are out of date or broken, or that you no longer use?
Next, make use of solutions such as dbt (data build tool) to improve the data transformation stage. dbt is an open-source tool that makes it easier to engineer robust, production-ready data pipelines while improving the modularity and portability of your transformation workflows. The dbt Cloud tool is part of Snowflake Partner Connect, which makes it a snap to try out the dbt platform.
4. Data privacy
Different individuals, teams, or departments in an organization may require different levels of access to confidential information. This has led to the rise of data masking: concealing specific columns or tables of data from those who don’t need to use it (in particular, from external intruders).
Snowflake’s Dynamic Data Masking feature helps ensure privacy for the sensitive information inside your data warehouse. Using Dynamic Data Masking, data privacy officers can craft policies that instantly apply to all the files in your Snowflake data warehouse, and then change these policies as your needs evolve.
5. Data security
Data privacy protects sensitive information from attackers, even if they manage to breach your defenses. What’s even better, however, is data security: ensuring that adversaries can’t access the data warehouse in the first place.
Snowflake includes several measures to help bolster the security of your enterprise data. These include:
- Network policies that allow users to restrict access to certain IP addresses or ranges of IP addresses.
- Role-based access control that sorts users into groups and provides certain access privileges to each group.
- Multi-factor authentication to thwart phishing scams and other means of stealing user credentials.
How Datavail Can Help with Snowflake Data Warehouses
From separating environments to concerns about data privacy and security, making the most of your Snowflake data warehouse is no mean feat. If you need help along the way, it’s a wise idea to join forces with an experienced data services provider like Datavail.
Datavail is a Snowflake technology partner with an in-house staff of skilled data warehousing experts. We have the knowledge and expertise our clients need from start to finish—from the initial IT assessment to the data warehouse implementation and beyond. Our list of Snowflake professional services includes:
- Data warehouse migrations
- Snowflake integrations and connectors
- Roadmaps and strategic planning
- Warehouse design and performance tuning
- Long-term support and maintenance
To learn how Datavail helped one client improve their Snowflake, Tableau, data integration, and software development processes, read our case study “Training and Education Organization Learns How to Get More from their Snowflake Data Warehouse Analytics Environment.” You can also get in touch with our team of data experts today to discuss how Datavail can help achieve your business needs and objectives.