Snowflake 101: 5 Ways to Build a Secure Data Cloud

Check out the on-demand sessions from the Low-Code/No-Code Summit to learn how to successfully innovate and gain efficiencies by improving and scaling citizen developers. look now.


Today, Snowflake is the favorite for all data. The company started as a simple data warehouse platform a decade ago, but has since evolved into a full-featured data cloud supporting a wide range of workloads, including that of a data lake.

More than 6,000 enterprises currently trust Snowflake to manage their data workloads and produce insights and applications for business growth. Together, they have over 250 petabytes of data on the data cloud, with over 515 million data workloads running every day.

Now, when the scale is this large, cybersecurity issues are inevitable. Snowflake recognizes this and offers scalable security and access control features that ensure the highest levels of security not only for accounts and users, but also for the data they store. However, organizations can miss out on some basics, leaving data clouds partially secure.

Here are some quick tips for closing those gaps and creating a secure enterprise data cloud.

Event

Smart Security Summit

Learn about the critical role of AI and ML in cybersecurity and industry-specific case studies on December 8. Sign up for your free pass today.

Register now

1. Secure your connection

First, all organizations using Snowflake, regardless of size, should focus on using secure networks and SSL/TLS protocols to prevent network-level threats. According to Matt Vogt, vice president of global solutions architecture at Immuta, a good way to start would be to connect to Snowflake via a private IP address using private connectivity from cloud service providers such as AWS PrivateLink or Azure. Private Link. This will create private VPC endpoints that allow direct and secure connectivity between your AWS/Azure VPCs and the Snowflake VPC without traversing the public internet. On top of that, network access controls, such as IP filtering, can also be used for third-party integrations, further enhancing security.

2. Protect source data

While Snowflake offers multiple layers of protection — like time travel and fail-safe — for data that has already been ingested, these tools can’t help if the source data itself is missing, corrupted, or compromised (like malicious encryption for ransom) in any way. This type of problem, as suggested by Clumio’s vice president of product, Chadd Kenney, can only be solved by adopting measures to protect the data while it resides in an object storage repository such as Amazon. S3 – before ingestion. Also, to protect against logical deletions, it’s a good idea to keep continuous, immutable, and preferably isolated backups that are instantly recoverable in Snowpipe.

3. Consider SCIM with multi-factor authentication

Enterprises should use SCIM (Cross-Domain Identity Management System) to facilitate automated provisioning and management of user identities and groups (i.e. roles used to authorize access to objects such as tables, views, and functions) in Snowflake. This makes user data more secure and simplifies the user experience by reducing the role of local system accounts. Additionally, by using SCIM where possible, enterprises will also have the ability to configure SCIM providers to synchronize users and roles with Active Directory users and groups.

In addition to this, companies should also use multi-factor authentication to implement an additional layer of security. Depending on the interface used, such as client applications using drivers, Snowflake UI, or Snowpipe, the platform may support multiple authentication methods including username/password, OAuth, pair of keys, external browser, federated authentication using SAML, and native Okta authentication. If multiple methods are supported, the company recommends favoring OAuth (either Snowflake OAuth or External OAuth), followed by External Browser Authentication, Native Okta Authentication, and Paired Authentication. keys.

4. Column Level Access Control

Organizations should use Snowflake’s dynamic data masking and external tokenization capabilities to restrict certain users’ access to sensitive information in certain columns. For example, dynamic data masking, which can dynamically mask column data based on who is querying it, can be used to restrict column visibility based on the user’s country, such as a US employee cannot view only US order data, while French employees can only view order data from France.

Both features are quite efficient, but they use masking policies to work. To get the most out of this, organizations should first determine whether they want to centralize masking policy management or decentralize it to individual database owner teams, depending on their needs. Additionally, they should also use invocator_role() in policy conditions to allow unauthorized users to view aggregate data on protected columns while keeping individual data hidden.

5. Implement a unified audit model

Finally, organizations should remember to implement a unified audit model to ensure transparency of the policies implemented. This will help them actively monitor policy changes, like who created which policy who granted user X or group Y access to certain data, and is as critical as monitoring queries and policy patterns. data access.

To view account usage patterns, use the system-defined read-only shared database named SNOWFLAKE. It has a schema named ACCOUNT_USAGE containing views that provide access to one year of audit logs.

VentureBeat’s mission is to be a digital public square for technical decision makers to learn about transformative enterprise technology and conduct transactions. Discover our Briefings.

Leave a Comment