Determining the Root Cause of a Data Breach With “The 5 Whys”

January 29, 2013

The jarring sound of an iPhone vibrating against a mahogany nightstand at 3:15am.  This can’t be good.  Server down?  Much worse: 50,000 sensitive files have been stolen from a poorly permissioned file server.  First, damage control.  Next, investigation.

Problem: 50,000 files were stolen.

Why?  The files were accessible to everyone in the company, even guests.

Why?  The folder’s access control list was configured incorrectly.

Why?  Chuck the intern configured that file server in 2007 and it hasn’t been reviewed since.

Why?  We don’t have a process to review file system permissions.

Why?  Because manually reviewing every folder’s ACL for problems is like searching for a needle in a haystack…and THERE’S ONLY THREE OF US AND A THOUSAND FILE SERVERS! SHEESH!

This fun little question-asking technique is called The 5 Whys.  It was developed by Sakichi Toyoda at Toyota to determine the root cause—and solution—to any given problem in the manufacturing process.  The technique has been borrowed by coders, sysadmins, and startup founders alike.

See, behind every technical problem is usually a human problem.

On the surface, it seems like the above fictional security incident was technical in nature – the ACL was configured incorrectly.  Deep down, however, the problem was the company’s non-existent entitlement review policy.

The 5 Whys technique encourages us to address the problem on multiple levels: fix the ACL, stop letting interns configure important systems by themselves, and institute a system for performing periodic entitlement reviews.

Sometimes it’s not feasible to immediately address every single problem uncovered, but 5 Whys suggests that if you make a proportional investment in the solution every time an incident occurs, you’ll eventually get to a point where you have an optimal level of protection against a given problem.  In our example, maybe you’d start by piloting entitlement reviews with a small business unit, or review just the super sensitive data sets.

The 5 Whys is an excellent technique for determining root cause so you can take reactive steps to ensure a problem doesn’t happen twice.  In my next post I’m going to talk about a new model for holistically evaluating your company’s risk profile so you can make proactive improvements.


Using Varonis: Involving Data Owners (Part I)

January 2, 2013

(This one entry in a series of posts about the Varonis Operational Plan – a clear path to data governance.  You can find the whole series here.)

Almost every organization is now data driven. With all the talk about data growth and big data analytics over the past couple of years, people have started to ask: “How do we maximize the value of our data? How can we make sure we’re deriving real business benefit?”

The keys to maximizing the value of our data are to gather the right intelligence about it, and then give the right people the ability to take action using the intelligence you’ve gathered.

Now that we know who our Data Owners are, it’s time to start getting them involved. Remember that it’s the owners—not IT—that have adequate context to make decisions about who should and shouldn’t have access to their assets.

The next step in operationalizing Varonis is to provide owners intelligence about their data assets.  DatAdvantage can deliver data-driven reports that shed light on what is happening with their data: who can access it, what they’re doing with it, which data is stale, etc. These reports greatly simplify and optimize reporting by delivering reports to all owners which contain information aboutonly the data they own.

An Example

Say you’ve spent a few weeks identifying and confirming business owners for all of the top-level folders on a large NAS (or two, or three…). Depending on the size of the company, this might be a few dozen or a few thousand people. One of the most common next steps is to provide permissions reports on all of these data sets to the relevant owners. So the HR owner gets a report on all of the users who have access to the HR folder, for instance. It’s the same with Finance, Marketing, R&D, etc. In the past, you would have to create and deliver a separate report for each owner, which depending on the complexity of your reporting process might be an onerous undertaking all by itself. DatAdvantage gives you a far better alternative.

In DatAdvantage, to accomplish the same thing, you’d only need to create a single report, and all owners would get permissions reports once a quarter (or however often you like). Create the report, include the proper filters and formatting, and then set up a data-driven subscription to be delivered on the first day of the first month of the quarter. That’s it you’re done.

Every quarter, every data owner is going to get that report in their inbox, and the report will contain information about only the data that they own—they won’t see anything that doesn’t belong to them. As you add and change owners over time, the subscription will continue to work without intervention. If my job role changes and suddenly I’m the owner of additional folders, my permissions report will show those as well. If I’m no longer an owner, my report won’t contain information about what I no longer own.

Permissions reporting is a great use case for data driven reports, and it’s not the only one. Reports that show actual access can be useful, too.  What if every data owner could see exactly who on their team was accessing data most? What about those people who weren’t accessing any? Or people from outside their team bumbling around?  Who creates content? Showing owners what data is stale or which folders are growing the fastest can help give them understanding of how their using resources. Providing owners intelligence about where their sensitive data is, where it’s exposed, and who has been accessing it lead to informed decisions about how they can reduce risk.

Once you’ve started putting intelligence into the hands of your owners, the next step is to give them the power to take action without bugging IT. We’ll cover that next.


Top 3 SharePoint Security Challenges

December 14, 2012

The rapid adoption of SharePoint has outpaced the ability of organizations to control its growth and enforce consistent policies for security and access control. The ease with which SharePoint sites can be created means that SharePoint use is decentralized and often outside the purview of IT departments, security personnel and even dedicated SharePoint administrators.

So what are the top 3 SharePoint security challenges?

1 – Organic and chaotic deployment of SharePoint sites

Pervasive departmental use of SharePoint means that all types of data makes its way into SharePoint repositories. This can range in sensitivity and importance and may easily include human resources or product information. So, now the problem for organizations becomes not only identifying sensitive data but locating all SharePoint sites, existing and emerging.

2 – Ad hoc, complex permissions administration

The levels and types of permissions available with SharePoint are more complex than their NTFS counterparts, and the additional granularity and inheritance complexity creates more access levels and a high probability for erroneous or overly permissive access.

While access control decisions may be (rightly) left to the data owners through SharePoint’s permissions workflow, the complexity of its implementation often leads to inconsistency in ACL configuration and group assignment. Without strict auditing and oversight, permissions may be set in conflict with enterprise-level access policies, and may not include key business intelligence about why the access should be limited (e.g., content might be regulated or copyright protected).

3 – Limited, resource-intense auditing

Key to maintaining good access control over data is continuous monitoring of how data is being used. This is another challenge with a SharePoint environment. Microsoft SharePoint audit detail is geared toward helping site administrators manage content, not toward refining access policy. Consequently there is no way for SharePoint administrators to easily establish which users took what action on data.

The native auditing capabilities are also limited in terms of scalability across sites. “Normalizing” the data, i.e., creating a unified and accurate view of data use and access across sites and locations, is challenging and time-intensive. Exacerbating the problem is that files on SharePoint often make their way to other platforms like file shares and email – without a unified audit trail of activity, understanding how and by whom data is accessed in the collaborative environment can be a significant challenge.

Download our FREE guide to learn how to make sense of SharePoint permissions & lock down and monitor your sensitive data.


Top 5 Things IT Should Be Doing, But Isn’t

December 7, 2012

Posted on December 5, 2012 by 

A clear path to effective information governance.

1. Audit Data Access

Effective management of any data set is impossible without a record of access. Unless one can reliably observe data use, one cannot observe its non-use, misuse, or abuse. Without a record of data usage, one cannot answer critical questions—from the most basic ones, like “who deleted my files, what data does this person or people use, and what data isn’t used?” to more complex questions, “like who owns a data set, which data sets support this business unit, and how can I lock down data without disrupting workflows?”

2. Inventory Permissions and Directory Services Group Objects

Effective management of any data set is also impossible without understanding who has access to it. Access controls lists and groups (in Active Directory, LDAP, etc.) are the fundamental protective control mechanism for all unstructured and semi structured data platforms, yet too often IT cannot easily answer fundamental data protection questions like, “Who has access to a data set?” and “What data sets does a user or group have access to?” Answers to these questions must be accurate and accessible for data protection and management projects to succeed.

3. Prioritize Which Data Should Be Addressed

While all data should be protected, some data needs to be protected much more urgently than other data. Some data sets have well known owners and well defined processes and controls for their protection, but many others are less understood. With an audit trail, data classification technology, and access control information, organizations can identify active and stale data, data that is considered sensitive, confidential, or internal, and data that is accessible to many people. These data sets should be reviewed and addressed quickly to reduce risk.

Access our FREE Full Report, including the complete list of IT Must Do’s.

4. Remove Global Access Groups from ACLs (like “Everyone”) – especially where sensitive data is located

It is not uncommon for folders on file shares to have access control permissions allowing “Everyone,” or all “domain users” (nearly Everyone) to access the data contained therein. SharePoint has the same problem ( especially with authenticated users). Exchange has these, as well as “Anonymous User” access. This creates a significant security risk; for any data placed in that folder will inherit those “exposed” permissions, and those who place data in these wide-open folders may not be aware of the lax access settings. When sensitive data, like PII, credit card information, intellectual property, or HR information are in these folders, the risks can become very significant. Global access to folders, SharePoint sites, and mailboxes should be removed and replaced with rules that give access to the explicit groups that need it.

5. Identify Data Owners

IT should keep track of data business owners and the folders and SharePoint sites under their responsibility. By involving data owners, IT can expedite a number of the previously identified tasks, including verifying permissions revocation and review, and identifying data for archival. The net effect is a marked increase in the accuracy of data entitlement permissions and, therefore, data protection.

Access our FREE Full Report including the complete list of IT Must Do’s.


Using Varonis: Which Data Needs Owners?

December 6, 2012

(This one entry in a series of posts about the Varonis Operational Plan – a clear path to data governance.  You can find the whole series here.)

Which Data Needs Owners?

In a single terabyte of data there are typically around 50,000 folders or containers, about 5% of which have unique permissions. If IT were to set a goal of assigning an owner for every unique ACL, they’d need to locate owners for 2,500 folders. That’s quite daunting. And most organizations aren’t dealing with a single terabyte of data; in fact, many enterprise installations we encounter are dealing with multiple petabytes of unstructured data. Clearly we need a more surgical approach to assign owners.

Varonis tackled this problem with a longtime customer who needed to identify and assign owners for more than 200 terabytes of CIFS data on their fleet of NetApp filers. There were about 40,000 users in the company, approximately 3,000 of which (as it turned out) needed to be as designated owners for some data.

When we started taking a close look at specific folders, we discovered that many of them (especially at the top of the hierarchy) simply didn’t need an owner; the only users who could read or write data, according to the ACL, were either services accounts or administrative/IT.

What we needed was a methodology for locating the folders where business users had access and a way to identify the likely owner for just those folders. So that’s what we built.

The logic went like this:

  • Identify the topmost unique ACL in a tree where business users have access.
  • If that ACL’s permissions allow write access to users outside of IT, it’s considered a “demarcation point.”
  • For what’s left, identify higher-level demarcation points where non-IT users can only read data.
  • For each demarcation point, identify the most active users
  • Correlate active users with other metadata, such as department name, payroll code, managed by, etc.

The end result of this process is that each demarcation point has a likely ownership candidate. For this particular customer, the next step was to go through a survey process to confirm ownership of each demarcation point with the likely owners (as determined by Varonis’ reports). Any data without a confirmed owner was locked down to remove non-IT access and underwent a separate disposition process.

Other customers have since added content classification and other risk factors in order to better prioritize the data ownership assignment process. With a good classification scheme in place, IT is able to start assigning owners to the most critical data first.

The key takeaway from this process is we can use DatAdvantage to quickly identify the folders that need owners as well as likely owners, so IT doesn’t need to make decisions about 2500 folders per terabyte of data.

While this report was a originally a customization for one customer, we’ve now baked it right into DatAdvantage as report 12M – Recommended Base Folders.

Now that we know who our owners are, the next step is to start getting them involved. My next few posts will cover exactly how we do this using both DatAdvantage and DataPrivilege.

Stay tuned!


Using Varonis: Fixing the Biggest Problems

November 26, 2012

Now that we have a pretty good idea where the highest-risk data is, the question naturally turns to reducing that risk. Fixing permissions problems on Windows, SharePoint or Exchange has always been a significant operational challenge. I’ve been in plenty of situations as an admin where I know something is broken—a SharePoint site open to Authenticated Users for instance—but I’ve felt powerless to actually address the problem since any permissions change carries the risk of denying access to a user (or process) who needs it. Mistakes can have significant business impact depending on whose access you broke and on what data. Since we’re defining “at-risk” as being valuable data that’s over-exposed, that means that any accessibility problems we create will impact valuable data, and that can create more problems than we started with.

Step 3: Remediate High-Risk Data

The goal is to reduce risk by reducing permissions for those users or processes that don’t require access to the data in question.

The next step in the Varonis Operational Plan is fixing those high-risk access control issues that we’ve identified: data open to global access groups as well as concentrations of sensitive information open to either global groups or groups with many users. Since simply reducing access without any context can cause problems, we need to leverage metadata and automation through DatAdvantage.

Let’s tackle global access first. When everyone can access data, it’s very difficult to know who among the large set of potential users actually needs that access. If we know exactly who’s touching the data, we can be surgical about reducing access without causing any headaches.

DatAdvantage analyzes the data’s audit record over time in conjunction with access controls, showing folders, SharePoint sites, and other repositories that are accessible by global access groups, and those users who have been accessing that data who wouldn’t have had access without a global access group. In effect, it’s doing an environment-wide simulation to answer the question, “What if I removed every global access group off every ACL tomorrow. Who would be affected?” This report gives you some key information:

  • Which data is open to global access groups
  • Which part of that data is being accessed by users who wouldn’t otherwise be able to access

And it’s not just global groups that DatAdvantage lets you do this with. Because every data touch by every user on every monitored server is logged, Varonis lets you do this kind of analysis for any user, in any group, on any file or folder. That means you can safely remediate access to all of the high-risk data without risking productivity. You can actually fix the problem without getting in anyone’s way.

The next step is to start shifting decision making from your IT staff to the people who actually should be making choices about who gets access to data: data owners.


Some Amazing Things About Your File System

October 25, 2012

by Andy Green

I was recently asked by one of our sales people to come up with a few unusual facts about user behaviors or statistics related to networked file systems. She was looking for a good anecdote that would make our customers reconsider conventional IT wisdom. I think I’ve found something to raise an IT admin’s eyebrow.

To be fair, my discovery has been known about in a general way for a long time. It’s even become part of our popular culture. No, I don’t mean Murphy’s Law, which is well-appreciated by IT journeymen. I am referring to the proverbial 80-20 rule, which was explained to me, with more than a little hand waving, when I first started in IT. It went something like this: “80% of the data is explained by 20% of the facts”.

As with many simply stated rules, 80-20 hides some deep ideas. It turns out to describe key stats in complex systems spanning economics, marketing, sociology, as well as a few physical sciences. In recent years, the rule has been found to apply to another and more familiar complex creation–the Internet.

The fancier way to describe the 80-20 rule is to say that the distribution of data—a graph of web site visits, web link references, and, as we’ll see later, file sizes—are governed by so-called power laws. Long tails or fat tails are still other terms used to talk about the relative weightiness of events at the extreme end of the data curve—that is, compared to the thinner limits of the more beloved bell-shaped curve.

There is strong evidence for the rule. Much has been written about fat tails with respect to web stats. You can partially satisfy your own curiosity by looking at the web traffic data collected by Quantcast. According to them, perennial top sites such as Facebook, Google, Yahoo, Twitter, MSN.com and a few others attract a disproportionate amount of total web visits.

From a quick back-of-the-envelope calculation using the Quantcast numbers, I tallied up close to 80% of monthly visitor traffic against just 40 of Quantcast’s top ranked sites. These 40 sites, out of almost 400 million total web sites worldwide, is way, way less than 1%. That’s a very skewed 80-20 pattern—closer to 80-.00001!

What does this have to do with file systems? Networked file servers are complex enough with a large enough community of users accessing an ever changing supply of resources–files, directories, and access permissions—to potentially behave in similar ways to the Web.

In graphing the distributions of file sizes, researchers long ago noticed–long pause–a similar kind of skewed curve. While it may not be a true power law, the telltale fat tail shows up for extreme file sizes. For example, you can check out this paper from the folks at Microsoft Research wherein they plot byte-counts for their corporate file system.

Being curious about my own aged home computer, a 10 year-old Dell running Windows XP, I decided to take a quick peek at a histogram of its file system, using a freebie utility. Here’s what I learned: out of almost 70,000 files taking up about 29 GB of space, a mere 83 files, or a shade more than .1%, accounted for an astonishing 26% of the disk space!

Skewed disk utilization graph

Even though I’m familiar with the research, I was still a little stunned to see the fat tail pattern play out on my personal computer. By the way, Microsoft Outlook® .pst files can reach huge sizes–you’ve been warned!

What’s going on to explain these renegade fat tails in corporate file systems?

One of the proposed ideas is that we, as file users, are copying existing files and then editing–adding or subtracting content–from them for the next person down the chain to modify and so on. Essentially, users are successively multiplying a file size by a random factor, and this has been shown to lead to fat-tailed file size curves.

This copying behavior may also have a herd component to it. That is, we tend to edit files that have been copied or accessed more frequently. Preferences for popular files—or web sites or social networks—are also known to lead to fat-tailed distributions.

Based on my own experience as a user, I plead guilty to not only amending and expanding existing files but also echoing file permissions. When it came to read-write-execute or ACE metadata, I was definitely a member of the herd, following what someone else had done—that is, until I started at Varonis.

There’s an IT moral to all this. Your user community is, unfortunately, propagating the “everyone” group or other harmful ACEs, and also unknowingly helping to push files into the red-zone of the file size curve.

For my money, herding behaviors alone are reason enough to use Varonis’s DatAdvantage to really understand and manage your organization’s networked file systems. A file system and its community of users form a kind of social network in which it is quite easy to amplify bad habits.

So you’ll want Varonis’s software to automatically spot these patterns and then take more direct control over shaping your file system’s overall profile.


Introducing Varonis Data Transport Engine

September 6, 2012

For years, Varonis customers have been using Varonis DatAdvantage and the IDU Classification Framework to find data sets that they want to move or delete—stale data, active data, sensitive data, data belonging to department X or Y. Being able to easily find data based on permissions, activity, content, and other metadata accelerates lots of common IT data projects like migrations, mergers & acquisitions, archival, and disposition.

What would make it even easier? What if you could automatically copy, move, or delete data once you find it, without downtime, across domains or across platforms? What if you could automatically translate and optimize the permissions during a move, and simulate the move to see and edit the new directory and permissions structure before executing?

Now you can. Check out the new Varonis Data Transport Engine.



Follow

Get every new post delivered to your Inbox.

Join 746 other followers