Amazon today announced a new security service built to identify, classify, and protect sensitive data stored in AWS from leaks, breaches, and unauthorized access, with Amazon Simple Storage Service (S3) being the initial data store.
S3 appeals to organizations due to its simplicity: It's easy for users to sort their software and services data into "buckets" in the cloud. But the catch is that it's equally easy for users to misconfigure permissions and leave data exposed, as evidenced in high-profile data leaks affecting Verizon, the WWE, Republican National Committee, and Scottrade earlier this year.
Back in June, millions of voter records were leaked from an unsecured AWS S3 bucket storage account owned by Deep Root Analytics, which performed work on behalf of the Republican National Committee. Permissions had been set to public instead of private, making data files publicly accessible; in some cases, the records could also be downloaded.
One month later, a data leak at Dow Jones & Co. exposed the personal data of millions of customers after S3 settings had been configured to let any AWS Authenticated User download data using the bucket's URL. "Authenticated user" means anyone who has a free AWS account, meaning the data was accessible by more than one million people.
Amazon's new Macie service was not created in response to this year's S3 leaks, but could help address similar incidents by alerting security teams to events like misconfigured bucket permissions, which led to the Deep Root Analytics leak.
The service finds and classifies data stored in S3, gives each data object a business value, and monitors for suspicious activity based on user authentications to data, times of access, and data access locations, according to Amazon.
Macie runs an engine to specifically detect common sources of personally identifiable information (PII) or sensitive personal information (SP), Amazon's Tara Walker said in a blog post on the news. It also checks events in AWS CloudTrail for PUT requests in S3 buckets to detect and classify new information. Amazon's new service also uses machine learning algorithms and natural language processing to automatically classify data objects by file and content type. It shows how data objects are classified and highlights data based on how critical it is for business use, personal use, and compliance.
Data is assigned a risk level ranging from 1 (lowest risk) to 10 (highest risk). Its dashboard groups data into high-risk S3 objects (those with risk levels 8-10), total event occurrences since Macie was enabled, and total user sessions. Users can define and customize automated remediation actions, such as triggering password reset policies, based on activity.
After it sets a baseline for the organization's sensitive data, it monitors for activity that could indicate risky behavior.
Users are alerted of suspicious behavior that could put information at risk; for example, if large quantities of source code are downloaded by a user account that doesn't usually access the data. The same would happen if there were sudden changes in permissions of Amazon S3 buckets, or if API keys were uploaded into source code.
"By using machine learning to understand the content and user behavior of each organization, Amazon Macie can cut through huge volumes of data with better visibility and more accurate alerts, allowing customers to focus on securing their sensitive information instead of wasting time trying to find it," AWS CISO Stephen Schmidt said in a statement.
Macie can send findings to Amazon CloudWatch Events and support API endpoints through AWS SDK later this year so it can integrate with third-party tools. Planned integrations include providers like Palo Alto Networks, Trend Micro, and Splunk.