Dark Reading is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Cloud

5/23/2019
02:30 PM
Connect Directly
LinkedIn
RSS
E-Mail vvv
50%
50%

Google's Origin & the Danger of Link Sharing

How the act of sharing links to files stored in a public cloud puts organizations at risk, and what security teams can do to safeguard data and PII.

Some of us as "seasoned" computer science professionals recall the early days of computing pre-Web and pre-PageRank, the key algorithmic innovation that enabled Google to grow to its current mammoth scale. Much has been written about Google's history and the spawning of effective web search engines that ranked web pages so users could easily find the most relevant information they were interested in.

At the time, some in the computer science community concerned with security and privacy issues expressed fears that Google's web crawling and indexing might be illegal. Certainly, copyright issues would be in play if wholesale copying of web content wasn't permissible. Many of these issues were resolved over the years by employing agreed-upon rules of the road, permitting crawling, page analysis, and indexing, but under the control of announced policies and terms of service by webmasters. In a perfect Internet, all would be good.

Today, web crawling is continuous and ubiquitous, and it has broadened in scope from web pages to general Internet searches and file shares. The downside to this is that Google searches can also capture and index files and data exposed in cloud shares. Along with the very many legitimate web crawlers that adhere to the rules in robot.txt, there are also malicious crawlers that ignore these warnings and scan and probe, sometimes successfully, to capture cloud shared documents. It may not be immediately apparent when a cloud share has been visited by a spider. After all, it isn't immediately obvious when your website has been crawled unless you explicitly look for it.

This is why it pays to be proactive. We experienced a related incident firsthand at Columbia University, where I work as a computer science professor. Long ago, before there were so many regulations around protecting personal identifiable information, student Social Security numbers were used as the unique identifier when entering a housing lottery for securing a dorm room on campus. The files associated with this lottery were then stored in the cloud and forgotten. That is, until Google's indexing made the Social Security numbers public and searchable, creating an incident years after the files were stored and students had moved on from the university. The university's security team was able to remove the links and has since spent more time educating its faculty and students on data privacy best-practices. They've also set up a scanning system to help monitor for any instances of students' social security numbers being shared.

It is these types of incidents that drove the university to take precautions, update security policies, and anticipate risks related to Google indexing and link sharing. Just recently, data from more than 90 companies, including Box, was exposed through Box accounts because employees shared web links.

How can security teams understand just how pervasive link-sharing risks are in their organizations? First, administrators should make sure the default access settings for shared links are configured to "people in your company" to reduce accidental exposure of data to the public. Secondly, security policies for cloud-resident data should mirror any policies that apply to data stored on the premises. That includes policies about downloading or sharing certain kinds of sensitive data, as well as encryption of sensitive data.

Defenders typically resort to cloud log analysis to determine the extent of the problem. Such log analytics can alert personnel to possibly misconfigured cloud share access controls, or user security violations, where a shared link gives access to a broad collection of documents to an interested spider.

The log analytics aren't easy to do, but generally, capturing all events including time stamps, source IPs, agent strings, and URLs requested is the basic starting point. There are numerous products available to assist in the process — for example, to uncover the source IPs from tracert, and that analyze timing of requests. Being alert to spiders is important, but once a spider has done its job, and the shared documents have been exposed, what's next?

At that point, once a spider has scanned and indexed the files in the cloud share, the data owner has lost the ability to control access to it; in essence, all bets are off. So, the immediate questions security teams need to know are: What was lost? Who is affected? Who is responsible? How did it get lost? Can it be prevented from happening again?

Cloud log analysis can help answer some of these questions. Appropriate mitigation actions in a case like this also include shutting down credentials for the person who shared the link, revoking user access to cloud-resident files, folders, or cloud shares, and, in some cases, decommissioning a public cloud folder and reconfiguring security settings for future files. That is how some of the organizations involved in the Box data leak responded.

At some point in the near to distant future, the information in cloud activity logs could be automatically analyzed using artificial intelligence, machine learning, or other technologies to lessen the workload of security professionals. Rather than spending resources digging through cloud logs, it may be possible to send teams real-time notifications when cloud security policies are violated, or when unsanctioned users open or download cloud-resident files that weren't meant for them.

Related Content:

Dr. Salvatore Stolfo is the founder and CTO of Allure Security. As a professor of artificial intelligence at Columbia University since 1979, Dr. Stolfo has spent a career figuring out how people think and how to make computers and systems think like people. Dr. Stolfo has ... View Full Bio
Comment  | 
Print  | 
More Insights
Comments
Newest First  |  Oldest First  |  Threaded View
Dr.T
50%
50%
Dr.T,
User Rank: Ninja
5/29/2019 | 7:52:42 AM
Good questions
At that point, once a spider has scanned and indexed the files in the cloud share, the data owner has lost the ability to control access to it; in essence, all bets are off. So, the immediate questions security teams need to know are: What was lost? Who is affected? Who is responsible? How did it get lost? Can it be prevented from happening again? All good questions, as a security expert you would be expected to know the answers tough.
Dr.T
50%
50%
Dr.T,
User Rank: Ninja
5/29/2019 | 7:50:53 AM
Re: Frightening thought too
And also, if you are hunting, give hunting employers an entirely wrong impression of you particularly if you have MOVED as I have. Agree. Keep the information current to avoid unnecessary troubles.
Dr.T
50%
50%
Dr.T,
User Rank: Ninja
5/29/2019 | 7:48:54 AM
Near future
At some point in the near to distant future, the information in cloud activity logs could be automatically analyzed using artificial intelligence, machine learning, or other technologies to lessen the workload of security professionals. This makes sense. It is not practical otherwise.
Dr.T
50%
50%
Dr.T,
User Rank: Ninja
5/29/2019 | 7:47:14 AM
Re: Frightening thought too
How MUCH of your old information is OUT THERE? Good point. I would say more than we thought.
Dr.T
50%
50%
Dr.T,
User Rank: Ninja
5/29/2019 | 7:46:17 AM
AI for logs
It makes sense to analyze logs with AI/ML then letting a human to look at it as it could not possibly be efficient at all.
REISEN1955
50%
50%
REISEN1955,
User Rank: Ninja
5/23/2019 | 2:49:53 PM
Frightening thought too
How MUCH of your old information is OUT THERE?  I just reviewed my linked-in profile and I need to modify my personal bio very badly tonight.  OK, that is not vital stuff but tons of old resumes are probably on a dozen websites and from those one can learn all kinds of things.  And also, if you are hunting, give hunting employers an entirely wrong impression of you particularly if you have MOVED as I have.  (I still get job leads for 5 states away).  So consider the amount of old data you have set in place and considering changing, revision or deletion actions asap.
7 Truths About BEC Scams
Ericka Chickowski, Contributing Writer,  6/13/2019
DNS Firewalls Could Prevent Billions in Losses to Cybercrime
Curtis Franklin Jr., Senior Editor at Dark Reading,  6/13/2019
Cognitive Bias Can Hamper Security Decisions
Kelly Sheridan, Staff Editor, Dark Reading,  6/10/2019
Register for Dark Reading Newsletters
White Papers
Video
Cartoon
Current Issue
Building and Managing an IT Security Operations Program
As cyber threats grow, many organizations are building security operations centers (SOCs) to improve their defenses. In this Tech Digest you will learn tips on how to get the most out of a SOC in your organization - and what to do if you can't afford to build one.
Flash Poll
The State of IT Operations and Cybersecurity Operations
The State of IT Operations and Cybersecurity Operations
Your enterprise's cyber risk may depend upon the relationship between the IT team and the security team. Heres some insight on what's working and what isn't in the data center.
Twitter Feed
Dark Reading - Bug Report
Bug Report
Enterprise Vulnerabilities
From DHS/US-CERT's National Vulnerability Database
CVE-2013-7472
PUBLISHED: 2019-06-15
The "Count per Day" plugin before 3.2.6 for WordPress allows XSS via the wp-admin/?page=cpd_metaboxes daytoshow parameter.
CVE-2019-12839
PUBLISHED: 2019-06-15
In OrangeHRM 4.3.1 and before, there is an input validation error within admin/listMailConfiguration (txtSendmailPath parameter) that allows authenticated attackers to achieve arbitrary command execution.
CVE-2019-12840
PUBLISHED: 2019-06-15
In Webmin through 1.910, any user authorized to the "Package Updates" module can execute arbitrary commands with root privileges via the data parameter to update.cgi.
CVE-2019-12835
PUBLISHED: 2019-06-15
formats/xml.cpp in Leanify 0.4.3 allows for a controlled out-of-bounds write in xml_memory_writer::write via characters that require escaping.
CVE-2019-12830
PUBLISHED: 2019-06-15
In MyBB before 1.8.21, an attacker can exploit a parsing flaw in the Private Message / Post renderer that leads to [video] BBCode persistent XSS to take over any forum account, aka a nested video MyCode issue.