The nature of the experiment was quite simple: we deployed a dedicated web server and created secret and totally unpredictable URLs on it for each tested service, something similar to:
Then we used various legitimate functionalities (detailed in the table below) of the tested services to transmit the secret URLs, carefully monitoring our web server logs for all incoming HTTP requests (to see which services followed the secret link that was not supposed to be known and accessed by anyone).
During the 10 days of our experiment, we trapped only six services out of the 50. However, among those six were four of the biggest and most used social networks: Facebook, Twitter, Google+ and Formspring. The remaining two were URL shortening services: bit.ly and goo.gl.
If for the URL shortening services such behavior may be part of their legitimate functionalities, it should not also be the case with social networks such as Facebook and Twitter. Taking into consideration that some of the services may have legitimate robots (e.g. to verify and block spam links) crawling every user-transmitted link automatically, we also created a robots.txt file on our web server that restricted bots accessing the server and its content. Only Twitter respected this restriction, all other social networks simply ignored it, accessing the secret URL.
Below, you can find HTTP requests of trapped services that accessed the secret URLs:
Bit.ly: IP: 188.8.131.52 User-Agent: bitlybot
Facebook: IP: 184.108.40.206 User-Agent: facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)
Formspring: IP: 220.127.116.11 User-Agent: Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.31 (KHTML, like Gecko) Chrome/26.0.1410.64 Safari/537.31
goo.gl: IP: 18.104.22.168 User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.4 (KHTML, like Gecko; Google Web Preview) Chrome/22.0.1229 Safari/537.4
Google+: IP: 22.214.171.124 User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:6.0) Gecko/20110814 Firefox/6.0 Google (+https://developers.google.com/+/web/snippet/)
Twitter: IP: 126.96.36.199 User-Agent: Twitterbot/1.0
Marsel Nizamutdinov, Chief Research Officer at High-Tech Bridge, comments: "The results of this experiment are quite interesting actually. The four trapped social networks justify their activities by “automated verifications”. However, it is technically impossible to verify what is really going on and how the information obtained on the user-transmitted URLs is being used. Today, quite a lot of web applications omit authentication and rely on temporary or unpredictable URLs to hide some content and, when users transfer such URLs via social networks, they cannot be sure that their information will indeed remain confidential. Unfortunately there is no way to keep the URL and its content confidential [if there is no authentication of course] while transferring the URL via social networks."