This is significant work done by Princeton researchers. It’s honestly a pretty damning indictment of the world’s most visited websites.
Most people who’ve spent time on the internet have some understanding that many websites log their visits and keep record of what pages they’ve looked at. When you search for a pair of shoes on a retailer’s site for example, it records that you were interested in them. The next day, you see an advertisement for the same pair on Instagram or another social media site.
The idea of websites tracking users isn’t new, but research from Princeton University released last week indicates that online tracking is far more invasive than most users understand. In the first installment of a series titled “No Boundaries,” three researchers from Princeton’s Center for Information Technology Policy (CITP) explain how third-party scripts that run on many of the world’s most popular websites track your every keystroke and then send that information to a third-party server.
Some highly-trafficked sites run software that records every time you click and every word you type. If you go to a website, begin to fill out a form, and then abandon it, every letter you entered in is still recorded, according to the researchers’ findings. If you accidentally paste something into a form that was copied to your clipboard, it’s also recorded. Facebook users were outraged in 2013 when it was discovered that the social network was doing something similar with status updates—it recorded what users they typed, even if they never ended up posting it.
These scripts, or bits of code that websites run, are called “session replay” scripts. Session replay scripts are used by companies to gain insight into how their customers are using their sites and to identify confusing webpages. But the scripts don’t just aggregate general statistics, they record and are capable of playing back individual browsing sessions. The scripts don’t run on every page, but are often placed on pages where users input sensitive information, like passwords and medical conditions.
Most troubling is that the information session replay scripts collect can’t “reasonably be expected to be kept anonymous,” according to the researchers. Some of the companies that provide this software, like FullStory, design tracking scripts that even allow website owners to link the recordings they gather to a user’s real identity. On the backend, companies can see that a user is connected to a specific email or name. FullStory did not return a request for comment.
Companies that sell replay scripts do offer a number of redaction tools that allow websites to exclude sensitive content from recordings, and some even explicitly forbid the collection of user data. Still, the use of session replay scripts by so many of the world’s most popular websites has serious privacy implications.
“Collection of page content by third-party replay scripts may cause sensitive information such as medical conditions, credit card details, and other personal information displayed on a page to leak to the third-party as part of the recording,” the researchers wrote in their post.
Passwords are often accidentally included in recordings, despite that the scripts are designed to exclude them. The researchers found that other personal information was also often not redacted, or only redacted partially, at least with some of the scripts. Two of the companies, UserReplay and SessionCam, block all user inputs by default (they just track where users are clicking), which is a far safer approach.
Finally, the study’s authors are worried that session script companies could be vulnerable to targeted hacks, especially because they’re likely high-value targets. For example, many of these companies have dashboards where clients can playback the recordings they collect.
It’s not just session scripts that are following you around the internet. A study published earlier this year found that nearly half of the world’s 1,000 most popular websites use the same tracking software to monitor your behavior in various ways.
If you want to block session replay scripts, popular ad-blocking tool AdBlock Plus will now protect you against all of the ones documented in the Princeton study. AdBlock Plus formerly only protected against some, but has now been updated to block all as a result of the researchers’ work.