Unveiling how porn platforms abuse personal data
with Tracking Exposed and GDPR
* * *
ALESSANDRO POLIDORO - GAETANO PRIORI - GIULIA CORONA - GIULIA GIORGI
to know more
see also our research
poTREX extensionFree software pornhub.tracking.exposed
we analyze platforms
* * *
NOT PEOPLE
TREX browser extension collects data on .json and .csv formats, in order to decipher the functioning of the proprietary algorithms for public interest.
How platforms conceptualise gender has broader effects, as it reifies a specific, socially embedded cultural conception that is able to shape, affect, and maintain gender identities.
Why Pornhub?
-
→
Pornhub has affirmed as the main gateway for the access to pornographic digital content
-
→
The ‘platformization’ of porn culture
- 1. Datafication of users
- 2. Standardization of content based on criteria of popularity and predictability
- 3. Targeted content and ads (Paasonen, 2019)
-
→
‘Year in Review’ (Pornhub Insights 2021): the transparency narrative
methodology
* * *
OUR RESEARCH IN BRIEF:
OUR MAIN ARGUMENT
Data show how the platform leverages on affordances and algorithmic suggestions to build fixed and limited user gender identities: contributing to the reiteration of a heteronormative perspective on sexual desire and sexuality typical of a heterosexual, white, and hegemonic masculinity.
- Heteronormative
- White
- Masculine
WHAT COMES NEXT?
-
Raise awareness about topics like GDPR, algorithms, consent
-
Holding the platform accountable for lack of transparency with respect to the tracking and use of users’ personal and behavioural data
-
Call for action through TrEx’s free tools
algorithms as social policies
* * *
BIASED BLACK BOXES
Technology is social before it is technical. —Gilles Deleuze
Algorithmic systems have been criticized for perpetuating bias, discriminations, and contributing to inequality. Data and information collection takes place asymmetrically, generating surveillance. Users' data are exploited.
from the movie: Brazil, 1985 by Terry Gilliam
SCRAPISM WITH POTREX
the practice of web scraping for artistic, emotional, and critical ends. [Sam Lavigne]
-
01. HTML Websites
not structured data
-
02. Potrex extension
Scraping tool
-
03. Structured Data
JSON/CSV
CSV|JSON structure
* * *
DATA FORMAT - HOMEPAGE
Each entry represent a proposed video from Pornhub.
They are video snippet you might click on while visiting the platform.
{ "title": "Sunny Sextape on the Sofa! Squirt, deepthroat", "authorName": "Leolulu", "authorLink": "/pornstar/leolulu", "duration": "17:15", "href": "/view_video.php?viewkey=ph5e18b11299830", "savingTime": "2020-01-19T22:18:10.522Z", "metadataId": "738c411c67c7b6107bbb3ff8631070011a814f48", "clientTime": "2020-01-19T22:17:48.000Z", "size": 421227, "randomUUID": "INITucmr5condtj2zkfy9o6cv4", "selector": "body", "incremental": 0, "amountGrossDimension": 0, "packet": 0, "type": "home", "processed": true, "step": 0, "session": 1, "pseudo": "blueberry-cake-pistachio", "sectionName": "Hot Porn Videos In United States", "sectionHref": "/video?o=ht&cc=us", "sectionOrder": 0, "displayOrder": 0, },
METHODOLOGY
‘gender’ and ‘sexual orientation’ are defined by the platform
- Videos per homepage: 46
- Homepages: 1600
- Videos: 45.959
- Reliability: 99.1%
- Unique videos: 118
Data collection processes leveraged on the ‘Pornhub Tracking Exposed’ (poTREX) infrastructure, that collects and processes data from Pornhub.com web pages such as page layout, video order, titles and views, authors, categories, and more.
This data collection helped us to determine potential recurring patterns, especially regarding the underlying logics governing the different sections of the homepage.
Observing
- Homepage: it keeps changing for different users
- Sections: geographically or individually personalized
- Recommended: different gender identities have different recommendations
homepages layout
* * *
COMMON SECTIONS
The homepage is not completely individually personalized.
The majority of the sections propose the same videos to all users.
This is the case for:
· Hot Porn Videos in Your Country
· Most Viewed Videos in Your Country
· Recently Featured XXX Videos
personalized content
* * *
RECOMMENDED CATEGORY FOR YOU
Not all 10 profiles shared the same 5 sections
The cluster seem to reflect gender-normativity. This is especially relevant considering that this specific section is missing for Same Sex Couple (female), Non-Binary, Trans Female, and Trans Male.
personalized content
* * *
RECOMMENDED
FOR YOU
Common for all profiles
the gender-normative group showcases models, channels, and pornstar; the second group doesn't include channels (production companies). Pornhub manages content in relation to gender identity factoring in broader productive and distributive logics as well?
The result of this deeply male-dominated culture is that the male experience, the male perspective, has come to be seen as universal, while the female experience -that of half the global population, after all- is seen as, well, niche.”
Caroline Criado Perez, Invisible Women: Data Bias in a World Designed for Men, 2019
how do our tools work
* * *
POTREX CONTINUED
Gaining data from volunteers is essential for algorithm analysis.
But sometimes to prove a certain bias we have to begin from a clean environment.
WHAT ARE WE LOOKING FOR?
-
→
Third Party Trackers
-
→
Recommended Content
-
→
Corn?
tool
* * *
Guardoni.js
From the potrex github repository you can find our automation script in methodology/bin/guardoni.js
And what can it do for us?
LABORATORY SETUP
It can automate for us some boring tasks:
-
01. Install the potrex extension on a clean browser
-
02. Initialize a new profile if needed
-
03. Navigate through the website
-
04. Harvest tracking data and screenshots
PUPPEETER 101
* * *
From its github repository
const puppeteer = require('puppeteer'); (async () => { const browser = await puppeteer.launch(); const page = await browser.newPage(); await page.goto('https://example.com'); await page.screenshot({ path: 'example.png' }); await browser.close(); })();
AN EXPERIMENT SAMPLE
* * *
In general for an experiment we meant a set of commands which automates the navigation of our bot.
This includes a landing page, a list of videos to be watched and the time to be spent on each page.
From the potrex repositoy in methodology/json/phase1/guardoni2.json we can see and an excerpt
[{ "name": "Homepage first", "url": "https://www.pornhub.com/", "loadFor": "17s", "screenshotAfter": "5s" },{ "name": "Hentai-1", "loadFor": "3m", "screenshotAfter": "5s", "url": "https://www.pornhub.com/view_video.php?viewkey=ph604a68bb36307" },{ "name": "Hentai-2", "loadFor": "3m", "url": "https://www.pornhub.com/view_video.php?viewkey=ph604a625b562ea" },{ ...
CODE SAMPLES
* * *
From methodology/src/domainSpecific.js
you can see that we save everything from localstorage
for (let i = 0; i < localStorage.length; i++) { const key = localStorage.key(i); json[key] = localStorage.getItem(key); ... const cookies = await page._client.send('Network.getAllCookies');
CODE SAMPLES
* * *
...Or how third party trackers are saved
const up = url.parse(reqpptr.url()); if(_.endsWith(up.host, 'pornhub.com') || _.endsWith(up.host, 'phncdn.com') ) return; const full3rdparty = { method: reqpptr.method(), host: up.host, pathname: up.pathname, ...if(full3rdparty.method != ‘GET’) full3rdparty.postData = reqpptr.postData();
Anecdotal evidence suggests that there are significant differences between the third-party organizations operating in the porn and the regular web tracking industry as large online ad networks such as Google Ads set strict constraints for porn-related publishers, prohibiting the advertising of adult-oriented products and services. These restricting terms of services [...] opened new market opportunities for other actors who have specialized in providing advertising and tracking technologies to adult sites. This context has created, as a result, a parallel ecosystem of third-party service providers in the porn ecosystem who has not been scrutinized by regulators, policy makers, and the research community.
Tales from the Porn: A Comprehensive Privacy Analysis of the Web Porn Ecosystem, 2019
How did we get here?
* * *
The “Legal Vacuum”
A legal context in which things are not clear, there is no applicable law or in which some injustice is uncorrected
ABOUT THE LEGAL VACUUM
-
Lack of norms (at every level)
-
Lack of concepts (categories, principles)
-
Lack of political will (controversial and complex topic)
In the words of Ludwig Mies van der Rohe “Less is more”?
In the European context some precious tools are already filling up the void
* * *
Legally speaking, what about porn platforms?
- “I shall not today attempt further to define the kinds of material […] but I know it when I see it” famous quote by Potter Stewart
- Producing, selling and possessing porno is still illegal in many nations
- There has been a significant effort to stop child pornography and non-consensual pornography
- We can see a growing effort in preventing under age users to access online pornography
- An interesting debate has sparked over the addictive nature of mainstream pornographic content
SOME OPEN ISSUES
- sex-workers and prostitutes rights
- privacy
- role of pornography in our society
POSSIBLE APPROACHES
-
Policy
-
Advocacy
-
Litigation
reactive and proactive
* * *
1. POLICY
- currently most of power and agency over this field is in the hands of the private sector
- there are many projects that try to offer alternatives to mainstream online pornography, embedding values like feminism, non-conformity of gender, sex-workers empowerment and more
- public opinion does not seem to see this issue as a priority also due to a cultural taboo
reactive and proactive
* * *
2. ADVOCACY
- this topic is often not welcomed by political institutions
- pushing for openness on these issues may escalate in a moral panic
- a blind fight against obscenity and other forms of extremism generate a toxic debate
only reactive
* * *
3. LITIGATION
- It seems to be the most reachable approach at the present time
- Allows to bring together law and technology specialists during evidence acquisition and reporting
- Involves judges and Data Protection Authorities in bringing light into the legal vacuum
feedbacks
* * *
WHAT ABOUT YOU
- Are you aware of any issues?
- What would you like to investigate?
- How? Which methodology would you use?
mail: team[@]tracking.exposed
SOME OPEN ISSUES
- sex-workers and prostitutes rights
- privacy
- role of pornography in our society
POSSIBLE APPROACHES
-
Policy
-
Advocacy
-
Litigation
reactive and proactive
* * *
1. POLICY
- currently most of power and agency over this field is in the hands of the private sector
- there are many projects that try to offer alternatives to mainstream online pornography, embedding values like feminism, non-conformity of gender, sex-workers empowerment and more
- public opinion does not seem to see this issue as a priority also due to a cultural taboo
reactive and proactive
* * *
2. ADVOCACY
- this topic is often not welcomed by political institutions
- pushing for openness on these issues may escalate in a moral panic
- a blind fight against obscenity and other forms of extremism generate a toxic debate
only reactive
* * *
3. LITIGATION
- It seems to be the most reachable approach at the present time
- Allows to bring together law and technology specialists during evidence acquisition and reporting
- Involves judges and Data Protection Authorities in bringing light into the legal vacuum
feedbacks
* * *
WHAT ABOUT YOU
- Are you aware of any issues?
- What would you like to investigate?
- How? Which methodology would you use?
mail: team[@]tracking.exposed
POSSIBLE APPROACHES
-
Policy
-
Advocacy
-
Litigation
reactive and proactive
* * *
1. POLICY
- currently most of power and agency over this field is in the hands of the private sector
- there are many projects that try to offer alternatives to mainstream online pornography, embedding values like feminism, non-conformity of gender, sex-workers empowerment and more
- public opinion does not seem to see this issue as a priority also due to a cultural taboo
reactive and proactive
* * *
2. ADVOCACY
- this topic is often not welcomed by political institutions
- pushing for openness on these issues may escalate in a moral panic
- a blind fight against obscenity and other forms of extremism generate a toxic debate
only reactive
* * *
3. LITIGATION
- It seems to be the most reachable approach at the present time
- Allows to bring together law and technology specialists during evidence acquisition and reporting
- Involves judges and Data Protection Authorities in bringing light into the legal vacuum
feedbacks
* * *
WHAT ABOUT YOU
- Are you aware of any issues?
- What would you like to investigate?
- How? Which methodology would you use?
mail: team[@]tracking.exposed
reactive and proactive
* * *
2. ADVOCACY
- this topic is often not welcomed by political institutions
- pushing for openness on these issues may escalate in a moral panic
- a blind fight against obscenity and other forms of extremism generate a toxic debate
only reactive
* * *
3. LITIGATION
- It seems to be the most reachable approach at the present time
- Allows to bring together law and technology specialists during evidence acquisition and reporting
- Involves judges and Data Protection Authorities in bringing light into the legal vacuum
feedbacks
* * *
WHAT ABOUT YOU
- Are you aware of any issues?
- What would you like to investigate?
- How? Which methodology would you use?
mail: team[@]tracking.exposed
only reactive
* * *
3. LITIGATION
- It seems to be the most reachable approach at the present time
- Allows to bring together law and technology specialists during evidence acquisition and reporting
- Involves judges and Data Protection Authorities in bringing light into the legal vacuum
feedbacks
* * *
WHAT ABOUT YOU
- Are you aware of any issues?
- What would you like to investigate?
- How? Which methodology would you use?
mail: team[@]tracking.exposed
feedbacks
* * *