Unveiling how porn platforms abuse personal data

with Tracking Exposed and GDPR


* * *

ALESSANDRO POLIDORO - GAETANO PRIORI - GIULIA CORONA - GIULIA GIORGI
to know more
see also our research

poTREX extension

Free software pornhub.tracking.exposed

we analyze platforms

* * *

NOT PEOPLE

TREX browser extension collects data on .json and .csv formats, in order to decipher the functioning of the proprietary algorithms for public interest.

How platforms conceptualise gender has broader effects, as it reifies a specific, socially embedded cultural conception that is able to shape, affect, and maintain gender identities.

Bivens et al. 2016

Why Pornhub?

  • Pornhub has affirmed as the main gateway for the access to pornographic digital content

  • The ‘platformization’ of porn culture

  • 1. Datafication of users
  • 2. Standardization of content based on criteria of popularity and predictability
  • 3. Targeted content and ads (Paasonen, 2019)
  • ‘Year in Review’ (Pornhub Insights 2021): the transparency narrative


methodology

* * *

OUR RESEARCH IN BRIEF:

  • 10 accounts
  • 1.600 variations of Pornhub’s homepages
  • 25.000 videos suggested
  • OUR MAIN ARGUMENT

    Data show how the platform leverages on affordances and algorithmic suggestions to build fixed and limited user gender identities: contributing to the reiteration of a heteronormative perspective on sexual desire and sexuality typical of a heterosexual, white, and hegemonic masculinity.

    • Heteronormative
    • White
    • Masculine

    WHAT COMES NEXT?


    • Raise awareness about topics like GDPR, algorithms, consent

    • Holding the platform accountable for lack of transparency with respect to the tracking and use of users’ personal and behavioural data

    • Call for action through TrEx’s free tools

    Bonsai

    algorithms as social policies

    * * *

    BIASED BLACK BOXES

    Technology is social before it is technical. —Gilles Deleuze


    Algorithmic systems have been criticized for perpetuating bias, discriminations, and contributing to inequality. Data and information collection takes place asymmetrically, generating surveillance. Users' data are exploited.

    from the movie: Brazil, 1985 by Terry Gilliam

    SCRAPISM WITH POTREX

    the practice of web scraping for artistic, emotional, and critical ends. [Sam Lavigne]

    • 01. HTML Websites

      not structured data

    • 02. Potrex extension

      Scraping tool

    • 03. Structured Data

      JSON/CSV

    CSV|JSON structure

    * * *

    DATA FORMAT - HOMEPAGE

    Each entry represent a proposed video from Pornhub.
    They are video snippet you might click on while visiting the platform.
          {
            "title": "Sunny Sextape on the Sofa! Squirt, deepthroat",
            "authorName": "Leolulu",
            "authorLink": "/pornstar/leolulu",
            "duration": "17:15",
            "href": "/view_video.php?viewkey=ph5e18b11299830",
            "savingTime": "2020-01-19T22:18:10.522Z",
            "metadataId": "738c411c67c7b6107bbb3ff8631070011a814f48",
            "clientTime": "2020-01-19T22:17:48.000Z",
            "size": 421227,
            "randomUUID": "INITucmr5condtj2zkfy9o6cv4",
            "selector": "body",
            "incremental": 0,
            "amountGrossDimension": 0,
            "packet": 0,
            "type": "home",
            "processed": true,
            "step": 0,
            "session": 1,
            "pseudo": "blueberry-cake-pistachio",
            "sectionName": "Hot Porn Videos In United States",
            "sectionHref": "/video?o=ht&cc=us",
            "sectionOrder": 0,
            "displayOrder": 0,
          },

    METHODOLOGY

    ‘gender’ and ‘sexual orientation’ are defined by the platform

      Data collection processes leveraged on the ‘Pornhub Tracking Exposed’ (poTREX) infrastructure, that collects and processes data from Pornhub.com web pages such as page layout, video order, titles and views, authors, categories, and more.
      This data collection helped us to determine potential recurring patterns, especially regarding the underlying logics governing the different sections of the homepage.

    • Videos per homepage: 46
    • Homepages: 1600
    • Videos: 45.959
    • Reliability: 99.1%
    • Unique videos: 118

    Observing

    • Homepage: it keeps changing for different users
    • Sections: geographically or individually personalized
    • Recommended: different gender identities have different recommendations

    F I N D I N G S

    * * *

    A small summary

    graphs made with gephi

    homepages layout

    * * *

    COMMON SECTIONS

    The homepage is not completely individually personalized.
    The majority of the sections propose the same videos to all users.
    This is the case for:
    · Hot Porn Videos in Your Country
    · Most Viewed Videos in Your Country
    · Recently Featured XXX Videos

    personalized content

    * * *

    RECOMMENDED CATEGORY FOR YOU

    Not all 10 profiles shared the same 5 sections
    The cluster seem to reflect gender-normativity. This is especially relevant considering that this specific section is missing for Same Sex Couple (female), Non-Binary, Trans Female, and Trans Male.

    personalized content

    * * *

    RECOMMENDED
    FOR YOU

    Common for all profiles
    the gender-normative group showcases models, channels, and pornstar; the second group doesn't include channels (production companies). Pornhub manages content in relation to gender identity factoring in broader productive and distributive logics as well?

    The result of this deeply male-dominated culture is that the male experience, the male perspective, has come to be seen as universal, while the female experience -that of half the global population, after all- is seen as, well, niche.”

    Caroline Criado Perez, Invisible Women: Data Bias in a World Designed for Men, 2019

    how do our tools work

    * * *

    POTREX CONTINUED





    Gaining data from volunteers is essential for algorithm analysis.

    But sometimes to prove a certain bias we have to begin from a clean environment.


    WHAT ARE WE LOOKING FOR?

    • Third Party Trackers

    • Recommended Content

    • Corn?
      corn

    Bonsai

    tool

    * * *

    Guardoni.js

    From the potrex github repository you can find our automation script in methodology/bin/guardoni.js


    And what can it do for us?

    LABORATORY SETUP

    It can automate for us some boring tasks:

    • 01. Install the potrex extension on a clean browser

    • 02. Initialize a new profile if needed

    • 03. Navigate through the website

    • 04. Harvest tracking data and screenshots

    PUPPEETER 101

    * * *

  • Chrome(ium) automation through dev tools!
  • It's just a node library with JS extensions!
  • Made with UI testing and browser automation in mind
  • Full browser configuration capabilities
  • From its github repository

          
    const puppeteer = require('puppeteer'); (async () => { const browser = await puppeteer.launch(); const page = await browser.newPage(); await page.goto('https://example.com'); await page.screenshot({ path: 'example.png' }); await browser.close(); })();

    AN EXPERIMENT SAMPLE

    * * *

    In general for an experiment we meant a set of commands which automates the navigation of our bot.

    This includes a landing page, a list of videos to be watched and the time to be spent on each page.

    From the potrex repositoy in methodology/json/phase1/guardoni2.json we can see and an excerpt

    [{
        "name": "Homepage first",
        "url": "https://www.pornhub.com/",
        "loadFor": "17s",
        "screenshotAfter": "5s"
    },{
        "name": "Hentai-1",
        "loadFor": "3m",
        "screenshotAfter": "5s",
        "url": "https://www.pornhub.com/view_video.php?viewkey=ph604a68bb36307"
    },{
        "name": "Hentai-2",
        "loadFor": "3m",
        "url": "https://www.pornhub.com/view_video.php?viewkey=ph604a625b562ea"
    },{
    ...
          

    CODE SAMPLES

    * * *

    From methodology/src/domainSpecific.js
    you can see that we save everything from localstorage

    for (let i = 0; i < localStorage.length; i++) {
      const key = localStorage.key(i);
      json[key] = localStorage.getItem(key);
    ...
    const cookies = await page._client.send('Network.getAllCookies');
          

    CODE SAMPLES

    * * *

    ...Or how third party trackers are saved

    const up = url.parse(reqpptr.url());
    if(_.endsWith(up.host, 'pornhub.com') ||
    _.endsWith(up.host, 'phncdn.com') )
      return;
    const full3rdparty = {
      method: reqpptr.method(),
      host: up.host,
      pathname: up.pathname,
    ...
    

    if(full3rdparty.method != ‘GET’) full3rdparty.postData = reqpptr.postData();

    Anecdotal evidence suggests that there are significant differences between the third-party organizations operating in the porn and the regular web tracking industry as large online ad networks such as Google Ads set strict constraints for porn-related publishers, prohibiting the advertising of adult-oriented products and services. These restricting terms of services [...] opened new market opportunities for other actors who have specialized in providing advertising and tracking technologies to adult sites. This context has created, as a result, a parallel ecosystem of third-party service providers in the porn ecosystem who has not been scrutinized by regulators, policy makers, and the research community.

    Tales from the Porn: A Comprehensive Privacy Analysis of the Web Porn Ecosystem, 2019

    mrbean

    How did we get here?

    * * *

    The “Legal Vacuum”

    A legal context in which things are not clear, there is no applicable law or in which some injustice is uncorrected

    ABOUT THE LEGAL VACUUM


    • Lack of norms (at every level)

    • Lack of concepts (categories, principles)

    • Lack of political will (controversial and complex topic)



    In the words of Ludwig Mies van der Rohe “Less is more”?

    In the European context some precious tools are already filling up the void

    * * *

  • General Data Protecion Regulation (since 2018)
  • Privacy and Electronic Commerce Directive (since 2002)
  • ecommerce directive (since 2000)
  • … and more
  • Legally speaking, what about porn platforms?

    • “I shall not today attempt further to define the kinds of material […] but I know it when I see it” famous quote by Potter Stewart
    • Producing, selling and possessing porno is still illegal in many nations
    • There has been a significant effort to stop child pornography and non-consensual pornography
    • We can see a growing effort in preventing under age users to access online pornography
    • An interesting debate has sparked over the addictive nature of mainstream pornographic content

    SOME OPEN ISSUES

    • sex-workers and prostitutes rights
    • privacy
    • role of pornography in our society

    POSSIBLE APPROACHES


    • Policy

    • Advocacy

    • Litigation

    reactive and proactive

    * * *

    1. POLICY

    • currently most of power and agency over this field is in the hands of the private sector
    • there are many projects that try to offer alternatives to mainstream online pornography, embedding values like feminism, non-conformity of gender, sex-workers empowerment and more
    • public opinion does not seem to see this issue as a priority also due to a cultural taboo

    reactive and proactive

    * * *

    2. ADVOCACY

    • this topic is often not welcomed by political institutions
    • pushing for openness on these issues may escalate in a moral panic
    • a blind fight against obscenity and other forms of extremism generate a toxic debate

    only reactive

    * * *

    3. LITIGATION

    • It seems to be the most reachable approach at the present time
    • Allows to bring together law and technology specialists during evidence acquisition and reporting
    • Involves judges and Data Protection Authorities in bringing light into the legal vacuum

    sailor

    feedbacks

    * * *

    WHAT ABOUT YOU


    • Are you aware of any issues?
    • What would you like to investigate?
    • How? Which methodology would you use?

    mail: team[@]tracking.exposed