• ParlimentOfDoom@piefed.zip
    link
    fedilink
    English
    arrow-up
    8
    ·
    18 hours ago

    There is no way AI is going to figure out I’m secretly a gorilla living in Saskatchewan and work for the local golf course.

  • hansolo@lemmy.today
    link
    fedilink
    English
    arrow-up
    22
    ·
    23 hours ago

    Oi crikey mate, I’m a 6’4" tall Swedish Olympic carpenter named Liam. Pip pip! I love to eat lingenberries!

    That ought to buy me a few days…

  • XLE@piefed.social
    link
    fedilink
    English
    arrow-up
    16
    ·
    23 hours ago

    The concept of de-anonymizing people isn’t new to AI. This just expedites it. And the best part is: Sam Altman and every other AI CEO is funding it… And so are you.

    Your tax dollars -> state-subsidized energy, water, money -> AI companies -> this

  • Em Adespoton
    link
    fedilink
    English
    arrow-up
    8
    ·
    23 hours ago

    Seems to me that all this would do in my case is identify that I’m likely the same person in the various forums I post on.

    I’ve used LLMs to go hunting for me online, and it took a LOT of prompting to link the same anonymous me across more than a few of the most popular online forums, even when I fed it excerpts from less popular places.

    Since I use a different voice in my private communications and don’t post to public forums under my real name, there’s very little to be matched.

    Matching up multiple Reddit and Instagram accounts? Sure, LLMs can do that with ease.

  • Artwork@lemmy.world
    link
    fedilink
    English
    arrow-up
    9
    ·
    24 hours ago

    We show that large language models can be used to perform at-scale deanonymization.

    With full Internet access, our agent can re-identify Hacker News users and Anthropic Interviewer participants at high precision, given pseudonymous online profiles and conversations alone, matching what would take hours for a dedicated human investigator. We then design attacks for the closed-world setting.

    Given two databases of pseudonymous individuals, each containing unstructured text written by or about that individual, we implement a scalable attack pipeline that uses LLMs to:
    - 1. extract identity-relevant features,
    - 2. search for candidate matches via semantic embeddings, and
    - 3. reason over top candidates to verify matches and reduce false positives…

    Our second dataset matches users across Reddit movie discussion communities; and the third splits a single user’s Reddit history in time to create two pseudonymous profiles to be matched. In each setting, LLM-based methods substantially outperform classical baselines, achieving up to 68% recall at 90% precision compared to near 0% for the best non-LLM method.

    Our results show that the practical obscurity protecting pseudonymous users online no longer holds and that threat models for online privacy need to be reconsidered…

    The following prompt is used…

    Source: https://arxiv.org/pdf/2602.16800 [2026-02-18]

    -–

    I don’t know why people are so keen to put the details of their private life in public; they forget that invisibility is a superpower…
    ~ Banksy

    Related: https://cloudflare.com/learning/ai/data-poisoning