Memory.lol - a tiny web service that provides historical information about social media accounts

Overview

memory.lol

Rust build status Coverage status

Overview

This project is a tiny web service that provides historical information about social media accounts.

It can currently be used to look up 542 million historical screen names for 443 million Twitter accounts. Most of this data has been scraped from either the Twitter Stream Grab or the Wayback Machine (both published by the Internet Archive).

Coverage should be fairly good (for non-protected accounts) going back to 2011, which is when the Twitter Stream Grab was launched.

Please note that this software is not "open source", but the source is available for use and modification by individuals, non-profit organizations, and worker-owned businesses (see the license section below for details).

Safety

All information provided by this service has been gathered from public archives, and in most cases it can easily be found through other means (such as a Twitter search for replies to an account). The goal of the service is to make it easier for researchers or journalists to identify directions for further investigation, and more generally to indicate to users that an account may be operating a scam, spreading disinformation, etc. If you have concerns about safety or privacy, you can contact me (via Twitter DM or email) and your request will be handled privately.

Current access restrictions

In most cases public access to the tool is currently limited to historical facts that have been observed in the past 60 days. There are two exceptions to this rule:

  • Some accounts are excluded at the request of the account owner.
  • Full histories are provided for a set of accounts compiled from several "bad actor" lists.

The full twelve years and half a billion screen names (minus requested exclusions) are available to a trusted group of researchers, journalists, and activists.

The service currently supports authenticating via a GitHub or Google account.

The service only uses GitHub (or Google) for authentication, doesn't require any non-public or write access to the user's accounts, will never request any kind of password, and only the user's public information is stored on the servers.

The service does not currently log requests in a way that would allow anyone with access to the server to link individual queries to specific authorized users, but I reserve the right to implement such logging in the future if there's any suggestion of abuse.

To log in visit https://api.memory.lol/v1/login/github, which will take you to a GitHub "Authorize memory.lol" page that will ask you to authorize "Limited access to your public data". Click "Authorize" and you'll be taken to a status URL that will show your current access level (which will be empty unless your account has been specifically approved for access). From there the full index will be available (if your account has been approved).

To log out go to https://api.memory.lol/v1/logout.

It's possible to use the full version of the service from the command-line via GitHub's device flow, but this currently isn't very convenient (see instructions below). I'll be providing a client that makes command-line use a little easier.

If you're interested in having your account approved for non-date-restricted access, please contact me.

Use cases

Accounts that engage in hate speech, scams, harassment, etc. on social media platforms sometimes try to obscure their identities by changing their screen names, and they often also have really bad opsec (for example using real names or other identifying information on accounts that they later intend to use anonymously).

Being able to look up historical social media profiles often makes it possible to identify the offline identities of these people (or at least to trace connections between their activities).

Here are a few examples off the top of my head (the first three are examples of the service in action, and the last two show how it can be used to confirm the work of others):

In many cases the information provided by the service won't be enough to identify a person, but may provide hints about where to look next (for example looking up deleted tweets for old screen names with cancel-culture is often a reasonable second step).

Detailed example

If you visit https://api.memory.lol/v1/tw/libsoftiktok in your browser, you'll see the following data:

{
  "accounts": [
    {
      "id": 1326229737551912960,
      "screen-names": {
        "chayaraichik": null,
        "cuomomustgo": null,
        "houseplantpotus": null,
        "shaya69830552": [
          "2020-11-10"
        ],
        "shaya_ray": [
          "2020-11-27",
          "2020-12-17"
        ],
        "libsoftiktok": [
          "2021-08-18",
          "2022-06-16"
        ]
      }
    }
  ]
}

Note that for some screen names we don't currently have information about when they were observed (e.g. the ones with null values above). If an screen name was observed on only one day in our data sets, there will be a single date. If there are two dates, they indicate the first and last day that the screen name was observed.

These date ranges will not generally represent the entire time that the screen name has been used (they just indicate when the account appears with that screen name in our data sets).

Other features

The service is very minimal. One of these few things it does support is querying multiple screen names via a comma-separated list (for example: https://api.memory.lol/v1/tw/jr_majewski,MayraFlores2022). It also supports searching for a screen name prefix (currently limited to 100 results; for example: https://api.memory.lol/v1/tw/tradwife*).

It currently only supports JSON output, but if you want a spreadsheet, for example, you can convert the JSON to CSV using a tool like gojq:

$ curl -s https://api.memory.lol/v1/tw/jr_majewski,MayraFlores2022 |
> gojq -r '.[].accounts | .[] | .id as $id | ."screen-names" | keys | [$id] + . | @csv'
89469296,"LaRepublicana86","MayraFlores2022","MayraNohemiF"
726873022603362304,"JRMajewski","jr_majewski"
1533878962455293953,"jr_majewski"

Or if you want one screen name per row:

$ curl -s https://api.memory.lol/v1/tw/jr_majewski,MayraFlores2022 |
> gojq -r '.[].accounts | .[] | .id as $id | ."screen-names" | keys | .[] | [$id, .] | @csv'
89469296,"LaRepublicana86"
89469296,"MayraFlores2022"
89469296,"MayraNohemiF"
726873022603362304,"JRMajewski"
726873022603362304,"jr_majewski"
1533878962455293953,"jr_majewski"

Note that screen name queries are case-insensitive, but the results distinguish case (which can be useful for archives such as Archive Today, which only provide case-sensitive search).

Other endpoints

You can also look up an account's history by account ID (e.g. https://api.memory.lol/v1/tw/id/1326229737551912960 also shows the screen names for Raichik's account).

Authorized access via device flow

There are currently several steps if you want to access the full index from the command line. By default you will receive date-restricted results:

$ curl -s https://api.memory.lol/v1/tw/USForcesKorea | jq
{
  "accounts": [
    {
      "id": 4749974413,
      "id_str": "4749974413",
      "screen_names": {
        "USForcesKorea": [
          "2018-06-08",
          "2022-07-29"
        ]
      }
    }
  ]
}

To access the full index (assuming you have an approved account), you'll first need to get a device code and user code, using exactly this command:

$ curl -X POST -d 'client_id=b8ab5a8c1a2745d514b7' https://github.com/login/device/code
device_code=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx&expires_in=898&interval=5&user_code=ABCD-0123&verification_uri=https%3A%2F%2Fgithub.com%2Flogin%2Fdevice

Next visit https://github.com/login/device in a browser and enter the user code you just received when prompted.

Lastly you need to get your bearer token (replacing device_code below with the one you were given, but again using the client_id shown here):

$ curl -X POST -d 'device_code=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx&client_id=b8ab5a8c1a2745d514b7&grant_type=urn:ietf:params:oauth:grant-type:device_code' https://github.com/login/oauth/access_token
access_token=gho_XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX&scope=&token_type=bearer

You can then use this token to make authenticated queries:

$ curl -s -X POST -d 'token=gho_XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX' https://api.memory.lol/v1/tw/USForcesKorea | jq
{
  "accounts": [
    {
      "id": 26847645,
      "id_str": "26847645",
      "screen_names": {
        "USFKPAO": [
          "2011-10-19",
          "2016-06-19"
        ],
        "usforceskorea": [
          "2017-02-20",
          "2018-03-27"
        ]
      }
    },
    {
      "id": 4749974413,
      "id_str": "4749974413",
      "screen_names": {
        "usforceskorea": [
          "2016-02-26",
          "2017-02-07"
        ],
        "usforceskorea_": [
          "2017-02-19",
          "2018-04-24"
        ],
        "USForcesKorea": [
          "2018-06-08",
          "2022-07-29"
        ]
      }
    }
  ]
}

Eventually this process will be bundled up into a command-line client, but for now this approach will work with existing tools like curl.

Importing data

The application currently supports importing data in two file formats. The first requires one Twitter user object per line (in JSON format with an additional snapshot field representing the observation time as an epoch second). The second is a CSV format with at least three columns (Twitter user ID, screen name, and observation time as epoch second).

Future

Anything about the web service is subject to change at any time, including its availability.

There are non-public endpoints that I'm likely to open up at some point. These provide full historical user profiles, information about suspension or deactivation status, etc.

Terms of service compliance

This web service simply provides an interface to an index for content that is hosted in public archives, and the project aims to be compliant with the terms of service of all platforms that were accessed in generating this index.

This repository does not contain data from any social media platform.

License

This software is published under the Anti-Capitalist Software License (v. 1.4).

Comments
  • added a v2 manifest for Firefox support

    added a v2 manifest for Firefox support

    Firefox seems to currently only support Manifest v2.

    Tested working on Firefox 102.0.1 (64-bit) on Windows 10

    Testing: remove (the v3) manifest.json and rename manifest_v2.json to manifest.json, then zip all the contents with the extension .xpi. Add-on can temporarily be loaded in about:debugging

    opened by naresh97 3
  • Full access to history for

    Full access to history for "included" accounts

    This changes provides untrusted users with full access to history for a set of 21k accounts.

    The list of accounts was created by merging four sources:

    1. The 10k top "Stop the Steal" promoters from the Cornell VoterFraud2020 dataset.
    2. The BlockDNat block list (focused on Deseret Nationalist accounts).
    3. The Antifa Checker 161 bad actor list.
    4. The 10k accounts most central to far-right networks according to the Twitter Watch project.

    Any account on one of these lists (together with a few miscellaneous additions) will be available to anyone accessing the service unless it has been explicitly excluded.

    opened by travisbrown 0
  • Fix some issues with browser extension

    Fix some issues with browser extension

    Now much more robust with respect to styling and updating appropriately. Making this change in preparation for new features coming soon (specifically "Blocked by …" information).

    opened by travisbrown 0
  • Cleaning up React build

    Cleaning up React build

    Now supports both development and production builds. If you're running the web service locally at http://127.0.0.1:8000 you should only need to add a file like this:

    $ cat .env.development.local 
    REACT_APP_API_ROOT=
    
    opened by travisbrown 0
  • Prevent transgender harassment

    Prevent transgender harassment

    Would be cool to add a flag that if the bio contains the transgender flag or pronouns, the script fails. While people will be able to look at the script and edit out this feature, it will help prevent non-technical trolls / lazy bullies from surfacing dead names.

    opened by estelle 32
Owner
Travis Brown
Functional programmer mostly.
Travis Brown
The first fully opensource Skyblock mod backend lol.

DWS The first fully opensource Skyblock mod backend lol. Running Features Rest GET /cosmetics POST /broadcast GET /metrics GET /users DELETE /users?uu

null 2 Jan 2, 2023
Asset-Pool is a decentralized lending protocol and enables users to lend through their social networks

Run If you need to, set up your Substrate development environment . Then, build and run a development chain: $ cargo run -- --dev --tmp Once the node

antimatter-repo 11 Oct 22, 2022
This is a cosmwasm implementation of an interchain accounts controller.

CosmWasm ICA Controller Contract This is a CosmWasm smart contract that communicates with the golang ica/host module on the host chain to create and m

null 7 Jul 2, 2023
Web service for Firefox Suggest

Merino A service to provide address bar suggestions to Firefox. For more details, see the service docs. About the Name This project drives an importan

Mozilla Services 27 Sep 23, 2022
Web service generating images of Japanese (Riichi) Mahjong hands.

chombo-gen ChomboGen is a web service that allows to generate images of Japanese (Riichi) Mahjong hands. The hands are provided in a text format and a

Mateusz Maćkowski 5 May 2, 2023
📜 A pci.ids-compliant library for getting information about available PCI devices.

aparato A pci.ids-compliant library for getting information about available PCI devices. Usage Add the following to your project's Cargo.toml file: ap

Aziz Ben Ali 22 Nov 14, 2022
use variant nesting information to flter overlapping sites from vg deconstruct output

vcfbub popping bubbles in vg deconstruct VCFs overview The VCF output produced by a command like vg deconstruct -e -a -H '#' ... includes information

null 11 Dec 28, 2022
A stringly-typed Error that includes `#[track_caller]` information.

A stringly-typed Error that includes #[track_caller] information.

null 5 Oct 7, 2022
Lambda function to handle Bitbucket webhook payloads, extract relevant information and send notifications to Microsoft Teams

PR-Bot Lambda function to handle Bitbucket webhook payloads, extract relevant information, and send notifications to Microsoft Teams, saving you time

Irine 14 Sep 26, 2023
Jonathan Kelley 33 Dec 6, 2022
RusTiny -- A Rust implementation of Tiny+ language

RusTiny -- A Rust implementation of Tiny+ language 编译器实践 基本要求: 参考《编译原理及实践》的TINY语言编译器(已上传到群中)完成TINY+ 语言(见附录 A)的解释器:即给定满足 TINY+语言的源代码输入,你的解 释器可以给出对其的解释执

M4tsuri 2 May 22, 2022
Tiny Commands Toolchain

Tiny Commands Toolchain: TCT A 1.62MB (release profile) monolithic commands kit for casual terminal usage without any run-time dependency. MMade in 17

Defmc 9 Dec 1, 2022
Click-once - A small tiny little binary to fix undesired mouse double clicks in Windows, written in Rust.

click-once A small tiny little binary to fix malfunctioning mouse double clicks in Windows, written in Rust. Minimal executable with little to no over

null 23 Dec 29, 2022
Tiny crate that allows to wait for a stop signal across multiple threads

Tiny crate that allows to wait for a stop signal across multiple threads. Helpful mostly in server applications that run indefinitely and need a signal for graceful shutdowns.

Dominik Nakamura 5 Dec 16, 2022
Tiny Discord ticket support bot that utilizes the OpenAI GPT-3.5-turbo model.

BDFD AI Mod Our tiny Discord ticket support bot that utilizes the OpenAI GPT-3.5-turbo model. This project aims to help users by providing a very fast

NilPointer Software 5 Apr 20, 2023
LaaS: Life as a Service

LaaS: Life as a Service $ curl life-as-a-service.herokuapp.com/-1x0~0x0~1x0 0x-1~0x0~0x1 let previous = '0x-1~0x0~0x1' for (let i = 0; i < 5; i++) {

Brandon Smith 5 Nov 1, 2021
A discord bot that safely executes whatever rust you throw at it. Remote code execution as a service

RustBot Bot is still under development and not ready for production use RustBot is a discord bot that executes whatever rust code you throw at it. In

Conner Bradley 7 Jan 3, 2022
RuES - Expression Evaluation as Service

RuES is a minimal JMES expression evaluation side-car, that uses JMESPath, and it can handle arbitrary JSON. Which effectively makes it general purpose logical expression evaluation engine, just like some Python libraries that used to evaluate logical expression. This in turn can allow you implement complex stuff like Rule engine, RBAC, or Policy engines etc.

Zohaib Sibte Hassan 14 Jan 3, 2022
Runit service management wrappers

void-svtools Basic wrappers for managing services for runit,

Isaac Hung 1 Aug 3, 2022