• igorlogius
        link
        fedilink
        English
        9
        edit-2
        1 year ago

        And they sell it to the highest bidder to train their next LLM, which seems to be all the rage at the moment.

    • thisbenzingring
      link
      fedilink
      English
      6
      edit-2
      1 year ago

      Text data on a compressed drive is so small. You have a modern server and accessing text files in a compressed drive is not noticeable performance hit. The compression ratio is massive for text and markup language files

      • @[email protected]
        link
        fedilink
        English
        -31 year ago

        Yes, text doesn’t take up much space, but decades of text can easily take up a lot of space, especially when you track things like edits.

        Not to mention that this data isn’t in text files. It’s going to be in a database, so the number of records that need to be parsed will impact performance. How big that impact is depends on how they set the database up.