• 0 Posts
  • 4 Comments
Joined 9 hours ago
cake
Cake day: February 1st, 2026

help-circle



  • Would love to help still from my PC on dataset 9 specifically. Any way we can exchange progress so I won’t start with downloading files you already have downloaded?

    E: just started scraping starting from page 18330 (as you mentioned you ended around 18333), hoping I can fill in the remaining 4000-ish pages

    Update 2 (1715UTC): just finished scraping up until the page 20500 limit you set in the code. There are 0 new files in the range between 18330-20500 compared to the ones you already found. So unless I did something wrong, either your list is complete or the DOJ has been scrambling their shit (considering the large number of duplicate pages, I’m going with the second explanation).

    Either way, I’m gonna extract the 48GB and 100GB torrent directories now and try to mark down which of the files already exist within those torrents, so we can make an (intermediate) list of which files are still missing from them