@epstein_files_guy

epstein_files_guy@lemmy.world · 38 分钟前

yep, impossible to know

epstein_files_guy@lemmy.world · 2 小时前

yeah I’m not the one who generated the url list but I’ve also been getting a lot without a downloadable document. I’m going to start on one of the url lists posted here soon

epstein_files_guy@lemmy.world · 2 小时前

alrighty, I’m currently in the middle of the archive.org upload but I can transfer the chunks I already have over to a different machine and do it there with a new IP

epstein_files_guy@lemmy.world · 2 小时前

age gate > page not found

epstein_files_guy@lemmy.world · 3 小时前

I messaged you on the other site; I’m currently getting a Could not determine Content-Length (got None) error

epstein_files_guy@lemmy.world · edit-2 11 小时前

No worries, thank you!

edit: I’ll start on that url list (randomized) tomorrow, my run from the previously generated url list is still going (currently 75.6k files)

epstein_files_guy@lemmy.world · 15 小时前

this method is not working for me anymore

epstein_files_guy@lemmy.world · 15 小时前

I’m waiting for /u/Kindly_District9380 's version but I’ve been slowly working backwards on this in the meantime https://archive.org/details/dataset9_url_list

epstein_files_guy@lemmy.world · 15 小时前

I’ve got that one too, maybe we should compare dataset 12 versions too

epstein_files_guy@lemmy.world · 16 小时前

about 25gb

epstein_files_guy@lemmy.world · 16 小时前

I’m using a partial download I already had and not the 48gb version but I will be gathering as many chunks as I can as well. Thanks for making this

epstein_files_guy@lemmy.world · edit-2 3 小时前

I’ll get the first set (42k files in 31G) uploading as soon as I get it zipped up. it’s the one least likely to have any new files in it since I started at the beginning like others but it’s worth a shot

edit 01FEB2026 1208AM EST - 6.4/30gb uploaded to archive.org

edit 01FEB2026 0430AM EST - 13/30gb uploaded to archive.org; scrape using a different url set going backwards is currently at 75.4k files

edit 01FEB2026 1233PM EST - had an internet outage overnight and lost all progress on the archive.org upload, currently back to 11/30gb. the scrape using a previous url set seems to be getting very few new files now, sitting at 77.9k at the moment

epstein_files_guy@lemmy.world · 19 小时前

maybe archive.org? that way they can be torrented if others want to attempt their own merging techniques? either way it will be a long upload, my speed is not especially good. I’m still churning through one set of urls that is 1.2M lines, most are failing but I have 65k from that batch so far.

epstein_files_guy@lemmy.world · 20 小时前

looking forward to your torrent, will seed.

I have several incomplete sets of files from dataset 9 that I downloaded with a scraped set of urls - should I try to get them to you to compare as well?