they might, but the point is the volume of data rather than the speed… CERN is obviously an outlier, but not by as much as you’d think. copious amounts of data is kinda par for the course in a lot of cases, and training data just doesn’t even come close to the volume of data that large data users produce (data warehouses/lakes in the order of PB and EB are not that uncommon)
Yes it does. Where do you think they store those gigantic training datasets?
relative to the hard drive market in general, that seems like a drop in the bucket. research labs like CERN write TBs per SECOND
quality data sets don’t even come close
They might write to faster storage first, then dump to slower larger storage afterwards
they might, but the point is the volume of data rather than the speed… CERN is obviously an outlier, but not by as much as you’d think. copious amounts of data is kinda par for the course in a lot of cases, and training data just doesn’t even come close to the volume of data that large data users produce (data warehouses/lakes in the order of PB and EB are not that uncommon)