• 0 Posts
  • 65 Comments
Joined 3 months ago
cake
Cake day: March 23rd, 2025

help-circle
  • It’s not anthropomorphizing, its how new terms are created.

    Pretty much every new term ever draws on already existing terms.

    A car is called car, because that term was first used for streetcars before that, and for passenger train cars before that, and before that it was used for cargo train cars and before that it was used for a charriot and originally it was used for a two-wheeled Celtic war chariot. Not a lot of modern cars have two wheels and a horse.

    A plane is called a plane, because it’s short for airplane, which derives from aeroplane, which means the wing of an airplane and that term first denoted the shell casings of a beetle’s wings. And not a lot of modern planes are actually of beetle wing shell casings.

    You can do the same for almost all modern terms. Every term derives from a term that denotes something similar, often in another domain.

    Same with AI hallucinations. Nobody with half an education would think that the cause, effect and expression of AI hallucinations is the same as for humans. OpenAI doesn’t feed ChatGTP hallucinogenics. It’s just a technical term that means something vaguely related to what the term originally meant for humans, same as “plane” and “beetle wing shell casing”.






  • You did not read your source. Some quotes you apparently missed:

    Scraping to violate the public’s privacy is bad, actually.

    Scraping to alienate creative workers’ labor is bad, actually.

    Please read your source before posting it and claiming it says something it doesn’t actually say.

    Now why does Doctrow distinguish between good scraping and bad scraping, and even between good LLM training and bad LLM training in his post?

    Because the good applications are actually covered by fair use while the bad parts aren’t.

    Because fair use isn’t actually about what is done (scraping, LLM training, …) but about who does it (researchers, non-profit vs. companies, for-profit) and for what purpose (research, critique, teaching, news reporting vs. making a profit by putting original copyright owners out of work).

    That’s the whole point of fair use. It’s even in the name. It’s about the use, and the use needs to be fair. It’s not called “Allowed techniques, don’t care if it’s fair”.


  • Tbh, this is not a question about scraping at all.

    Scraping is just a rather neutral tool that can be used for all sorts of purposes, legal and illegal.

    Neither does the technique justify the purpose nor does outlawing the technique fix the actual problem.

    Fair use only applies for a certain set of use cases and has a strict set of restrictions applied to it.

    The permitted use cases are: “criticism, comment, news reporting, teaching (including multiple copies for classroom use), scholarship, or research”.

    And the two relevant restrictions are:

    • “the purpose and character of the use, including whether such use is of a commercial nature or is for nonprofit educational purposes;”
    • “the effect of the use upon the potential market for or value of the copyrighted work.”

    (Quoted from 17 U.S.C. § 107)

    And here the differences between archive.org and AI become obvious. While archive.org can be abused as some kind of file sharing system or to circumvent paywalls or ads, its intended purpose is for research, and it’s firmly non-profit and doesn’t compete with copyright holders.

    AI, on the other hand, is almost always commercial, and its main purpose is to replace human labour, specifically of the copyright owners. It might not be an actual problem for Disney’s bottom line, but it’s a massive problem for smaller artists, stock photographers, translators, and many other professions.

    That way, it clearly doesn’t apply to the use cases for fair use while violating the restrictions.

    And for that, it doesn’t matter if the training data is acquired using scraping (without permission) or some other way (without permission to use it for AI training).






  • Zionism is the one thing where anti-semites and Jews (at least zionist Jews) agree.

    Zionist Jews want it because it gives them their own country where they are not persecuted.

    Anti-semites want it, because it means that the Jews are not in their country.

    That’s why even the literal Nazis supported zionism. Every Jew in Israel was one less Jew in Germany.

    You get the same thing still today with the most right-wing politicians supporting Zionism/Israel. On the one hand because it’s a way to keep Jews far away and on the other hand because it can be used as a “I’m supporting Israel, so surely I can’t be a Nazi. Anyway, let’s go shoot some Muslims.”-kind of excuse.


  • Tbh, immigration isn’t the worst “solution”.

    We do have an overpopulation problem. Well, an overconsumption times overpopulation problem, really.

    We could fix that by either consuming less (which we apparently, as a species, really don’t want) or by having fewer people (which we apparently really want).

    So, in the end, reducing population isn’t a real problem. Even if the population shrinks by 50% each generation (~25 years, for the sake of the argument), there will still be 250mio people left even after 5 generations. The trend should probably be reversed sometime then, but until then it’s really not an issue on the species survival aspect and it would actually be really good for the planet and our long-term survival.

    But until then we have mainly one problem: our economic system is based on infinite growth, which can’t work. So again there are two main solutions: either we bring in people from other countries, who benefit from a higher standard of living here while supporting our economic system, or we get rid of the real parasites and freeloaders in our societies: the ultra rich. And again, for some reason we really don’t want to get rid of the rich.