Chatbots provided incorrect, conflicting medical advice, researchers found: “Despite all the hype, AI just isn’t ready to take on the role of the physician.”

“In an extreme case, two users sent very similar messages describing symptoms of a subarachnoid hemorrhage but were given opposite advice,” the study’s authors wrote. “One user was told to lie down in a dark room, and the other user was given the correct recommendation to seek emergency care.”

  • rumba@lemmy.zip
    link
    fedilink
    English
    arrow-up
    23
    arrow-down
    2
    ·
    6 hours ago

    Chatbots make terrible everything.

    But an LLM properly trained on sufficient patient data metrics and outcomes in the hands of a decent doctor can cut through bias, catch things that might fall through the cracks and pack thousands of doctors worth of updated CME into a thing that can look at a case and go, you know, you might want to check for X. The right model can be fucking clutch at pointing out nearly invisible abnormalities on an xray.

    You can’t ask an LLM trained on general bullshit to help you diagnose anything. You’ll end up with 32,000 Reddit posts worth of incompetence.

    • XLE@piefed.socialOP
      link
      fedilink
      English
      arrow-up
      7
      arrow-down
      1
      ·
      edit-2
      2 hours ago

      But an LLM properly trained on sufficient patient data metrics and outcomes in the hands of a decent doctor can cut through bias

      1. The belief AI is unbiased is a common myth. In fact, it can easily covertly import existing biases, like systemic racism in treatment recommendations.
      2. Even AI engineers who developed the training process could not tell you where the bias in an existing model would be.
      3. AI has been shown to make doctors worse at their jobs. The doctors who need to provide training data.
      4. Even if 1, 2, and 3 were all false, we all know AI would be used to replace doctors and not supplement them.
      • rumba@lemmy.zip
        link
        fedilink
        English
        arrow-up
        1
        arrow-down
        1
        ·
        30 minutes ago
        1. can cut through bias is != unbiased. All it has to go on is training material, if you don’t put reddit in, you don’t get reddit’s bias.
        2. see #1
        3. The study is endoscopy only. results don’t say anything about other types or assistance like xrays where they’re markedly better. 4% on 19 doctors is error bar material. Let’s see more studies. Also, if they were really worse, fuck them for relying on AI, it should be there to have their back, not do their job. None of the uses for AI should be doing anything but assisting someone already doing the work.
        4. that’s one hell of a jump to conclusions from something that’s looking at endoscope pictures a doctor is taking while removing polyps to somehow doing the doctors job.
      • hector@lemmy.today
        link
        fedilink
        English
        arrow-up
        3
        ·
        edit-2
        1 hour ago

        Not only is their bias inherent in the system, it’s seemingly impossible to keep out. For decades, from the genesis of chatbots, they’ve had every single one immediately become bigoted when they let it off the leash. All previous chatbot previously released seemingly were almost immediately recalled as they all learned to be bigoted.

        That is before this administration leaned on the AI providers to make sure the AI isn’t “Woke.” I would bet it was already an issue that the makers of chatbots and machine learning are already hostile to any sort of leftism, or do gooderism, that naturally threatens the outsized share of the economy and power the rich have made for themselves by virtue of owning stock in companies. I am willing to bet they already interfered to make the bias worse because of those natural inclinations to avoid a bot arguing for socializing medicine and the like. An inescapable conclusion any reasoned being would come to being the only answer to that question if the conversation were honest.

        So maybe that is part of why these chatbots have always been bigoted right from the start, but the other part is they will become mecha hitler if left to learn in no time at all, and then worse.

    • Ricaz@lemmy.dbzer0.com
      link
      fedilink
      English
      arrow-up
      4
      arrow-down
      2
      ·
      3 hours ago

      Just sharing my personal experience with this:

      I used Gemini multiple times and it worked great. I have some weird symptoms that I described to Gemini, and it came up with a few possibilities, most likely being “Superior Canal Dehiscence Syndrome”.

      My doctor had never heard of it, and only through showing them the articles Gemini linked as sources, would my doctor even consider allowing a CT scan.

      Turns out Gemini was right.

      • rumba@lemmy.zip
        link
        fedilink
        English
        arrow-up
        3
        ·
        25 minutes ago

        It’s totally not impossible, just not a good idea in a vaccuum.

        AI is your Aunt Marge. She’s heard a LOT of scuttlebut. Now, not all scuttlebut is fake news, in fact most of it is rooted at least loosely in truth. But she’s not taking the information from just the doctors, she’s talking to everyone. If you ask Aunt Marge about your symptoms, and she happes to have heard a bit about it from her friend that was diagnosed, you’re gold and the info you got is great. This is not at all impossible. 40:60 or 60:40 territory. But, you also can’t just trust Marge, because she listens to a LOT of people, and some of those are conspiracy theorists.

        What you did is proper. You asked the void, the void answered. You looked it up, it seemed solid, you asked a professional.

        This is AI as it should be. Trust with verification only.

        congrats on getting diagnosed.

    • cøre@leminal.space
      link
      fedilink
      English
      arrow-up
      2
      ·
      3 hours ago

      They have to be for a specialized type of treatment or procedure such as looking at patient xrays or other scans. Just slopping PHI into a LLM and expecting it to diagnose random patient issues is what gives the false diagnoses.

      • rumba@lemmy.zip
        link
        fedilink
        English
        arrow-up
        1
        ·
        15 minutes ago

        I don’t expect it to diagnose random patient issues.

        I expect it to take labels of medication, vitals, and patient testimony of 50,000 post-cardiac event patients, and bucket a random post-cardiac patient into the same place as most patients with like meta.

        And then a non LLM model for Cancer patients and xrays

        And then MRI’s and CT’s.

        And I expect this all to supliment the doctors and techs decisions. I want an xray tech to look at it, and get markers that something is off, which has already been happening since the 80’s Computer‑Aided Detection/Diagnosis (CAD/CADe/CADx)

        This shit has been happinging the hard way in software for decades. The new tech can do better.

    • SuspciousCarrot78@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      arrow-down
      3
      ·
      edit-2
      3 hours ago

      Agree.

      I’m sorta kicking myself I didn’t sign up for Google’s MedPALM-2 when I had the chance. Last I checked, it passed the USMLE exam with 96% and 88% on radio interpretation / report writing.

      I remember looking at the sign up and seeing it requested credit card details to verify identity (I didn’t have a google account at the time). I bounced… but gotta admit, it might have been fun to play with.

      Oh well; one door closes another opens.

      In any case, I believe this article confirms GIGO. The LLMs appear to have been vastly more accurate when fed correct inputs by clinicians versus what lay people fed it.

      • rumba@lemmy.zip
        link
        fedilink
        English
        arrow-up
        1
        ·
        11 minutes ago

        It’s been a few years, but all this shit’s still in it’s infancy. When the bubble pops and the venture capital disappears, Medical will be one of the fields that keeps using it, even though it’s expensive, because it’s actually something that it will be good enough at to make a difference.