SoftestSapphic@lemmy.world to 196@lemmy.world · 2 months ago"AI" rulelemmy.worldexternal-linkmessage-square89fedilinkarrow-up1692arrow-down154
arrow-up1638arrow-down1external-link"AI" rulelemmy.worldSoftestSapphic@lemmy.world to 196@lemmy.world · 2 months agomessage-square89fedilink
minus-squareFooBarrington@lemmy.worldlinkfedilinkarrow-up2·1 month agoThat’s also true, though it’s important to remember that the “experts” aren’t experts in the classical sense. Say you have a word made up of 3 tokens, it’s possible that each token is routed to a different expert. It’s just a model architecture.
That’s also true, though it’s important to remember that the “experts” aren’t experts in the classical sense. Say you have a word made up of 3 tokens, it’s possible that each token is routed to a different expert. It’s just a model architecture.