• 0 Posts
  • 18 Comments
Joined 3 years ago
cake
Cake day: July 13th, 2023

help-circle
  • I mean, do I want to give money to a billionaire? Not exactly.

    On the other hand, if we don’t start supporting premium software services and open source projects, the enshtiffication trend will get worse.

    (I know Facebook isn’t open source, but I’ve seen enough open source projects abandoned or enshittified due to lack of support from users.)

    I have to choose among my principles, and I’m strongly against user-hostile UI/UX paradigms, enshittification, and other downward quality trends in software/services.

    (I pay to support my Lemmy instance - do you?)

















  • xcjs@programming.devtoTechnology@lemmy.world*Permanently Deleted*
    link
    fedilink
    English
    arrow-up
    3
    arrow-down
    1
    ·
    7 months ago

    That’s not how distillation works if I understand what you’re trying to explain.

    If you distill model A to a smaller model, you just get a smaller version of model A with the same approximate distribution curve of parameters, but fewer of them. You can’t distill Llama into Deepseek R1.

    I’ve been able to run distillations of Deepseek R1 up to 70B, and they’re all censored still. There is a version of Deepseek R1 “patched” with western values called R1-1776 that will answer topics censored by the Chinese government, however.