• Terrasque@infosec.pub
    link
    fedilink
    English
    arrow-up
    1
    arrow-down
    1
    ·
    3 days ago

    I know them, and used them a bit. I even mentioned them in an earlier comment. The capabilities of OpenAI’s new model is on a different level in my experience.

    https://www.reddit.com/r/StableDiffusion/comments/1jlj8me/4o_vs_flux/ - read the comments there. That’s a community dedicated to running local diffusion models. They’re familiar with all the tricks. They’re pretty damn impressed too.

    I can’t help but feel that people here either haven’t tried the new openai image model, or have never actually used any of the existing ai image generators before.

    • ZeroOne@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      2 days ago

      I cannot take you seriously with all that reddit comments.

      But then why am I even surprised, you shill for a proprietary-AI

      • Terrasque@infosec.pub
        link
        fedilink
        English
        arrow-up
        1
        arrow-down
        1
        ·
        2 days ago

        ah yes, I forgot we live in post-truth society where reality doesn’t matter and only your feelings are important. And since your feelings say AI bad, proprietary bad, and reddit bad, you don’t have to actually think or take into consideration reality.

        • ZeroOne@lemmy.world
          link
          fedilink
          English
          arrow-up
          1
          ·
          2 days ago

          Truth in this case simply means Your ill-informed opinions

          & FYI, I like AIs that are fully opensource

          • Terrasque@infosec.pub
            link
            fedilink
            English
            arrow-up
            1
            ·
            edit-2
            2 days ago

            I’m sorry, but what is ill informed or opinion about it? Fact is it can do things no other image generator can do, open source or not. It can also effortlessly do things that would require a lot of tinkering with controlnet in comfyui, or even making custom lora’s. It’s a multimodal model that can do image and text both input and output, and does it well. All other useful image generators are diffusion based, which doesn’t read a prompt in the same way, and is more about weighting patterns based on keywords rather than any real understanding of the prompt. That’s why they’re struggling with relatively simple things like “a full glass of wine” or “a horse riding an astronaut on the moon”. If I’m wrong about this, please prove me wrong. Nothing would make me happier than finding an open source model that can do what openai’s new image model can do, really. I already run llama.cpp servers and comfyui locally, I have my own AI server in the basement with a P40 and a 3090. Please, please prove me wrong here.

            I love open models, and been running them locally since first llama model, but that doesn’t mean I willfully ignore and pretend what claude and openai and google develops doesn’t exist. Rather I want awareness about it, that it does exist, and I want an open source version of it.