Actually, really liked the Apple Intelligence announcement. It must be a very exciting time at Apple as they layer AI on top of the entire OS. A few of the major themes.

Step 1 Multimodal I/O. Enable text/audio/image/video capability, both read and write. These are the native human APIs, so to speak.

Step 2 Agentic. Allow all parts of the OS and apps to inter-operate via “function calling”; kernel process LLM that can schedule and coordinate work across them given user queries.

Step 3 Frictionless. Fully integrate these features in a highly frictionless, fast, “always on”, and contextual way. No going around copy pasting information, prompt engineering, or etc. Adapt the UI accordingly.

Step 4 Initiative. Don’t perform a task given a prompt, anticipate the prompt, suggest, initiate.

Step 5 Delegation hierarchy. Move as much intelligence as you can on device (Apple Silicon very helpful and well-suited), but allow optional dispatch of work to cloud.

Step 6 Modularity. Allow the OS to access and support an entire and growing ecosystem of LLMs (e.g. ChatGPT announcement).

Step 7 Privacy. <3

We’re quickly heading into a world where you can open up your phone and just say stuff. It talks back and it knows you. And it just works. Super exciting and as a user, quite looking forward to it.

https://x.com/karpathy/status/1800242310116262150?s=46

    • Z4rK@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      9
      arrow-down
      18
      ·
      18 days ago

      He sort of invented it, so you have to think he’s commenting on the concept here, not the implementation.

      I have tried a lot of medium and small models, and there it just no good replacement for the larger ones for natural text output. And they won’t run on device.

      Still, fine-tuning smaller models can do wonders, so my guess would be that Apple Intelligence is really 20+ small and fine tuned models that kick in based on which action you take.

      • gravitas_deficiency@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        9
        arrow-down
        3
        ·
        18 days ago

        An LLM has no comprehension of what it says. It’s just a puppy that is really good at performing for treats. This will always yield nonsense a meaningful proportion of the time.

        I don’t care how statistically good your model can be under certain constraints and inputs. At the end of the day, all you’ve done is classically condition your computer.

        • Z4rK@lemmy.worldOP
          link
          fedilink
          English
          arrow-up
          1
          arrow-down
          15
          ·
          18 days ago

          It goes a tad bit beyond classical conditioning… LLM’a provides a much better semantic experience than any previous technology, and is great for relating input to meaningful content. Think of it as an improved search engine that gives you more relevant info / actions / tool-suggestions etc based on where and how you are using it.

          Here’s a great article that gives some insight into the knowledge features embedded into a larger model: https://transformer-circuits.pub/2024/scaling-monosemanticity/

          • gravitas_deficiency@sh.itjust.works
            link
            fedilink
            English
            arrow-up
            2
            arrow-down
            1
            ·
            edit-2
            17 days ago

            That’s great. But that’s not how it’s being marketed and sold to the public. It’s being sold as an oracle (as in crystal ball, not database). And it’s misleading and hurting people as a result.

            I’ll reiterate: An LLM has no comprehension of what it says.

            It’s a matter of engineering ethics, on multiple levels:

            • the training data in the vast majority of cases is outright stolen
            • it’s being sold as something that it’s not, and the result is causing real damage to people and society in a ton of ways we’re still discovering
            • most people deeply involved in developing LLMs, and basically all of the technical leadership, are categorically ignoring and abrogating any and all responsibility around this “magical” new system they’ve made. We’ve seen this before with social networking. We know where this road leads.

            I’m not saying the tech should be banned. That’s obviously idiotic. Neural nets can - and are - used for tons of fascinating and excellent applications. It’s just that my staunch opinion is that LLMs are a terrible application of that the tech at this stage of development, and it’s particularly terrible that OpenAI/Microsoft/etc are aggressively foisting this technology on the public, and simultaneously refusing to take any ethical responsibility for it.

            • Z4rK@lemmy.worldOP
              link
              fedilink
              English
              arrow-up
              2
              ·
              17 days ago

              To be honest, I’m not sure what we’re arguing - we both seem to have a sound understanding of what LLM is and what it is not.

              I’m not trying to defend or market LLM, I’m just describing the usability of the current capabilities of typical LLMs.

              • gravitas_deficiency@sh.itjust.works
                link
                fedilink
                English
                arrow-up
                3
                arrow-down
                1
                ·
                17 days ago

                I’m saying that I wish that more people involved with the core development of the technology took the ethical considerations seriously, and communicated those concerns as a first-order issue when they talk about applications like this.

                It’s fascinating tech, but the way it’s being employed these days is deeply irresponsible.