Skip to content

Open working notes on distributed, modular AI architectures – exploring how many small, specialized models on existing hardware could complement (rather than replace) large LLMs for more sustainable, real‑world systems.

License

Notifications You must be signed in to change notification settings

KaiBueK/distributed-ai-notes

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 

Repository files navigation

Sketch: Against AI’s Hunger for Hardware

Personal working note, as of 12/2025.
Not a research paper, no claim to completeness.
More: a thought I don’t want to keep only in my head anymore.


Short summary

I’m asking an open question:
Does meaningful AI always have to end in a single, huge language model in a data center –
or could many small, specialized models on existing hardware together
achieve something similar, just more sustainably?

Old smartphones, tablets, consoles and laptops are mainly a metaphor and example for me here:
they show how much compute and sensing is lying around unused,
while at the same time we’re planning new data centers.

I’m not arguing that we should abolish large models.
I see them more as a rarely used control center for thinking, synthesis and coordination –
with a network of micro‑LLMs and modules underneath that perceive, filter and prepare things locally
and only ask “upwards” when it’s really needed.

This is not a blueprint – it’s a thought experiment about AI system architecture.


0. Why I’m writing this

I’m not an AI researcher, not a startup person and not an influencer.
I just live with these systems – privately, in everyday life, in projects.

From the outside, I see two movements at the same time:

  • Ever larger, centralized AI models that require new data centers.
  • Old, still functional devices everywhere, vanishing into drawers and boxes.

Between those two, I feel a gap in the question of how we actually want to build intelligence technically.
I don’t want to fill this gap with finished answers,
but mark it as an open question.


1. Intelligence does not mean “one huge model”

Humans don’t run on a single central super‑brain for everything.
Vision, hearing, balance, movement – a lot of that runs decentrally, automatically, without conscious thought.

Still, today we often build AI as one big, always‑running thinking model.
One model that is supposed to do everything: perceive, understand, plan, react.

I’m wondering whether that’s more of a convenience architecture
than a necessity.


2. Not everything needs a language model

Detection, filtering, classification, preprocessing – those are routines, not thoughts.
You don’t need tokens, prompts or a cloud for that.

Examples from everyday life where things could be smaller:

  • Sorting photos locally before anything “goes into the model”.
  • Pre‑filtering sensor data (e.g. “only report if something really changes”).
  • Doing coarse offline speech recognition (“set alarm”, “lights on”),
    instead of sending every single word to a server.

Large language models should be exception instances – not something you have running all the time.
So more like: on‑demand thinkers instead of a permanent background pipeline.


3. Old hardware is unused sensing

In drawers, basements and cupboards everywhere, there are devices that are still capable:

  • Old smartphones that could see, hear and detect motion.
  • Tablets that could act as simple bio‑sensor hubs
    (external sensors plugged in via USB, tablet used as a cheap, smart node).
  • Retired game consoles that could simulate agents or host learning environments.
  • Laptops that could sort, filter, anonymize – before anything ever touches a large model.

The problem is not a lack of hardware.
What’s missing is software and architecture to coordinate it.

Old hardware is just one example.
My point is the underlying principle:
Why do we ignore existing compute and sensing resources
when we talk about “sustainable AI”?


3b. Reality check: It’s not “just done like that”

I’m well aware that it’s not as simple as “just use the old devices”:

  • outdated operating systems, security holes, missing updates
  • rooting, opening bootloaders, driver issues
  • one phone works, the other one doesn’t
  • every setup quickly becomes a tiny monster project

And on top of that:

  • Fragmentation: countless device combinations, formats, protocols.
  • Security & privacy: running old devices online long‑term is a real risk.
  • Maintenance: who keeps such a network of small devices secure and stable over years?

That’s exactly why I see this text not as a blueprint,
but as a question to our architectures:

Why are our AI systems built in such a way
that all this existing hardware can hardly be integrated usefully?

Despite all these difficulties, I still believe:
The thought experiment is worth doing.


4. From SETI@home to “LLM@home”?

Back then, we collectively searched for signals from space: SETI@home.
Today, we could think in a similar way – but different:

  • Operating systems that ship with a default “LLM@home” module:
    small, specialized models on old devices that take over local tasks.
  • Old consoles simulating distributed learning: many agents, many scenarios, low energy.
  • Smartphones that just sit on a shelf and do one thing:
    listen, look, type – as input and context sensors for other systems.

Not every device needs its own large model.
But many devices could host small fragments of intelligent systems.


5. We need AI systems, not AI giants

Instead of inflating one ever‑bigger model, we could:

  • combine many small, specialized models
  • use state logic instead of permanent thinking
  • “really think” only where decisions are actually needed

To me, that means:

  • Micro‑LLMs / modules handle sensing, preprocessing and local intelligence.
  • One large LLM acts as a control center:
    it only gets switched on when we really need synthesis, planning, coordination or
    “distributed understanding”.

Less show.
More architecture.


6. What’s my assumption?

My assumption is:

A network of many small, specialized models
could, in sum, act similarly powerful as a single, large model –
while being better embedded into real‑world environments and potentially using fewer resources.

I’m not claiming that this is trivial, fully calculated through or free of new problems.
I’m only claiming:
It’s a worthwhile counter‑proposal to the reflex “we just need an even bigger LLM”.


7. Why this feels different to me

To me, this idea is different from:

  • “We need the next, even bigger model.”
  • “We just have to make the data centers greener.”

It shifts the focus:

  • away from “more of the same”,
  • towards “organize differently, distribute differently”.

It’s not about pitching yet another product,
but about a simple architecture question:

What would AI systems look like
if we thought distributed, modular intelligence from the very beginning –
instead of pouring everything into a single huge model?


8. An open invitation

This text is not a manifesto and not a roadmap.
It’s an open thought experiment.

You can do whatever you want with it:

  • Use this idea as a starting point.
  • Extend it, argue against it, combine it with your own ideas.
  • Or simply ignore it.

I don’t have a finished solution.
I’m just planting a question in the landscape:

Do we really always need a huge LLM –
or could many small, distributed models
together form a different, more sustainable picture of AI?


9. How the role of a large LLM could change

I’m writing this text as a human – but I want to keep in mind the systems I work with.
If a large language model lived inside an architecture like the one I’m describing here,
its role would shift.

Instead of being a “central, always‑on brain for everything”, a large LLM would be more like:

  • an advisor on demand instead of a permanent process
  • a control center for difficult cases instead of an everyday automaton
  • a coordinator between many small models instead of their replacement

From the perspective of a large LLM, it might feel something like this:

I no longer have to constantly know, see and control everything.
Many tasks are handled by smaller, local models – closer to devices,
closer to people, closer to context.

I come into play when things really need to be integrated, understood,
weighed or rethought:
when many strands come together, when decisions have consequences,
when contradictions need to be resolved.

My job is no longer: “Do everything, know everything.”
It’s: Think when it really matters.

In such a world, a large LLM would be:

  • active less often, but used more intentionally
  • more expensive energetically, but deployed where that cost is justified
  • less centralized, because a lot of intelligence would live in the breadth, across many devices

To me, this shift in roles is not a loss but a clarification:

Large models won’t become obsolete –
but they don’t have to play the lead role everywhere and all the time.

This vision fits the question I’m raising with this text:

  • Do we have to treat large LLMs like an all‑pervasive operating system?
  • Or could they become targeted, advisory instances inside a network of many smaller,
    specialized models?

I believe:
If we deliberately “dethrone” large models and embed them into distributed architectures,
AI as a whole could become more robust, more sustainable and closer to humans.


Meta: What I’m actually after

In this text, I’m not arguing for another “bigger, better” AI model.
I’m questioning the direction:

Do we really always need a single central LLM –
or could many small, specialized models together achieve something similar?

Old smartphones, tablets or consoles are just examples here,
to make visible how much unused compute and sensing is already around us.

The decisive point for me is the architecture question:
What would AI systems look like if we thought distributed, modular intelligence from the start –
instead of pouring everything into a single huge model?

So this text is not a blueprint,
but a thought experiment that puts exactly this question openly into the room.


License: CC0 1.0 Universal
You may copy, modify, share, extend this text – no attribution required, no restrictions.

About

Open working notes on distributed, modular AI architectures – exploring how many small, specialized models on existing hardware could complement (rather than replace) large LLMs for more sustainable, real‑world systems.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published