Neural Architect — Junghwan Park

The project rejects the usual premise of deep learning. Backpropagation thrives when neurons are abundant because gradients can exploit a rich, redundant search space; this work asks the opposite question — when the neuron budget is brutally small, can an LLM's semantic understanding of the data design each neuron's role deliberately and beat (or match) gradient descent? The benchmark task is MNIST in PyTorch, and the governing metric is a Neuron Efficiency Score (accuracy divided by hidden-neuron count), with tiers from ultra-minimal (4–8 neurons) up to efficient (33–64).

The architecture puts a frontier model in the role of "Neural Architect" talking to a tool server. Data tools let it inspect class distributions, render samples as ASCII art, compute class-similarity matrices, and run PCA; architecture tools let it construct models layer by layer with an explicit, natural-language role annotation per neuron; training and evaluation tools run the model and return accuracy, loss, and per-class breakdowns. Crucially, every tool call, architecture decision, neuron role, weight modification, and evaluation is auto-logged to a SQLite schema, so the entire design process is reproducible and self-documenting — the explainable training the project set out to produce.

The empirical findings are concrete and grounded in the experiment database. A reference backprop sweep establishes the ceiling per budget (4 neurons gives 82.1%, 8 gives 92.5%, 16 gives 95.6%, 128 gives 97.8%). Against that, the best gradient-free 8-neuron model the agent found reached 92.31% test accuracy — on par with the 8-neuron backprop baseline, achieved without any gradient descent. A closed-form pipeline (LDA-derived first-layer directions plus a least-squares readout) reliably produced 82 to 86% with 8 to 9 neurons, and the agent scaled the same gradient-free recipe up to roughly 90 to 93% at 96 to 512 hidden units.

Honest context: the most efficient solutions came not from hand-placing individual weights one at a time but from the LLM choosing principled, closed-form constructions (LDA plus least squares, prototype and data-patch features) — an instructive result about where reasoning-driven design actually pays off. The scope is a single dataset (MNIST) and small fully-connected nets, and the gradient-free models match rather than dominate the backprop baseline at comparable size. As a private research repo, the value is the demonstrated, auditable mechanism — an LLM building competitive tiny networks through reasoning and tools — not a claim of universal superiority over gradient descent.