Discussion about this post

User's avatar
Boy Kester's avatar

Thanks for summarizing and more importantly providing your own color, Robert. But a few things are taken a little too easily for granted.

What you deem a breakthrough thought from Claude is actually exactly what is wrong with LLMs. It seems plausible and clever, but it is actually… wrong. The difference between training an LLM on the corpus of human knowledge and how humans learn is that LLM training is a one-off pattern and sense-making exercise, based on whatever it is fed. That is not how you, I, or Marks synthesize data. We do it in real time, with learning and adaptation. Philosophically, an LLM is stateless. It will not learn or adapt after its training. You can spend an entire conversation teaching an LLM that your specific industry uses a term in a non-standard way, and it will adapt within that session. Open a new window the next day and it has forgotten everything. It has not learned. It has no continuity. That is what statelessness means, and that is what separates it from how Marks absorbed lessons from Graham and Buffett over decades of practice.

Another problem with LLMs is temperature sampling: the ‘creativity’ it is allowed to deploy and provide variation in its answers. In reality this means that the same prompt in a different conversation will provide different outcomes. You can test this yourself by opening ten chat windows and feeding ChatGPT the same riddle ten times. You will see differences. And even if it gets it right 8 out of 10, that means 20% was wrong.

This brings me to point two. LLMs are technically not suited for the kind of agentic Level 3 application that Marks describes. The transformer architecture is fundamentally a next-token predictor. It does not plan, verify, or course-correct. When you chain multiple steps together in an agentic workflow, each step carries a probability of error, and those errors compound multiplicatively. A 95% accuracy rate across twenty chained steps gives you a 36% chance of getting the whole sequence right. That is not a labor substitute. That is a liability. Autonomous agents need to know when they are wrong, and LLMs do not know what they know. They generate plausible outputs with equal confidence whether the output is correct or fabricated. In a chat window that produces a useful paragraph, hallucination is an inconvenience. In an autonomous agent making consequential decisions without human review, hallucination is a structural failure mode. Marks conflates the impressive fluency of the interface with the reliability of the underlying architecture. Level 3 requires persistent state, verifiable reasoning, and real-time error correction. Current LLMs offer none of these.

The third point is the infrastructure buildout, which is treated rather simplistically. You refer to railroads. Those were indeed economically viable for years after installation. The problem is that data centers, and more specifically the GPUs inside them, have a technical lifespan of maybe two or three years. Nvidia is releasing new generations within an eighteen to twenty-four month cycle, and performance improves between three and five times between generations. Meanwhile, hyperscalers are depreciating these assets over five or six year spans. The problem this creates is serious. If I buy this generation and my competitor buys the next, my data center is suddenly far less viable, filled with assets I am still depreciating for another three years. Railroads do not become obsolete every two years. GPUs do. So we’re looking at investment patterns that resemble infrastructure projects like roads and bridges, with the unit economics of an iPhone.

Brian West's avatar

I think Marks is seduced by the charms of Claude. The sceptic vs proponent argument is basically “its statistics” vs “does it matter if the output is good”. Intelligence evolves from an evolutionary need, reasoning and intelligence evolves in the context of stakes, incentives, caring, and suffering (aka survival). The statistics has none of those things.

54 more comments...

No posts

Ready for more?