Hugging Face Is the New GitHub for LLMs
Hugging Face is the new GitHub for LLMs. These are the famous words of Andrej Karpathy. Software has undergone a rapid transformation in the last few years. And now, English is the new programming language.

Introduction
Recently, I came across Andrej Karpathy's keynote speech, which he delivered on June 17, 2025, at Y Combinator's AI Startup School in San Francisco. I was amazed how clearly Andrej was able to show the direction Software is headed in the age of AI. I found his ideas so incredible that I thought it would be worthwhile to summarize them here.
Journey from Software 1.0 to 3.0
Andrej started the talk by drawing our attention to the fact that Software is changing, and changing rapidly. For seventy years, Software and programming on a computer have not changed much at a fundamental level (Software 1.0), and then in the last few years, they have undergone two rapid changes (Software 2.0 and Software 3.0). The good news is that, due to this fundamental shift in Software, there is a huge amount of work to be done to rewrite the software. In this blog post, I explore Andrej's assertion about what this shift in Software means for Software developers and what opportunities lie ahead.
Software 1.0
Looking back, the traditional way of developing Software that we are all familiar with is where we, the programmers, are the brains. And we determine all the rules that the program should follow ahead of time. Then, we embark on imperative programming, writing step-by-step instructions for the computer to get specific answers when we provide input data to our programs. There are obvious limitations to this approach, as the program will encounter roadblocks when new situations arise for which the rules have not already been thought out and coded.

Software 2.0
Neural Networks allowed us to arrive at Software 2.0, where the paradigm of coding the business rules into a program is flipped on its head. In Software 2.0, you do not provide the rules; instead, you provide as much input data as possible, along with the answers if available, and let the computer derive the rules independently. This new way of programming lets the computer arrive at the correct rules by starting with wild guesses and making giant mistakes, slowly correcting the mistakes, and repeating the process multiple times until it has corrected all the mistakes (or enough to produce acceptable answers), thus learning the rules that make it give the correct answer. Instead of the code a programmer writes, Software 2.0 arrives at the weights of a neural network. To enable a computer to achieve this, a programmer still needs to code a traditional program, just as in Software 1.0; however, this program is much smaller. This program only defines the architecture of the Neural Network and a mechanism that allows the neural network to correct its mistakes (back propagation), transitioning from wild guesses to weights that produce an acceptable outcome.

Software 3.0
In recent years, there has been another transformation, into Software 3.0. This is where the same Neural Network, along with its weights, can be programmed to produce results for different use cases. The Neural Network has already been trained on a vast amount of data and has the capability of understanding different types of requests. Interestingly, the language used to program the LLM in this manner is English, and the programs are presented in the form of prompts that anyone who knows English can provide to the LLM.
Software 3.0 introduces a new programming paradigm where programs are prompted to the LLM, and prompts are written in our native language, English. So anyone who knows English can program in Software 3.0.

Andrej asserts, and I concur, that as a Software Engineer, one has to be fluent in all three Software development paradigms. Because a real-world problem will often require traditional programming, it may also warrant training a Neural Network on custom data (although the need for that is diminishing), and it will most definitely require prompt engineering utilizing well-trained large language models.
One interesting thing that Andrej Karpathy mentioned and that drew my attention was that he equated Hugging Face with GitHub for Software 2.0. Where thousands of programmers use GitHub to commit their projects, but more importantly, other programmers might fork a repository and modify it slightly to produce something different. This can also be done on Hugging Face with LLMs. Different labs and individuals release their LLMs (the trained weights, along with the architecture) on Hugging Face. Now, with techniques like LoRA or Transfer Learning, anyone can tune an LLM for their specific needs and create something akin to a fork or branch of that LLM.

LLMs in the age of Software 3.0
In the last few years, LLMs have sprung up like mushrooms after a spring rain—each promising something new, yet all growing from the same fertile soil of transformer architectures and massive datasets. To fold LLMs into the new paradigm for Software development, Andrej presented how to think about LLMs in the age of AI.
LLMs have properties of Utilities
"AI is the new Electricity" - Andrew Ng
In 2017, Andrew Ng, a noted AI scientist, founder of Coursera, and a professor at Stanford University, referred to AI as the new Electricity. He observed that just as electricity transformed almost everything 100 years ago, AI is poised to have a similar impact. Drawing inspiration from this profound statement, Andrej takes this analogy a step further and observes that if AI is the new electricity, LLMs have properties akin to utilities.
If we consider the leading frontier LLMs, such as GPT-4, Claude 3.5 Sonnet, and Gemini, they were all trained by large LLM labs, companies with substantial resources, including OpenAI, Anthropic, and Google. These "LLM Labs" have incurred a significant amount of Capital Expenditure on training these LLMs.
And then they bear significant operational costs to bring the LLMs to us via API access.In turn, we get to use these LLMs and have a "Metered" access to the LLMS, just as we do for utilities, where the labs producing this new AI electricity via their LLM grids charge us some cost, for example, some cents per million tokens or something like that.
We have similar demands for LLMs as we do for utilities, including low latency, high uptime, and consistent quality. Additionally, we can switch between AI providers by switching from one LLM to another. Platforms like openRouter make it seamless to switch between LLMs from different providers.
And if these LLMs go down for some reason, the world will have an intelligence brownout, just like we do during a thunderstorm when, for example, a transformer blows up.
LLMs have properties of fabs
Andrej further observes that LLMs have properties of FABs due to the significant CAPEX required to train the model. Companies like Google are fabricating TPUs (Tensor Processing Units, a custom ASIC designed to accelerate machine learning workloads). Still, others are leveraging Nvidia GPU resources on cloud providers such as Azure or AWS.
LLMs Analogous to the Operating Systems of the 1960s
However, he ultimately believes that LLMs are most similar to Operating Systems, such as those of the 1960s. There was an increasingly complex ecosystem. The ecosystem for AI is shaping in a similar way to the operating systems, where there are few closed-source providers, such as Windows and macOS, with proprietary Operating Systems. Then there are open-source providers, such as Linux. In the world of LLMs, there are also Closed Frontier Models, such as OpenAI, Anthropic, and Gemini. Additionally, there is the LLAMA ecosystem, an open-source LLM that may evolve into something similar to Linux.
The Neural Network infrastructure (for example, the hidden layers and number of nodes) is akin to a CPU. The context window functions like program memory, and the LLM acts like the operating system running on the Neural Network, orchestrating memory and compute for problem-solving.
And since a massive amount of CapEx is required to train an LLM, it is centralized on a handful of providers, much like the 1960s OS, which used to run on a central computer, and users accessed it via dumb terminals. Similarly, the LLMs are hosted by the providers, and we access them remotely using thin clients. Where we time share the LLMs. Computers used to look like this. Where the OS was in the cloud and everyone accessed it through a dumb terminal in a timesharing, batched manner.
But then came the personal computers, and the nature of OS shifted from centralized to everyone's desktops. The personal computing revolution has not yet occurred for AI, mainly because it is not cost-effective. However, some people are trying, and Mac Minis appear to be a good fit for this, particularly for batch 1 inference, which is memory-bound and works well on Mac Minis.

One more analogy that Andrej noted regarding LLMs' similarity to OS circa 1960 is the current interface to the LLMs. Most LLMs offer a Chat interface for interaction. The chat interface takes a prompt, and the LLM generates a response. This is akin to using a terminal to interact with the OS in the olden days. A GUI has not been developed generically, presenting an opportunity for developers to enter the industry and provide easier, application-specific interfaces to LLMs. The good news is that the LLMs are readily available to us, and it is our time to enter this industry and program these computers.

Psychology of LLMs
Next, Andrej turned to explaining the psychology of the current LLMs. This was an excellent observation because it highlighted the strengths of these LLMs while also exposing their deficits, thereby presenting an opportunity for developers to work around these deficits and make the LLMs more useful.
The first observation he made was that these LLMs have a people's spirit. And rightly so, after all, the LLMs are trained on a vast amount of information produced by people, in turn making them a stochastic simulation of the human spirit. This simulator is an Autoregressive Transformer that processes sequential data in parallel, utilizing self-attention. Because this simulator has derived its weights by fitting to text from the internet, which is produced by humans, it develops a human-like spirit. However, they possess encyclopedic knowledge, having been trained on a vast amount of data. And they can remember many things.
They also have a bunch of cognitive deficits. They hallucinate, and they make up stuff. They lack an internal model of self-knowledge, and they exhibit jagged intelligence, as evidenced by mistakes that no human would make, such as counting the number of letters in a sentence.
Most importantly, LLMs suffer from anterograde amnesia. They are stuck in the time when they were trained and derived their weights, and they fail to learn new things as they work with you. So there is no continuous learning. Like a human coworker, working with you will improve over time as they learn about your organization, discover new things, and make daily improvements. In short, a human coworker gains a tremendous amount of context about the organization over time.
The LLMs work on tokens, chunks of text that the LLM uses for processing and understanding languages. The amount of text a model can consider at once when generating a response is its Context Window. Once a response is generated by the LLM, the context window is cleared. To maintain context, the chat interface sends the entire history of messages and responses from the LLM with each new message every time you interact with it in a session. It appears that they remember your conversation from the start of the session. Still, in reality, an LLM processes the entire conversation from inception every time you send a new message. The context window expands throughout the session as you chat with an LLM. However, when you start a new chat, the context window is wiped clean, and the LLM has not learned from your interaction with it from a previous conversation, which is what Andrej refers to as anterograde amnesia.
There are several other deficits. LLMs are gullible; they are susceptible to prompt injection risks and leak your data.
To put it simply, although LLMs possess superhuman capabilities, they also suffer from numerous deficits, which present opportunities for us to work efficiently around their deficit.
Opportunities
That brings us to the shining light of opportunities we have going forward. All or most of Software 1.0 will get transformed into a combination of traditional code, trained custom neural networks, and integration with frontier or open-source models. There is a tremendous amount of work to be done. Let's explore some of the ideas Andrej presented.
Partial Autonomy Apps
Partial autonomy apps are applications that integrate with AI, but humans remain in control. These applications have an interface that allows humans to perform their work in a traditional manner, but then utilize either a custom-trained Neural Network or integrate with Frontier or open-source models to enhance productivity and automate mundane tasks. An example of such an application is Cursor AI. This AI-powered code editor aims to boost developer productivity by integrating AI directly into the coding workflow, while maintaining the traditional interface, allowing programmers to write code from scratch if they choose to do so.
Some of the properties of such Partial Autonomy Apps must be highlighted.
- They do a ton of context management. They maintain the history of interaction with the AI and provide a seamless interface, ensuring that previous responses in a session influence the future outcome.
- They orchestrate and call multiple models, for example, under the hood, Cursor has embedding models for your files, chat models to interact with LLMs, models for applying diffs to a file, etc, and this is all orchestrated for you.
- A significant one that is not fully appreciated is the Application-Specific GUI. Text is a cumbersome and challenging way of dealing with an LLM. It is hard to read text, but if an application-specific GUI provides visual assistance, it is easy to work with. GUI allows humans to audit the work of these fallible systems and work more efficiently.
- Autonomy Slider. How much control do you want to give the LLM? You should have control over that.
Exciting times ahead
It is an exciting time because many software systems will become partially autonomous. This is an essential opportunity for software engineers to find numerous job openings. Let us explore some of these.
Integration with LLMs
Since LLMs are set to be an integral part of partially autonomous applications, programmers must ensure a seamless integration with these LLMs. Can an LLM see all that a human sees in the application? Can an LLM act in the way a human can act, like clicking a link? Can LLMs be supervised so that a human can be in the loop, since these systems are fallible and not perfect, and humans have to validate the work of an LLM? Therefore, all these elements need to be modified for a traditional application. The GUI controls, designed for humans, such as switches and clickable buttons, are intended for humans and must be converted so that the same information can be shared with an LLM.
Cooperating with AI
In this new era of Software 3.0, there is a need for improved collaboration between humans and AI. LLMs can indeed generate an immense amount of information and do so very quickly. But it's imperative that the humans validate what the AI is generating. This becomes critical in light of all the deficits AI has, as discussed previously.
The opportunity lies in making the generation and validation cycle as fast as possible. A code assistant may produce a thousand lines of code in a second. But that code still needs to be validated by a human. The bottleneck is the human in this case. Any such partially autonomous app will be more successful if it has mechanisms in place to make the validation as easy and as expedient as possible for a human. Application-specific GUIs are a key component that enables this to happen.
Additionally, we must keep the AI in check by not allowing it to become overreactive. Develop a method that prevents the AI from overwhelming humans by generating at speeds and scales that humans cannot keep up with for verification. Since these are fallible systems that are not perfect, we cannot blindly rely on them to do things perfectly. We have to validate and verify. So, if the generator produces a 15-page speech, a human still has to review the speech to ensure that the AI has not generated something ridiculous.
Develop best practices to keep AI in check. For example, if you are prompting and the prompt is large, there will be issues, and verification will fail, which might put the LLM in a spin. Andrej emphasized the need for an Autonomy Slider, a mechanism that controls the level of freedom you want to grant to AI. Do you want it to act entirely without any supervision, or do you want to verify and validate what AI is generating? The Autonomy Slider is a critical mechanism that applications in the age of Software 3.0 must have.
Building for Agents
There is a new category of consumers/manipulators of digital information. It used to be that humans would use a GUI, or another computer program, through APIs. However, we have entered the world of AI agents, and although these agents are computer-based, they exhibit human-like characteristics.
We have ensured that the systems we are developing in the age of Software 3.0 can assist and act as agents. For example, similar to a robots.txt file, an LLM.txt file provides instructions to web crawlers about which parts of a website they cannot access. An LLM.txt file might be needed to provide domain-specific information to an LLM. This file should be in a format that LLMs can parse and understand.
Another example is documentation. Humans can read documents that have images, different fonts, or colors, but these are meaningless to an LLM. On the other hand, an LLM will understand documentation written in Markdown much better than a PDF file. Similarly, if the document is asking a user to take an action, such as clicking a link, that would be meaningless for an LLM. However, if instead of a link, the Partially Autonomous App converts all the links to equivalent cURL commands, then an LLM can take that action on your behalf.
Efforts are already underway in this direction. Model Context Protocol from Amthropic is an open standard and protocol that enables AI systems, particularly LLMs, to connect with and utilize various external tools, services, and data sources. It acts as a bridge, allowing AI agents to access information and perform actions in the real world.
Vibe Coding
All this discussion brings us back to the statement made by Andrej before, where he asserted that English is the hottest new programming language. With tools like Cursor AI, it is becoming easier to develop code in this new way. The term coined by Andrej for this kind of programming is Vibe Coding, an AI-assisted coding paradigm where you instruct AI, in English, what you want your code to do, and it produces the code. In the era of Software 3.0, this way of generating code will become more commonplace, and Andrej encourages you to try Vibe coding. However, he also observes that generating the code is the easiest part; however, productionizing the code requires another level of effort. And this is where another opportunity lies for developers, to mature Vibe Coding to a level where an application can be developed, deployed, and tested by using nothing more than English.
Conclusion: The Software Revolution Is Here — Step In and Build
Andrej Karpathy’s keynote isn’t just a map of where things are going — it’s an invitation to help shape the journey.
We’re standing at the dawn of a new era in computing. The rules are being rewritten, and the barrier between idea and implementation is collapsing faster than ever before. English is now a programming language. AI is your new pair programmer. Software is no longer just typed — it’s prompted, fine-tuned, and orchestrated.
This is our moment.
Whether you’re a seasoned engineer or just getting started, there’s never been a more exciting time to build. The old playbooks are fading. Programmers are writing new ones.
- Start prompting.
- Start building.
- Start collaborating with AI, not competing with it.
- Turn curiosity into code.
- Turn problems into products.
Software is no longer just something we write — it’s something we co-create. The tools are powerful. The opportunities are vast. And the future is wide open.
So let's not wait on the sidelines. Step into the arena.
Vibe code. Build boldly. Shape what’s next.
Comments (0)
Leave a Comment
Loading comments...
Post Details
Author

Anupam Chandra (AC)
Tech strategist, AI explorer, and digital transformation architect. Driven by curiosity, powered by learning, and lit up by the beauty of simple things done exceptionally well.
Published
July 7, 2025
Categories
Reading Time
17 min read