There is somewhat of a development towards modularity going on. There is now the agent-based approach. As I understand, you run multiple sessions/discussions with the model and give each session a smaller, simpler job. So, you might tell the first agent/session that its job is converting file types, that it should convert from XML to JSON, that it should only output the JSON and so forth. And then the second agent you tell that its job is extracting data from JSON, that it should read the value of the field “tomato” and output only the value. And then you write some shoddy script, which pumps your XML into the first agent, then takes the response and pumps it into the second agent and the response of that is what you’re looking for.
And then you can individually optimize these agents to have as much of a success rate as possible, by repeatedly tweaking parameters and running them with all kinds of input data and measuring whether your changes made it better or worse.
Having said that, for the modularity of what we do in open-source, you need effectively 100% reliability. If you combine ten systems with 99% reliability each, you end up with 0.9910 = 0.904, i.e. 90% reliability for the whole system. It is brutal. And to my knowledge, these agents don’t typically hit 99% either.
We’ll have to see how it works out. You will probably need fewer agents than you need libraries. And of course, you could combine these agents or LLM models in general with traditionally developed software. If 95% reliability is fine for you, then the LLM only needs to be 95% reliable, so it could take over the input-output, for example.
Maybe these are just bad examples because they were the first thing to come to mind, but if you’re using multiple AI agents to convert XML to JSON and read a field, even in large datasets, then you are quite literally doing it the most wasteful way possible.
Yeah, very much the first thing that came to mind, well, and kind of just simpler to explain than going into the specifics of some use-case.
But yeah, your point is still important. If you make it too simple, what each agent is supposed to do, then it’s frequently more reliable, more predictable, more verifiable to push it through a library instead.