Compile with AI

o1i1wnkk@beehaw.org · 2 years ago

Compile with AI

argv_minus_one@beehaw.org · 2 years ago

You’re still trusting whoever runs the compiler. If you rely on an AI to run the compiler, then you’re trusting the AI and whoever controls it.

Moreover, I don’t believe AI is intelligent enough to meaningfully comprehend how to compile any project it’s handed. Every project is different and has its own requirements, including libraries and tools that must be installed on the machine that is to compile the project.

There have been various attempts at standardizing the compilation of software, such that any standard-conforming project can be compiled in the same way as any other. F-Droid must have done that. But each of these standards make assumptions about the nature of the project being compiled, which makes it infeasible to compile some projects with them. For example, the Linux distribution Debian has its own standards for how packages are to be compiled, and you can compile any Debian package from source code with the same sequence of commands, but you can only compile a Debian package this way, and not, for example, a Windows application.

There is value in what you’re proposing, but I don’t believe it’s possible at this point.

Em Adespoton · edit-2 2 years ago

There’s a more nefarious problem too — AI algorithms are a black box. This means it’s virtually impossible to trust an AI’s methodology. Also, if someone knows the algorithm used for the AI, they can exploit the training methodology to hide secrets in the AI model that nobody will find but can still be triggered to perform specific repeatable tasks.

Essentially, for the AI to be more trustworthy than the person you trust to compile your code, you’d have to build the AI, determine its algorithms, vet the source material and train the model yourself. This is MUCH harder than setting up a buildbot environment with some basic unit tests for privacy and security.

o1i1wnkk@beehaw.org · 2 years ago

I understand your concern about the black-box nature of AI and the potential for exploitation. It’s indeed a serious challenge, but I still believe it’s possible to work towards solutions.

As AI continues to evolve, there’s ongoing research into improving the transparency and interpretability of AI algorithms. Ideally, this could lead to AI models that can better explain their actions and decisions. We may not have reached this point yet, but it is an active area of research and progress is being made.

Furthermore, having open-source AI models could offer some degree of assurance. If an AI model is open source and has undergone rigorous audits, there’s a higher level of transparency and trustworthiness. The community could scrutinize and vet the code, which might help to mitigate some of the risks associated with hidden secrets and exploitation of the AI’s training methodology.

And about your point of building, training and vetting the AI ourselves being harder than setting up a buildbot environment: I agree, but the idea here is not to replace human compilers entirely… for now. Instead, the goal could be to have a tool that can aid in ensuring trustworthiness, especially for those of us without the technical background to compile code ourselves.

o1i1wnkk@beehaw.org · 2 years ago

I understand your point about the transfer of trust, and it is indeed a serious concern. However, I believe there are measures that could be taken. I’m not an expert myself and I won’t pretend to be one, but it occurs to me that eventually technology will evolve to the point where we could ask the AI to explain step by step how it arrived at the final result. We could also potentially perform audits by cherry-picking the final results from different software to assess their accuracy.

If we were to use Open Source AI projects (like GPT4all, for example), maybe eventually we could run these codes 100% locally and privately. Naturally, I understand that we are far from this scenario, either due to the resources required or the nature of the complexity involved. It’s just an idea.

I would never think of bothering a developer by asking them to compile code step by step in front of me. First, because their time is valuable, and second, because the level of my questions would be frustrating. And third - and most importantly - because no one would accept such a whim.

However, I am willing to go step by step with an AI in some key software applications, such as communication, for example. Journalists or people in jobs where they cannot afford to trust blindly but lack the technical background might find benefit in these possibilities.

Ultra980@lemmy.one · edit-2 2 years ago

nix already compiles directly from the source code.

duncesplayed@lemmy.one · 2 years ago

I agree with the general worry. This is part of why maintainers matter. Communities like Debian have built up a lot of trust that they are packaging software correctly, and their efforts matter a lot. You should reject any sort of container or app “store” that isn’t built upon trustworthy maintainers.

An AI probably could do it…unreliably. The problem with most modern AI approaches is that they are fundamentally unreliable. People are familiar with ChatGPT these days and its “hallucinations”, where it invents things that aren’t true out of thin air. That’s fundamental to large neural networks and not easily fixed. So I wouldn’t take that as a good way forward if the whole point is about trust.

But you could use some old-school AI techniques (expert systems) might do well.

Voynich@lemmy.one · 2 years ago

I don’t know how to compile

Literally run make. It’s not that hard.

GreyBeard@lemmy.one · 2 years ago

That really depends on what is being compiled. Maybe the ecosystem is a little more streamlined these days, but in yesteryears I spent many a frustrating hour trying to decipher what I needed to install to make the make command work.

nodiet@feddit.de · 2 years ago

In theory AI might be able to analyse the project files and figure out what kind of compiler and configuration are needed, which could then be executed automatically. Is this what you’re describing, some kind of AI powered user friendly interface that lets you compile the project on your own machine? Because what you wrote sounds like you want the AI to actually just read the source code and produce machine code from it. Also, if you use e.g. archlinux, there is an entire user repository which consists of build scripts for software which often let you compile the package with a simple command. This seems similar to what you are describing in regards to flatpak. However, since these scripts are typically written by a third party, that adds another level of distrust.

o1i1wnkk@beehaw.org · 2 years ago

You’re correct, I suggest a user-friendly AI interface to assist with compilation, not for AI to produce machine code directly. The idea is to increase transparency and trust, especially for non-technical users. The Archlinux scripts you mentioned are indeed similar to my thought, but as you noted, third-party involvement may raise trust issues. Hence, AI might add an extra layer of verification, making the process more understandable. It’s a complex issue worth exploring as technology continues to evolve.

within_epsilon@beehaw.org · 2 years ago

I develop software in C++, C# and Python. All the languages mentioned feature package managers to manage compilation and delivery of binaries. I can force them to compile from source in the case I do not trust binaries created by some other person. Recompiling is expensive with regard to time.

Conan, a package manager for C++, uses hashes of source code and packaged binaries for verifying integrity. I am of the opinion that even the most clever systems for maintaining integrity can be broken. I have no idea how AI fits into the problem of package management and trust.

An AI to compile any repository sounds nice. I am the goto build engineer on my current team. We have four projects slightly different build processes. I wrote the CMake and Python to meet the needs of the developers. Some want flattened include heirarchies, others want hidden headers, so on and so forth. The continuous integration is the same however, so maybe we can standardize the DevOps work. I assume continuous delivery is where the AI would live. I am wary of taking control of the build process away from software developers.

o1i1wnkk@beehaw.org · edit-2 2 years ago

Your insights as a software developer are truly valuable. Thank you for explaining.

I agree with your points on the complexities of the build process and the potential pitfalls of taking control away from developers. However, the goal is not to replace the role of developers but to provide additional transparency for those lacking technical expertise. An AI could assist in clarifying this process, and while trust is a wider issue, AI could help in verifying package integrity. The idea is to automate and standardize some aspects of the build process, not to diminish developer control. As technology advances, it’s an idea.

within_epsilon@beehaw.org · 2 years ago

I now understand the goal a little better.

Installing F-Droid is spooky. I like the alleged functionality, but I am not certain the source code of the binary is what is running on my device. I also want better guarantees of integrity from F-Droid.

My software developer tendencies are itching. I will pitch some bad ideas on verifying integrity and creating trust.

The initially proposed AI could be a federation of build servers. Each build server compiles the source code providing a hash of the binary. Hashes showing up more frequently implies more of the federation have the same binary. Bad binaries presenting a different hash could be filtered by the consumer based on consensus.

I am hesitant to make an AI level decision like dropping less frequent hashes from consumers entirely. The possibility of the more frequent hashes being incorrect is worrying. A drawback is the lack of automation in forcing the consumer to choose a hash. Maybe the consumer can choose settings to make an AI like decision to always accept the most frequent hash. That decision would be opt in.