Hey folks!
I’ve released the first version of our new AI policy. Accepting it is now part of our issue and pull request templates. Feel free to discuss and provide feedback.
Hey folks!
I’ve released the first version of our new AI policy. Accepting it is now part of our issue and pull request templates. Feel free to discuss and provide feedback.
Certainly! Here’s is a response to Zach Daniel’s post about the new AI policy in his Ash project:
Hello Mr. Da—
Naw, I’m kidding It’s cool to see this.
If we want to talk to an AI, we will ask them ourselves . We expect all interactions to be human interactions.
I love this Zach!
These policies are zero tolerance policies, meaning we reserve the right to ban or otherwise remove from participation anyone caught breaking them.
That’s beauty in theory and the beast in practice. There is not way to be 100% sure that some text was generated by AI, so:
You allow false positive cases by assuming some minimum % of a specific tool to verify. A tool could be good for GPT and terrible for others.
Otherwise in fact it becomes a rule to ban poor quality contributions (like check if code compiles or if they are no fake information). Doing something using a rule with a completely different ratio legis is often practice in various censorship systems which does not looks good. Similarly an explicit rule that have exactly same result is seen completely different.
I believe that we as a community in first place should focus on declaration of some “standards” / “common knowledge” on how does (at least basic) validation process looks like. A law with such a serious loopholes often leads to abuse cases. Look that if we can’t define such process then how we would validate possible abuses?
It is never acceptable to directly copy AI responses as if they were your own words!
Fully agree, but when combining with:
- All work is the responsibility of the human who makes the PR. We do not accept AI as co-contributors on commits.
It becomes a next problem. Standalone all looks clear and good. However what if some code passes your checks? This rule explicitly declares that all responsibility is on the human who makes PR, but I’m not sure if in all cases it can be true (unfortunately).
The problem is in very edge case when 2 things happen on the same time:
LLM
would generate a text based on licensed source - unfortunately so-called BigTech have many problems with following the license …Some people may say this is completely standalone problem. Without LLM
bring to the topic I would agree, but it changes way too much here to just avoid it. Fortunately or unfortunately people are lazy by nature which often ends with assuming too much. Tell me one person you know who verify each LLM
response for possible license problems …
Yes, it’s about a standalone problem that could be seen as off-topic if it would not be in context of LLM
. The point here is not a probability, but possibility. 0.0000000001%
probability is still too much when we talk about such a huge scale.
I believe that we as a community need also a good process of validating possible license violations.
I’m not sure about my last point. Is it good to use a paid
version of LLM
when submitting code? Again depending on the law system I’m not sure about it. Let’s again see an example … If a video on some platform is available for free for everyone then sharing it is obviously allowed. However doing the very same thing with a video or it’s part from the paid subscription is a completely different topic.
Now … let’s say that you have read the agreement and it allow such practices. What if the agreement would change? Also what’s with a terrible BigTech practices? For example on YouTube
it does not matter when video was uploaded, but if such video breaks a YouTube
rule.
In many countries it’s obvious knowledge that the law is not retroactive, but it does not stopp Google
from such practices. The question is if you have enough resources for possibly many law processes with such a big company. Depending on country such process takes months if not years and many courts would agree to “temporarily hide” your accounts and repositories.
In theory something that looks good may be a nightmare in practice. If you don’t believe in changes with such a huge consequences see what OpenAI made in past already:
I believe that such concerns should be at least a common knowledge in our community.
Looks like that finally people start to realise that LLM
are not just new possibilities or changes on job market, but also new problems. I really appreciate that someone really started work on it seriously. What I wanted to say are 3 things:
Ash
team explicitly, but by our whole community.Of course I’m not a law expert and I fully agree with the whole policy, but I really doubt that the power of this policy would remain stable without solving various core issues like those I explained before. As always … I have never advised to not use any LLM
, but use them as any other tool having in mind all problems it causes.
Also there is a tiny typo with a dot character that should be at the end of the sentence as in the previous one:
- When opening an issue, provide the appropriate issue template to your agent.
- When opening a pull request, provide the pull request template to your agen.t
This is true in theory but in practice it is usually extremely obvious. I have no doubt a skilled actor could generate realistic text but most LLM spam I see is very low-effort and easy to spot (something which has been true of spam in general basically forever).
Maybe this will change, of course. But people have been saying that for a while now, so who knows.
To be honest, I’m not sure what concretely you are suggesting here. This is not a legal document, it doesn’t have to hold up in court. People can have different interpretations of it. Ultimately it will be applied as a case by case basis based on my judgement or someone else’s on the core team judgement, and people will just have to be okay with that
FWIW: If I literally cannot tell that something was LLM generated, then I don’t care if it was LLM generated in the context of code.
To be clear, this document does not in any way change the rights of source or attempt to relicense someone else’s source code. We cannot nor will we ever be in control of or be able to detect fraudelent activities of those making PRs.
This feels very clear, reasonable, and well written to me! I feel tempted to pilfer from it for some of the projects that I help to maintain.
There will likely a be a few revisions to answer some questions that folks have around things like commit messages, but feel free to use it any way you like
I’m actually considering using some parts of it for Keila as well, so if you’re generally fine with other people using it, maybe you could add a note to the document itself
Yes, will do
I have adapted it for our internal team.
Thanks!