Follow-up on Today's informal working group session on AI
Greetings Directors, Earlier today, Eoghan Glynn, Phil Rob, Allison Randall, and I met from the informal working group meeting invite I sent out earlier in the week. Discussion focused around the creation of bullet points of what we need to tackle, which evolved into the general idea that we need to take on a three part approach to move forward. The first area of consensus was to draft a document laying out: * Set Common Understanding - We need to set the common words to be used and note there are two basic extremes. In essence, a spectrum from autocomplete to entirely machine generated output. We also have consensus that we expect code contributions to be made by humans, which may change as time moves on, outside of robotic process driven changes, for example requirements management or translation batch updates which occur today as well understood automated processes. * Build a common Identification system for changes - Where and How it was used. Reach consensus with some sort of labeling for commit messages, for consistency across projects. There seems to be agreement that this needs to involve identification of the tool and most likely some level of quantity or percentage. The overall goal being to get this into general practice sooner rather than later, and to likely adopt the ASF suggested Generated-By tag, in addition to some other Generated-Percentage or Generated-Level tag. Naturally we would expect the latter to likely change through the process of code review. * Establish bounds/guidelines for projects in what we expect, and what they can and cannot do. This likely includes alignment with the four opens so reviewers can attempt to determine how a tool might line up and if the contribution is able to be accepted, or not, based upon the tagging. This is the area we likely expect to explicitly name an example of what is not permissible, and possibly include a list of known tag values to aid consistency when it comes time to generate reporting. The second area of consensus is in the form of two specific paths forward above and beyond an initial guidance document. Allison made a really good point that we should not block on either of these areas. * We likely want to see some form of Attestation by the submitter of the change along the lines of DCO, that suggests "I've carefully reviewed this, I fully understand what it does, I have legal right to submit this along with a copyright assignment". In a sense, the DCO already sort of serves this purpose for some projects, but the variable of AI is an interesting one, and the consensus is that we expect there is a need for some cross-foundation collaboration on appropriate guidance since it is more than just our OpenInfra ecosystem. * We want to see a centralized review guide to both provide guidance of things to look at/for, and also help serve the pipeline of reviewers in general. It might have principles such as "does style conform?", "Does CI pass?", but this is nowhere near an exhaustive list and we expect some iteration may be necessary with our communities and projects to have the right level/form of guidance. Which leaves the next step. We'll provide a brief update during the upcoming board meeting[0], and have a call for participants to work together on drafting the guidance document. Once we have written words in a reviewable form, we'll reach out to our legal counsel for their input, and move forward from there. Thanks, -Julia [0]: https://board.openinfra.dev/en/meetings/2023-11-07
Greetings Directors, On the AI contribution guidance front, the Linux Foundation just published their own guidance: https://www.linuxfoundation.org/legal/generative-ai It seems in line with the ASF guidance, just a bit shorter. -- Thierry Julia Kreger wrote:
Greetings Directors,
Earlier today, Eoghan Glynn, Phil Rob, Allison Randall, and I met from the informal working group meeting invite I sent out earlier in the week.
Discussion focused around the creation of bullet points of what we need to tackle, which evolved into the general idea that we need to take on a three part approach to move forward.
The first area of consensus was to draft a document laying out:
* Set Common Understanding - We need to set the common words to be used and note there are two basic extremes. In essence, a spectrum from autocomplete to entirely machine generated output. We also have consensus that we expect code contributions to be made by humans, which may change as time moves on, outside of robotic process driven changes, for example requirements management or translation batch updates which occur today as well understood automated processes.
* Build a common Identification system for changes - Where and How it was used. Reach consensus with some sort of labeling for commit messages, for consistency across projects. There seems to be agreement that this needs to involve identification of the tool and most likely some level of quantity or percentage. The overall goal being to get this into general practice sooner rather than later, and to likely adopt the ASF suggested Generated-By tag, in addition to some other Generated-Percentage or Generated-Level tag. Naturally we would expect the latter to likely change through the process of code review.
* Establish bounds/guidelines for projects in what we expect, and what they can and cannot do. This likely includes alignment with the four opens so reviewers can attempt to determine how a tool might line up and if the contribution is able to be accepted, or not, based upon the tagging. This is the area we likely expect to explicitly name an example of what is not permissible, and possibly include a list of known tag values to aid consistency when it comes time to generate reporting.
The second area of consensus is in the form of two specific paths forward above and beyond an initial guidance document. Allison made a really good point that we should not block on either of these areas.
* We likely want to see some form of Attestation by the submitter of the change along the lines of DCO, that suggests "I've carefully reviewed this, I fully understand what it does, I have legal right to submit this along with a copyright assignment". In a sense, the DCO already sort of serves this purpose for some projects, but the variable of AI is an interesting one, and the consensus is that we expect there is a need for some cross-foundation collaboration on appropriate guidance since it is more than just our OpenInfra ecosystem.
* We want to see a centralized review guide to both provide guidance of things to look at/for, and also help serve the pipeline of reviewers in general. It might have principles such as "does style conform?", "Does CI pass?", but this is nowhere near an exhaustive list and we expect some iteration may be necessary with our communities and projects to have the right level/form of guidance.
Which leaves the next step. We'll provide a brief update during the upcoming board meeting[0], and have a call for participants to work together on drafting the guidance document. Once we have written words in a reviewable form, we'll reach out to our legal counsel for their input, and move forward from there.
Thanks,
-Julia
[0]: https://board.openinfra.dev/en/meetings/2023-11-07 <https://board.openinfra.dev/en/meetings/2023-11-07>
_______________________________________________ Foundation-board mailing list -- foundation-board@lists.openinfra.dev To unsubscribe send an email to foundation-board-leave@lists.openinfra.dev
Resurrecting this thread since we've seated a new board. After our meeting this morning, I had a brief discussion with one of the other directors who noted they were made aware of some work to help generate unit tests which was developed utilizing AI technologies trained exclusively on our own pre-existing works, and the idea seemed intriguing to me. It adds an additional facet which never arose when we had some informal discussions last year on the subject. I think it makes sense to schedule another round or two of informal discussions on this subject. Would anyone be interested in meeting next week or the week after to discuss further? -Julia On Mon, Nov 6, 2023 at 9:20 AM Thierry Carrez <thierry@openinfra.dev> wrote:
Greetings Directors,
On the AI contribution guidance front, the Linux Foundation just published their own guidance:
https://www.linuxfoundation.org/legal/generative-ai
It seems in line with the ASF guidance, just a bit shorter.
-- Thierry
Julia Kreger wrote:
Greetings Directors,
Earlier today, Eoghan Glynn, Phil Rob, Allison Randall, and I met from the informal working group meeting invite I sent out earlier in the week.
Discussion focused around the creation of bullet points of what we need to tackle, which evolved into the general idea that we need to take on a three part approach to move forward.
The first area of consensus was to draft a document laying out:
* Set Common Understanding - We need to set the common words to be used and note there are two basic extremes. In essence, a spectrum from autocomplete to entirely machine generated output. We also have consensus that we expect code contributions to be made by humans, which may change as time moves on, outside of robotic process driven changes, for example requirements management or translation batch updates which occur today as well understood automated processes.
* Build a common Identification system for changes - Where and How it was used. Reach consensus with some sort of labeling for commit messages, for consistency across projects. There seems to be agreement that this needs to involve identification of the tool and most likely some level of quantity or percentage. The overall goal being to get this into general practice sooner rather than later, and to likely adopt the ASF suggested Generated-By tag, in addition to some other Generated-Percentage or Generated-Level tag. Naturally we would expect the latter to likely change through the process of code review.
* Establish bounds/guidelines for projects in what we expect, and what they can and cannot do. This likely includes alignment with the four opens so reviewers can attempt to determine how a tool might line up and if the contribution is able to be accepted, or not, based upon the tagging. This is the area we likely expect to explicitly name an example of what is not permissible, and possibly include a list of known tag values to aid consistency when it comes time to generate reporting.
The second area of consensus is in the form of two specific paths forward above and beyond an initial guidance document. Allison made a really good point that we should not block on either of these areas.
* We likely want to see some form of Attestation by the submitter of the change along the lines of DCO, that suggests "I've carefully reviewed this, I fully understand what it does, I have legal right to submit this along with a copyright assignment". In a sense, the DCO already sort of serves this purpose for some projects, but the variable of AI is an interesting one, and the consensus is that we expect there is a need for some cross-foundation collaboration on appropriate guidance since it is more than just our OpenInfra ecosystem.
* We want to see a centralized review guide to both provide guidance of things to look at/for, and also help serve the pipeline of reviewers in general. It might have principles such as "does style conform?", "Does CI pass?", but this is nowhere near an exhaustive list and we expect some iteration may be necessary with our communities and projects to have the right level/form of guidance.
Which leaves the next step. We'll provide a brief update during the upcoming board meeting[0], and have a call for participants to work together on drafting the guidance document. Once we have written words in a reviewable form, we'll reach out to our legal counsel for their input, and move forward from there.
Thanks,
-Julia
[0]: https://board.openinfra.dev/en/meetings/2023-11-07 <https://board.openinfra.dev/en/meetings/2023-11-07>
_______________________________________________ Foundation-board mailing list -- foundation-board@lists.openinfra.dev To unsubscribe send an email to foundation-board-leave@lists.openinfra.dev
_______________________________________________ Foundation-board mailing list -- foundation-board@lists.openinfra.dev To unsubscribe send an email to foundation-board-leave@lists.openinfra.dev
I'll be traveling back from FOSDEM on Tuesday, I could possibly do something Wednesday or Thursday morning or Friday afternoon. Amy On Tue, Jan 30, 2024 at 11:26 AM Julia Kreger <juliaashleykreger@gmail.com> wrote:
Resurrecting this thread since we've seated a new board.
After our meeting this morning, I had a brief discussion with one of the other directors who noted they were made aware of some work to help generate unit tests which was developed utilizing AI technologies trained exclusively on our own pre-existing works, and the idea seemed intriguing to me. It adds an additional facet which never arose when we had some informal discussions last year on the subject.
I think it makes sense to schedule another round or two of informal discussions on this subject. Would anyone be interested in meeting next week or the week after to discuss further?
-Julia
On Mon, Nov 6, 2023 at 9:20 AM Thierry Carrez <thierry@openinfra.dev> wrote:
Greetings Directors,
On the AI contribution guidance front, the Linux Foundation just published their own guidance:
https://www.linuxfoundation.org/legal/generative-ai
It seems in line with the ASF guidance, just a bit shorter.
-- Thierry
Greetings Directors,
Earlier today, Eoghan Glynn, Phil Rob, Allison Randall, and I met from the informal working group meeting invite I sent out earlier in the week.
Discussion focused around the creation of bullet points of what we need to tackle, which evolved into the general idea that we need to take on a three part approach to move forward.
The first area of consensus was to draft a document laying out:
* Set Common Understanding - We need to set the common words to be used and note there are two basic extremes. In essence, a spectrum from autocomplete to entirely machine generated output. We also have consensus that we expect code contributions to be made by humans, which may change as time moves on, outside of robotic process driven changes, for example requirements management or translation batch updates which occur today as well understood automated processes.
* Build a common Identification system for changes - Where and How it was used. Reach consensus with some sort of labeling for commit messages, for consistency across projects. There seems to be agreement that this needs to involve identification of the tool and most likely some level of quantity or percentage. The overall goal being to get
into general practice sooner rather than later, and to likely adopt the ASF suggested Generated-By tag, in addition to some other Generated-Percentage or Generated-Level tag. Naturally we would expect the latter to likely change through the process of code review.
* Establish bounds/guidelines for projects in what we expect, and what they can and cannot do. This likely includes alignment with the four opens so reviewers can attempt to determine how a tool might line up and if the contribution is able to be accepted, or not, based upon the tagging. This is the area we likely expect to explicitly name an example of what is not permissible, and possibly include a list of known tag values to aid consistency when it comes time to generate reporting.
The second area of consensus is in the form of two specific paths forward above and beyond an initial guidance document. Allison made a really good point that we should not block on either of these areas.
* We likely want to see some form of Attestation by the submitter of
change along the lines of DCO, that suggests "I've carefully reviewed this, I fully understand what it does, I have legal right to submit
Julia Kreger wrote: this the this
along with a copyright assignment". In a sense, the DCO already sort of serves this purpose for some projects, but the variable of AI is an interesting one, and the consensus is that we expect there is a need for some cross-foundation collaboration on appropriate guidance since it is more than just our OpenInfra ecosystem.
* We want to see a centralized review guide to both provide guidance of things to look at/for, and also help serve the pipeline of reviewers in general. It might have principles such as "does style conform?", "Does CI pass?", but this is nowhere near an exhaustive list and we expect some iteration may be necessary with our communities and projects to have the right level/form of guidance.
Which leaves the next step. We'll provide a brief update during the upcoming board meeting[0], and have a call for participants to work together on drafting the guidance document. Once we have written words in a reviewable form, we'll reach out to our legal counsel for their input, and move forward from there.
Thanks,
-Julia
[0]: https://board.openinfra.dev/en/meetings/2023-11-07 <https://board.openinfra.dev/en/meetings/2023-11-07>
_______________________________________________ Foundation-board mailing list -- foundation-board@lists.openinfra.dev To unsubscribe send an email to foundation-board-leave@lists.openinfra.dev
_______________________________________________ Foundation-board mailing list -- foundation-board@lists.openinfra.dev To unsubscribe send an email to foundation-board-leave@lists.openinfra.dev
_______________________________________________ Foundation-board mailing list -- foundation-board@lists.openinfra.dev To unsubscribe send an email to foundation-board-leave@lists.openinfra.dev
I’m up for it. From: Julia Kreger <juliaashleykreger@gmail.com> Date: Tuesday, January 30, 2024 at 12:26 PM To: Thierry Carrez <thierry@openinfra.dev> Cc: foundation-board@lists.openinfra.dev <foundation-board@lists.openinfra.dev> Subject: [Foundation Board] Re: Follow-up on Today's informal working group session on AI You don't often get email from juliaashleykreger@gmail.com. Learn why this is important<https://aka.ms/LearnAboutSenderIdentification> Resurrecting this thread since we've seated a new board. After our meeting this morning, I had a brief discussion with one of the other directors who noted they were made aware of some work to help generate unit tests which was developed utilizing AI technologies trained exclusively on our own pre-existing works, and the idea seemed intriguing to me. It adds an additional facet which never arose when we had some informal discussions last year on the subject. I think it makes sense to schedule another round or two of informal discussions on this subject. Would anyone be interested in meeting next week or the week after to discuss further? -Julia On Mon, Nov 6, 2023 at 9:20 AM Thierry Carrez <thierry@openinfra.dev<mailto:thierry@openinfra.dev>> wrote: Greetings Directors, On the AI contribution guidance front, the Linux Foundation just published their own guidance: https://www.linuxfoundation.org/legal/generative-ai It seems in line with the ASF guidance, just a bit shorter. -- Thierry Julia Kreger wrote:
Greetings Directors,
Earlier today, Eoghan Glynn, Phil Rob, Allison Randall, and I met from the informal working group meeting invite I sent out earlier in the week.
Discussion focused around the creation of bullet points of what we need to tackle, which evolved into the general idea that we need to take on a three part approach to move forward.
The first area of consensus was to draft a document laying out:
* Set Common Understanding - We need to set the common words to be used and note there are two basic extremes. In essence, a spectrum from autocomplete to entirely machine generated output. We also have consensus that we expect code contributions to be made by humans, which may change as time moves on, outside of robotic process driven changes, for example requirements management or translation batch updates which occur today as well understood automated processes.
* Build a common Identification system for changes - Where and How it was used. Reach consensus with some sort of labeling for commit messages, for consistency across projects. There seems to be agreement that this needs to involve identification of the tool and most likely some level of quantity or percentage. The overall goal being to get this into general practice sooner rather than later, and to likely adopt the ASF suggested Generated-By tag, in addition to some other Generated-Percentage or Generated-Level tag. Naturally we would expect the latter to likely change through the process of code review.
* Establish bounds/guidelines for projects in what we expect, and what they can and cannot do. This likely includes alignment with the four opens so reviewers can attempt to determine how a tool might line up and if the contribution is able to be accepted, or not, based upon the tagging. This is the area we likely expect to explicitly name an example of what is not permissible, and possibly include a list of known tag values to aid consistency when it comes time to generate reporting.
The second area of consensus is in the form of two specific paths forward above and beyond an initial guidance document. Allison made a really good point that we should not block on either of these areas.
* We likely want to see some form of Attestation by the submitter of the change along the lines of DCO, that suggests "I've carefully reviewed this, I fully understand what it does, I have legal right to submit this along with a copyright assignment". In a sense, the DCO already sort of serves this purpose for some projects, but the variable of AI is an interesting one, and the consensus is that we expect there is a need for some cross-foundation collaboration on appropriate guidance since it is more than just our OpenInfra ecosystem.
* We want to see a centralized review guide to both provide guidance of things to look at/for, and also help serve the pipeline of reviewers in general. It might have principles such as "does style conform?", "Does CI pass?", but this is nowhere near an exhaustive list and we expect some iteration may be necessary with our communities and projects to have the right level/form of guidance.
Which leaves the next step. We'll provide a brief update during the upcoming board meeting[0], and have a call for participants to work together on drafting the guidance document. Once we have written words in a reviewable form, we'll reach out to our legal counsel for their input, and move forward from there.
Thanks,
-Julia
[0]: https://board.openinfra.dev/en/meetings/2023-11-07 <https://board.openinfra.dev/en/meetings/2023-11-07>
_______________________________________________ Foundation-board mailing list -- foundation-board@lists.openinfra.dev<mailto:foundation-board@lists.openinfra.dev> To unsubscribe send an email to foundation-board-leave@lists.openinfra.dev<mailto:foundation-board-leave@lists.openinfra.dev>
_______________________________________________ Foundation-board mailing list -- foundation-board@lists.openinfra.dev<mailto:foundation-board@lists.openinfra.dev> To unsubscribe send an email to foundation-board-leave@lists.openinfra.dev<mailto:foundation-board-leave@lists.openinfra.dev>
Thank you everyone who joined today. The discussion mainly focused on that with a policy, it might be a good idea to also leave a path to enable specific generative contributions which can be attested to being generated using AI technology modeled and trained on only appropriately licensed content, at least until the OSI might one day approve some approach as "Open Source AI". The discussion drifted into some potential collaborations and focuses as reasoning that if the input into the model could be attested for, then it would be reasonable to consider accepting the output. The major concern with generative being the unknown material and licenses the public models were generated upon and the risk of licensed content regurgitation. The next steps seems to be to pickup the policy work from last year and see if we can wordsmith a possible path. With that, would it make sense to meet again in two weeks to discuss a potential policy before moving forward towards implementation? Thanks, -Julia On Wed, Jan 31, 2024 at 6:19 AM Mohammed Naser <mnaser@vexxhost.com> wrote:
I’m up for it.
*From: *Julia Kreger <juliaashleykreger@gmail.com> *Date: *Tuesday, January 30, 2024 at 12:26 PM *To: *Thierry Carrez <thierry@openinfra.dev> *Cc: *foundation-board@lists.openinfra.dev < foundation-board@lists.openinfra.dev> *Subject: *[Foundation Board] Re: Follow-up on Today's informal working group session on AI
You don't often get email from juliaashleykreger@gmail.com. Learn why this is important <https://aka.ms/LearnAboutSenderIdentification>
Resurrecting this thread since we've seated a new board.
After our meeting this morning, I had a brief discussion with one of the other directors who noted they were made aware of some work to help generate unit tests which was developed utilizing AI technologies trained exclusively on our own pre-existing works, and the idea seemed intriguing to me. It adds an additional facet which never arose when we had some informal discussions last year on the subject.
I think it makes sense to schedule another round or two of informal discussions on this subject. Would anyone be interested in meeting next week or the week after to discuss further?
-Julia
On Mon, Nov 6, 2023 at 9:20 AM Thierry Carrez <thierry@openinfra.dev> wrote:
Greetings Directors,
On the AI contribution guidance front, the Linux Foundation just published their own guidance:
https://www.linuxfoundation.org/legal/generative-ai
It seems in line with the ASF guidance, just a bit shorter.
-- Thierry
Julia Kreger wrote:
Greetings Directors,
Earlier today, Eoghan Glynn, Phil Rob, Allison Randall, and I met from the informal working group meeting invite I sent out earlier in the week.
Discussion focused around the creation of bullet points of what we need to tackle, which evolved into the general idea that we need to take on a three part approach to move forward.
The first area of consensus was to draft a document laying out:
* Set Common Understanding - We need to set the common words to be used and note there are two basic extremes. In essence, a spectrum from autocomplete to entirely machine generated output. We also have consensus that we expect code contributions to be made by humans, which may change as time moves on, outside of robotic process driven changes, for example requirements management or translation batch updates which occur today as well understood automated processes.
* Build a common Identification system for changes - Where and How it was used. Reach consensus with some sort of labeling for commit messages, for consistency across projects. There seems to be agreement that this needs to involve identification of the tool and most likely some level of quantity or percentage. The overall goal being to get this into general practice sooner rather than later, and to likely adopt the ASF suggested Generated-By tag, in addition to some other Generated-Percentage or Generated-Level tag. Naturally we would expect the latter to likely change through the process of code review.
* Establish bounds/guidelines for projects in what we expect, and what they can and cannot do. This likely includes alignment with the four opens so reviewers can attempt to determine how a tool might line up and if the contribution is able to be accepted, or not, based upon the tagging. This is the area we likely expect to explicitly name an example of what is not permissible, and possibly include a list of known tag values to aid consistency when it comes time to generate reporting.
The second area of consensus is in the form of two specific paths forward above and beyond an initial guidance document. Allison made a really good point that we should not block on either of these areas.
* We likely want to see some form of Attestation by the submitter of the change along the lines of DCO, that suggests "I've carefully reviewed this, I fully understand what it does, I have legal right to submit this along with a copyright assignment". In a sense, the DCO already sort of serves this purpose for some projects, but the variable of AI is an interesting one, and the consensus is that we expect there is a need for some cross-foundation collaboration on appropriate guidance since it is more than just our OpenInfra ecosystem.
* We want to see a centralized review guide to both provide guidance of things to look at/for, and also help serve the pipeline of reviewers in general. It might have principles such as "does style conform?", "Does CI pass?", but this is nowhere near an exhaustive list and we expect some iteration may be necessary with our communities and projects to have the right level/form of guidance.
Which leaves the next step. We'll provide a brief update during the upcoming board meeting[0], and have a call for participants to work together on drafting the guidance document. Once we have written words in a reviewable form, we'll reach out to our legal counsel for their input, and move forward from there.
Thanks,
-Julia
[0]: https://board.openinfra.dev/en/meetings/2023-11-07 <https://board.openinfra.dev/en/meetings/2023-11-07>
_______________________________________________ Foundation-board mailing list -- foundation-board@lists.openinfra.dev To unsubscribe send an email to foundation-board-leave@lists.openinfra.dev
_______________________________________________ Foundation-board mailing list -- foundation-board@lists.openinfra.dev To unsubscribe send an email to foundation-board-leave@lists.openinfra.dev
Greetings folks, I've scheduled an additional informal discussion call on this subject for this coming Thursday at 1500 to 1600 UTC. URL: https://us02web.zoom.us/j/85038734661 Thanks! -Julia On Thu, Feb 8, 2024 at 9:52 AM Julia Kreger <juliaashleykreger@gmail.com> wrote:
Thank you everyone who joined today.
The discussion mainly focused on that with a policy, it might be a good idea to also leave a path to enable specific generative contributions which can be attested to being generated using AI technology modeled and trained on only appropriately licensed content, at least until the OSI might one day approve some approach as "Open Source AI". The discussion drifted into some potential collaborations and focuses as reasoning that if the input into the model could be attested for, then it would be reasonable to consider accepting the output. The major concern with generative being the unknown material and licenses the public models were generated upon and the risk of licensed content regurgitation.
The next steps seems to be to pickup the policy work from last year and see if we can wordsmith a possible path.
With that, would it make sense to meet again in two weeks to discuss a potential policy before moving forward towards implementation?
Thanks,
-Julia
On Wed, Jan 31, 2024 at 6:19 AM Mohammed Naser <mnaser@vexxhost.com> wrote:
I’m up for it.
*From: *Julia Kreger <juliaashleykreger@gmail.com> *Date: *Tuesday, January 30, 2024 at 12:26 PM *To: *Thierry Carrez <thierry@openinfra.dev> *Cc: *foundation-board@lists.openinfra.dev < foundation-board@lists.openinfra.dev> *Subject: *[Foundation Board] Re: Follow-up on Today's informal working group session on AI
You don't often get email from juliaashleykreger@gmail.com. Learn why this is important <https://aka.ms/LearnAboutSenderIdentification>
Resurrecting this thread since we've seated a new board.
After our meeting this morning, I had a brief discussion with one of the other directors who noted they were made aware of some work to help generate unit tests which was developed utilizing AI technologies trained exclusively on our own pre-existing works, and the idea seemed intriguing to me. It adds an additional facet which never arose when we had some informal discussions last year on the subject.
I think it makes sense to schedule another round or two of informal discussions on this subject. Would anyone be interested in meeting next week or the week after to discuss further?
-Julia
On Mon, Nov 6, 2023 at 9:20 AM Thierry Carrez <thierry@openinfra.dev> wrote:
Greetings Directors,
On the AI contribution guidance front, the Linux Foundation just published their own guidance:
https://www.linuxfoundation.org/legal/generative-ai
It seems in line with the ASF guidance, just a bit shorter.
-- Thierry
Greetings Directors,
Earlier today, Eoghan Glynn, Phil Rob, Allison Randall, and I met from the informal working group meeting invite I sent out earlier in the week.
Discussion focused around the creation of bullet points of what we need to tackle, which evolved into the general idea that we need to take on a three part approach to move forward.
The first area of consensus was to draft a document laying out:
* Set Common Understanding - We need to set the common words to be used and note there are two basic extremes. In essence, a spectrum from autocomplete to entirely machine generated output. We also have consensus that we expect code contributions to be made by humans, which may change as time moves on, outside of robotic process driven changes, for example requirements management or translation batch updates which occur today as well understood automated processes.
* Build a common Identification system for changes - Where and How it was used. Reach consensus with some sort of labeling for commit messages, for consistency across projects. There seems to be agreement that this needs to involve identification of the tool and most likely some level of quantity or percentage. The overall goal being to get
into general practice sooner rather than later, and to likely adopt the ASF suggested Generated-By tag, in addition to some other Generated-Percentage or Generated-Level tag. Naturally we would expect the latter to likely change through the process of code review.
* Establish bounds/guidelines for projects in what we expect, and what they can and cannot do. This likely includes alignment with the four opens so reviewers can attempt to determine how a tool might line up and if the contribution is able to be accepted, or not, based upon the tagging. This is the area we likely expect to explicitly name an example of what is not permissible, and possibly include a list of known tag values to aid consistency when it comes time to generate reporting.
The second area of consensus is in the form of two specific paths forward above and beyond an initial guidance document. Allison made a really good point that we should not block on either of these areas.
* We likely want to see some form of Attestation by the submitter of
change along the lines of DCO, that suggests "I've carefully reviewed this, I fully understand what it does, I have legal right to submit
Julia Kreger wrote: this the this
along with a copyright assignment". In a sense, the DCO already sort of serves this purpose for some projects, but the variable of AI is an interesting one, and the consensus is that we expect there is a need for some cross-foundation collaboration on appropriate guidance since it is more than just our OpenInfra ecosystem.
* We want to see a centralized review guide to both provide guidance of things to look at/for, and also help serve the pipeline of reviewers in general. It might have principles such as "does style conform?", "Does CI pass?", but this is nowhere near an exhaustive list and we expect some iteration may be necessary with our communities and projects to have the right level/form of guidance.
Which leaves the next step. We'll provide a brief update during the upcoming board meeting[0], and have a call for participants to work together on drafting the guidance document. Once we have written words in a reviewable form, we'll reach out to our legal counsel for their input, and move forward from there.
Thanks,
-Julia
[0]: https://board.openinfra.dev/en/meetings/2023-11-07 <https://board.openinfra.dev/en/meetings/2023-11-07>
_______________________________________________ Foundation-board mailing list -- foundation-board@lists.openinfra.dev To unsubscribe send an email to foundation-board-leave@lists.openinfra.dev
_______________________________________________ Foundation-board mailing list -- foundation-board@lists.openinfra.dev To unsubscribe send an email to foundation-board-leave@lists.openinfra.dev
participants (4)
-
Amy Marrich
-
Julia Kreger
-
Mohammed Naser
-
Thierry Carrez