[openstack-community] [User-committee] [OpenStack Marketing] qyjohn's quarterly report of the size and health of the 4 open source projects is out

Qingye Jiang (John) qingye.jiang at eucalyptus.com
Tue Jul 9 02:06:53 UTC 2013


I agree with Stefano that comparison across multiple projects is risky and misleading in many ways, especially when it is done by an entity such as the OpenStack Foundation or Eucalyptus as a company. This is why I need to put a safe-harbor statement in my report that the report represents my own opinion rather than the opinion of my employers. (Ah… yes I am working for Eucalyptus.) And, the reason that I can continue this effort is that this project was started way way before I came to Eucalyptus, and it would be a pity to stop it.

This project was started in CY11-Q4, with the objective to compare the community (forums and mailing lists) activities of various open source IaaS projects. I agree with Stefano that CloudStack and OpenStack are oranges and apples, but at the same time oranges and apples are all fruits and should be comparable in some way. In my point of view, population (or active population) represents the size of the community, while the communications between community members represents the activeness of the community. Different communities use different communication channel in different ways, and might result in different communication behavior, and gradually form the characteristics of the community. By tracking these parameters we will be able to analysis how communities grow and fall, as well as other general sociology subjects using open source IaaS communities as examples. 

The git metrics were introduced into this project in CY13-Q1. Frankly speaking I am not sure how the git metrics will impact the analysis results. I noticed several people told me that my git numbers were off, and will try to fix this problem in my CY13-Q3 report. The git analysis tool I used is a self-developed Java program. I am in the process of doing some house cleaning work for this program, and will make it available on github in a couple of days so that you guys can also use it (and fix it).

The program to track forums and mailing lists is not fully automatic, and needs lot of manual transmission to make it work. The major challenges include (1) dealing with old data, such as EOL'ed mailing lists and forums, and (2) de-duplication of membership.  I am still in the process of fixing it. 

Best regards,

Qingye Jiang (John)



在 2013-7-9,上午6:47,Stefano Maffulli <stefano at openstack.org> 写道:

> On 07/08/2013 08:24 PM, Joshua McKenty wrote:
>> I believe that the OpenStack marketing community sees comparisons to
>> other open source cloud frameworks as significant competitive
>> positioning. Accuracy in that data would be valuable to the whole community.
> 
> If you're arguing that Activity Board should include some data from
> other cloud frameworks let's discuss what questions you'd like to see
> answered/what raw data. Keep in mind that comparing different projects
> is like comparing oranges and apples: cloudstack and openstack are not
> comparable. The work that Qingye Jiang does is IMHO valuable when it
> highlights trends across different metrics for separate projects but it
> opens to all sorts of criticisms when it creates indexes like the
> Activeness Index and when it compares absolute numbers across projects
> (for example, the way openstack uses its -dev mailing list  is different
> than cloudstack's making the comparison irrelevant; neither you can
> compare discussions on gerrit with mlist traffic).
> 
> I wouldn't want the Foundation to produce anything like a comparative
> analysis for public consumption. IMHO public comparative reports would
> create way too much noise and risk of distracting our marketing
> resources.
> 
> For internal reports I'd be open to start tracking some significant
> metrics from other projects: let me know which ones you care about and
> I'll be happy to work on producing a periodic report for staff and board.
> 
>> I *know* that a number of OpenStack member companies use their
>> "position" in terms of ATC contributions as a marketing point, and
>> having an accurate baseline for those numbers might also be valuable.
> 
> All that data is public on the OpenStack Activity Board: data may be
> wrong though and if you spot mistakes please let me know so I can
> correct them. How companies decide to use public data gathered from
> gerrit, git/github etc is their decision to make.
> 
>> For example, DreamHost has suddenly become the most substantial
>> contributor to Quantum *ever*. :)
> 
> I see the smile ... but for the record, your link refers to a report
> limited to havana only and counts 'Lines of code' (added/removed? not
> clear) which is a very poor metric when quoted out of context: I'm sure
> you know and I'd expect to count on people that know for not quoting
> such data point out of context.
> 
>> As for myself, I often use the count of individual members, corporate
>> members, and total committers in sales and marketing materials - and
>> I've found a number of discrepancies in the user database that I find
>> concerning (duplicate names, etc.). 
> 
> BI is hard :) At the moment the database of people+affiliation as
> cleaned up by Bitergia is what I consider the most reliable produced by
> the Foundation. It's built by merging the Foundation db and the lists
> included in the git-dm tables and some extra manual cleanup.  I can have
> that one published if you think it's needed. You can also look at the
> JSON files and the database dumps linked from
> http://activity.openstack.org/dash/browser/index.html which are the
> results of elaboration. We can discuss on -dev under the [metrics] topic
> more about the technical details.
> 
>> Solid, official data is valuable for
>> everyone - and I think inviting these other projects to join the
>> activity board effort, by making it an openstack project itself, could
>> be a great way to get there. 
> 
> Definitely, I have already invited Mirantis to join the current efforts.
> I'm waiting to see their code in order to judge if and how it can be
> merged with the Activity Board. I definitely like their UI, although it
> has less dimensions than I need to see.
> 
> I always loved the idea of having *one* place for all OpenStack-related
> data and I've learned that no matter what I wish, there will always be
> somebody with his/her own itch to scratch who decides to create a new
> source of data and reports.
> 
> /stef




More information about the Community mailing list