Home > Chromium, Ubuntu > How many Chromium users in Ubuntu?

How many Chromium users in Ubuntu?

As the Chromium maintainer for Ubuntu, I often wonder how many users installed the different packages I maintain. It’s obviously difficult, not to say impossible, to tell how many of those are active users, mostly because it would imply adding some ping back mechanism that would hurt the privacy of some of those users, a line I am not willing to cross. So to try to answer this question, I depend on publicly available data.

To start with, I need to see where users are taking their chromium packages from.

The packages I maintain are distributed both in the official Ubuntu repositories and in 4 dedicated PPAs, that I call Channels, matching the upstream jargon used for the brother browser Google Chrome. So I start with those:

Official repositories

  • lucid/universe
  • maverick/universe
  • natty/universe

PPA/channels

  • stable
  • beta
  • dev
  • trunk (daily builds)
Obviously, all are optional, so having any of those installed means the user did it willingly.

Looking for more sources of the chromium-browser debs gave me little results, beside some obscure user PPAs cloning my own builds, which I seriously doubt are driving a lot of traffic. I’ll just ignore them from here (if you feel it’s wrong, just add a comment).

For the first category, it’s not possible to get stats. The reason is that most people are getting their updates from Ubuntu mirrors, for which we have no stats. Some users are even using proxies/caches making the matter worse.

One option we have though is the popularity-contest tool, called popcon. This is an opt-in tool, meaning that all it can report is purely statistical, i.e. a subset of the whole population. But the first question it raises is to know if that subset is representative or not..

The second option is the PPA stats coming from Launchpad. Obviously, it covers only the channels but it’s supposed to be accurate. But is it really?

Let’s review both to see what we can conclude, if anything.

popcon

Today, it tells me that the “chromium-browser” package has been installed by 146 162 users out of 1 926 962 (or 7.59%). Is that satisfactory? not quite. Let me explain..

As a matter of comparison, for its brother google-chrome, available as a deb in 3 channels, I get google-chrome-stable: 4.70%, google-chrome-beta: 4.16% and google-chrome-unstable 1.60%. So Chromium wins over Chrome Stable by a large margin, yet, the sum of the 3 chrome is higher but it’s difficult to say if the chrome figures are overlapping and by how much.

Can we even conclude anything about those percentages?

I used to produce various charts to track the number of installs reported by popcon. Here is the first type I started two years ago:

Those numbers didn’t mean much so I decided to track percentages instead, along with more browsers.

but it raised even more questions than it answered. Is konqueror really declining that fast? or is it KDE? why is epiphany so high? was it installed by default at some point? why is google-chrome-beta declining so fast here while it was still growing in the previous chart? the number of popcon users seems to be growing faster.. but maybe it’s really declining, hidden by the growth of popcon. I really need better figures.

Also, those are percentages, not the number of users (or more precisely, installs) I wanted to know. What can we do about this?

Back in November 2010, someone from Canonical estimated the Ubuntu user base to 12 millions. So I dug into the popcon data for the total number of reportees at that time, it was about 1760,000 (Nov 1st). That’s roughly a 9.5% increase in 6 months. Does it mean there are about 10 more percents of Ubuntu users now? difficult to say for sure. I assume popcon attracts more tech-savvy users than regular users and that the use of popcon per ubuntu distribution is not stable (*) so it may be all wrong to extrapolate this to the whole population.. yet, as we have nothing else, let’s assume for a minute it’s representative. That would mean the ubuntu user base is now at 13.2 millions, and that popcon represents about 15% of them.. so scaling chromium up (from its 7.59%) would mean 1 million users. if it’s true, woow..

(*) to mitigate this, 2 other examples: a/ firefox is at 93.62%, it seems to mean that most of the {k,x,l}ubuntu users have it.. well, maybe, it’s an unremovable dependency in ubuntu, but that seems a bit too high as a whole. b/ gwibber is at 20.09%.. yet, it’s installed by default since lucid. That seems awfully low to me. It seems to imply that either 1/ the crowd is still running pre-lucid ubuntu, or 2/ ubuntu is now a minority in the *buntu familly, or 3/ most people uninstalled gwibber or 4/ popcon is used mostly by pre-lucid users or finally 5/ popcon is not good for anything. My guess is that 1 and 2 are unlikely and that it’s a mix of the remaining options, with 5 most likely taking its share.

PPA stats

This looks more promising. Or at least, it will once the Launchpad devs sort a few things out, as we will see.

A few months ago, I played with the Launchpad python API to extract some stats for my numerous PPAs. I concluded at that time that the chromium PPAs were losing subscribers, most probably in favor of the official Ubuntu repositories. I since updated my tool to dig further into this. Did it confirm the trend? well, not really. Here is what I first saw:

That’s the global chart showing the chromium-browser deb in the daily PPA (trunk builds). Obviously something is wrong. In March 2010 and July 2010, some unexpected spikes are difficult to explain but worse, the whole chart jumped in February 2011. Why? big mystery.. I  asked a few times the Launchpad team in the last few weeks, got no answer. I ended up filing bug 767258, still without answer. So while still in doubt, I will consider the stats are correct now (everything post Feb 2011). That completely changes my previous conclusion, it’s not declining at all.

Let’s see what the 4 channels did in the last 2 months:

Does this mean the channels have respectively 15000 (trunk), 1500 (dev), 2000 (beta) and 7000 (stable) subscribers? once again, not quite, and far from it.

The reason the charts are misleading is that they show both new installs and upgrades, and users don’t upgrade everyday. So those charts represent only a subset of the total subscribers of those channels. The longer a package stays without being updated, the closer we get to obtain the information we need.

Charts are not the best tool to obtain that information. They are nice to give trends, but here, we need numbers. Here is what I got for the daily PPA:

To solve the problem of packages being updated before all the subscribers got a chance to update, we need a more stable hint. A dependency in the same PPA could do the trick. In the case of Chromium, I can track a lib called libvpx0 (used to provide WebM support in the <video> tag).

It seems far better. libvxp 0.9.6 is only available in the official repository for natty, and it’s unlikely that anyone would use this PPA to just get the lib without taking chromium. Also, this version has been in the PPA for 7 weeks, which should be enough. We could then consider that this PPA has 34600 maverick + 21258 lucid + 7245 karmic + 1148 hardy + an unknown number of natty subscribers. That unknown number of natty users could be estimated using the previous table to roughly 30% of the number of maverick users, or ~10.000. All in all, that means that PPA has about 75000 subscribers, 5 times more than the 15000 showed in the graph.

There’s another information we can get using this libvxp0.

Here, we can see the update pattern, 0.9.6 replaced a previous version on Mar 8. After the initial rush, it quickly decreased as expected, but after 3 weeks, there’s still a background of 500~600 installs per day (still not counting natty). They represent the number of new subscribers. That’s an impressive number for a PPA, even more so for one containing daily builds.

When I do the same thing for the other channels, here is what I get:

Overall, that’s about 6 times the first estimation using the graphs: 136k PPA subscribers, with about 1000 more everyday. You can check for yourself.

Also worth noting is that with karmic and hardy being end of life soon, dropping those means leaving 10k orphans behind.

After a lot of estimations and hypotheses, we get 1 million users give or take.. hm, a lot, out of which 136k come from one of the 4 PPAs, without any overlap.

What do you think? does it look realistic or is it pure fiction?

Advertisements
  1. Jef Spaleta
    April 29, 2011 at 21:09

    Note,
    Ubuntu’s popcon is not run in the same manner that Debian’s popcon is. Debian’s popcon cleans out state UUIDs that have no reported in on a approximately monthly basis.

    Based on my testing of the Ubuntu popcon, I have concluded that Ubuntu’s popcon is not doing this cleaning of stale UUIDs. Look at the less popular arches like HPPA or Sparc, the installed stats are monotonically increasing and are flat of the span of months. This is evidence that Ubuntu’s popcon is not cleaning. Some of the sparc and HPPA counts are my own hand crafted reports into the system and they were never discarded even though I’ve only reported using the UUID once.

    You can also see the effect when you compare top “voted” and “installed” packaged. “Voted” as a clear meaning regard to recent use. In the Debian popcon top “installed” and top “voted” numbers match up pretty well in the raw numbers. Not so for the Ubuntu popcon. I can go into why this implies a lack of UUID cleaning in more detail if its not clear to you when looking at the numbers.

    With that said, I don’t think you can look at the Ubuntu popcon counts as “installed” in the same way that Debian intended for popcon to mean. I have no idea how large the impact of aggregate accumulation of stale UUIDs is, but it does change the interpretation in a non-subtle way. Please be aware of that when quoting the numbers. The lack of UUID cleaning is not _wrong_ or _malicious_, its just _different_ and you have to take it into account when looking at the numbers.

    Also please note that noone from Canonical has ever gone on record with an explanation of how they got their 12 million user estimate number. Until the methodology is articulated, you should avoid requoting the number. As of this summer Mark Shuttleworth said the number is whatever they want it to be.

    http://irclogs.ubuntu.com/2010/10/14/%23ubuntu-classroom.html

    popey asked: We often see figures for how many Ubuntu installs there are, 8 million here, 12 million there. Can you give us definitive (near enough) figures and tell us how you arrive at them? This would help dispell some naysayers who claim we’re making these numbers up.
    sabdfl no, i have no definitive answer
    sabdfl there are stats
    sabdfl but we can make those say whatever we want

    -jef

    • fta
      April 29, 2011 at 21:21

      those 12 millions came from wikipedia, which has a link to an interview of a Canonical employee making that claim. Yet, I agree that there’s no methodology exposed in that interview, even less something that could be verified. Note that I never said I trusted this number, please re-read my article, it’s all conditional.

      otoh, I think I clearly explained my own methodology, including all the doubts I have /wrt popcon and the Launchpad bugs. Until there’s something better, that’s all I have.

  2. Charles Bowman
    April 29, 2011 at 21:33

    Well I’m just one user but installing Chromium & your ppa is the first thing I do after installing a new system. I can’t imagine using the web without it.

    Thank you.

  3. Jef Spaleta
    April 29, 2011 at 21:57

    I’m not suggesting that you trust it. I’m suggesting that it should never be repeated without pointing to the irclog reference I have provided where Shuttleworth states “we can make it anything we want.” Because even if you don’t trust it…just repeating it..lends more credence than it deserves for a number that is pulled out of thin air.

    I’m still chewing my way through your ppa stats. As this is the first time I’ve seen someone attempt to hold up PPA stats I’ve not yet done my own analysis to validate whether its a good methodology or not. I will refrain from commenting on the PPA analysis until I’ve done that.

    With respect to popcon, you can use the “vote” column to re-orient your results.
    Looking at Ubuntu at the top “vote” packages you see that “installed” and “vote” are off by an order of magnitude. In debian they are quite close.

    To rescale Ubuntu’s popcon take the top “voted” package and come up with a scaling factor for UUID deadloss for all other packages like this:

    debianutils voted/installed = 191772/1925165 = 0.0996
    dpkg voted/installed = 191469/1925004 = 0.0995
    perl-base voted/installed = 190592/1925110 = 0.0990

    So be generous and say that Ubuntu popcon install column is on average ten times larger than the number of actual active UUIDs. Ubuntu popcon has an order of magnitude stale UUID deadloss polluting its installed stats. Now you can take that 0.1 factor and looking at any packages install counts and use the higher number of either 0.1* or as the estimate for live UUIDs with that package.

    You can do the same scale factor for debian. And you get something close to .999 so there’s effectively no correction factor needed for UUID deadloss, as debian corrects for that every month when they clean their stale UUIDs.

    -jef

    • fta
      April 29, 2011 at 22:30

      your 0.1 ratio is obviously bogus. It may work for some packages but not as a general ratio for everything. I track a larger number packages, including my own, some of which I phased out and their popcon “installed” figures consequently staled then shrunk to almost nothing. I have no reason to believe everything popcon is reporting is bogus.
      Also, PPA stats gave me 136k. Even with a 10% error margin, it’s not possible to say popon with its 146k “installs” is ten times overestimated. That doesn’t compute. It would mean noone is using the package in the official repositories and I have reasons to believe it’s not true.

      Also, see my comment about gwibber being abnormally low. It seems to mean that popcon is not representative of the whole ubuntu population. Maybe noone installed it since lucid and all we get is stats from people who did their initial install long ago. I have no way of knowing that. At least in Debian, there’s a way to track popcon itself per version, giving a hint of which distro is represented. We should do the same in Ubuntu. We should also hash all the results per dist. It’s way too limited in its current state.

  4. Jef Spaleta
    April 29, 2011 at 22:49

    I’m not suggesting popcon is representative of the general user population. Nor does my factor attempt to suggest that. My factor attempts to corrects for the accumulation of stale UUIDs that Ubuntu is not culling to give an accurate picture of the subpopulation who are using popcon and reporting in on a monthly or more frequent basis.

    I make no claim that the subpopulation of popcon is representative and I make no suggestion as to a way to scale the popcon using subpopulation up to an overall userbase estimate. The scale factor I have mentioned is simply a way to account for accomulation inactive UUIDs in the Ubuntu popcon installed numbers. Similar factors could be broken out by release for a higher accuracy factor for each release, but what is on the Ubuntu popcon site is across all releases historically and the factor is meant in that context with all the caveats that implies. The state UUID accumulation in the public graphs on that site is significant in terms of interpreting the graphs correctly.

    Nor do I claim that everything in Ubuntu popcon is bogus. As far as I am aware by doing my own analysis, the only column which has to be interpreted differently than originally intended by Debian when they introduced popcon is the installed column. The voted, recent and old all appear to be valid based on my analysis and my understanding upstream scripts which produce the data tables.

    -jef

  5. April 30, 2011 at 01:07

    I have Chrome and Chromium, but I use firefox.

  6. molecule-eye
    April 30, 2011 at 01:24

    I often install Chrome from the dev channel, but I rarely use it. I’m mostly using opera and firefox, and when I fast-loading webkit browser I use rekonq (since I’m on KDE). So I’m one of those “inactive’ downloaders of Chrome. Why do I install it? Well, once in a while I like to see how it compares to other browsers. On non-linux platforms it’s nice, but on linux ones it just doesn’t integrate AT ALL. It uses it’s own graphics engine, it’s on widgets, etc. They all work well, but still–it seems like mostly a bastard child on linux. A shame, really.

  7. Daniel
    April 30, 2011 at 02:19

    Sorry if this is a bit off topic, but as you are the Chromium maintainer for Ubuntu I guess you are the best person to ask: Why use Chromium over Chrome?

  8. Lincoln
    April 30, 2011 at 11:56

    Which tools did you use for the charts?

    • fta
      April 30, 2011 at 21:55

      for the PPA stats? it’s something i wrote. A collector in python using the LP API; and a javascript module for the web page, using jquery flot lib.

      I initially wanted to release the code shortly but there’s no point until bug 767258 is clarified. Also, LP is too slow (bug 774259).

  9. May 28, 2011 at 06:47

    Wow, this is really cool! I’m the creator of UGR, an Ubuntu remix based on GNOME 3, and I think this might be the solution for our stats problem. I see you recently set up stats for the GNOME 3 team, would you mind doing the same for us? We do everything out of PPAs and can set up a static empty package for counting. Thanks!

  1. May 19, 2011 at 14:43
  2. May 19, 2011 at 21:51

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: