Raw popcon data

Bug #293132 reported by Siegfried Gevatter
2
Affects Status Importance Assigned to Milestone
Ubuntu Website - OBSOLETE
Invalid
Undecided
Unassigned

Bug Description

Hi,

I'm thinking about parsing the popcon data to a) get a list of packages which might be worth to be included into Ubuntu, and b) list the popcon of each package on REVU (as they may already be available on PPAs, getdeb, etc), but I've found two problems with this so far:

1. File http://popcon.ubuntu.com/all-popcon-results.txt.gz seems to be broken (geany complains about it if I open it after gunzip'ing it and Python fails trying to extract it, too).

2. The raw file doesn't seem to include the number of installations of each package, though the website shows it.

Revision history for this message
Siegfried Gevatter (rainct) wrote :

Update: 1 only happens if I use Python to download the file («tmpfile.write(urllib2.urlopen(popcon_url).read())»). Reading the file saved like that gives: «IOError: CRC check failed 0xfdd47f52 != 0x92460f0L».

Revision history for this message
Siegfried Gevatter (rainct) wrote :

19:19 < Ng> RainCT: I just downloaded the file and unzipped it and I can read it fine?
19:20 < Ng> I just used gunzip and less
19:21 < RainCT> Ng: right, nevermind. Seems it was just a problem with how I downloaded it, and I've just found a
                way that works :)
19:21 < Ng> I would assume that the file takes a little while to generate, so maybe it's a case of grabbing it
            before it's ready
19:21 < Ng> ah
19:22 < RainCT> tmpfile.write(urllib2.urlopen(popcon_url).read()) failed to get the complete file, but now I've
                found out about urllib.urlretrieve(popcon_url, tmpfile.name) which works
19:23 < RainCT> there's still the problem of inst not being there
19:24 < RainCT> eg, the file contains «Package: tar 63901 885477 23056 135», but the website shows «18 tar 972569 63901 885477 23056 135 (Bdale Garbee)», where the 972569 is the number of people who
                installed the package
19:25 < RainCT> ah, but that is 885477+63901+23056, so nevermind too :P

Changed in ubuntu-website:
status: New → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.