Librarian files uploaded with addFile() can not be read from the librarian til the transaction is committed (librarian is intermingled with LP)

Bug #113993 reported by James Henstridge
8
Affects Status Importance Assigned to Milestone
Launchpad itself
Triaged
High
Unassigned

Bug Description

There are a number of cases in Launchpad where we want to upload a file to the librarian and then be able to read it back again within the one request. There are two methods of uploading a file to the librarian, and both are problematic at present.

The first method is with addFile(), where the metadata about the file is created on the client side and then the data is sent to the librarian along with the database IDs of the metadata for storage. This is useful because the metadata is usable within the request's transaction. However, the content can not be read back from the librarian until the request's transaction is committed.

The second method is with remoteAddFile(), where the metadata is written to the database on the librarian side. The content can be read from the librarian immediately, but the metadata in the database is unavailable within the request's transaction, due to transactional isolation.

The proposed solution to this problem is to add another method usable from the backend librarian port (which is not exposed to the internet) to download the content for a given content ID, without the librarian checking for its presence in the database. This would make it possible to fetch the content for a file uploaded with addFile() from the request/transaction where it was uploaded. The librarian would not have access to the mime type or file alias when servicing one of these requests.

As this feature would only be available from the backend port, it should not have any security concerns. The client library would need to be responsible for any additional checks that the librarian does, such as checking the deleted flag.

Revision history for this message
Andrew Bennetts (spiv) wrote :

This solution sounds good to me. +1

Revision history for this message
James Henstridge (jamesh) wrote :

From talking with Andrew on IRC, it looks like it'd be easier to add a second backend port served by another twisted.web class, rather than trying to rejig the upload protocol to handle a download verb too.

This second one could handle requests as simple as:

    C: GET /$content_id HTTP/1.1
    C:
    S: 200 Ok HTTP/1.1
    S: Content-Type: application/octet-stream
    S: Content-Length: xxx
    S:
    S: ...

It can then look up the content ID directly in its store and serve it. As no alias ID is involved, the server can not know what the mime type is, so I've used "application/octet-stream" above.

Changed in launchpad:
importance: Undecided → Medium
status: New → Confirmed
visibility: private → public
Revision history for this message
Stuart Bishop (stub) wrote :

We could serve the file at this point without the mime type, but more of a concern now is the files privacy settings. Without access to the LibraryFileAlias record in the database, the Librarian does not know if the file can be retrieved from the public facing port or not.

Changed in launchpad-foundations:
status: Triaged → Won't Fix
Revision history for this message
Aaron Bentley (abentley) wrote :

The suggested approaches don't involve retrieving the file from a public-facing port, so the rationale for setting this bug to "won't fix" doesn't apply.

"The proposed solution to this problem is to add another method usable from the backend librarian port (which is not exposed to the internet) to download the content for a given content ID, without the librarian checking for its presence in the database."

"it looks like it'd be easier to add a second backend port served by another twisted.web class, rather than trying to rejig the upload protocol to handle a download verb too."

Changed in launchpad-foundations:
status: Won't Fix → Triaged
Revision history for this message
Stuart Bishop (stub) wrote :

@Aaron - do you have a use case for this?

If we do this, I'd just allow retrieving the file via HTTP from the private port (without the content type header, or with a dummy content type) .

Revision history for this message
Robert Collins (lifeless) wrote :

So, we want to move the librarian to a separate service with its own DB, this will address that.

Changed in launchpad:
importance: Medium → High
tags: added: soa
summary: Librarian files uploaded with addFile() can not be read from the
- librarian til the transaction is committed
+ librarian til the transaction is committed (librarian is intermingled
+ with LP)
Jeff Lane  (bladernr)
tags: removed: soa
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.