tar

Wrapper around tar reader, to provide the same interface as Pimarc.

This means we can deprecate the use of tar files, but keep backwards compatibility for a time, whilst moving over to direct use of Pimarc objects.

class PimarcTarBackend(archive_filename)[source]

Bases: object

open()[source]
close()[source]
iter_filenames()[source]

Just iterate over the filenames (decoded if necessary). Used to create metadata, check for file existence, etc.

Not as fast as with Pimarc, as we need to pass over the whole archive file to read all the names.

iter_metadata()[source]

Iterate over all files in the archive, yielding just the metadata, skipping over the data.

iter_files(skip=None, start_after=None)[source]

Iterate over files, together with their JSON metadata, which includes their name (as “name”).

Parameters:
  • start_after – skips all files before that with the given name, which is expected to be in the archive
  • skip – skips over the first portion of the archive, until this number of documents have been seen. Ignored is start_after is given.