pimlico.core.modules.map.threaded module

Just like multiprocessing, but using threading instead. If you’re not sure which you should use, it’s probably multiprocessing.

class ThreadingMapThread(input_queue, output_queue, exception_queue, executor)[source]

Bases: threading.Thread, pimlico.core.modules.map.DocumentMapProcessMixin

notify_no_more_inputs()[source]

Called when there aren’t any more inputs to come.

run()[source]

Method representing the thread’s activity.

You may override this method in a subclass. The standard run() method invokes the callable object passed to the object’s constructor as the target argument, if any, with sequential and keyword arguments taken from the args and kwargs arguments, respectively.

shutdown(timeout=3.0)[source]
class ThreadingMapPool(executor, processes)[source]

Bases: pimlico.core.modules.map.DocumentProcessorPool

THREAD_TYPE = None
start_worker()[source]
static create_queue(maxsize=None)[source]
shutdown()[source]
class ThreadingMapModuleExecutor(module_instance_info, **kwargs)[source]

Bases: pimlico.core.modules.map.DocumentMapModuleExecutor

POOL_TYPE = None
create_pool(processes)[source]

Should return an instance of the pool to be used for document processing. Should generally be a subclass of DocumentProcessorPool.

Always called after preprocess().

postprocess(error=False)[source]

Allows subclasses to define a finishing procedure to be called after corpus processing if finished.

threading_executor_factory(process_document_fn, preprocess_fn=None, postprocess_fn=None, worker_set_up_fn=None, worker_tear_down_fn=None)[source]

Factory function for creating an executor that uses the threading-based implementations of document-map pools and worker processes.

process_document_fn should be a function that takes the following arguments:

  • the worker process instance (allowing access to things set during setup)
  • archive name
  • document name
  • the rest of the args are the document itself, from each of the input corpora

If proprocess_fn is given, it is called from the main thread once before execution begins, with the executor as an argument.

If postprocess_fn is given, it is called from the main thread at the end of execution, including on the way out after an error, with the executor as an argument and a kwarg error which is True if execution failed.

If worker_set_up_fn is given, it is called within each worker before execution begins, with the worker thread instance as an argument. Likewise, worker_tear_down_fn is called from within the worker thread before it exits.

Alternatively, you can supply a worker type, a subclass of :class:.ThreadingMapThread, as the first argument. If you do this, worker_set_up_fn and worker_tear_down_fn will be ignored.