pimlico.core.modules.map.singleproc module¶
Sometimes the simple multiprocessing-based approach to map module parallelization just isn’t suitable. This module provides an equivalent set of implementations and convenience functions that don’t use multiprocessing, but conform to the pool-based execution pattern by creating a single-thread pool.
-
class
pimlico.core.modules.map.singleproc.
MultiprocessingMapModuleExecutor
(module_instance_info, **kwargs)[source]¶ Bases:
pimlico.core.modules.map.DocumentMapModuleExecutor
-
POOL_TYPE
= None¶
-
-
class
pimlico.core.modules.map.singleproc.
SingleThreadMapPool
(executor)[source]¶ Bases:
pimlico.core.modules.map.DocumentProcessorPool
A base implementation of document map parallelization using a single thread.
-
THREAD_TYPE
= None¶
-
-
class
pimlico.core.modules.map.singleproc.
SingleThreadMapWorker
(input_queue, output_queue, exception_queue, executor)[source]¶ Bases:
threading.Thread
,pimlico.core.modules.map.DocumentMapProcessMixin
-
pimlico.core.modules.map.singleproc.
single_process_executor_factory
(process_document_fn, preprocess_fn=None, postprocess_fn=None)[source]¶ Factory function for creating an executor that uses the single-process implementations of document-map pools and workers. This is an easy way to implement a non-parallelized executor
process_document_fn should be a function that takes the following arguments:
- the executor instance (allowing access to things set during setup)
- archive name
- document name
- the rest of the args are the document itself, from each of the input corpora
If proprocess_fn is given, it is called once before execution begins, with the executor as an argument.
If postprocess_fn is given, it is called at the end of execution, including on the way out after an error, with the executor as an argument and a kwarg error which is True if execution failed.