cobra_db.scripts.stage_1_ingest_images

usage: cobra_ingest [-h]

-dn DRIVE_NAMES [DRIVE_NAMES …] -dp DRIVE_PATHS [DRIVE_PATHS …] [-ho HOST] [-p PORT] [-u USERNAME] -db DATABASE_NAME [-n NUM_PROCESSES] [–project_name PROJECT_NAME]

Recursively ingest a set of directories of dicom files into a mongo database.

optional arguments:
-h, --help

show this help message and exit

-dn DRIVE_NAMES [DRIVE_NAMES …], –drive_names DRIVE_NAMES [DRIVE_NAMES …]

the name of the drives to ingest

-dp DRIVE_PATHS [DRIVE_PATHS …], –drive_paths DRIVE_PATHS [DRIVE_PATHS …]

the path where the drives are mounted (in the same order as drive_names)

-ho HOST, –host HOST

MongoDB host

-p PORT, --port PORT

MongoDB port

-u USERNAME, --username USERNAME

MongoDB username

-db DATABASE_NAME, –database_name DATABASE_NAME

MongoDB database

-n NUM_PROCESSES, --num_processes NUM_PROCESSES

Number of processes to use. It runs single threaded by default so that it can be easily debugged

--project_name PROJECT_NAME
A project name that will be stored in the _metadata field of the

created ImageMetadata docs

Module Contents

Functions

parse_arguments(→ Tuple[str, dict, int, str])

Convert the console arguments into python variables.

ingest_metadata(→ int)

Ingest a single dicom file's metadata into the database.

process_files_batch(→ Tuple[int, int])

Process a batch of files.

multiproc_drive(args)

multiproc(mount_paths, connector_kwargs, n_proc, ...)

single_proc(mount_paths, connector_kwargs, project_name)

Run the same thing as multiproc but single threaded

main(mount_paths, connector_kwargs, n_proc, project_name)

cli()

cobra_db.scripts.stage_1_ingest_images.parse_arguments(raw_args: List[str]) Tuple[str, dict, int, str][source]

Convert the console arguments into python variables.

Parameters:

raw_args – the list of commands (usually sys.argv[1:])

Returns:

mount_paths, connector_kwargs, n_proc

cobra_db.scripts.stage_1_ingest_images.ingest_metadata(filepath: str, im_dao: cobra_db.mongo_dao.ImageMetadataDao, mount_paths: dict, project_name) int[source]

Ingest a single dicom file’s metadata into the database. return: 1 if it was ingested properly.

cobra_db.scripts.stage_1_ingest_images.process_files_batch(args) Tuple[int, int][source]

Process a batch of files.

Parameters:

args – Single argument tuple of (filepaths, im_dao, mount_paths). It gives the ability to use multiprocessing.

Returns:

number of seen files and number of correctly processed files

cobra_db.scripts.stage_1_ingest_images.multiproc_drive(args)[source]
cobra_db.scripts.stage_1_ingest_images.multiproc(mount_paths: dict, connector_kwargs: dict, n_proc: int, project_name: str)[source]
cobra_db.scripts.stage_1_ingest_images.single_proc(mount_paths: dict, connector_kwargs: dict, project_name: str)[source]

Run the same thing as multiproc but single threaded

Parameters:
  • mount_paths – Where the drives are

  • connector_kwargs – How to connect to the db

  • project_name – name of the project that the images belong to

cobra_db.scripts.stage_1_ingest_images.main(mount_paths, connector_kwargs, n_proc, project_name)[source]
cobra_db.scripts.stage_1_ingest_images.cli()[source]