trainer.lib package¶

Subpackages¶

trainer.lib.tgui package

Submodules¶

trainer.lib.DslSemantics module¶

class trainer.lib.DslSemantics.DslSemantics(max_resources=10000)¶

Bases: abc.ABC

Defines execution context and utility functionality for running python statements from strings.

Child classes define methods that can be called inside of the DSL constructs.

bind_object(o: Any)¶

compile_prog(prog: str)¶

execute_program(state: Dict) → Generator¶

static gen_wrapper(f: Callable) → _GenericAlias¶: For example, converts a function f: bool -> int to a generator f: Generator[bool] -> Generator[int] :param f: A callable :return: A generator with the semantics of f

static generate_enum(e: enum.EnumMeta) → Generator¶

static is_callable(f: Any) → bool¶

trainer.lib.config module¶

The config of trainer is stored in the user directory in a folder ‘.trainer’.

The database connection string should be in the file configfile accessible by the key ‘db_con’: postgresql+psycopg2://postgres:password@127.0.0.1:5432/db_name

trainer.lib.config.load_config_json()¶

trainer.lib.config.save_config_json(obj=None)¶

trainer.lib.data_model module¶

Data Model¶

The data model aims to simplify machine learning on complex data structures. For example, classifying a subject (medical patient) by both its gender and between 1 and 4 ultrasound videos.

A dataset contains:

Subjects (Which are the training examples)
Model Weights
Json config files - Class definitions - Segmentation mask definitions

A Subject contains:

Images & masks
Classes
Json config files

Multiple different types of binaries are supported.

Image stacks are used for images, videos and 3D images. Shape of an image stack: [#frames, width, height, #channels]

Segmentation Masks (‘img_mask’) are used to store every annotated structure for one frame of an imagestack. Shape of a mask: [width, height, #structures]

Miscellaneous objects are general pickled objects.

class trainer.lib.data_model.ClassDefinition(**kwargs)¶

Bases: sqlalchemy.ext.declarative.api.Base

classmethod build_new(name: str, cls_type: trainer.lib.data_model.ClassType, values: List[str])¶

cls_type¶

id¶

name¶

values¶

class trainer.lib.data_model.ClassType(value)¶

Bases: enum.Enum

An enumeration.

Binary = 'binary'¶

Nominal = 'nominal'¶

Ordinal = 'ordinal'¶

class trainer.lib.data_model.Classifiable¶

Bases: object

classes = Column(None, JSONB(astext_type=Text()), table=None)¶

get_class(class_name: str)¶

classmethod query_all_with_class(session: sqlalchemy.orm.session.Session, class_name: str)¶

remove_class(class_name: str)¶

set_class(class_name: str, class_val: str)¶

class trainer.lib.data_model.Dataset(**kwargs)¶

Bases: sqlalchemy.ext.declarative.api.Base

A dataset is a collection of splits.

add_split(split_name: str) → trainer.lib.data_model.Split ¶

classmethod build_new(name: str)¶

get_split_by_name(split_name: str)¶

get_summary() → str¶

id¶

name¶

splits: List[Split]¶

class trainer.lib.data_model.ImStack¶

Bases: trainer.lib.data_model.Classifiable, trainer.lib.data_model.NumpyBinary, sqlalchemy.ext.declarative.api.Base

add_ss_mask(gt_arr: numpy.ndarray, sem_seg_tpl: trainer.lib.data_model.SemSegTpl, for_frame=0, ignore_shape_mismatch=False)¶

binary¶

classmethod build_new(src_im: numpy.ndarray, extra_info: Optional[Dict] = None)¶

Only adds images, not volumes or videos! Unless it is already in shape (frames, width, height, channels). Multi-channel images are assumed to be channels last. Grayscale images are assumed to be of shape (width, height).

The array is saved using type np.uint8 and is expected to have intensities in the range of [0, 255]

Parameters

src_im – Numpy Array. Can be of shape (W, H), (W, H, #C) or (#F, W, H, #C)
extra_info – Extra info for a human. Must contain only standard types to be json serializable

classes¶

dtype¶

extra_info¶

file_path¶

id¶

sbjt_id¶

semseg_masks: List[SemSegMask]¶

shape¶

stored_in_db¶

class trainer.lib.data_model.MaskType(value)¶

Bases: enum.Enum

Possible types that a mask can have.

blob: straightforward region. Is used for most segmentation tasks
A point is usually segmented as a small circle and then postprocessed to be the center of that circle
A line is usually segmented as a sausage and then postprocessed to a single response-line

Blob = 'blob'¶

Line = 'line'¶

Point = 'point'¶

Unknown = 'unknown'¶

class trainer.lib.data_model.NumpyBinary¶

Bases: object

binary = Column(None, LargeBinary(), table=None)¶

dtype = Column(None, String(), table=None)¶

file_path = Column(None, String(), table=None)¶

get_bin_disk_path()¶

init_on_load()¶: Does the job of the constructor in case of an object which is loaded from the database. See https://docs.sqlalchemy.org/en/13/orm/constructors.html for details.

set_array(arr: numpy.ndarray) → None¶

shape = Column(None, String(), table=None)¶

stored_in_db = Column(None, Boolean(), table=None)¶

values() → numpy.ndarray¶

class trainer.lib.data_model.SemSegClass(**kwargs)¶

Bases: sqlalchemy.ext.declarative.api.Base

classmethod build_new(name: str, ss_type: trainer.lib.data_model.MaskType)¶

id¶

name¶

ss_type¶

tpl_id¶

class trainer.lib.data_model.SemSegMask¶

Bases: trainer.lib.data_model.Classifiable, trainer.lib.data_model.NumpyBinary, sqlalchemy.ext.declarative.api.Base

binary¶

classes¶

dtype¶

file_path¶

for_frame¶

id¶

im_stack_id¶

mtype¶

shape¶

stored_in_db¶

tpl¶

tpl_id¶

class trainer.lib.data_model.SemSegTpl(**kwargs)¶

Bases: sqlalchemy.ext.declarative.api.Base

classmethod build_new(tpl_name: str, seg_types: Dict[str, trainer.lib.data_model.MaskType])¶

id¶

name¶

ss_classes: List[SemSegClass]¶

class trainer.lib.data_model.Split(**kwargs)¶

Bases: sqlalchemy.ext.declarative.api.Base

dataset_id¶

id¶

name¶

sbjts: List[Subject]¶

class trainer.lib.data_model.Subject(**kwargs)¶

Bases: trainer.lib.data_model.Classifiable, sqlalchemy.ext.declarative.api.Base

In a medical context a subject is concerned with the data of one patient. For example, a patient has classes (disease_1, …), imaging (US video, CT volumetric data, x-ray image, …), text (symptom description, history) and structured data (date of birth, nationality…).

The extra_info attribute can be used freely for a json dict.

In future releases a complete changelog will be saved in a format suitable for process mining.

classmethod build_new(name: str, extra_info: Optional[Dict] = None)¶

classes¶

extra_info¶

id¶

ims: List[ImStack]¶

name¶

trainer.lib.data_model.reset_data_model()¶

trainer.lib.gen_utils module¶

class trainer.lib.gen_utils.GenCacher(*args, **kwds)¶

Bases: typing.Generic

Wrapper around a generator that stores the already yielded values and therefore allows indexing.

fill_cache(idx: int)¶

get_cache_len() → int¶

is_exhausted() → bool¶

trainer.lib.gen_utils.product(gens: List[Generator]) → Generator¶

Utility to compute the cartesian product between an arbitrary number of generators. Developed to handle the case of a possible mix of finite and infinite generators. The built-in itertools.product can only compute the cartesian product between finite generators.

The exploration strategy can be visualized using the following code block:

import matplotlib.pyplot as plt
import trainer.demo_data as dd
import trainer.lib as lib

fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.set_xlim3d([0.0, 10.0])
ax.set_xlabel('X')
ax.set_ylim3d([0.0, 10.0])
ax.set_ylabel('Y')
ax.set_zlim3d([0.0, 10.0])
ax.set_zlabel('Z')
xs, ys, zs = [], [], []

gens = [
    dd.finite_test_gen(start=0, end=3),
    dd.infinite_test_gen(first=0),
    dd.finite_test_gen(start=0, end=3)
]

for c in lib.product(gens):
    xs.append(c[0])
    ys.append(c[1])
    zs.append(c[2])
    ax.plot(xs=xs[-2:], ys=ys[-2:], zs=zs[-2:])
    fig.show()
    plt.pause(0.01)

The result looks as following:

Parameters: gens – Between 1 and N generators
Returns: One generator that returns all N-tuples, built from the input generators

trainer.lib.gen_utils.sample_randomly(gens: Union[List[Generator], List[Iterator]], probas: List[float], use_softmax=False)¶

Draw from one generator in a list according to uniformly distributed probabilities.

Parameters

gens – A list of generators
probas – List of generator probabilities, must correspond to the list of generators
use_softmax – Use softmax to press priorities to one

Returns

Randomly drawn value from one of the generators

trainer.lib.gen_utils.summations(sum_to: int, ls: List[int]) → Generator[Tuple, None, None]¶

trainer.lib.grammar module¶

class trainer.lib.grammar.Grammar(*args, **kwds)¶

Bases: typing.Generic

append_semantics(f: Callable, prio: float)¶

get_rule(nts: NTS) → List[Tuple[List[Union[TS, NTS]], float]]¶

read_program(start_symbol: NTS) → Generator[Union[List[TS], None]]¶

sample_prog_strings(sym: NTS)¶

trainer.lib.grammar.analyse_function_type(f: Callable) → Tuple[List[type], type]¶

trainer.lib.import_utils module¶

Collection of utility function which can be used to add content from other file-formats and sources to the convenient trainer dataset-format.

trainer.lib.import_utils.add_image_folder(split: trainer.lib.data_model.Split, folder_path: str, progress=True, sess=<sqlalchemy.orm.session.Session object>) → None¶

Iterates through a folder and adds its contents to a split.

If a file is found, a new subject is created with only that file. If a directory is found, a new subject is created with all files that live within that directory. If a dicom file is found, the image is appended to the subject with that patient_id

Supported file formats: - DICOM (no extension or .dcm) - Standard image files - B8 files (.b8)

Parameters

split – The dataset split this data is appended to. The split knows its dataset.
folder_path – Top level folder path
progress – If true, displays a progress bar
sess – database session, defaults to a new session

trainer.lib.import_utils.add_imagestack(s: trainer.lib.data_model.Subject, file_path: str, binary_id='') → None¶: Takes an image path and tries to deduce the type of image from the path ending. No path ending is assumed to be a DICOM file (not a DICOM folder)

trainer.lib.import_utils.add_import_folder(split: trainer.lib.data_model.Split, folder_path: str, semsegtpl: trainer.lib.data_model.SemSegTpl)¶

trainer.lib.import_utils.export_to_folder(split: trainer.lib.data_model.Split, folder_path: str)¶

trainer.lib.import_utils.import_dicom(dicom_path: str)¶

trainer.lib.import_utils.import_subject(split: trainer.lib.data_model.Split, subject_path: str, semsegtpl: trainer.lib.data_model.SemSegTpl)¶: Imports a subject from the format created when exporting datasets

trainer.lib.logging module¶

class trainer.lib.logging.Experiment(**kwargs)¶

Bases: sqlalchemy.ext.declarative.api.Base

Allows to track solutions over time. Uses the database for different types of logs.

add_result(result_name: str, flag='success', sess=None, auto_commit=True) → None¶

Add a new result to the current run.

Parameters

result_name – Value of the result.
flag – Flag of the result. For example ‘success’ or ‘fail’. It is case-sensitive.
sess –
auto_commit –

classmethod build_new(experiment_name: str, sess=None)¶

experiment_name¶

static get_all_with_flag(sess: sqlalchemy.orm.session.Session, exp_name: str, flag='success') → List[str]¶: Computes all results with a certain flag from the history i.g. all runs.

get_results(flag='success') → List[str]¶

id¶

is_in(result_name: str, flag='success') → bool¶: Returns if the result was already added to the current experiment.

results: List[ExperimentResult]¶

start_date¶

class trainer.lib.logging.ExperimentResult(**kwargs)¶

Bases: sqlalchemy.ext.declarative.api.Base

The semantics of one instance of this class might be: data point #223 was correctly classified.

classmethod build_new(name: str, flag: str)¶

exp_id¶

flag¶

id¶

name¶

class trainer.lib.logging.LogWriter(log_dir: str = './logs', id_hint='log')¶

Bases: object

debug_var(o: Any) → None¶

Allows to inspect an arbitrary python object on disk.

For visualizing an array with a description debug a (np.ndarray, str) tuple. For visualizing multiple arrays with one description each debug a (np.ndarray, str) list.

Parameters: o – Any variable

get_absolute_run_folder() → str¶

get_parent_log_folder() → str¶

trainer.lib.misc module¶

trainer.lib.misc.create_identifier(hint: str = '') → str¶

Can be used to create unique names for files by exploiting the uniqueness of the current date. Be aware that if two identifiers are created during the same second they are equal! Follows the form YYYY_MM_DD__hh_mm_ss.

Returns: YYYY_MM_DD__hh_mm_ss_{hint}

trainer.lib.misc.delete_dir(dir_path: str, blocking=True, verbose=True)¶

trainer.lib.misc.download_and_extract(online_url: str, parent_dir='./', dir_name: str = None) → str¶: Can be used to download and extract a .zip dataset file hosted online. Assumes the zip to be one directory. :param online_url: The url that points directly to a .zip file containing folders with the dataset files :param parent_dir: The directory that is used to store the temporary zip and the final extracted folder :param dir_name: if provided, the function checks if the directory already exists and does not download it again :return: The absolute local path to the directory

trainer.lib.misc.get_img_from_fig(fig: matplotlib.pyplot.figure, dpi=180) → numpy.ndarray¶

Converts a matplotlib figure into a numpy array.

Parameters

fig – A matplotlib figure
dpi – image quality, higher is better and takes longer

Returns

np.ndarray with the image content

trainer.lib.misc.load_b8(file_path: str) → numpy.ndarray¶

Loads b8 file used by some ultrasound machines.

Param: file_path: The direct path to the b8 file
Returns: Numpy array with the image data

trainer.lib.misc.load_grayscale_from_disk(path: str) → numpy.ndarray¶

trainer.lib.misc.make_converter_dict_for_enum(e)¶

trainer.lib.misc.pick_from_list(ls: List[T], title='Title', rows=- 1, columns=- 1) → T¶

Spawns a small gui which allows the user to select from a list.

Be aware that this works only if a $Display is set

trainer.lib.misc.slugify(value)¶: Normalizes string, converts to lowercase, removes non-alpha characters, and converts spaces to hyphens.

trainer.lib.misc.standalone_foldergrab(folder_not_file: bool = False, optional_inputs: List[Tuple[str, str]] = None, optional_choices: List[Tuple[str, str, List[str]]] = None, title='Select a folder') → Tuple[str, Dict[str, str]]¶

Uses a simple blocking GUI for prompting the user for a file or folder path. Optionally allows to prompt for additional text inputs as well.

Parameters

folder_not_file – if True asks for a folder, if False asks for a file
optional_inputs – A list of (description, key) pairs
optional_choices – A list of (description, key, list of options) pairs
title –

Returns

Tuple with path as first entry and a dictionary with the optional text inputs as second entry

trainer.lib.syn module¶

Given rules of a grammar the syn package searches for syntactic correct solutions for example input/output pairs.

class trainer.lib.syn.NonTerminalSym(rule: trainer.lib.syn.SubstitutionRule)¶: Bases: object

class trainer.lib.syn.State(pair: Pair)¶

Bases: object

is_final() → bool¶

Can be used to determine for one

:return True if the state is a valid solution to the output

visualize(ax1, ax2)¶

class trainer.lib.syn.SubstitutionRule(left_side, right_side)¶: Bases: object

class trainer.lib.syn.TerminalSym¶

Bases: object

apply_on_state(s: trainer.lib.syn.State)¶

semantics()¶

Module contents¶

trainer.lib.reset_complete_database()¶

trainer.lib package¶

Subpackages¶

Submodules¶

trainer.lib.DslSemantics module¶

trainer.lib.config module¶

trainer.lib.data_model module¶

Data Model¶

trainer.lib.gen_utils module¶

trainer.lib.grammar module¶

trainer.lib.import_utils module¶

trainer.lib.logging module¶

trainer.lib.misc module¶

trainer.lib.syn module¶

Module contents¶

AI-Trainer

Navigation

Related Topics