trainer.cg package

Submodules

trainer.cg.Dsl module

class trainer.cg.Dsl.Context

Bases: abc.ABC

Can be used to hold the context dependent semantics, meant to be derived for a specific problem. For example for image classification it might define a function context.get_img(), returning the input image.

set_state(state: Dict)
class trainer.cg.Dsl.Dsl(c: trainer.cg.Dsl.Context, sample_dict: Dict, samplers: List[trainer.cg.samplers.Sampler], sampler_weight=2.0)

Bases: object

Wrapper around grammar for generating computational graphs

add_function(f: Callable, prio: float) → None

Add an arbitrary python function to your DSL.

Parameters
  • f – A well typed callable

  • prio – The corresponding priority

sample_n_words(r_type: Any, max_n=10)

trainer.cg.DslFunc module

class trainer.cg.DslFunc.CNode(semantics: Union[Callable, enum.Enum], node_type: trainer.cg.DslFunc.CNodeType, name='')

Bases: object

execute(store_last_result=False)

Executes the program using strict evaluation.

classmethod from_json(d: Dict, sem: Dict)trainer.cg.DslFunc.CNode
get_dot(dot_id: int, p_id=- 1) → Tuple[str, int]
class trainer.cg.DslFunc.CNodeType(value)

Bases: enum.Enum

An enumeration.

EnumNode = 2
FuncNode = 0
ParamNode = 1
class trainer.cg.DslFunc.DslFunc(root: trainer.cg.DslFunc.CNode)

Bases: object

execute(store_result=False) → Any
visualize(f_name='', dir_path='', delete_dot_after=True, instance_id=None) → str
trainer.cg.DslFunc.val_to_label(v: Any, max_length=100, vis_depth=4) → str

Attempts to visualize any value using a string displayable in the dot language of graphviz

Parameters
  • v – The value to be visualized

  • max_length – Maximum length of the resulting string

  • vis_depth – maximum lines that the string may occupy

Returns

trainer.cg.DtDataset module

A module which contains glue code.

Loads two program pools from disk. One for computing features, one for exploring actions. Then the pure output of the sampled programs is computed for a given game.

Actions are scored by the number of cells that can be explained by that specific action.

class trainer.cg.DtDataset.DtDataset(pp_f: trainer.cg.ProgPool.ProgPool, pp_a: trainer.cg.ProgPool.ProgPool)

Bases: object

static build_dt_set(raw_data: Tuple[numpy.ndarray, numpy.ndarray, numpy.ndarray, numpy.ndarray, numpy.ndarray]) → Tuple[numpy.ndarray, numpy.ndarray, numpy.ndarray, numpy.ndarray, numpy.ndarray]
get_data(game: trainer.demo_data.arc.Game) → Tuple[Tuple[numpy.ndarray, numpy.ndarray, numpy.ndarray, numpy.ndarray, numpy.ndarray], Tuple[numpy.ndarray, numpy.ndarray, numpy.ndarray, numpy.ndarray, numpy.ndarray]]
trainer.cg.DtDataset.vis_dec_tree(dec_tree: sklearn.tree._classes.DecisionTreeClassifier, name: str, prog_names: numpy.ndarray, class_names: List[str], dir_path='')

trainer.cg.ProgPool module

class trainer.cg.ProgPool.ProgInstance(architecture_id: int)

Bases: object

duplicate()trainer.cg.ProgPool.ProgInstance
classmethod from_json(state: Dict, s_dict: Dict[str, trainer.cg.samplers.Sampler])
get_architecture_id() → int
get_instance_id() → int
get_json()
get_node_dict(n_id: int)
get_value(n_id: int)
increment_used()
is_locked()
lock()
pick_node_id() → Optional[int]
resample_node(n_id: int) → Any
revert_resampling(n_id)
set_instance_id(new_id: int) → None
set_value(n_id, new_val: Any, s: trainer.cg.samplers.Sampler) → Any
unlock()
class trainer.cg.ProgPool.ProgPool(r_type: Any, fs: List[Tuple[Callable, float]], context: trainer.cg.Dsl.Context, samplers: List[trainer.cg.samplers.Sampler], max_nodes=25, append_standard_samplers=True)

Bases: object

append_instance(new_instance: trainer.cg.ProgPool.ProgInstance)
compute_features(states: List[Dict], store=True, visualize_after=False) → Tuple[numpy.ndarray, numpy.ndarray]

Given a list of states, computes the output for all given states. Output: (#ProgramInstances, #states, Width, Height)

duplicate_instance(instance_id: int)
execute(instance_id: int, state: Optional[Dict] = None, store=True, visualize_after=False)

Executes a computational graph, specified by its instance_id in the current pool, on a given state.

Parameters
  • instance_id – Id of the graph-instance in the current pool.

  • state – The state that the graph is computed upon. It will be applied to the context object.

  • store – If true, stores intermediate values at each node. Slow, but improves visualization and debugging

  • visualize_after – If true, writes the graph to disk after executing it

Returns

Result of the graph

classmethod from_disk(p: str, r_type: Any, fs: List[Tuple[Callable, float]], c: trainer.cg.Dsl.Context, samplers: List[trainer.cg.samplers.Sampler])
get_locks()
initialize_instances()
optim_move(temperature: float, dim_factor=3.5)

Resample one node of every program (which is not locked) inside the current instantiations. Performs birth and death moves randomly.

remove_instance(instance_id: int, p_locked=0.9) → Optional[trainer.cg.ProgPool.ProgInstance]

Removes an instance from the pool (Death Move).

remove_unused(remove_all_zero=True)
revert_last_step()
sample_instance()trainer.cg.ProgPool.ProgInstance
sample_words(max_n=10) → None
set_init_num(n: int)
set_samplers(samplers: List[trainer.cg.samplers.Sampler])
to_disk(identifier: str, parent_dir='') → str
visualize_instance(instance_id: int, parent_dir='', f_name='') → None

trainer.cg.dsl_utils module

Specifies utility functions, aiming to allow rapid developing of domain specific computational graphs.

trainer.cg.dsl_utils.general_transform(grid: numpy.ndarray, labels: numpy.ndarray, f: Callable[[numpy.ndarray], numpy.ndarray])

Applies a transformation f on every labelled region in grid independently

trainer.cg.dsl_utils.get_lbl_lu(obj: numpy.ndarray) → Tuple[int, int]
trainer.cg.dsl_utils.objects_from_labels(labels: numpy.ndarray) → List[numpy.ndarray]
trainer.cg.dsl_utils.select_objects(objts: Tuple[numpy.ndarray, numpy.ndarray], lbl_to_float: Callable, count_index: int, reverted: bool)
trainer.cg.dsl_utils.zoom_in(arr: numpy.ndarray, factor: int)

trainer.cg.dt_seq module

class trainer.cg.dt_seq.ArcTransformation(f_inst: numpy.ndarray, a_inst: numpy.ndarray, steps: Optional[List] = None)

Bases: object

Stores one transformation step as a tuple (action_index, decision_tree). Stores a transformation as a list of those tuples.

append_step(t: Tuple[int, sklearn.tree._classes.DecisionTreeClassifier])trainer.cg.dt_seq.ArcTransformation
static enumerate_consistent_moves(x: numpy.ndarray, y: numpy.ndarray, f_inst: numpy.ndarray, a_inst: numpy.ndarray, reduce_duplicated_actions=False) → List[Tuple[int, sklearn.tree._classes.DecisionTreeClassifier, numpy.ndarray, numpy.ndarray]]

Enumerates all moves that might make sense given by the heuristic of pick_action

Parameters
  • reduce_duplicated_actions

  • a_inst

  • f_inst

  • x – Array of shape (rows, number_of_features)

  • y – Array of shape (rows, actions)

Returns

List of (Action_index, remaining_x_rows, remaining_y_rows)

static find_consistent_solutions(x: numpy.ndarray, y: numpy.ndarray, f_s: numpy.ndarray, a_s: numpy.ndarray, max_steps=10) → Optional[List[trainer.cg.dt_seq.ArcTransformation]]

Gets called with the full dataset for one subject. Returns a list of consistent ArcTransformations for this task. Use max_steps for early stopping, but keep in mind that the first steps are the expensive ones.

get_node_count()
get_used_cgs() → List[Tuple[int, Set[int]]]
static increment_pool(pool: List[Tuple[trainer.cg.dt_seq.ArcTransformation, numpy.ndarray, numpy.ndarray]], f_s, a_s) → List[Tuple[trainer.cg.dt_seq.ArcTransformation, numpy.ndarray, numpy.ndarray]]
predict(x_test: numpy.ndarray, values: numpy.ndarray) → List[numpy.ndarray]

Given a tabular decision tree dataset, returns the predictions made by this dt_seq as images.

If there are multiple test examples, multiple images will be in the results list.

test_generalization(x_test: numpy.ndarray, y_test: numpy.ndarray) → Tuple[bool, List[Tuple[int, Set[int]]]]

Judges the generalization performance given a test set in dt-format.

Provides a feedback object of the following form: [(action_instance_name, {f_inst})]

visualize(parent_path='', f_vis: Optional[Callable] = None, a_vis: Optional[Callable] = None, folder_appendix='', name='unknown')

Visualizes the whole sequence including the features and actions that are used by this dt sequence.

Parameters
  • parent_path

  • f_vis – A callable that takes a feature id and a disk location and visualizes that feature

  • a_vis – A callable that takes an action id and a disk location and visualizes that action

  • folder_appendix

  • name

Returns

class trainer.cg.dt_seq.DtSeq

Bases: object

Encapsulates a DtSeq model, public methods inspired from sklearn.

fit(train_x: numpy.ndarray, train_y: numpy.ndarray, train_vals: numpy.ndarray) → None
trainer.cg.dt_seq.get_leave_impurity(dt: sklearn.tree._classes.DecisionTreeClassifier) → float
trainer.cg.dt_seq.get_leave_indices(dt: sklearn.tree._classes.DecisionTreeClassifier) → numpy.ndarray
trainer.cg.dt_seq.order_actions(x: numpy.ndarray, ys: numpy.ndarray, initial_ignore=None) → Generator[Tuple[int, sklearn.tree._classes.DecisionTreeClassifier], None, None]

An action is assumed to be easily explainable if the corresponding decision tree has a low number of nodes.

Because all decision trees are computed anyways for ordering the actions, the corresponding decision tree for an action is returned along with its index.

Parameters
  • x – Features (N x #F)

  • ys – All actions (N x #A)

  • initial_ignore – Optional boolean array indicating which actions should be considered (#A)

Returns

trainer.cg.dt_seq.pred_equal(preds1: List[numpy.ndarray], preds2: List[numpy.ndarray]) → bool

Simple utility for comparing two lists of numpy arrays using exact equality

>>> pred_equal([np.ones(3)], [np.zeros(3)])
False
>>> pred_equal([np.zeros(3), np.ones(4)], [np.zeros(3), np.ones(4)])
True

trainer.cg.grid_dsl module

class trainer.cg.grid_dsl.B(value)

Bases: enum.Enum

An enumeration.

F = 1
T = 0
class trainer.cg.grid_dsl.OneShotTransform(value)

Bases: enum.Enum

An enumeration.

FlipLR = 10
FlipUD = 11
Identity = 0
Rot180 = 2
Rot270 = 3
Rot90 = 1
class trainer.cg.grid_dsl.Orientation(value)

Bases: enum.Enum

An enumeration.

diagonal_dota = 2
diagonal_other = 3
horizontal = 0
vertical = 1
class trainer.cg.grid_dsl.RFilters(value)

Bases: enum.Enum

An enumeration.

Majority = 1
Modal = 0
class trainer.cg.grid_dsl.RegionProp(value)

Bases: enum.Enum

An enumeration.

area = 0
bbox_area = 1
centroid_column = 10
centroid_row = 9
convex_area = 2
eccentricity = 3
equivalent_diameter = 4
euler_number = 5
extent = 6
filled_area = 7
major_axis_length = 11
minor_axis_length = 12
orientation = 13
perimeter = 14
solidity = 15
class trainer.cg.grid_dsl.Structs(value)

Bases: enum.Enum

An enumeration.

Full = 2
Orthogonal = 1
trainer.cg.grid_dsl.apply_boolf(im: NewType.<locals>.new_type, filt: NewType.<locals>.new_type) → NewType.<locals>.new_type

Filters 3x3 patches using the given filter

trainer.cg.grid_dsl.arr_from_line(line: NewType.<locals>.new_type, orientation: trainer.cg.grid_dsl.Orientation) → NewType.<locals>.new_type
trainer.cg.grid_dsl.bool_to_real(b_arr: NewType.<locals>.new_type) → NewType.<locals>.new_type
trainer.cg.grid_dsl.compute_region_prop(prop, rp: trainer.cg.grid_dsl.RegionProp) → float
trainer.cg.grid_dsl.coord(grid: NewType.<locals>.new_type, modulo: NewType.<locals>.new_type) → NewType.<locals>.new_type
trainer.cg.grid_dsl.different_cells_in_line(grid: NewType.<locals>.new_type, orientation: trainer.cg.grid_dsl.Orientation) → NewType.<locals>.new_type
trainer.cg.grid_dsl.direction_step(step_size: NewType.<locals>.new_type, direction: trainer.cg.grid_dsl.Orientation) → NewType.<locals>.new_type
trainer.cg.grid_dsl.filter_3x3(middle: trainer.cg.grid_dsl.B, ortho: trainer.cg.grid_dsl.B, diag: trainer.cg.grid_dsl.B) → NewType.<locals>.new_type
trainer.cg.grid_dsl.get_obj_lu(obj: NewType.<locals>.new_type) → NewType.<locals>.new_type
Parameters

grid – Zero-padded object

Returns

(x, y) coordinate of the left upper corner of a zero-padded object

trainer.cg.grid_dsl.hist(grid: NewType.<locals>.new_type) → NewType.<locals>.new_type
trainer.cg.grid_dsl.ident_neigh(arr: NewType.<locals>.new_type) → NewType.<locals>.new_type

Outputs the number of identical values in the input for each neighbourhood

trainer.cg.grid_dsl.int_to_real(int_arr: NewType.<locals>.new_type) → NewType.<locals>.new_type
trainer.cg.grid_dsl.is_value(arr: NewType.<locals>.new_type, v: trainer.demo_data.arc.Value) → NewType.<locals>.new_type
trainer.cg.grid_dsl.lbl_by_bg(arr: NewType.<locals>.new_type, structure: trainer.cg.grid_dsl.Structs, background: trainer.demo_data.arc.Value) → NewType.<locals>.new_type
trainer.cg.grid_dsl.lbl_by_bool(grid: NewType.<locals>.new_type, arr: NewType.<locals>.new_type) → NewType.<locals>.new_type
trainer.cg.grid_dsl.lbl_connected(arr: NewType.<locals>.new_type, structure: trainer.cg.grid_dsl.Structs, background: trainer.demo_data.arc.Value) → NewType.<locals>.new_type
trainer.cg.grid_dsl.make_offset(x: NewType.<locals>.new_type, y: NewType.<locals>.new_type) → NewType.<locals>.new_type
trainer.cg.grid_dsl.measure_grid(grid: NewType.<locals>.new_type, orientation: trainer.cg.grid_dsl.Orientation) → NewType.<locals>.new_type
trainer.cg.grid_dsl.measure_grid_b(grid: NewType.<locals>.new_type, orientation: trainer.cg.grid_dsl.Orientation) → NewType.<locals>.new_type
trainer.cg.grid_dsl.move_to(obj: NewType.<locals>.new_type, location: NewType.<locals>.new_type) → NewType.<locals>.new_type
trainer.cg.grid_dsl.negated_arr(arr: NewType.<locals>.new_type, negated: trainer.cg.grid_dsl.B) → NewType.<locals>.new_type
trainer.cg.grid_dsl.non_zero_num(x: NewType.<locals>.new_type, b: trainer.cg.grid_dsl.B) → NewType.<locals>.new_type
trainer.cg.grid_dsl.obj_to_boolgrid(obj: NewType.<locals>.new_type) → NewType.<locals>.new_type
trainer.cg.grid_dsl.obj_to_valgrid(obj: NewType.<locals>.new_type) → NewType.<locals>.new_type
trainer.cg.grid_dsl.object_by_ordering(objts: NewType.<locals>.new_type, prop: trainer.cg.grid_dsl.RegionProp, count_index: NewType.<locals>.new_type, reverted: trainer.cg.grid_dsl.B) → NewType.<locals>.new_type
trainer.cg.grid_dsl.object_by_spatial(objts: NewType.<locals>.new_type, count_index: NewType.<locals>.new_type, reverted: trainer.cg.grid_dsl.B) → NewType.<locals>.new_type
trainer.cg.grid_dsl.origin() → NewType.<locals>.new_type
trainer.cg.grid_dsl.pick_from_values(values: List[trainer.demo_data.arc.Value], count_index: NewType.<locals>.new_type, reverted: trainer.cg.grid_dsl.B)trainer.demo_data.arc.Value
trainer.cg.grid_dsl.reg_quantity(labels: NewType.<locals>.new_type, rp: trainer.cg.grid_dsl.RegionProp) → NewType.<locals>.new_type
trainer.cg.grid_dsl.rfilt(arr: NewType.<locals>.new_type, rfilter: trainer.cg.grid_dsl.RFilters, s: NewType.<locals>.new_type) → NewType.<locals>.new_type
trainer.cg.grid_dsl.shift_val_arr(grid: NewType.<locals>.new_type, offset: NewType.<locals>.new_type) → NewType.<locals>.new_type
trainer.cg.grid_dsl.sorted_values(grid: NewType.<locals>.new_type) → List[trainer.demo_data.arc.Value]
trainer.cg.grid_dsl.tiled(arr: NewType.<locals>.new_type) → NewType.<locals>.new_type
trainer.cg.grid_dsl.transform(labelling: NewType.<locals>.new_type, t_type: trainer.cg.grid_dsl.OneShotTransform) → NewType.<locals>.new_type

Every coherent region is transformed independently.

If objects are classified by their color it works fine to transform the input directly. If not first foreground needs to be separated from background by a labelling strategy.

trainer.cg.grid_dsl.value_to_arr(v: trainer.demo_data.arc.Value) → NewType.<locals>.new_type
trainer.cg.grid_dsl.zoom_boolgrid(grid: NewType.<locals>.new_type, factor: NewType.<locals>.new_type) → NewType.<locals>.new_type
trainer.cg.grid_dsl.zoom_valgrid(grid: NewType.<locals>.new_type, factor: NewType.<locals>.new_type) → NewType.<locals>.new_type

trainer.cg.samplers module

class trainer.cg.samplers.EnumSampler(*args, **kwds)

Bases: trainer.cg.samplers.Sampler

from_json_repr(vals: List[str]) → List[V]

To load the state of a sampler from disk, this method need to be implemented.

get_json_repr(vals: List[V]) → List[str]

Given a list of actual values that this node can hold, compute their string representation.

The function can be used to store the sampled values on disk.

resample(last_value: Optional[V] = None) → V

Randomly samples a new value for this node.

Parameters

last_value – The current value of the node. None if the node does not yet have a value.

Returns

The new value that will be assigned to this node

class trainer.cg.samplers.FloatSampler(*args, **kwds)

Bases: trainer.cg.samplers.Sampler

from_json_repr(vals: List[str]) → List[V]

To load the state of a sampler from disk, this method need to be implemented.

resample(last_value: Optional[V] = None) → V

Randomly samples a new value for this node.

Parameters

last_value – The current value of the node. None if the node does not yet have a value.

Returns

The new value that will be assigned to this node

class trainer.cg.samplers.NumberSampler(*args, **kwds)

Bases: trainer.cg.samplers.Sampler

from_json_repr(vals: List[str]) → List[V]

To load the state of a sampler from disk, this method need to be implemented.

resample(last_value: Optional[V] = None) → V

Randomly samples a new value for this node.

Parameters

last_value – The current value of the node. None if the node does not yet have a value.

Returns

The new value that will be assigned to this node

class trainer.cg.samplers.Sampler(*args, **kwds)

Bases: abc.ABC, typing.Generic

Template for implementing nodes that can be sampled using MCMC

abstract from_json_repr(vals: List[str]) → List[V]

To load the state of a sampler from disk, this method need to be implemented.

get_json_repr(vals: List[V]) → List[str]

Given a list of actual values that this node can hold, compute their string representation.

The function can be used to store the sampled values on disk.

abstract resample(last_value: Optional[V] = None) → V

Randomly samples a new value for this node.

Parameters

last_value – The current value of the node. None if the node does not yet have a value.

Returns

The new value that will be assigned to this node

sample(node_id: int) → V

trainer.cg.sym_lang module

class trainer.cg.sym_lang.ArcContext

Bases: trainer.cg.Dsl.Context

get_colours() → List[trainer.demo_data.arc.Value]
sit() → NewType.<locals>.new_type
trainer.cg.sym_lang.arc_specifics()
trainer.cg.sym_lang.get_pps(max_f=200, max_a=100) → Tuple[trainer.cg.ProgPool.ProgPool, trainer.cg.ProgPool.ProgPool]
trainer.cg.sym_lang.load_pps_from_disk(f_csv_path: str, a_csv_path: str) → Tuple[trainer.cg.ProgPool.ProgPool, trainer.cg.ProgPool.ProgPool]

Module contents

Computational graphs package, contributes:

  • A greedy, decision tree based, image transformation module

  • A module for defining computational graphs using general python functions, trained using Simulated Annealing