SyntheticBuild Class
- class ds_discovery.components.synthetic_builder.SyntheticBuilder(property_manager: Any, intent_model: Any, default_save: bool | None = None, reset_templates: bool | None = None, template_path: str | None = None, template_module: str | None = None, template_source_handler: str | None = None, template_persist_handler: str | None = None, align_connectors: bool | None = None)
- add_column_description(column_name: str, description: str, save: bool | None = None)
adds a description note that is included in with the ‘report_column_catalog’
- add_connector_contract(connector_name: str, connector_contract: ConnectorContract, template_aligned: bool | None = None, save: bool | None = None)
Sets a named connector contract
- Parameters:
connector_name – the name or label to identify and reference the connector
connector_contract – a Connector Contract for the properties persistence
template_aligned – the connector aligns with the template so changes to the template
save – override of the default save action set at initialisation.
- Returns:
if load is True, returns a Pandas.DataFrame else None
- add_connector_from_template(connector_name: str, uri_file: str, template_name: str, save: bool | None = None, **kwargs)
Adds a connector using settings from a template connector. By default a self.TEMPLATE_SOURCE and self.TEMPLATE_PERSIST are added at initialisation
- Parameters:
connector_name – the name or label to identify and reference the connector
uri_file – the name of the file to append to the end of the default path
template_name – the name of the template connector
save – override of the default save action set at initialisation.
kwargs – any kwargs to add to the default connector
- Returns:
- add_connector_persist(connector_name: str, uri_file: str, save: bool | None = None, **kwargs)
Adds a connector using settings from the self.TEMPLATE_PERSIST template connector. self.TEMPLATE_PERSIST are added at initialisation
- Parameters:
connector_name – the name or label to identify and reference the connector
uri_file – the name of the file to append to the end of the default path
save – override of the default save action set at initialisation.
kwargs – any kwargs to add to the default connector
- Returns:
- add_connector_source(connector_name: str, uri_file: str, save: bool | None = None, **kwargs)
Adds a connector using settings from the self.TEMPLATE_SOURCE template connector.
- Parameters:
connector_name – the name or label to identify and reference the connector
uri_file – the name of the file to append to the end of the default path
save – override of the default save action set at initialisation.
kwargs – any kwargs to add to the default connector
- Returns:
- add_connector_uri(connector_name: str, uri: str, save: bool | None = None, template_aligned: bool | None = None, **kwargs)
Sets the contract giving the full uri path. This is a shortcut of set_source_contract(…), not requiring a ConnectorContract to be set up and using the default module and handler values.
- Parameters:
connector_name – the name or label to identify and reference the connector
uri – a fully qualified uri of the source data
template_aligned – the connector aligns with the template so changes to the template
save – (optional) if True, save to file. Default is True
- add_intent_level_description(level: [<class 'int'>, <class 'str'>], text: str, save: bool | None = None)
sets description to the augmented knowledge ‘intent’ to a level
- Parameters:
level – the intent level to add the comment to
text – the description text
save – (optional) override of the default save action set at initialisation.
- add_notes(catalog: str, label: [<class 'str'>, <class 'list'>], text: str, constraints: list | None = None, save=None)
- adds a note to the augmented knowledge.
if no label is given then a journal date of ‘year-month’ is provided if no catalog is given then the default catalogue name is given
- Parameters:
catalog – a catalog name
label – a sub key label or list of labels to separate different information strands
text – the text to add
constraints – (optional) a list of allowed label values, if None then any value allowed
save – if True, save to file. Default is True
- add_run_book(run_levels: [<class 'str'>, <class 'list'>], book_name: str | None = None, save: bool | None = None)
sets a named run book, the run levels are a list of levels and the order they are run in
- Parameters:
run_levels – the name or list of levels to be run
book_name – (optional) the name of the run_book. defaults to ‘primary_run_book’
save – (optional) override of the default save action set at initialisation.
- add_run_book_level(run_level: str, book_name: str | None = None, save: bool | None = None)
adds a single runlevel to the end of a run_book. If the name already exists it will be replaced
- Parameters:
run_level – the run_level to add.
book_name – (optional) the name of the run_book. defaults to ‘primary_run_book’
save – (optional) override of the default save action set at initialisation.
- backup_canonical(connector_name: str, canonical: Any, uri: str, **kwargs)
persists the canonical to the referenced connector as a backup using the URI to replace the current Connector Contract URI.
- Parameters:
connector_name – the name or label to identify and reference the connector
canonical – the canonical data to persist
uri – an alternative uri to the one in the ConnectorContract
kwargs – arguments to be passed to the handler on persist
- static canonical_report(canonical, stylise: bool = True, inc_next_dom: bool = False, report_header: str | None = None, condition: str | None = None)
The Canonical Report is a data dictionary of the canonical providing a reference view of the dataset’s attribute properties
- Parameters:
canonical – the DataFrame to view
stylise – if True present the report stylised.
inc_next_dom – (optional) if to include the next dominate element column
report_header – (optional) filter on a header where the condition is true. Condition must exist
condition – (optional) the condition to apply to the header. Header must exist. examples: ‘ > 0.95’, “.str.contains(‘shed’)”
- Returns:
- create_snapshot(suffix: str | None = None, version: str | None = None, save: bool | None = None)
creates a snapshot of contracts configuration. The name format will be <contract_name>_#<suffix>.
- Parameters:
suffix – (optional) adds the suffix to the end of the contract name. if None then date & time used
version – (optional) changes the version number of the current contract
save – override of the default save action set at initialisation.
- Returns:
a list of current contract snapshots
- delete_snapshot(snapshot_name: str, save: bool | None = None)
deletes a snapshot
- Parameters:
snapshot_name – the name of the snapshot
save – override of the default save action set at initialisation.
- Returns:
True if successful, False is not found or not deleted
- property discover: DataDiscovery
The components instance
- classmethod discovery_pad() DataDiscovery
A class method to use the Components discovery methods as a scratch pad
- classmethod from_env(task_name: str, default_save=None, reset_templates: bool | None = None, align_connectors: bool | None = None, default_save_intent: bool | None = None, default_intent_level: bool | None = None, order_next_available: bool | None = None, default_replace_intent: bool | None = None, uri_pm_repo: str | None = None, has_contract: bool | None = None, **kwargs)
Class Factory Method that builds the connector handlers taking the property contract path from the
os.environ['HADRON_PM_PATH']
or, if not found, uses the system default,for Linux and IOS ‘/tmp/components/contracts
for Windows
os.environ['AppData']\components\contracts
- The following environment variables can be set:
HADRON_PM_PATH: the property contract path, if not found, uses the system default
HADRON_PM_REPO: the property contract should be initially loaded from a read only repo site such as github
HADRON_PM_TYPE: a file type for the property manager. If not found sets as json
HADRON_PM_MODULE: a default module package, if not set uses component default
HADRON_PM_HANDLER: a default handler. if not set uses component default
This method calls to the Factory Method
from_uri(...)
returning the initialised class instance- Parameters:
task_name – The reference name that uniquely identifies a task or subset of the property manager
default_save – (optional) if the configuration should be persisted
reset_templates – (optional) reset connector templates from environ variables. Default True
align_connectors – (optional) resets aligned connectors to the template. default Default True
default_save_intent – (optional) The default action for saving intent in the property manager
default_intent_level – (optional) the default level intent should be saved at
order_next_available – (optional) if the default behaviour for the order should be next available order
default_replace_intent – (optional) the default replace existing intent behaviour
uri_pm_repo – The read only repo link that points to the raw data path to the contracts repo directory
has_contract – (optional) indicates the instance should have a property manager domain contract
kwargs – to pass to the property ConnectorContract as its kwargs
- Returns:
the initialised class instance
- classmethod from_memory(has_contract: bool | None = None, default_save_intent: bool | None = None, default_intent_level: bool | None = None, order_next_available: bool | None = None, default_replace_intent: bool | None = None, **kwargs)
Class Factory Method that creates a light touch in memory instance that leaves no residue when closed. This factory method can load a reference contract from a remote repo as a foundation.
- param default_save_intent:
(optional) The default action for saving intent in the property manager
- param default_intent_level:
(optional) the default level intent should be saved at
- param order_next_available:
(optional) if the default behaviour for the order should be next available order
- param default_replace_intent:
(optional) the default replace existing intent behaviour
- param has_contract:
(optional) indicates the instance should have a property manager domain contract
- param kwargs:
to pass to the property ConnectorContract as its kwargs
- return:
the initialised class instance
- classmethod from_uri(task_name: str, uri_pm_path: str, creator: str, uri_pm_repo: str | None = None, pm_file_type: str | None = None, pm_module: str | None = None, pm_handler: str | None = None, pm_kwargs: dict | None = None, default_save=None, reset_templates: bool | None = None, template_path: str | None = None, template_module: str | None = None, template_source_handler: str | None = None, template_persist_handler: str | None = None, align_connectors: bool | None = None, default_save_intent: bool | None = None, default_intent_level: bool | None = None, order_next_available: bool | None = None, default_replace_intent: bool | None = None, has_contract: bool | None = None) SyntheticBuilder
Class Factory Method to instantiates the components application. The Factory Method handles the instantiation of the Properties Manager, the Intent Model and the persistence of the uploaded properties. See class inline docs for an example method
- param task_name:
The reference name that uniquely identifies a task or subset of the property manager
- param uri_pm_path:
A URI that identifies the resource path for the property manager.
- param creator:
A user name for this task activity.
- param uri_pm_repo:
(optional) A repository URI to initially load the property manager but not save to.
- param pm_file_type:
(optional) defines a specific file type for the property manager
- param pm_module:
(optional) the module or package name where the handler can be found
- param pm_handler:
(optional) the handler for retrieving the resource
- param pm_kwargs:
(optional) a dictionary of kwargs to pass to the property manager
- param default_save:
(optional) if the configuration should be persisted. default to ‘True’
- param reset_templates:
(optional) reset connector templates from environ variables. Default True (see report_environ())
- param template_path:
(optional) a template path to use if the environment variable does not exist
- param template_module:
(optional) a template module to use if the environment variable does not exist
- param template_source_handler:
(optional) a template source handler to use if no environment variable
- param template_persist_handler:
(optional) a template persist handler to use if no environment variable
- param align_connectors:
(optional) resets aligned connectors to the template. default Default True
- param default_save_intent:
(optional) The default action for saving intent in the property manager
- param default_intent_level:
(optional) the default level intent should be saved at
- param order_next_available:
(optional) if the default behaviour for the order should be next available order
- param default_replace_intent:
(optional) the default replace existing intent behaviour
- param has_contract:
(optional) indicates the instance should have a property manager domain contract
- return:
the initialised class instance
- get_persist_contract() ConnectorContract
gets the persist connector contract that can be used as the next chain source. If the uri contains environment variables it is NOT parsed at load
- get_persist_uri() str
gets the persist connector contract uri that be used as the next chain source. If the uri contains environment variables it is parsed
- property intent_model: SyntheticIntentModel
The intent model instance
- load_canonical(connector_name: str, reset_changed: bool | None = None, has_changed: bool | None = None, return_empty: bool | None = None, **kwargs) DataFrame
returns the canonical of the referenced connector
- Parameters:
connector_name – the name or label to identify and reference the connector
reset_changed – (optional) resets the has_changed boolean to True
has_changed – (optional) tests if the underline canonical has changed since last load else error returned
return_empty – (optional) if has_changed is set, returns an empty canonical if set to True
kwargs – arguments to be passed to the handler on load
- load_persist_canonical(reset_changed: bool | None = None, has_changed: bool | None = None, return_empty: bool | None = None, **kwargs) DataFrame
loads the clean pandas.DataFrame from the clean folder for this contract
- Parameters:
reset_changed – (optional) resets the has_changed boolean to True
has_changed – (optional) tests if the underline canonical has changed since last load else error returned
return_empty – (optional) if has_changed is set, returns an empty canonical if set to True
kwargs – arguments to be passed to the handler on load
- load_source_canonical(reset_changed: bool | None = None, has_changed: bool | None = None, return_empty: bool | None = None, **kwargs) DataFrame
returns the contracted source data as a DataFrame
- Parameters:
reset_changed – (optional) resets the has_changed boolean to True
has_changed – (optional) tests if the underline canonical has changed since last load else error returned
return_empty – (optional) if has_changed is set, returns an empty canonical if set to True
kwargs – arguments to be passed to the handler on load
- persist_canonical(connector_name: str, canonical: Any, **kwargs)
persists the canonical to the referenced connector. same as save_canonical
- Parameters:
connector_name – the name or label to identify and reference the connector
canonical – the canonical data to persist
kwargs – arguments to be passed to the handler on persist
- property pm: SyntheticPropertyManager
The properties manager instance
- pm_persist(save=None)
Saves the current configuration to file
- pm_reset(save: bool | None = None)
resets the contract back to a default. This does not remove the Property Manager Connector Contract or any snapshots
- Parameters:
save – override of the default save action set at initialisation.
- pm_transfer(transfer_connector: [<class 'str'>, <class 'aistac.handlers.abstract_handlers.ConnectorContract'>])
Takes a copy of the pm contract and saves it to a new location defined by the connector contract. This can be used to publish a property manager to a new location, change its format or as a backup
- Parameters:
transfer_connector – the name of an existing connector contract or a ConnectorContract
- recover_snapshot(snapshot_name: str, overwrite: bool | None = None, save: bool | None = None) bool
recovers a snapshot back to the current. The snapshot must be from this root contract. by default the original root contract will be overwritten unless the overwrite is set to False. if overwrite is False a timestamped snapshot is created
:param snapshot_name:the name of the snapshot (use self.contract_snapshots to get list of names) :param overwrite: (optional) if the original contract should be overwritten. Default to True :param save: override of the default save action set at initialisation. :return: True if the contract was recovered, else False
- remove_canonical(connector_name: str, **kwargs)
removes the current persisted canonical.
- Parameters:
connector_name – the name or label to identify and reference the connector
kwargs – arguments to be passed to the handler on remove
- remove_connector_contract(connector_name: str, save: bool | None = None)
removes a named connector contract
- Parameters:
connector_name – the name or label to identify and reference the connector
save – override of the default save action set at initialisation.
- remove_intent(intent_param: [<class 'str'>, <class 'dict'>] = None, level: [<class 'int'>, <class 'str'>] = None, save: bool = None)
- removes part or all the intent contract.
If no params all intent is removed
if only intent then all references in all params of that named intent will be removed
if only level then that level is removed
if both level and intent then that specific intent on that level is removed
- Parameters:
intent_param – (optional) removes the method contract
level – (optional) removes the level contract
save – (optional) override of the default save action set at initialisation.
- Returns:
True if removed, False if not
- remove_notes(catalog: str, label: str | None = None, save=None)
removes a all entries for a labeled note
- Parameters:
catalog – the type of note to delete, if left empty all notes removed
label – (Optional) the name of the label to be removed
save – (Optional) if True, save to file. Default is True
- Returns:
True is successful, False if not
- remove_run_book(book_name: str | None = None, save: bool | None = None) bool
removes named run book. If no runbook is given then all run books are removed
- Parameters:
book_name – (optional) the name of the run_book. defaults to primary_run_book’
save – (optional) override of the default save action set at initialisation.
- Returns:
True if removed, False if not
- static report2dict(report: str, file_type: str = None, versioned: bool = None, stamped: str = None, prefix: str = None, suffix: str = None, path: [<class 'str'>, <class 'list'>] = None) dict
a utility method to help build analytics conditions by aligning method parameters with dictionary format.
- Parameters:
report – The name of the report
file_type – (optional) an alternative file extension to the default ‘json’ format
versioned – (optional) if the component version should be included as part of the pattern
stamped – (optional) A string of the timestamp options [‘days’, ‘hours’, ‘minutes’, ‘seconds’, ‘ns’]
prefix – (optional) a prefix to put at the front of the file pattern to replace the default
suffix – (optional) a suffix to put at the end of the file pattern and extension
path – (optional) a file path that precedes the prefix and file pattern. uses os.path.join so takes a list
- Returns:
a dictionary for an individual element
- report_canonical_schema(schema: [<class 'str'>, <class 'dict'>] = None, roots: [<class 'str'>, <class 'list'>] = None, sections: [<class 'str'>, <class 'list'>] = None, elements: [<class 'str'>, <class 'list'>] = None, stylise: bool = True)
presents the current canonical schema
- Parameters:
schema – (optional) the name of the schema
roots – (optional) one or more tree roots
sections – (optional) the section under the root
elements – (optional) the element in the section
stylise – if True present the report stylised.
- Returns:
pd.DataFrame
- report_column_catalog(column_name: [<class 'str'>, <class 'list'>] = None, stylise: bool = True)
generates a report on the source contract
- Parameters:
column_name – (optional) filters on specific column names.
stylise – (optional) returns a stylised DataFrame with formatting
- Returns:
pd.DataFrame
- report_connectors(connector_filter: [<class 'str'>, <class 'list'>] = None, inc_pm: bool = None, inc_template: bool = None, stylise: bool = True)
generates a report on the source contract
- Parameters:
connector_filter – (optional) filters on the connector name.
inc_pm – (optional) include the property manager connector
inc_template – (optional) include the template connectors
stylise – (optional) returns a stylised DataFrame with formatting
- Returns:
pd.DataFrame
- report_environ(hide_not_set: bool = True, stylise: bool = True)
generates a report on all the intent
- Parameters:
hide_not_set – hide environ keys that are not set.
stylise – returns a stylised dataframe with formatting
- Returns:
pd.Dataframe
- report_intent(levels: [<class 'str'>, <class 'int'>, <class 'list'>] = None, stylise: bool = True)
generates a report on all the intent
- Parameters:
levels – (optional) a filter on the levels. passing a single value will report a single parameterised view
stylise – (optional) returns a stylised dataframe with formatting
- Returns:
pd.Dataframe
- report_notes(catalog: [<class 'str'>, <class 'list'>] = None, labels: [<class 'str'>, <class 'list'>] = None, regex: [<class 'str'>, <class 'list'>] = None, re_ignore_case: bool = False, stylise: bool = True, drop_dates: bool = False)
generates a report on the notes
- Parameters:
catalog – (optional) the catalog to filter on
labels – (optional) s label or list of labels to filter on
regex – (optional) a regular expression on the notes
re_ignore_case – (optional) if the regular expression should be case sensitive
stylise – (optional) returns a stylised dataframe with formatting
drop_dates – (optional) excludes the ‘date’ column from the report
- Returns:
pd.Dataframe
- report_run_book(stylise: bool = True)
generates a report on all the intent
- Parameters:
stylise – returns a stylised dataframe with formatting
- Returns:
pd.Dataframe
- report_task(stylise: bool = True)
generates a report on the source contract
- Parameters:
stylise – (optional) returns a stylised DataFrame with formatting
- Returns:
pd.DataFrame
- reset_template_connectors(save: bool | None = None)
resets connector contracts with template path and handler where they are template aligned. (see set_connector_aligned)
- Parameters:
save – override of the default save action set at initialisation.
- run_component_pipeline(canonical: Any = None, intent_levels: [str, int, list] = None, run_book: str = None, use_default: bool = None, seed: int = None, **kwargs)
runs the synthetic component pipeline. By passing an int value as the canonical will generate a synthetic file of that size
- Parameters:
canonical – an optional canonical to start the pipeline or a size of the synthetic build
intent_levels – a single or list of intent levels to run
run_book – a saved runbook to run
use_default – if the default runbook should be used if it exists
seed – a seed value for this run
kwargs – any additional kwargs
- save_canonical(connector_name: str, canonical: Any, **kwargs)
saves the canonical to the referenced connector. Same as persist_canonical
- Parameters:
connector_name – the name or label to identify and reference the connector
canonical – the canonical data to persist
kwargs – arguments to be passed to the handler on persist
- save_canonical_schema(schema_name: str | None = None, canonical: DataFrame | None = None, schema_tree: list | None = None, exclude_associate: list | None = None, detail_numeric: bool | None = None, strict_typing: bool | None = None, category_limit: int | None = None, save: bool | None = None)
Saves the canonical schema to the Property contract. The default loads the clean canonical but optionally a canonical can be passed to base the schema on and optionally a name given other than the default
- Parameters:
schema_name – (optional) the name of the schema to save
canonical – (optional) the canonical to base the schema on
schema_tree – (optional) an analytics dict (see Discovery.analyse_association(…)
exclude_associate – (optional) a list of dot notation tree of items to exclude from iteration (e.g. [‘age.gender.salary’] will cut ‘salary’ branch from gender and all sub branches)
detail_numeric – (optional) if numeric columns should have detail stats, slowing analysis. default False
strict_typing – (optional) stops objects and string types being seen as categories. default True
category_limit – (optional) a global cap on categories captured. default is 10
save – (optional) if True, save to file. Default is True
- save_persist_canonical(canonical, auto_connectors: bool | None = None, **kwargs)
Saves the canonical to the clean files folder, auto creating the connector from template if not set
- save_report_canonical(reports: [<class 'str'>, <class 'list'>], report_canonical: [<class 'dict'>, <class 'pandas.core.frame.DataFrame'>], replace_connectors: bool | None = None, auto_connectors: bool | None = None, save: bool | None = None, **kwargs)
saves one or a list of reports using the TEMPLATE_PERSIST connector contract. Though a report can be of any name, for convention and consistency each component has a set of REPORT constants <Component>.REPORT_<NAME> where <Component> is the component Class name and <name> is the name of the report_canonical.
The reports can be a simple string name or a list of names. The name list can be a string or a dictionary providing more detailed parameters on how to represent the report. These parameters keys are
key report: the name of the report
key file_type: (optional) a file type other than the default .json
key versioned: (optional) if the filename should be versioned
key stamped: (optional) A string of the timestamp options [‘days’, ‘hours’, ‘minutes’, ‘seconds’, ‘ns’]
Some examples
self.REPORT_SCHEMA [self.REPORT_NOTES, self.REPORT_SCHEMA] [self.REPORT_NOTES, {'report': self.REPORT_SCHEMA, 'uri_file': '<file_name>'}] [{'report': self.REPORT_NOTES, 'file_type': 'json'}] [{'report': self.REPORT_SCHEMA, 'file_type': 'csv', 'versioned': True, 'stamped': days}]
- Parameters:
reports – a report name or list of report names to save
report_canonical – a relating canonical to base the report on
auto_connectors – (optional) if a connector should be created automatically
replace_connectors – (optional) replace any existing report connectors with these reports
save – (optional) if True, save to file. Default is True
kwargs – additional kwargs to pass to a Connector Contract
- classmethod scratch_pad() SyntheticIntentModel
A class method to use the Components intent methods as a scratch pad
- set_connector_aligned(connector_names: [<class 'str'>, <class 'list'>], aligned: bool, save: bool | None = None)
modifies the uri of a connector contract and resets
- Parameters:
connector_names – a name or list of names of connector contract to modify
aligned – if the connector contract is aligned to the template connector contract
save – override of the default save action set at initialisation.
- set_connector_version(connector_names: [<class 'str'>, <class 'list'>], version: str, save: bool | None = None)
modifies the uri of a connector contract and resets
- Parameters:
connector_names – a name or list of names of connector contract to modify
version – the new version number
save – override of the default save action set at initialisation.
- set_description(description: str, save=None)
sets the description of this component task :param description: a brief description of this component task :param save: override of the default save action set at initialisation.
- set_persist(uri_file: str | None = None, save: bool | None = None, **kwargs)
sets the persist contract CONNECTOR_PERSIST using the TEMPLATE_PERSIST connector contract
- Parameters:
uri_file – (optional) the uri_file is appended to the template path
save – (optional) if True, save to file. Default is True
- set_persist_contract(connector_contract: ConnectorContract, save: bool | None = None)
Sets the persist contract.
- Parameters:
connector_contract – a Connector Contract for the persisted data
save – (optional) if True, save to file. Default is True
- set_persist_uri(uri: str, save: bool | None = None, template_aligned: bool | None = None, **kwargs)
Sets the persist contract giving the full uri path. This is a shortcut of set_persist_contract(…), not requiring a ConnectorContract to be set up and using the default module and handler values.
- Parameters:
uri – a fully qualified uri of the persist data
template_aligned – the connector aligns with the template so changes to the template
save – (optional) if True, save to file. Default is True
- set_report_persist(reports: [<class 'str'>, <class 'list'>], project: str = None, path: [<class 'str'>, <class 'list'>] = None, prefix: str = None, suffix: str = None, file_type: str = None, versioned: bool = None, stamped: str = None, save: bool = None, **kwargs) list
sets the report persist using the TEMPLATE_PERSIST connector contract, there are preset constants that should be used. These constance can be in the form <class>.REPORT_<NAME> or <instance>.REPORT_<NAME> where <NAME> is the name of the report and can be found in this class. Examples of reports might be:
Transition.REPORT_SCHEMA [self.REPORT_NOTES, self.REPORT_SCHEMA] [builder.REPORT_NOTES, {'report': builder.REPORT_SCHEMA, 'uri_file': '<file_name>'}] [{'report': Wrangle.REPORT_NOTES, 'file_type': 'json'}] [{'report': self.REPORT_SCHEMA, 'file_type': 'csv', 'versioned': True, 'stamped': 'days'}]
if a report is presented as dict, the method signature parameters will be overwritten by the report values. This allows globally parameter to apply generally but allow single reports to be modified at a granular level.
to ensure dict reports have the correct keys the util method ‘report2dict(…)’ can be used.
- Parameters:
reports – (optional) the name(s) of the report connector to set (see class REPORT_* constants)
project – (optional) an alternative project string that replaces ‘hadron’
path – (optional) a file path that precedes the prefix and file pattern. uses os.path.join so takes a list
prefix – (optional) a prefix to put at the front of the file pattern to replace the default
suffix – (optional) a suffix to put at the end of the file pattern and extension
file_type – (optional) a global file extension to the default ‘json’ format
versioned – (optional) if all reports should include a version
stamped – (optional) A string of the timestamp options [‘days’, ‘hours’, ‘minutes’, ‘seconds’, ‘ns’]
save – (optional) if True, save to file. Default is True
kwargs – (optional) additional parameters to send as kwargs for the Connect Contract
- Returns:
a list of connector names created from the reports
- set_source(uri_file: str, save: bool | None = None, **kwargs)
sets the source contract CONNECTOR_SOURCE using the TEMPLATE_SOURCE connector contract,
- Parameters:
uri_file – the uri_file is appended to the template path
save – (optional) if True, save to file. Default is True
- set_source_contract(connector_contract: ConnectorContract, template_aligned: bool | None = None, save: bool | None = None)
Sets the source contract using the class CONNECTOR_SOURCE constant
- Parameters:
connector_contract – a Connector Contract for the source data
template_aligned – the connector aligns with the template so changes to the template
save – (optional) if True, save to file. Default is True
- set_source_uri(uri: str, save: bool | None = None, template_aligned: bool | None = None, **kwargs)
Sets the source contract giving the full uri path. This is a shortcut of set_source_contract(…), not requiring a ConnectorContract to be set up and using the default module and handler values.
- Parameters:
uri – a fully qualified uri of the source data
template_aligned – the connector aligns with the template so changes to the template
save – (optional) if True, save to file. Default is True
- set_status(status: str, save=None)
sets the status of this component task. Suggested status might be ‘discovery’, ‘stable’, ‘production’ :param status: the status to be set, :param save: override of the default save action set at initialisation.
- set_version(version: str, save=None)
sets the version :param version: the version to be set :param save: override of the default save action set at initialisation.
- setup_bootstrap(domain: str | None = None, project_name: str | None = None, path: str | None = None, file_type: str | None = None, description: str | None = None)
Creates a bootstrap Transition setup. Note this does not set the source
- Parameters:
domain – (optional) The domain this simulators sits within e.g. ‘Healthcare’ or ‘Financial Services’
project_name – (optional) a project name that will replace the hadron naming on file prefix
path – (optional) a path added to the template path default
file_type – (optional) a file_type for the persisted file, default is ‘parquet’
description – (optional) a description of the component instance to overwrite the default
- upload_notes(canonical: dict, catalog: str, label_key: str, text_key: str, constraints: list | None = None, save=None)
Allows bulk upload of notes.
- Parameters:
canonical – a dictionary of where the key is the label and value is the text
catalog – (optional) the section these notes should be put in
label_key – the dictionary key name for the labels
text_key – the dictionary key name for the text
constraints – (optional) the limited list of acceptable labels. If not in list then ignored
save – if True, save to file. Default is True
- property visual: Visualisation
The visualisation instance
- classmethod visual_pad() Visualisation
A class method to use the Components visualisation methods as a scratch pad