EntityRecognizer¶

class arcgis.learn.text.EntityRecognizer(data, lang='en', backbone='spacy', **kwargs)¶

Creates an entity recognition model to extract text entities from unstructured text documents.

Argument	Description
data	Requires data object returned from `prepare_data` function.
lang	Optional string. Language-specific code, named according to the language’s ISO code The default value is ‘en’ for English.
backbone	Optional string. Specify spacy or the HuggingFace transformer model name to be used to train the entity recognizer model. Default set to spacy. Entity recognition via spaCy is based on <https://spacy.io/api/entityrecognizer> To learn more about the available transformer models or choose models that are suitable for your dataset, kindly visit:- https://huggingface.co/transformers/pretrained_models.html

kwargs

Argument	Description
verbose	Optional string. Default set to error. The log level you want to set. It means the amount of information you want to display while training or calling the various methods of this class. Allowed values are - debug, info, warning, error and critical. Applicable only for models with HuggingFace transformer backbones.
seq_len	Optional Integer. Default set to 512. Maximum sequence length (at sub-word level after tokenization) of the training data to be considered for training the model. Applicable only for models with HuggingFace transformer backbones.
mixed_precision	Optional Bool. Default set to False. If set True, then mixed precision training is used to train the model. Applicable only for models with HuggingFace transformer backbones.
pretrained_path	Optional String. Path where pre-trained model is saved. Accepts a Deep Learning Package (DLPK) or Esri Model Definition(EMD) file.

Returns: EntityRecognizer Object

classmethod available_backbone_models(architecture)¶

Get available models for the given entity recognition backbone

Argument	Description
architecture	Required string. name of the architecture one wishes to use. To learn more about the available models or choose models that are suitable for your dataset, kindly visit:- https://huggingface.co/transformers/pretrained_models.html

Returns: a tuple containing the available models for the given entity recognition backbone

extract_entities(text_list, drop=True, batch_size=4)¶

Extracts the entities from [documents in the mentioned path or text_list].

Field defined as ‘address_tag’ in prepare_data() function’s class mapping attribute will be treated as a location. In cases where trained model extracts multiple locations from a single document, that document will be replicated for each location in the resulting dataframe.

Argument	Description
text_list	Required string(path) or list(documents). List of documents for entity extraction OR path to the documents.
drop	Optional bool. If documents without address needs to be dropped from the results. Default is set to True.
batch_size	Optional integer. Number of items to process at once. (Reduce it if getting CUDA Out of Memory Errors). Default is set to 4. Not applicable for models with spaCy backbone.

Returns: Pandas DataFrame

f1_score()¶: Calculate F1 score of the trained model

fit(epochs=20, lr=None, one_cycle=True, early_stopping=False, checkpoint=True, **kwargs)¶

Train the model for the specified number of epochs and using the specified learning rates

Argument	Description
epochs	Optional integer. Number of cycles of training on the data. Increase it if underfitting. The default value is 20.
lr	Optional float or slice of floats. Learning rate to be used for training the model. If `lr=None`, an optimal learning rate is automatically deduced for training the model. **Note - Passing slice of floats as lr value is not supported for models with spaCy backbone.
one_cycle	Optional boolean. Parameter to select 1cycle learning rate schedule. If set to False no learning rate schedule is used. **Note - Not applicable for models with spaCy backbone
early_stopping	Optional boolean. Parameter to add early stopping. If set to ‘True’ training will stop if validation loss stops improving for 5 epochs. **Note - Not applicable for models with spaCy backbone
checkpoint	Optional boolean. Parameter to save the best model during training. If set to True the best model based on validation loss will be saved during training. **Note - Not applicable for models with spaCy backbone

classmethod from_model(emd_path, data=None)¶

Creates an EntityRecognizer model object from a Deep Learning Package(DLPK) or Esri Model Definition (EMD) file.

Argument	Description
emd_path	Required string. Path to Deep Learning Package (DLPK) or Esri Model Definition(EMD) file.
data	Required DatabunchNER object or None. Returned data object from prepare_data function or None for inferencing.

Returns: EntityRecognizer Object

load(name_or_path)¶

Loads a saved EntityRecognizer model from disk.

Argument	Description
name_or_path	Required string. Path to Deep Learning Package (DLPK) or Esri Model Definition(EMD) file.

lr_find(allow_plot=True)¶: Runs the Learning Rate Finder, and displays the graph of it’s output. Helps in choosing the optimum learning rate for training the model.

metrics_per_label()¶: Calculate precision, recall & F1 scores per labels/entities for which the model was trained on

plot_losses(show=True)¶

Plot training and validation losses.

Argument	Description
show	Optional bool. Defaults to True If set to False, figure will not be plotted but will be returned, when set to True function will plot the figure and return nothing.

Returns: matplotlib.figure.Figure

precision_score()¶: Calculate precision score of the trained model

recall_score()¶: Calculate recall score of the trained model

save(name_or_path, **kwargs)¶

Saves the model weights, creates an Esri Model Definition and Deep Learning Package zip for deployment to Image Server or ArcGIS Pro.

Argument	Description
name_or_path	Required string. Name of the model to save. It stores it at the pre-defined location. If path is passed then it stores at the specified path with model name as directory name and creates all the intermediate directories.
publish	Optional boolean. Publishes the DLPK as an item. Default is set to False.
gis	Optional GIS Object. Used for publishing the item. If not specified then active gis user is taken.
compute_metrics	Optional boolean. Used for computing model metrics. Default is set to True.
save_optimizer	Optional boolean. Used for saving the model-optimizer state along with the model. Default is set to False Not applicable for models with spaCy backbone.
kwargs	Optional Parameters: Boolean overwrite if True, it will overwrite the item on ArcGIS Online/Enterprise, default False. Boolean zip_files if True, it will create the Deep Learning Package (DLPK) file while saving the model.

show_results(ds_type='valid')¶

Runs entity extraction on a random batch from the mentioned ds_type.

Argument	Description
ds_type	Optional string, defaults to valid.

Returns: Pandas DataFrame

supported_backbones = ['spacy', 'BERT', 'RoBERTa', 'DistilBERT', 'ALBERT', 'CamemBERT', 'MobileBERT', 'XLNet', 'XLM', 'XLM-RoBERTa', 'FlauBERT', 'ELECTRA', 'Longformer']¶

unfreeze()¶: Unfreezes the earlier layers of the model for fine-tuning.

TextClassifier¶

class arcgis.learn.text.TextClassifier(data, backbone='bert-base-cased', **kwargs)¶

Creates a TextClassifier Object. Based on the Hugging Face transformers library

Argument

Description

data

Required text data object, returned from prepare_textdata function.

backbone

Optional string. Specifying the HuggingFace transformer model name to be used to train the classifier. Default set to bert-base-cased.

To learn more about the available models or choose models that are suitable for your dataset, kindly visit:- https://huggingface.co/transformers/pretrained_models.html

kwargs

Argument	Description
verbose	Optional string. Default set to error. The log level you want to set. It means the amount of information you want to display while training or calling the various methods of this class. Allowed values are - debug, info, warning, error and critical.
seq_len	Optional Integer. Default set to 512. Maximum sequence length (at sub-word level after tokenization) of the training data to be considered for training the model.
thresh	Optional Float. This parameter is used to set the threshold value to pick labels in case of multi-label text classification problem. Default value is set to 0.25
mixed_precision	Optional Bool. Default set to False. If set True, then mixed precision training is used to train the model
pretrained_path	Optional String. Path where pre-trained model is saved. Accepts a Deep Learning Package (DLPK) or Esri Model Definition(EMD) file.

Returns: TextClassifier Object

accuracy()¶

Calculates the following metric:

accuracy: the number of correctly predicted labels in the validation set
divided by the total number of items in the validation set

Returns: a floating point number depicting the accuracy of the classification model.

classmethod available_backbone_models(architecture)¶

Get available models for the given transformer backbone

Argument	Description
architecture	Required string. name of the transformer backbone one wish to use. To learn more about the available models or choose models that are suitable for your dataset, kindly visit:- https://huggingface.co/transformers/pretrained_models.html

Returns: a tuple containing the available models for the given transformer backbone

fit(epochs=10, lr=None, one_cycle=True, early_stopping=False, checkpoint=True, tensorboard=False, **kwargs)¶

Train the model for the specified number of epochs and using the specified learning rates

Argument	Description
epochs	Required integer. Number of cycles of training on the data. Increase it if underfitting.
lr	Optional float or slice of floats. Learning rate to be used for training the model. If `lr=None`, an optimal learning rate is automatically deduced for training the model.
one_cycle	Optional boolean. Parameter to select 1cycle learning rate schedule. If set to False no learning rate schedule is used.
early_stopping	Optional boolean. Parameter to add early stopping. If set to ‘True’ training will stop if validation loss stops improving for 5 epochs.
checkpoint	Optional boolean. Parameter to save the best model during training. If set to True the best model based on validation loss will be saved during training.
tensorboard	Optional boolean. Parameter to write the training log. If set to ‘True’ the log will be saved at <dataset-path>/training_log which can be visualized in tensorboard. Required tensorboardx version=2.1 The default value is ‘False’. **Note - Not applicable for Text Models

classmethod from_model(emd_path, data=None)¶

Creates an TextClassifier model object from a Deep Learning Package(DLPK) or Esri Model Definition (EMD) file.

Argument	Description
emd_path	Required string. Path to Deep Learning Package (DLPK) or Esri Model Definition(EMD) file.
data	Required fastai Databunch or None. Returned data object from prepare_textdata function or None for inferencing.

Returns: TextClassifier model Object

get_misclassified_records()¶

Returns: get misclassified records for this classification model.

load(name_or_path)¶

Loads a saved TextClassifier model from disk.

Argument	Description
name_or_path	Required string. Path to Deep Learning Package (DLPK) or Esri Model Definition(EMD) file.

lr_find(allow_plot=True)¶: Runs the Learning Rate Finder, and displays the graph of its output. Helps in choosing the optimum learning rate for training the model.

metrics_per_label()¶

Returns: precision, recall and f1 score for each label in the classification model.

plot_losses()¶: Plot validation and training losses after fitting the model.

predict(text_or_list, show_progress=True, thresh=None)¶

Predicts the class label(s) for the input text

Argument	Description
text_or_list	Required String or List. text or a list of texts for which we wish to find the class label(s).
show_progress	optional Bool. If set to True, will display a progress bar depicting the items processed so far. Applicable only when a list of text is passed
thresh	Optional Float. The threshold value set to get the class label(s). Applicable only for multi-label classification task. Default is the value set during the model creation time, otherwise the value of 0.25 is set.

Returns

In case of single label classification problem, a tuple containing

the text, its predicted class label and the confidence score.

In case of multi label classification problem, a tuple containing

the text, its predicted class labels, a list containing 1’s for the predicted labels, 0’s otherwise and list containing a score for each label

save(name_or_path, framework='PyTorch', publish=False, gis=None, compute_metrics=True, save_optimizer=False, **kwargs)¶

Saves the model weights, creates an Esri Model Definition and Deep Learning Package zip for deployment.

Argument	Description
name_or_path	Required string. Folder path to save the model.
framework	Optional string. Defines the framework of the model. (Only supported by `SingleShotDetector`, currently.) If framework used is `TF-ONNX`, `batch_size` can be passed as an optional keyword argument. Framework choice: ‘PyTorch’ and ‘TF-ONNX’
publish	Optional boolean. Publishes the DLPK as an item.
gis	Optional GIS Object. Used for publishing the item. If not specified then active gis user is taken.
compute_metrics	Optional boolean. Used for computing model metrics.
save_optimizer	Optional boolean. Used for saving the model-optimizer state along with the model. Default is set to False.
kwargs	Optional Parameters: Boolean overwrite if True, it will overwrite the item on ArcGIS Online/Enterprise, default False. Boolean zip_files if True, it will create the Deep Learning Package (DLPK) file while saving the model.

Returns: the qualified path at which the model is saved

show_results(rows=5, **kwargs)¶

Prints the rows of the dataframe with target and prediction columns.

Argument	Description
rows	Optional Integer. Number of rows to print.

Returns: dataframe

supported_backbones = ['BERT', 'RoBERTa', 'DistilBERT', 'ALBERT', 'FlauBERT', 'CamemBERT', 'XLNet', 'XLM', 'XLM-RoBERTa', 'Bart', 'ELECTRA', 'Longformer', 'MobileBERT']¶

unfreeze()¶: Unfreezes the earlier layers of the model for fine-tuning.

SequenceToSequence¶

class arcgis.learn.text.SequenceToSequence(data, backbone='t5-base', **kwargs)¶

Creates a SequenceToSequence Object. Based on the Hugging Face transformers library

Argument

Description

data

Required text data object, returned from prepare_textdata function.

backbone

Optional string. Specifying the HuggingFace transformer model name to be used to train the model. Default set to ‘t5-base’.

To learn more about the available models or choose models that are suitable for your dataset, kindly visit:- https://huggingface.co/transformers/pretrained_models.html

kwargs

Argument	Description
verbose	Optional string. Default set to error. The log level you want to set. It means the amount of information you want to display while training or calling the various methods of this class. Allowed values are - debug, info, warning, error and critical.
seq_len	Optional Integer. Default set to 512. Maximum sequence length (at sub-word level after tokenization) of the training data to be considered for training the model.
mixed_precision	Optional Bool. Default set to False. If set True, then mixed precision training is used to train the model
pretrained_path	Optional String. Path where pre-trained model is saved. Accepts a Deep Learning Package (DLPK) or Esri Model Definition(EMD) file.

Returns: SequenceToSequence model object for sequence_translation task.

classmethod available_backbone_models(architecture)¶

Get available models for the given transformer backbone

Argument	Description
architecture	Required string. name of the transformer backbone one wish to use. To learn more about the available models or choose models that are suitable for your dataset, kindly visit:- https://huggingface.co/transformers/pretrained_models.html

Returns: a tuple containing the available models for the given transformer backbone

fit(epochs=10, lr=None, one_cycle=True, early_stopping=False, checkpoint=True, tensorboard=False, **kwargs)¶

Train the model for the specified number of epochs and using the specified learning rates

Argument	Description
epochs	Required integer. Number of cycles of training on the data. Increase it if underfitting.
lr	Optional float or slice of floats. Learning rate to be used for training the model. If `lr=None`, an optimal learning rate is automatically deduced for training the model.
one_cycle	Optional boolean. Parameter to select 1cycle learning rate schedule. If set to False no learning rate schedule is used.
early_stopping	Optional boolean. Parameter to add early stopping. If set to ‘True’ training will stop if validation loss stops improving for 5 epochs.
checkpoint	Optional boolean. Parameter to save the best model during training. If set to True the best model based on validation loss will be saved during training.
tensorboard	Optional boolean. Parameter to write the training log. If set to ‘True’ the log will be saved at <dataset-path>/training_log which can be visualized in tensorboard. Required tensorboardx version=2.1 The default value is ‘False’. **Note - Not applicable for Text Models

classmethod from_model(emd_path, data=None, **kwargs)¶

Creates an SequenceToSequence model object from a Deep Learning Package(DLPK) or Esri Model Definition (EMD) file.

Argument	Description
emd_path	Required string. Path to Deep Learning Package (DLPK) or Esri Model Definition(EMD) file.
data	Optional fastai Databunch. Returned data object from prepare_textdata function or None for inferencing. Default value: None

Returns: SequenceToSequence Object

get_model_metrics()¶

Calculates the following metrics:

accuracy: the number of correctly predicted labels in the validation set
divided by the total number of items in the validation set
bleu-score This value indicates the similarity between model predictions
and the ground truth text. Maximum value is 1

Returns: a dictionary containing the metrics for classification model.

load(name_or_path)¶

Loads a saved SequenceToSequence model from disk.

Argument	Description
name_or_path	Required string. Path to Deep Learning Package (DLPK) or Esri Model Definition(EMD) file.

lr_find(allow_plot=True)¶: Runs the Learning Rate Finder, and displays the graph of its output. Helps in choosing the optimum learning rate for training the model.

plot_losses(show=True)¶

Plot training and validation losses.

Argument	Description
show	Optional bool. Defaults to True If set to False, figure will not be plotted but will be returned, when set to True function will plot the figure and return nothing.

Returns: matplotlib.figure.Figure

predict(text_or_list, batch_size=64, show_progress=True, **kwargs)¶

Predicts the translated outcome.

Argument	Description
text_or_list	Required input string or list of input strings.
batch_size	Optional integer. Number of inputs to be processed at once. Try reducing the batch size in case of out of memory errors. Default value : 64
show_progress	Optional bool. To show or not to show the progress of prediction task. Default value : True

kwargs

Argument	Description
num_beams	Optional integer. Number of beams for beam search. 1 means no beam search. Default value is set to 1
max_length	Optional integer. The maximum length of the sequence to be generated. Default value is set to 20
min_length	Optional integer. The minimum length of the sequence to be generated. Default value is set to 10

Returns: list of tuples(input , predicted output strings).

save(name_or_path, framework='PyTorch', publish=False, gis=None, compute_metrics=True, save_optimizer=False, **kwargs)¶

Saves the model weights, creates an Esri Model Definition and Deep Learning Package zip for deployment.

Argument	Description
name_or_path	Required string. Folder path to save the model.
framework	Optional string. Defines the framework of the model. (Only supported by `SingleShotDetector`, currently.) If framework used is `TF-ONNX`, `batch_size` can be passed as an optional keyword argument. Framework choice: ‘PyTorch’ and ‘TF-ONNX’
publish	Optional boolean. Publishes the DLPK as an item.
gis	Optional GIS Object. Used for publishing the item. If not specified then active gis user is taken.
compute_metrics	Optional boolean. Used for computing model metrics.
save_optimizer	Optional boolean. Used for saving the model-optimizer state along with the model. Default is set to False.
kwargs	Optional Parameters: Boolean overwrite if True, it will overwrite the item on ArcGIS Online/Enterprise, default False. Boolean zip_files if True, it will create the Deep Learning Package (DLPK) file while saving the model.

Returns: the qualified path at which the model is saved

show_results(rows=5, **kwargs)¶

Prints the rows of the dataframe with target and prediction columns.

Argument	Description
rows	Optional Integer. Number of rows to print.

Returns: dataframe

supported_backbones = ['T5', 'Bart', 'Marian']¶

unfreeze()¶: Unfreezes the earlier layers of the model for fine-tuning.

Inference Only Models¶

FillMask¶

class arcgis.learn.text.FillMask(backbone=None)¶

Creates a FillMask Object. Based on the Hugging Face transformers library

Argument

Description

backbone

Optional string. Specify the HuggingFace transformer model name which will be used to generate the suggestion token.

To learn more about the available models for fill-mask task, kindly visit:- https://huggingface.co/models?filter=lm-head

Returns: FillMask Object

predict_token(text_or_list, num_suggestions=5)¶

Summarize the given text or list of text

Argument	Description
text_or_list	Required string or list. A text/sentence or a list of texts/sentences for which on wishes to generate the recommendations for masked-token.

Returns

A list or a list of list of dict: Each result comes as list of dictionaries with the following keys:

sequence (str) – The corresponding input with the mask token prediction.
score (float) – The corresponding probability.
token_str (str) – The predicted token (to replace the masked one).

supported_backbones = []¶: supported transformer backbones

QuestionAnswering¶

class arcgis.learn.text.QuestionAnswering(backbone=None)¶

Creates a QuestionAnswering Object. Based on the Hugging Face transformers library

Argument

Description

backbone

Optional string. Specify the HuggingFace transformer model name which will be used to extract the answers from a given passage/context.

To learn more about the available models for question-answering task, kindly visit:- https://huggingface.co/models?filter=question-answering

Returns: QuestionAnswering Object

get_answer(text_or_list, context, **kwargs)¶

Find answers for the asked questions from the given passage/context

Argument	Description
text_or_list	Required string or list. Questions or a list of questions one wishes to seek an answer for.
context	Required string. The context associated with the question(s) which contains the answers.

kwargs

Argument	Description
num_answers	Optional integer. The number of answers to return. The answers will be chosen by order of likelihood. Default value is set to 1.
max_answer_length	Optional integer. The maximum length of the predicted answers. Default value is set to 15.
max_question_length	Optional integer. The maximum length of the question after tokenization. Questions will be truncated if needed. Default value is set to 64.
impossible_answer	Optional bool. Whether or not we accept impossible as an answer. Default value is set to False

Returns: a list or a list of list containing the answer(s) for the input question(s)

supported_backbones = []¶: supported transformer backbones

TextGenerator¶

class arcgis.learn.text.TextGenerator(backbone=None)¶

Creates a TextGenerator Object. Based on the Hugging Face transformers library

Argument

Description

backbone

Optional string. Specifying the HuggingFace transformer model name which will be used to generate the text.

To learn more about the available models for text-generation task, kindly visit:- https://huggingface.co/models?search=&filter=lm-head

Returns: TextGenerator Object

generate_text(text_or_list, **kwargs)¶

Generate text(s) for a text or a list of incomplete sentence(s)

Argument	Description
text_or_list	Required string or list. A text/sentence or a list of texts/sentences to complete.

kwargs

Argument	Description
min_length	Optional integer. The minimum length of the sequence to be generated. Default value is set to to min_length parameter of the model config.
max_length	Optional integer. The maximum length of the sequence to be generated. Default value is set to max_length parameter of the model config.
num_return_sequences	Optional integer. The number of independently computed returned sequences for each element in the batch. Default value is set to 1.
num_beams	Optional integer. Number of beams for beam search. 1 means no beam search. Default value is set to 1.
length_penalty	Optional float. Exponential penalty to the length. 1.0 means no penalty. Set to values < 1.0 in order to encourage the model to generate shorter sequences, to a value > 1.0 in order to encourage the model to produce longer sequences. Default value is set to 1.0.
early_stopping	Optional bool. Whether to stop the beam search when at least `num_beams` sentences are finished per batch or not. Default value is set to False.

Returns: a list or a list of list containing the generated text for the input prompt(s) / sentence(s)

supported_backbones = []¶: supported transformer backbones

TextSummarizer¶

class arcgis.learn.text.TextSummarizer(backbone=None)¶

Creates a TextSummarizer Object. Based on the Hugging Face transformers library

Argument

Description

backbone

Optional string. Specify the HuggingFace transformer model name which will be used to summarize the text.

To learn more about the available models for summarization task, kindly visit:- https://huggingface.co/models?filter=summarization

Returns: TextSummarizer Object

summarize(text_or_list, **kwargs)¶

Summarize the given text or list of text

Argument	Description
text_or_list	Required string or list. A text/passage or a list of texts/passages to generate the summary for.

kwargs

Argument	Description
min_length	Optional integer. The minimum length of the sequence to be generated. Default value is set to to min_length parameter of the model config.
max_length	Optional integer. The maximum length of the sequence to be generated. Default value is set to to max_length parameter of the model config.
num_return_sequences	Optional integer. The number of independently computed returned sequences for each element in the batch. Default value is set to 1.
num_beams	Optional integer. Number of beams for beam search. 1 means no beam search. Default value is set to 1.
length_penalty	Optional float. Exponential penalty to the length. 1.0 means no penalty. Set to values < 1.0 in order to encourage the model to generate shorter sequences, to a value > 1.0 in order to encourage the model to produce longer sequences. Default value is set to 1.0.
early_stopping	Optional bool. Whether to stop the beam search when at least `num_beams` sentences are finished per batch or not. Default value is set to False.

Returns: a list or a list of list containing the summary/summaries for the input prompt(s) / sentence(s)

supported_backbones = []¶: supported transformer backbones

TextTranslator¶

class arcgis.learn.text.TextTranslator(source_language='es', target_language='en')¶

Creates a TextTranslator Object. Based on the Hugging Face transformers library To learn more about the available models for translation task, kindly visit:- https://huggingface.co/models?filter=translation

Argument	Description
source_language	Optional string. Specify the language of the text you would like to get the translation of. Default value is ‘es’ (Spanish)
target_language	Optional string. The language into which one wishes to translate the input text. Default value is ‘en’ (English)

Returns: TextTranslator Object

supported_backbones = ['MarianMT']¶: supported transformer backbones

translate(text_or_list, **kwargs)¶

Translate the given text or list of text into the target language

Argument	Description
text_or_list	Required string or list. A text/passage or a list of texts/passages to translate.

kwargs

Argument	Description
min_length	Optional integer. The minimum length of the sequence to be generated. Default value is set to to min_length parameter of the model config.
max_length	Optional integer. The maximum length of the sequence to be generated. Default value is set to to max_length parameter of the model config.
num_return_sequences	Optional integer. The number of independently computed returned sequences for each element in the batch. Default value is set to 1.
num_beams	Optional integer. Number of beams for beam search. 1 means no beam search. Default value is set to 1.
length_penalty	Optional float. Exponential penalty to the length. 1.0 means no penalty. Set to values < 1.0 in order to encourage the model to generate shorter sequences, to a value > 1.0 in order to encourage the model to produce longer sequences. Default value is set to 1.0.
early_stopping	Optional bool. Whether to stop the beam search when at least `num_beams` sentences are finished per batch or not. Default value is set to False.

Returns: a list or a list of list containing the translation of the input prompt(s) / sentence(s) to the target language

ZeroShotClassifier¶

class arcgis.learn.text.ZeroShotClassifier(backbone=None)¶

Creates a ZeroShotClassifier Object. Based on the Hugging Face transformers library

Argument

Description

backbone

Optional string. Specifying the HuggingFace transformer model name which will be used to predict the answers from a given passage/context.

To learn more about the available models for zero-shot-classification task, kindly visit:- https://huggingface.co/models?search=nli

Returns: ZeroShotClassifier Object

predict(text_or_list, candidate_labels, **kwargs)¶

Predicts the class label(s) for the input text

Argument	Description
text_or_list	Required string or list. The sequence or a list of sequences to classify.
candidate_labels	Required string or list. The set of possible class labels to classify each sequence into. Can be a single label, a string of comma-separated labels, or a list of labels.

kwargs

Argument	Description
multi_class	Optional boolean. Whether or not multiple candidate labels can be true. Default value is set to False.
hypothesis	Optional string. The template used to turn each label into an NLI-style hypothesis. This template must include a {} or similar syntax for the candidate label to be inserted into the template. Default value is set to “This example is {}.”.

Returns

a list of dict: Each result comes as a dictionary with the following keys:

sequence (str) – The sequence for which this is the output.
labels (List[str]) – The labels sorted by order of likelihood.
scores (List[float]) – The probabilities for each of the labels.

supported_backbones = []¶: supported transformer backbones