amazonka-sagemaker-2.0: Amazon SageMaker Service SDK.
Copyright(c) 2013-2023 Brendan Hay
LicenseMozilla Public License, v. 2.0.
MaintainerBrendan Hay
Stabilityauto-generated
Portabilitynon-portable (GHC extensions)
Safe HaskellSafe-Inferred
LanguageHaskell2010

Amazonka.SageMaker.CreateLabelingJob

Description

Creates a job that uses workers to label the data objects in your input dataset. You can use the labeled data to train machine learning models.

You can select your workforce from one of three providers:

  • A private workforce that you create. It can include employees, contractors, and outside experts. Use a private workforce when want the data to stay within your organization or when a specific set of skills is required.
  • One or more vendors that you select from the Amazon Web Services Marketplace. Vendors provide expertise in specific areas.
  • The Amazon Mechanical Turk workforce. This is the largest workforce, but it should only be used for public data or data that has been stripped of any personally identifiable information.

You can also use automated data labeling to reduce the number of data objects that need to be labeled by a human. Automated data labeling uses active learning to determine if a data object can be labeled by machine or if it needs to be sent to a human worker. For more information, see Using Automated Data Labeling.

The data objects to be labeled are contained in an Amazon S3 bucket. You create a manifest file that describes the location of each object. For more information, see Using Input and Output Data.

The output can be used as the manifest file for another labeling job or as training data for your machine learning models.

You can use this operation to create a static labeling job or a streaming labeling job. A static labeling job stops if all data objects in the input manifest file identified in ManifestS3Uri have been labeled. A streaming labeling job runs perpetually until it is manually stopped, or remains idle for 10 days. You can send new data objects to an active (InProgress) streaming labeling job in real time. To learn how to create a static labeling job, see Create a Labeling Job (API) in the Amazon SageMaker Developer Guide. To learn how to create a streaming labeling job, see Create a Streaming Labeling Job.

Synopsis

Creating a Request

data CreateLabelingJob Source #

See: newCreateLabelingJob smart constructor.

Constructors

CreateLabelingJob' 

Fields

  • labelCategoryConfigS3Uri :: Maybe Text

    The S3 URI of the file, referred to as a /label category configuration file/, that defines the categories used to label the data objects.

    For 3D point cloud and video frame task types, you can add label category attributes and frame attributes to your label category configuration file. To learn how, see Create a Labeling Category Configuration File for 3D Point Cloud Labeling Jobs.

    For named entity recognition jobs, in addition to "labels", you must provide worker instructions in the label category configuration file using the "instructions" parameter: "instructions": {"shortInstruction":"<h1>Add header</h1><p>Add Instructions</p>", "fullInstruction":"<p>Add additional instructions.</p>"}. For details and an example, see Create a Named Entity Recognition Labeling Job (API) .

    For all other built-in task types and custom tasks, your label category configuration file must be a JSON file in the following format. Identify the labels you want to use by replacing label_1, label_2,...,label_n with your label categories.

    {
    "document-version": "2018-11-28",
    "labels": [{"label": "label_1"},{"label": "label_2"},...{"label": "label_n"}]
    }

    Note the following about the label category configuration file:

    • For image classification and text classification (single and multi-label) you must specify at least two label categories. For all other task types, the minimum number of label categories required is one.
    • Each label category must be unique, you cannot specify duplicate label categories.
    • If you create a 3D point cloud or video frame adjustment or verification labeling job, you must include auditLabelAttributeName in the label category configuration. Use this parameter to enter the LabelAttributeName of the labeling job you want to adjust or verify annotations of.
  • labelingJobAlgorithmsConfig :: Maybe LabelingJobAlgorithmsConfig

    Configures the information required to perform automated data labeling.

  • stoppingConditions :: Maybe LabelingJobStoppingConditions

    A set of conditions for stopping the labeling job. If any of the conditions are met, the job is automatically stopped. You can use these conditions to control the cost of data labeling.

  • tags :: Maybe [Tag]

    An array of key/value pairs. For more information, see Using Cost Allocation Tags in the Amazon Web Services Billing and Cost Management User Guide.

  • labelingJobName :: Text

    The name of the labeling job. This name is used to identify the job in a list of labeling jobs. Labeling job names must be unique within an Amazon Web Services account and region. LabelingJobName is not case sensitive. For example, Example-job and example-job are considered the same labeling job name by Ground Truth.

  • labelAttributeName :: Text

    The attribute name to use for the label in the output manifest file. This is the key for the key/value pair formed with the label that a worker assigns to the object. The LabelAttributeName must meet the following requirements.

    • The name can't end with "-metadata".
    • If you are using one of the following built-in task types, the attribute name must end with "-ref". If the task type you are using is not listed below, the attribute name must not end with "-ref".

      • Image semantic segmentation (SemanticSegmentation), and adjustment (AdjustmentSemanticSegmentation) and verification (VerificationSemanticSegmentation) labeling jobs for this task type.
      • Video frame object detection (VideoObjectDetection), and adjustment and verification (AdjustmentVideoObjectDetection) labeling jobs for this task type.
      • Video frame object tracking (VideoObjectTracking), and adjustment and verification (AdjustmentVideoObjectTracking) labeling jobs for this task type.
      • 3D point cloud semantic segmentation (3DPointCloudSemanticSegmentation), and adjustment and verification (Adjustment3DPointCloudSemanticSegmentation) labeling jobs for this task type.
      • 3D point cloud object tracking (3DPointCloudObjectTracking), and adjustment and verification (Adjustment3DPointCloudObjectTracking) labeling jobs for this task type.

    If you are creating an adjustment or verification labeling job, you must use a different LabelAttributeName than the one used in the original labeling job. The original labeling job is the Ground Truth labeling job that produced the labels that you want verified or adjusted. To learn more about adjustment and verification labeling jobs, see Verify and Adjust Labels.

  • inputConfig :: LabelingJobInputConfig

    Input data for the labeling job, such as the Amazon S3 location of the data objects and the location of the manifest file that describes the data objects.

    You must specify at least one of the following: S3DataSource or SnsDataSource.

    • Use SnsDataSource to specify an SNS input topic for a streaming labeling job. If you do not specify and SNS input topic ARN, Ground Truth will create a one-time labeling job that stops after all data objects in the input manifest file have been labeled.
    • Use S3DataSource to specify an input manifest file for both streaming and one-time labeling jobs. Adding an S3DataSource is optional if you use SnsDataSource to create a streaming labeling job.

    If you use the Amazon Mechanical Turk workforce, your input data should not include confidential information, personal information or protected health information. Use ContentClassifiers to specify that your data is free of personally identifiable information and adult content.

  • outputConfig :: LabelingJobOutputConfig

    The location of the output data and the Amazon Web Services Key Management Service key ID for the key used to encrypt the output data, if any.

  • roleArn :: Text

    The Amazon Resource Number (ARN) that Amazon SageMaker assumes to perform tasks on your behalf during data labeling. You must grant this role the necessary permissions so that Amazon SageMaker can successfully complete data labeling.

  • humanTaskConfig :: HumanTaskConfig

    Configures the labeling task and how it is presented to workers; including, but not limited to price, keywords, and batch size (task count).

Instances

Instances details
ToJSON CreateLabelingJob Source # 
Instance details

Defined in Amazonka.SageMaker.CreateLabelingJob

ToHeaders CreateLabelingJob Source # 
Instance details

Defined in Amazonka.SageMaker.CreateLabelingJob

ToPath CreateLabelingJob Source # 
Instance details

Defined in Amazonka.SageMaker.CreateLabelingJob

ToQuery CreateLabelingJob Source # 
Instance details

Defined in Amazonka.SageMaker.CreateLabelingJob

AWSRequest CreateLabelingJob Source # 
Instance details

Defined in Amazonka.SageMaker.CreateLabelingJob

Associated Types

type AWSResponse CreateLabelingJob #

Generic CreateLabelingJob Source # 
Instance details

Defined in Amazonka.SageMaker.CreateLabelingJob

Associated Types

type Rep CreateLabelingJob :: Type -> Type #

Read CreateLabelingJob Source # 
Instance details

Defined in Amazonka.SageMaker.CreateLabelingJob

Show CreateLabelingJob Source # 
Instance details

Defined in Amazonka.SageMaker.CreateLabelingJob

NFData CreateLabelingJob Source # 
Instance details

Defined in Amazonka.SageMaker.CreateLabelingJob

Methods

rnf :: CreateLabelingJob -> () #

Eq CreateLabelingJob Source # 
Instance details

Defined in Amazonka.SageMaker.CreateLabelingJob

Hashable CreateLabelingJob Source # 
Instance details

Defined in Amazonka.SageMaker.CreateLabelingJob

type AWSResponse CreateLabelingJob Source # 
Instance details

Defined in Amazonka.SageMaker.CreateLabelingJob

type Rep CreateLabelingJob Source # 
Instance details

Defined in Amazonka.SageMaker.CreateLabelingJob

newCreateLabelingJob Source #

Create a value of CreateLabelingJob with all optional fields omitted.

Use generic-lens or optics to modify other optional fields.

The following record fields are available, with the corresponding lenses provided for backwards compatibility:

$sel:labelCategoryConfigS3Uri:CreateLabelingJob', createLabelingJob_labelCategoryConfigS3Uri - The S3 URI of the file, referred to as a /label category configuration file/, that defines the categories used to label the data objects.

For 3D point cloud and video frame task types, you can add label category attributes and frame attributes to your label category configuration file. To learn how, see Create a Labeling Category Configuration File for 3D Point Cloud Labeling Jobs.

For named entity recognition jobs, in addition to "labels", you must provide worker instructions in the label category configuration file using the "instructions" parameter: "instructions": {"shortInstruction":"<h1>Add header</h1><p>Add Instructions</p>", "fullInstruction":"<p>Add additional instructions.</p>"}. For details and an example, see Create a Named Entity Recognition Labeling Job (API) .

For all other built-in task types and custom tasks, your label category configuration file must be a JSON file in the following format. Identify the labels you want to use by replacing label_1, label_2,...,label_n with your label categories.

{
"document-version": "2018-11-28",
"labels": [{"label": "label_1"},{"label": "label_2"},...{"label": "label_n"}]
}

Note the following about the label category configuration file:

  • For image classification and text classification (single and multi-label) you must specify at least two label categories. For all other task types, the minimum number of label categories required is one.
  • Each label category must be unique, you cannot specify duplicate label categories.
  • If you create a 3D point cloud or video frame adjustment or verification labeling job, you must include auditLabelAttributeName in the label category configuration. Use this parameter to enter the LabelAttributeName of the labeling job you want to adjust or verify annotations of.

$sel:labelingJobAlgorithmsConfig:CreateLabelingJob', createLabelingJob_labelingJobAlgorithmsConfig - Configures the information required to perform automated data labeling.

$sel:stoppingConditions:CreateLabelingJob', createLabelingJob_stoppingConditions - A set of conditions for stopping the labeling job. If any of the conditions are met, the job is automatically stopped. You can use these conditions to control the cost of data labeling.

CreateLabelingJob, createLabelingJob_tags - An array of key/value pairs. For more information, see Using Cost Allocation Tags in the Amazon Web Services Billing and Cost Management User Guide.

CreateLabelingJob, createLabelingJob_labelingJobName - The name of the labeling job. This name is used to identify the job in a list of labeling jobs. Labeling job names must be unique within an Amazon Web Services account and region. LabelingJobName is not case sensitive. For example, Example-job and example-job are considered the same labeling job name by Ground Truth.

$sel:labelAttributeName:CreateLabelingJob', createLabelingJob_labelAttributeName - The attribute name to use for the label in the output manifest file. This is the key for the key/value pair formed with the label that a worker assigns to the object. The LabelAttributeName must meet the following requirements.

  • The name can't end with "-metadata".
  • If you are using one of the following built-in task types, the attribute name must end with "-ref". If the task type you are using is not listed below, the attribute name must not end with "-ref".

    • Image semantic segmentation (SemanticSegmentation), and adjustment (AdjustmentSemanticSegmentation) and verification (VerificationSemanticSegmentation) labeling jobs for this task type.
    • Video frame object detection (VideoObjectDetection), and adjustment and verification (AdjustmentVideoObjectDetection) labeling jobs for this task type.
    • Video frame object tracking (VideoObjectTracking), and adjustment and verification (AdjustmentVideoObjectTracking) labeling jobs for this task type.
    • 3D point cloud semantic segmentation (3DPointCloudSemanticSegmentation), and adjustment and verification (Adjustment3DPointCloudSemanticSegmentation) labeling jobs for this task type.
    • 3D point cloud object tracking (3DPointCloudObjectTracking), and adjustment and verification (Adjustment3DPointCloudObjectTracking) labeling jobs for this task type.

If you are creating an adjustment or verification labeling job, you must use a different LabelAttributeName than the one used in the original labeling job. The original labeling job is the Ground Truth labeling job that produced the labels that you want verified or adjusted. To learn more about adjustment and verification labeling jobs, see Verify and Adjust Labels.

CreateLabelingJob, createLabelingJob_inputConfig - Input data for the labeling job, such as the Amazon S3 location of the data objects and the location of the manifest file that describes the data objects.

You must specify at least one of the following: S3DataSource or SnsDataSource.

  • Use SnsDataSource to specify an SNS input topic for a streaming labeling job. If you do not specify and SNS input topic ARN, Ground Truth will create a one-time labeling job that stops after all data objects in the input manifest file have been labeled.
  • Use S3DataSource to specify an input manifest file for both streaming and one-time labeling jobs. Adding an S3DataSource is optional if you use SnsDataSource to create a streaming labeling job.

If you use the Amazon Mechanical Turk workforce, your input data should not include confidential information, personal information or protected health information. Use ContentClassifiers to specify that your data is free of personally identifiable information and adult content.

CreateLabelingJob, createLabelingJob_outputConfig - The location of the output data and the Amazon Web Services Key Management Service key ID for the key used to encrypt the output data, if any.

CreateLabelingJob, createLabelingJob_roleArn - The Amazon Resource Number (ARN) that Amazon SageMaker assumes to perform tasks on your behalf during data labeling. You must grant this role the necessary permissions so that Amazon SageMaker can successfully complete data labeling.

$sel:humanTaskConfig:CreateLabelingJob', createLabelingJob_humanTaskConfig - Configures the labeling task and how it is presented to workers; including, but not limited to price, keywords, and batch size (task count).

Request Lenses

createLabelingJob_labelCategoryConfigS3Uri :: Lens' CreateLabelingJob (Maybe Text) Source #

The S3 URI of the file, referred to as a /label category configuration file/, that defines the categories used to label the data objects.

For 3D point cloud and video frame task types, you can add label category attributes and frame attributes to your label category configuration file. To learn how, see Create a Labeling Category Configuration File for 3D Point Cloud Labeling Jobs.

For named entity recognition jobs, in addition to "labels", you must provide worker instructions in the label category configuration file using the "instructions" parameter: "instructions": {"shortInstruction":"<h1>Add header</h1><p>Add Instructions</p>", "fullInstruction":"<p>Add additional instructions.</p>"}. For details and an example, see Create a Named Entity Recognition Labeling Job (API) .

For all other built-in task types and custom tasks, your label category configuration file must be a JSON file in the following format. Identify the labels you want to use by replacing label_1, label_2,...,label_n with your label categories.

{
"document-version": "2018-11-28",
"labels": [{"label": "label_1"},{"label": "label_2"},...{"label": "label_n"}]
}

Note the following about the label category configuration file:

  • For image classification and text classification (single and multi-label) you must specify at least two label categories. For all other task types, the minimum number of label categories required is one.
  • Each label category must be unique, you cannot specify duplicate label categories.
  • If you create a 3D point cloud or video frame adjustment or verification labeling job, you must include auditLabelAttributeName in the label category configuration. Use this parameter to enter the LabelAttributeName of the labeling job you want to adjust or verify annotations of.

createLabelingJob_labelingJobAlgorithmsConfig :: Lens' CreateLabelingJob (Maybe LabelingJobAlgorithmsConfig) Source #

Configures the information required to perform automated data labeling.

createLabelingJob_stoppingConditions :: Lens' CreateLabelingJob (Maybe LabelingJobStoppingConditions) Source #

A set of conditions for stopping the labeling job. If any of the conditions are met, the job is automatically stopped. You can use these conditions to control the cost of data labeling.

createLabelingJob_tags :: Lens' CreateLabelingJob (Maybe [Tag]) Source #

An array of key/value pairs. For more information, see Using Cost Allocation Tags in the Amazon Web Services Billing and Cost Management User Guide.

createLabelingJob_labelingJobName :: Lens' CreateLabelingJob Text Source #

The name of the labeling job. This name is used to identify the job in a list of labeling jobs. Labeling job names must be unique within an Amazon Web Services account and region. LabelingJobName is not case sensitive. For example, Example-job and example-job are considered the same labeling job name by Ground Truth.

createLabelingJob_labelAttributeName :: Lens' CreateLabelingJob Text Source #

The attribute name to use for the label in the output manifest file. This is the key for the key/value pair formed with the label that a worker assigns to the object. The LabelAttributeName must meet the following requirements.

  • The name can't end with "-metadata".
  • If you are using one of the following built-in task types, the attribute name must end with "-ref". If the task type you are using is not listed below, the attribute name must not end with "-ref".

    • Image semantic segmentation (SemanticSegmentation), and adjustment (AdjustmentSemanticSegmentation) and verification (VerificationSemanticSegmentation) labeling jobs for this task type.
    • Video frame object detection (VideoObjectDetection), and adjustment and verification (AdjustmentVideoObjectDetection) labeling jobs for this task type.
    • Video frame object tracking (VideoObjectTracking), and adjustment and verification (AdjustmentVideoObjectTracking) labeling jobs for this task type.
    • 3D point cloud semantic segmentation (3DPointCloudSemanticSegmentation), and adjustment and verification (Adjustment3DPointCloudSemanticSegmentation) labeling jobs for this task type.
    • 3D point cloud object tracking (3DPointCloudObjectTracking), and adjustment and verification (Adjustment3DPointCloudObjectTracking) labeling jobs for this task type.

If you are creating an adjustment or verification labeling job, you must use a different LabelAttributeName than the one used in the original labeling job. The original labeling job is the Ground Truth labeling job that produced the labels that you want verified or adjusted. To learn more about adjustment and verification labeling jobs, see Verify and Adjust Labels.

createLabelingJob_inputConfig :: Lens' CreateLabelingJob LabelingJobInputConfig Source #

Input data for the labeling job, such as the Amazon S3 location of the data objects and the location of the manifest file that describes the data objects.

You must specify at least one of the following: S3DataSource or SnsDataSource.

  • Use SnsDataSource to specify an SNS input topic for a streaming labeling job. If you do not specify and SNS input topic ARN, Ground Truth will create a one-time labeling job that stops after all data objects in the input manifest file have been labeled.
  • Use S3DataSource to specify an input manifest file for both streaming and one-time labeling jobs. Adding an S3DataSource is optional if you use SnsDataSource to create a streaming labeling job.

If you use the Amazon Mechanical Turk workforce, your input data should not include confidential information, personal information or protected health information. Use ContentClassifiers to specify that your data is free of personally identifiable information and adult content.

createLabelingJob_outputConfig :: Lens' CreateLabelingJob LabelingJobOutputConfig Source #

The location of the output data and the Amazon Web Services Key Management Service key ID for the key used to encrypt the output data, if any.

createLabelingJob_roleArn :: Lens' CreateLabelingJob Text Source #

The Amazon Resource Number (ARN) that Amazon SageMaker assumes to perform tasks on your behalf during data labeling. You must grant this role the necessary permissions so that Amazon SageMaker can successfully complete data labeling.

createLabelingJob_humanTaskConfig :: Lens' CreateLabelingJob HumanTaskConfig Source #

Configures the labeling task and how it is presented to workers; including, but not limited to price, keywords, and batch size (task count).

Destructuring the Response

data CreateLabelingJobResponse Source #

See: newCreateLabelingJobResponse smart constructor.

Constructors

CreateLabelingJobResponse' 

Fields

  • httpStatus :: Int

    The response's http status code.

  • labelingJobArn :: Text

    The Amazon Resource Name (ARN) of the labeling job. You use this ARN to identify the labeling job.

Instances

Instances details
Generic CreateLabelingJobResponse Source # 
Instance details

Defined in Amazonka.SageMaker.CreateLabelingJob

Associated Types

type Rep CreateLabelingJobResponse :: Type -> Type #

Read CreateLabelingJobResponse Source # 
Instance details

Defined in Amazonka.SageMaker.CreateLabelingJob

Show CreateLabelingJobResponse Source # 
Instance details

Defined in Amazonka.SageMaker.CreateLabelingJob

NFData CreateLabelingJobResponse Source # 
Instance details

Defined in Amazonka.SageMaker.CreateLabelingJob

Eq CreateLabelingJobResponse Source # 
Instance details

Defined in Amazonka.SageMaker.CreateLabelingJob

type Rep CreateLabelingJobResponse Source # 
Instance details

Defined in Amazonka.SageMaker.CreateLabelingJob

type Rep CreateLabelingJobResponse = D1 ('MetaData "CreateLabelingJobResponse" "Amazonka.SageMaker.CreateLabelingJob" "amazonka-sagemaker-2.0-9SyrKZ4KqhsL1qX9u3ILA3" 'False) (C1 ('MetaCons "CreateLabelingJobResponse'" 'PrefixI 'True) (S1 ('MetaSel ('Just "httpStatus") 'NoSourceUnpackedness 'NoSourceStrictness 'DecidedStrict) (Rec0 Int) :*: S1 ('MetaSel ('Just "labelingJobArn") 'NoSourceUnpackedness 'NoSourceStrictness 'DecidedStrict) (Rec0 Text)))

newCreateLabelingJobResponse Source #

Create a value of CreateLabelingJobResponse with all optional fields omitted.

Use generic-lens or optics to modify other optional fields.

The following record fields are available, with the corresponding lenses provided for backwards compatibility:

$sel:httpStatus:CreateLabelingJobResponse', createLabelingJobResponse_httpStatus - The response's http status code.

CreateLabelingJobResponse, createLabelingJobResponse_labelingJobArn - The Amazon Resource Name (ARN) of the labeling job. You use this ARN to identify the labeling job.

Response Lenses

createLabelingJobResponse_labelingJobArn :: Lens' CreateLabelingJobResponse Text Source #

The Amazon Resource Name (ARN) of the labeling job. You use this ARN to identify the labeling job.