| Copyright | (c) 2013-2023 Brendan Hay | 
|---|---|
| License | Mozilla Public License, v. 2.0. | 
| Maintainer | Brendan Hay | 
| Stability | auto-generated | 
| Portability | non-portable (GHC extensions) | 
| Safe Haskell | Safe-Inferred | 
| Language | Haskell2010 | 
Amazonka.SageMaker.Types.TransformInput
Description
Synopsis
- data TransformInput = TransformInput' {}
 - newTransformInput :: TransformDataSource -> TransformInput
 - transformInput_compressionType :: Lens' TransformInput (Maybe CompressionType)
 - transformInput_contentType :: Lens' TransformInput (Maybe Text)
 - transformInput_splitType :: Lens' TransformInput (Maybe SplitType)
 - transformInput_dataSource :: Lens' TransformInput TransformDataSource
 
Documentation
data TransformInput Source #
Describes the input source of a transform job and the way the transform job consumes it.
See: newTransformInput smart constructor.
Constructors
| TransformInput' | |
Fields 
  | |
Instances
Create a value of TransformInput with all optional fields omitted.
Use generic-lens or optics to modify other optional fields.
The following record fields are available, with the corresponding lenses provided for backwards compatibility:
$sel:compressionType:TransformInput', transformInput_compressionType - If your transform data is compressed, specify the compression type.
 Amazon SageMaker automatically decompresses the data for the transform
 job accordingly. The default value is None.
$sel:contentType:TransformInput', transformInput_contentType - The multipurpose internet mail extension (MIME) type of the data. Amazon
 SageMaker uses the MIME type with each http call to transfer data to the
 transform job.
$sel:splitType:TransformInput', transformInput_splitType - The method to use to split the transform job's data files into smaller
 batches. Splitting is necessary when the total size of each object is
 too large to fit in a single request. You can also use data splitting to
 improve performance by processing multiple concurrent mini-batches. The
 default value for SplitType is None, which indicates that input data
 files are not split, and request payloads contain the entire contents of
 an input object. Set the value of this parameter to Line to split
 records on a newline character boundary. SplitType also supports a
 number of record-oriented binary data formats. Currently, the supported
 record formats are:
- RecordIO
 - TFRecord
 
When splitting is enabled, the size of a mini-batch depends on the
 values of the BatchStrategy and MaxPayloadInMB parameters. When the
 value of BatchStrategy is MultiRecord, Amazon SageMaker sends the
 maximum number of records in each request, up to the MaxPayloadInMB
 limit. If the value of BatchStrategy is SingleRecord, Amazon
 SageMaker sends individual records in each request.
Some data formats represent a record as a binary payload wrapped with
 extra padding bytes. When splitting is applied to a binary data format,
 padding is removed if the value of BatchStrategy is set to
 SingleRecord. Padding is not removed if the value of BatchStrategy
 is set to MultiRecord.
For more information about RecordIO, see
 Create a Dataset Using RecordIO
 in the MXNet documentation. For more information about TFRecord, see
 Consuming TFRecord data
 in the TensorFlow documentation.
$sel:dataSource:TransformInput', transformInput_dataSource - Describes the location of the channel data, which is, the S3 location of
 the input data that the model can consume.
transformInput_compressionType :: Lens' TransformInput (Maybe CompressionType) Source #
If your transform data is compressed, specify the compression type.
 Amazon SageMaker automatically decompresses the data for the transform
 job accordingly. The default value is None.
transformInput_contentType :: Lens' TransformInput (Maybe Text) Source #
The multipurpose internet mail extension (MIME) type of the data. Amazon SageMaker uses the MIME type with each http call to transfer data to the transform job.
transformInput_splitType :: Lens' TransformInput (Maybe SplitType) Source #
The method to use to split the transform job's data files into smaller
 batches. Splitting is necessary when the total size of each object is
 too large to fit in a single request. You can also use data splitting to
 improve performance by processing multiple concurrent mini-batches. The
 default value for SplitType is None, which indicates that input data
 files are not split, and request payloads contain the entire contents of
 an input object. Set the value of this parameter to Line to split
 records on a newline character boundary. SplitType also supports a
 number of record-oriented binary data formats. Currently, the supported
 record formats are:
- RecordIO
 - TFRecord
 
When splitting is enabled, the size of a mini-batch depends on the
 values of the BatchStrategy and MaxPayloadInMB parameters. When the
 value of BatchStrategy is MultiRecord, Amazon SageMaker sends the
 maximum number of records in each request, up to the MaxPayloadInMB
 limit. If the value of BatchStrategy is SingleRecord, Amazon
 SageMaker sends individual records in each request.
Some data formats represent a record as a binary payload wrapped with
 extra padding bytes. When splitting is applied to a binary data format,
 padding is removed if the value of BatchStrategy is set to
 SingleRecord. Padding is not removed if the value of BatchStrategy
 is set to MultiRecord.
For more information about RecordIO, see
 Create a Dataset Using RecordIO
 in the MXNet documentation. For more information about TFRecord, see
 Consuming TFRecord data
 in the TensorFlow documentation.
transformInput_dataSource :: Lens' TransformInput TransformDataSource Source #
Describes the location of the channel data, which is, the S3 location of the input data that the model can consume.