Safe Haskell | None |
---|---|
Language | Haskell2010 |
Creates a DataSource
object. A DataSource
references data that can be used to
perform CreateMLModel
, CreateEvaluation
, or CreateBatchPrediction
operations.
CreateDataSourceFromS3
is an asynchronous operation. In response to CreateDataSourceFromS3
, Amazon Machine Learning (Amazon ML) immediately returns and sets the DataSource
status to PENDING
. After the DataSource
is created and ready for use, Amazon
ML sets the Status
parameter to COMPLETED
. DataSource
in COMPLETED
or PENDING
status can only be used to perform CreateMLModel
, CreateEvaluation
or CreateBatchPrediction
operations.
If Amazon ML cannot accept the input source, it sets the Status
parameter
to FAILED
and includes an error message in the Message
attribute of the GetDataSource
operation response.
The observation data used in a DataSource
should be ready to use; that is,
it should have a consistent structure, and missing data values should be kept
to a minimum. The observation data must reside in one or more CSV files in an
Amazon Simple Storage Service (Amazon S3) bucket, along with a schema that
describes the data items by name and type. The same schema must be used for
all of the data files referenced by the DataSource
.
After the DataSource
has been created, it's ready to use in evaluations and
batch predictions. If you plan to use the DataSource
to train an MLModel
, the DataSource
requires another item: a recipe. A recipe describes the
observation variables that participate in training an MLModel
. A recipe
describes how each input variable will be used in training. Will the variable
be included or excluded from training? Will the variable be manipulated, for
example, combined with another variable, or split apart into word
combinations? The recipe provides answers to these questions. For more
information, see the Amazon Machine Learning Developer Guide.
- data CreateDataSourceFromS3
- createDataSourceFromS3 :: Text -> S3DataSpec -> CreateDataSourceFromS3
- cdsfsComputeStatistics :: Lens' CreateDataSourceFromS3 (Maybe Bool)
- cdsfsDataSourceId :: Lens' CreateDataSourceFromS3 Text
- cdsfsDataSourceName :: Lens' CreateDataSourceFromS3 (Maybe Text)
- cdsfsDataSpec :: Lens' CreateDataSourceFromS3 S3DataSpec
- data CreateDataSourceFromS3Response
- createDataSourceFromS3Response :: CreateDataSourceFromS3Response
- cdsfsrDataSourceId :: Lens' CreateDataSourceFromS3Response (Maybe Text)
Request
data CreateDataSourceFromS3 Source
Request constructor
CreateDataSourceFromS3
constructor.
The fields accessible through corresponding lenses are:
Request lenses
cdsfsComputeStatistics :: Lens' CreateDataSourceFromS3 (Maybe Bool) Source
The compute statistics for a DataSource
. The statistics are generated from
the observation data referenced by a DataSource
. Amazon ML uses the
statistics internally during an MLModel
training. This parameter must be set
to true
if the ''DataSource'' needs to be used for MLModel
training
cdsfsDataSourceId :: Lens' CreateDataSourceFromS3 Text Source
A user-supplied identifier that uniquely identifies the DataSource
.
cdsfsDataSourceName :: Lens' CreateDataSourceFromS3 (Maybe Text) Source
A user-supplied name or description of the DataSource
.
cdsfsDataSpec :: Lens' CreateDataSourceFromS3 S3DataSpec Source
The data specification of a DataSource
:
DataLocationS3 - Amazon Simple Storage Service (Amazon S3) location of the observation data.
DataSchemaLocationS3 - Amazon S3 location of the DataSchema
.
DataSchema - A JSON string representing the schema. This is not required if DataSchemaUri
is specified.
DataRearrangement - A JSON string representing the splitting requirement of
a Datasource
.
Sample - ' "{"randomSeed":"some-random-seed","splitting":{"percentBegin":10,"percentEnd":60}}"'
Response
Response constructor
createDataSourceFromS3Response :: CreateDataSourceFromS3Response Source
CreateDataSourceFromS3Response
constructor.
The fields accessible through corresponding lenses are:
Response lenses
cdsfsrDataSourceId :: Lens' CreateDataSourceFromS3Response (Maybe Text) Source
A user-supplied ID that uniquely identifies the datasource. This value should
be identical to the value of the DataSourceID
in the request.