!X      !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~      !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\] ^ _ ` a b c d e f g h i j k l m n o p q r s t u v w x y z { | } ~                                                                                                                                                                   ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c d e f g h i j k l m n o p q r s t u v w x y z { | } ~                                                                                   !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~None&'.P4datasetsAReadAs is a datatype to describe data formats that hold data setsdatasetsA H contains metadata for loading, caching, preprocessing and parsing data.datasetsDataset sourcedatasetsTemporary directory (optional)datasets)Dataset preprocessing function (optional) datasetsA [ source can be either a URL (for remotely-hosted datasets) or the filepath of a local file. datasets?Load a dataset, using the system temporary directory as a cachedatasets*Get a ByteString from the specified Sourcedatasets0Parse a ByteString into a list of Haskell valuesdatasetsQA CSV record with default decoding options (i.e. columns are separated by commas)datasets-Define a dataset from a source for a CSV filedatasetsGDefine a dataset from a source for a CSV file, skipping the header linedatasetsADefine a dataset from a source for a CSV file with a known headerdatasetsODefine a dataset from a source for a CSV file with a known header and separatordatasets/Define a dataset from a source for a JSON file datasetstInclude a preprocessing stage to a Dataset: each field in the raw data will be preprocessed with the given function.datasetsdInclude a temporary directory for caching the dataset after this has been downloaded one first time.datasetsTurn dashes to CamelCasedatasets0Parse a field, first turning dashes to CamelCasedatasets-Parse a CSV field, based on its read instancedatasetsDrop lines from a bytestringdatasetsGTurn US-style decimals starting with a period (e.g. .2) into something cassava can parse (e.g. 0.2)datasets%Convert a Fixed-width format to a CSVdatasets-Filter out escaped double quotes from a fielddatasetsnConvert a fractional year to UTCTime with second-level precision (due to not taking into account leap seconds)datasets#The UMass machine learning database 5http://mlr.cs.umass.edu/ml/machine-learning-databasesdatasets!The UCI machine learning database| 9https://archive.ics.uci.edu/ml/machine-learning-databasesdatasetsCache directorydatasets!How to parse the raw data string datasetsThe data string      None.7Q !*)('&%$#"+-,./+-,. !*)('&%$#"/None.7RPF;<KJIHGDCBA@?>=FELNMOQPRWUTSVX^]\[ZY_mlkjihgfedcba`nutsrqpov~}|{zyxwFv~}|{zyxwnutsrqpo_mlkjihgfedcba`X^]\[ZYRWUTSVOQPLNM;<KJIHGDCBA@?>=FESafe7TNone.7U<None.7V..None.7WNone.7XF#$+*)('&%,.-/0432158769=<;:>9=<;:587604321,.-/#$+*)('&%> None.7Yt]^_]_^ None.7Y dekjihfgl dekjihfgl None7Zx qrwvutsx{zy| x{zyqrwvuts| None7[. None.7_`datasets ring-number: none=n,one=o,two=tdatasetsIs the mushroom edible?datasetsWhether a mushroom sample is poisonous or edible; the classification tasks involves predicting this label. data Classification = Poisonous | Edible deriving (Eq, Read, Show, Ord, Enum, Bounded, Generic)      !"#$%&'()*+,-./01233 !"#$%&'()*+,-./012     Nonedatasets$A type for date-tagged movie ratingsdatasets Test set itemdatasetsMovie dataset itemdatasetsMovie IDdatasetsTraining set itemdatasetsUser ID (anonymized)datasetsA date-tagged movie ratingdatasetsIThe training set (a set of text files) is assumed to be in the directory `datafiles/netflix/training/` relative to the repository rootdatasets?The test set (one text file) is assumed to be in the directory `datafiles/netflix/test/` relative to the repository rootdatasetsEThe movies dataset (one text file) is assumed to be in the directory `datafiles/netflix/movies/` relative to the repository rootdatasets^Parse the whole training set, convert to coordinate format and concatenate into a single list.datasetsvParse the whole training set and convert to coordinate format (each dataset file is parsed into a distinct inner list)datasetsZParse the whole test set, convert to coordinate format and concatenate into a single list.datasets9Parse the whole test set and convert to coordinate formatdatasets]Parse the whole movies file, convert to coordinate format and concatenate into a single list.datasetsThe first line of each training set file contains the movie id followed by a colon. Each subsequent line in the file corresponds to a rating from a customer and its date in the following format:CustomerID,Rating,Date,MovieIDs range from 1 to 17770 sequentially.GCustomerIDs range from 1 to 2649429, with gaps. There are 480189 users.8Ratings are on a five star (integral) scale from 1 to 5.!Dates have the format YYYY-MM-DD.datasets1The test set ("qualifying") file consists of lines indicating a movie id, followed by a colon, and then customer ids and rating dates, one per line for that movie id. The movie and customer ids are contained in the training set. Of course the ratings are withheld. There are no empty lines in the file.datasets-Movie information is in the following format:MovieID,YearOfRelease,TitleHMovieID do not correspond to actual Netflix movie ids or IMDB movie ids.YearOfRelease can range from 1890 to 2005 and may correspond to the release of corresponding DVD, not necessarily its theaterical release.qTitle is the Netflix movie title and may not correspond to titles used on other sites. Titles are in English.None7None.7 datasets waiting time until next eruption datasetsduration of eruption in minutes      None.7-None.7 #"! $ #"! $None.7a)*,+-)*,+-None.7_AdatasetsDid the passenger survive ?Bdatasets+The Titanic dataset, to be downloaded from Fhttps://raw.githubusercontent.com/JackStat/6003Data/master/Titanic.txtCdatasets-The Titanic dataset, parsed from a local copyDdatasetsThe AgeE field requires a custom FromField instance because its value may be NA23456789:;<=>?@ABCBC<=>?@A789:;56234None.7Z[^]\_Z[^]\_None.7 deihgfjlkm jlkdeihgfmNone.7yz~}|{yz~}|{None.7 !"#$%&'()*+,-./01234567889:;<=>?@ABCDEFGHIJKLMNOPQRRSTUVWXYZ[9\]^_`abcBdefghijklmnopqrstuvwxyz{|}~GIJKLMNS      !!"#$%&'())*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`a b c d e f g h i i j k l m n o p q r s t u u v w x y z { | } ~                                                                                                                                                                    ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c d e f g h i j k l m n o p q r s t u v w x y z { | } ~                                                                              c     3  "!"#$%&Bde''()*+,--./0123456789:;<=>?KJILNM@ABCDDjEFGHIJKLLk9VMBdeNGOIJKLMNPQRSSTUVWXYZ[\]^_`abcdefgghijklmnopqUrstuvwxyz {|}~%datasets-0.3.0-7Z79tuLu7Qn9R081MfsFatNumeric.DatasetsNumeric.Datasets.AbaloneNumeric.Datasets.AdultNumeric.Datasets.AnscombeNumeric.Datasets.BostonHousing&Numeric.Datasets.BreastCancerWisconsinNumeric.Datasets.CO2Numeric.Datasets.CarNumeric.Datasets.CoalNumeric.Datasets.GapminderNumeric.Datasets.IrisNumeric.Datasets.MichelsonNumeric.Datasets.MushroomNumeric.Datasets.NetflixNumeric.Datasets.NightingaleNumeric.Datasets.OldFaithfulNumeric.Datasets.QuakesNumeric.Datasets.StatesNumeric.Datasets.SunspotsNumeric.Datasets.TitanicNumeric.Datasets.UNNumeric.Datasets.VocabularyNumeric.Datasets.WineNumeric.Datasets.WineQualityReadAsJSON CSVRecordCSVNamedRecordDatasetsourcetemporaryDirectory preProcessreadAsSourceURLFile getDataset readDataset csvRecord csvDatasetcsvDatasetSkipHdr csvHdrDatasetcsvHdrDatasetSep jsonDatasetwithPreprocess withTempDirparseDashToCamelFieldparseReadField dropLinesfixAmericanDecimalsfixedWidthToCSVremoveEscQuotes yearToUTCTime umassMLDBuciMLDBAbalonesex abaloneLengthdiameterheight wholeWeight shuckedWeight visceraWeight shellWeightringsSexMFIabalone$fFromFieldSex$fFromRecordAbalone $fShowSex $fReadSex$fEqSex $fGenericSex $fBoundedSex $fEnumSex $fShowAbalone $fReadAbalone$fGenericAbaloneAdultage workClass finalWeight education educationNum maritalStatus occupation relationshiprace capitalGain capitalLoss hoursPerWeek nativeCountryincomeIncomeGT50KLE50KFemaleMaleRaceWhiteAsianPacIslanderAmerIndianEskimoOtherBlack RelationshipWifeOwnChildHusband NotInFamily OtherRelative Unmarried Occupation TechSupport CraftRepair OtherServiceSalesExecManagerial ProfSpecialtyHandlersCleanersMachineOpInspct AdmClericalFarmingFishingTransportMoving PrivHouseServProtectiveServ ArmedForces MaritalStatusMarriedCivSpouseDivorced NeverMarried SeparatedWidowedMarriedSpouseAbsentMarriedAFSpouse WorkClassPrivate SelfEmpNotInc SelfEmpInc FederalGovLocalGovStateGov WithoutPay NeverWorkedadult adultTestSet$fFromFieldWorkClass$fFromFieldMaritalStatus$fFromFieldOccupation$fFromFieldRelationship$fFromFieldRace$fFromFieldIncome$fFromRecordAdult$fShowWorkClass$fReadWorkClass $fEqWorkClass$fGenericWorkClass$fBoundedWorkClass$fEnumWorkClass$fShowMaritalStatus$fReadMaritalStatus$fEqMaritalStatus$fGenericMaritalStatus$fBoundedMaritalStatus$fEnumMaritalStatus$fShowOccupation$fReadOccupation$fEqOccupation$fGenericOccupation$fBoundedOccupation$fEnumOccupation$fShowRelationship$fReadRelationship$fEqRelationship$fGenericRelationship$fBoundedRelationship$fEnumRelationship $fShowRace $fReadRace$fEqRace $fGenericRace $fBoundedRace $fEnumRace $fShowIncome $fReadIncome $fEqIncome$fGenericIncome$fBoundedIncome $fEnumIncome $fShowAdult $fReadAdult$fGenericAdultanscombe anscombe1 anscombe2 anscombe3 anscombe4 BostonHousing crimeRatezoned industrial charlesRiver nitricOxidesroomsdistance radialHwytaxptRatiob lowerStatus medianValue bostonHousing$fFromRecordBostonHousing$fShowBostonHousing$fReadBostonHousing$fGenericBostonHousing CellFeaturesradius perimeterarea smoothness compactness concavity concavePointssymmetryfractalDimensionPrognosticBreastCancer prognosticID prognosisprognosticCellsDiagnosticBreastCancer diagnosticID diagnosisdiagnosticCellsBreastCancerEntrysampleCodeNumberclumpThicknessuniformityCellSizeuniformityCellShapemarginalAdhesionsingleEpithelialCellSize bareNucleiblandChromatinnormalNucleolimitosis sampleClass Prognosis Recurrent Nonrecurrent Diagnosis MalignantBenignintToDiagnosisbreastCancerDatabasecharToDiagnosischarToPrognosisdiagnosticBreastCancerprognosticBreastCancer$fFromRecordBreastCancerEntry$fFromRecordCellFeatures"$fFromRecordPrognosticBreastCancer"$fFromRecordDiagnosticBreastCancer$fShowDiagnosis$fReadDiagnosis $fEqDiagnosis$fGenericDiagnosis$fBoundedDiagnosis$fEnumDiagnosis$fShowPrognosis$fReadPrognosis $fEqPrognosis$fGenericPrognosis$fBoundedPrognosis$fEnumPrognosis$fShowBreastCancerEntry$fReadBreastCancerEntry$fGenericBreastCancerEntry$fShowCellFeatures$fReadCellFeatures$fGenericCellFeatures$fShowPrognosticBreastCancer$fReadPrognosticBreastCancer$fGenericPrognosticBreastCancer$fShowDiagnosticBreastCancer$fReadDiagnosticBreastCancer$fGenericDiagnosticBreastCancerCO2timevalue maunaLoaCO2$fFromNamedRecordCO2 $fShowCO2 $fReadCO2 $fGenericCO2Carbuying maintenancedoorspersons luggageBootsafety acceptabilityCountNNOrMoreMore Acceptability Unacceptable AcceptableGoodVeryGoodRelSizeSmallMediumBigRelScoreLowMedHighVeryHighcar$fFromFieldRelScore$fFromFieldRelSize$fFromFieldAcceptability$fFromFieldCount$fFromRecordCar$fShowRelScore$fReadRelScore $fEqRelScore$fGenericRelScore$fBoundedRelScore$fEnumRelScore $fShowRelSize $fReadRelSize $fEqRelSize$fGenericRelSize$fBoundedRelSize $fEnumRelSize$fShowAcceptability$fReadAcceptability$fEqAcceptability$fGenericAcceptability$fBoundedAcceptability$fEnumAcceptability $fShowCount $fReadCount $fEqCount$fGenericCount $fShowCar $fReadCar $fGenericCarCoaldatecoal$fFromRecordCoal $fShowCoal $fReadCoal $fGenericCoal Gapmindercountryyearpop continentlifeExp gdpPercap gapminder$fFromNamedRecordGapminder$fShowGapminder$fReadGapminder$fGenericGapminderIris sepalLength sepalWidth petalLength petalWidth irisClass IrisClassSetosa Versicolor Virginicairis$fFromFieldIrisClass$fFromRecordIris$fShowIrisClass$fReadIrisClass $fEqIrisClass$fOrdIrisClass$fGenericIrisClass$fEnumIrisClass$fBoundedIrisClass $fShowIris $fReadIris $fGenericIris michelsonHabitatGrassesLeavesMeadowsPathsUrbanWasteWoods PopulationAbundant ClusteredNumerous ScatteredSeveralSolitarySporePrintColorSPCBlackSPCBrownSPCBuff SPCChocolateSPCGreen SPCOrange SPCPurpleSPCWhite SPCYellowRingType RTCobwebby RTEvanescent RTFlaringRTLargeRTNone RTPendant RTSheathingRTZone RingNumberRNNoneRNOneRNTwo VeilColorVCBrownVCOrangeVCWhiteVCYellowVeilTypePartial UniversalStalkColorBelowRing SCBRBrownSCBRBuff SCBRCinnamonSCBRGray SCBROrangeSCBRPinkSCBRRed SCBRWhite SCBRYellowStalkColorAboveRing SCARBrownSCARBuff SCARCinnamonSCARGray SCAROrangeSCARPinkSCARRed SCARWhite SCARYellowStalkSurfaceBelowRing SSBRFibrous SSBRScaly SSBRSilky SSBRSmoothStalkSurfaceAboveRing SSARFibrous SSARScaly SSARSilky SSARSmooth StalkRootBulbousClubCupEqual RhizomorphsRooted StalkShape EnlargingTapering GillColorGCBlackGCBrownGCBuff GCChocolateGCGrayGCGreenGCOrangeGCPinkGCPurpleGCRedGCWhiteGCYellowGillSizeBroadNarrow GillSpacingCloseCrowdedDistantGillAttachmentAttached DescendingFreeNotchedOdorAlmondAniseCreosoteFishyFoulMustyNonePungentSpicyCapColorCCBrownCCBuff CCCinnamonCCGrayCCGreenCCPinkCCPurpleCCRedCCWhiteCCYellow CapSurface CSFibrous CSGroovesCSScalyCSSmoothCapShapeBellConicalConvexFlatKnobbedSunken MushroomEntryediblecapShape capSurfacecapColorbruisesodorgillAttachment gillSpacinggillSize gillColor stalkShape stalkRootstalkSurfaceAboveRingstalkSurfaceBelowRingstalkColorAboveRingstalkColorBelowRingveilType veilColor ringNumberringTypesporePrintColor populationhabitatmushroom$fFromRecordMushroomEntry $fEqCapShape$fReadCapShape$fShowCapShape $fOrdCapShape$fEnumCapShape$fBoundedCapShape$fGenericCapShape$fEqCapSurface$fReadCapSurface$fShowCapSurface$fOrdCapSurface$fEnumCapSurface$fBoundedCapSurface$fGenericCapSurface $fEqCapColor$fReadCapColor$fShowCapColor $fOrdCapColor$fEnumCapColor$fBoundedCapColor$fGenericCapColor$fEqOdor $fReadOdor $fShowOdor $fOrdOdor $fEnumOdor $fBoundedOdor $fGenericOdor$fEqGillAttachment$fReadGillAttachment$fShowGillAttachment$fOrdGillAttachment$fEnumGillAttachment$fBoundedGillAttachment$fGenericGillAttachment$fEqGillSpacing$fReadGillSpacing$fShowGillSpacing$fOrdGillSpacing$fEnumGillSpacing$fBoundedGillSpacing$fGenericGillSpacing $fEqGillSize$fReadGillSize$fShowGillSize $fOrdGillSize$fEnumGillSize$fBoundedGillSize$fGenericGillSize $fEqGillColor$fReadGillColor$fShowGillColor$fOrdGillColor$fEnumGillColor$fBoundedGillColor$fGenericGillColor$fEqStalkShape$fReadStalkShape$fShowStalkShape$fOrdStalkShape$fEnumStalkShape$fBoundedStalkShape$fGenericStalkShape $fEqStalkRoot$fReadStalkRoot$fShowStalkRoot$fOrdStalkRoot$fEnumStalkRoot$fBoundedStalkRoot$fGenericStalkRoot$fEqStalkSurfaceAboveRing$fReadStalkSurfaceAboveRing$fShowStalkSurfaceAboveRing$fOrdStalkSurfaceAboveRing$fEnumStalkSurfaceAboveRing$fBoundedStalkSurfaceAboveRing$fGenericStalkSurfaceAboveRing$fEqStalkSurfaceBelowRing$fReadStalkSurfaceBelowRing$fShowStalkSurfaceBelowRing$fOrdStalkSurfaceBelowRing$fEnumStalkSurfaceBelowRing$fBoundedStalkSurfaceBelowRing$fGenericStalkSurfaceBelowRing$fEqStalkColorAboveRing$fReadStalkColorAboveRing$fShowStalkColorAboveRing$fOrdStalkColorAboveRing$fEnumStalkColorAboveRing$fBoundedStalkColorAboveRing$fGenericStalkColorAboveRing$fEqStalkColorBelowRing$fReadStalkColorBelowRing$fShowStalkColorBelowRing$fOrdStalkColorBelowRing$fEnumStalkColorBelowRing$fBoundedStalkColorBelowRing$fGenericStalkColorBelowRing $fEqVeilType$fReadVeilType$fShowVeilType $fOrdVeilType$fEnumVeilType$fBoundedVeilType$fGenericVeilType $fEqVeilColor$fReadVeilColor$fShowVeilColor$fOrdVeilColor$fEnumVeilColor$fBoundedVeilColor$fGenericVeilColor$fEqRingNumber$fReadRingNumber$fShowRingNumber$fOrdRingNumber$fEnumRingNumber$fBoundedRingNumber$fGenericRingNumber $fEqRingType$fReadRingType$fShowRingType $fOrdRingType$fEnumRingType$fBoundedRingType$fGenericRingType$fEqSporePrintColor$fReadSporePrintColor$fShowSporePrintColor$fOrdSporePrintColor$fEnumSporePrintColor$fBoundedSporePrintColor$fGenericSporePrintColor$fEqPopulation$fReadPopulation$fShowPopulation$fOrdPopulation$fEnumPopulation$fBoundedPopulation$fGenericPopulation $fEqHabitat $fReadHabitat $fShowHabitat $fOrdHabitat $fEnumHabitat$fBoundedHabitat$fGenericHabitat$fShowMushroomEntry$fReadMushroomEntry$fGenericMushroomEntryRDrdRatingrdDateTest testRatingMoviemovieId releaseYear movieTitleMovieIdTrain trainRatingratingUserId RatingDateuserId ratingDate trainingSettestSetmoviesparseTrainingSet parseTestSet parseMovies $fShowUserId $fShowMovieId $fEqUserId$fEqRatingDate$fShowRatingDate $fEqTrain $fShowTrain $fEqMovieId $fEqMovie $fShowMovie$fEqTest $fShowTest$fEqCol $fShowCol $fEqTrainCol$fShowTrainCol $fEqTestCol $fShowTestCol$fEqRD$fShowRD Nightingale army_sizediseasewoundsother nightingale$fFromJSONNightingale$fShowNightingale$fReadNightingale$fGenericNightingale OldFaithfulwaitingduration oldFaithful$fFromRecordOldFaithful$fShowOldFaithfulQuakelatlongdepthmagstationsquakes$fFromNamedRecordQuake $fShowQuake $fReadQuake$fGenericQuakeStateEdustateregion satVerbalsatMath satPercent dollarSpend teacherPaystates$fFromNamedRecordStateEdu$fShowStateEdu$fReadStateEdu$fGenericStateEduSunspot sunspotMonthsunspots$fFromNamedRecordSunspot $fShowSunspot $fReadSunspot$fGenericSunspotAgeClassFirstSecondThirdCrew TitanicEntrytClasstAgetSex tSurvived titanicRemote titanicLocal$fFromFieldAge$fFromNamedRecordTitanicEntry $fEqClass $fReadClass $fShowClass$fGenericClass $fEnumClass$fBoundedClass$fEqAge $fReadAge $fShowAge $fGenericAge$fEqTitanicEntry$fReadTitanicEntry$fShowTitanicEntry$fGenericTitanicEntry GdpMortalityinfantMortalitygdpgdpMortalityUN$fFromNamedRecordGdpMortality$fShowGdpMortality$fReadGdpMortality$fGenericGdpMortalityVocab vocabularyvocab$fFromNamedRecordVocab $fShowVocab $fReadVocab$fGenericVocabWine wineClassalcohol malicAcidash ashAlcalinity magnesium totalPhenols flavanoidsnonflavanoidPhenolsproanthocyaninscolorIntensityhuedilutedOD280toOD315prolinewine$fFromRecordWine $fShowWine $fReadWine $fGenericWine WineQuality fixedAcidityvolatileAcidity citricAcid residualSugar chloridesfreeSulfurDioxidetotalSulfurDioxidedensitypH sulphatesqualityredWineQualitywhiteWineQuality$fFromNamedRecordWineQuality$fShowWineQuality$fReadWineQuality$fGenericWineQualitygetFileFromSourcedashToCamelCasecharToClassificationparseTrainingSet' parseTestSet'trainingSetParser testSetParser moviesParser