Safe Haskell | None |
---|---|
Language | Haskell2010 |
Streaming decoders for the IDX format used in the MNIST handwritten digit recognition dataset [1].
Both sparse and dense decoders are provided. In either case, the range of the data is the same as the raw data (one unsigned byte per pixel).
Links
Synopsis
- sourceIdxLabels :: MonadResource m => (ByteString -> Either e o) -> FilePath -> ConduitT i (Either e o) m r
- mnistLabels :: ByteString -> Either String Int
- sourceIdx :: MonadResource m => FilePath -> ConduitT a (Vector Word8) m ()
- sourceIdxSparse :: MonadResource m => FilePath -> ConduitT a (Sparse Word8) m ()
- data Sparse a
- sBufSize :: Sparse a -> Int
- sNzComponents :: Sparse a -> Vector (Int, a)
Labels
:: MonadResource m | |
=> (ByteString -> Either e o) | parser for the labels, where the bytestring buffer contains exactly one unsigned byte |
-> FilePath | filepath of uncompressed IDX labels file |
-> ConduitT i (Either e o) m r |
Outputs the labels corresponding to the data
mnistLabels :: ByteString -> Either String Int Source #
Parser for the labels, can be plugged in as an argument to sourceIdxLabels
Data
Dense data buffers
:: MonadResource m | |
=> FilePath | filepath of uncompressed IDX data file |
-> ConduitT a (Vector Word8) m () |
Outputs dense data buffers in the 0-255 range
In the case of MNIST dataset, 0 corresponds to the background of the image.
Sparse data buffers
:: MonadResource m | |
=> FilePath | filepath of uncompressed IDX data file |
-> ConduitT a (Sparse Word8) m () |
Outputs sparse data buffers (i.e without zero components)
This incurs at least one additional data copy of each vector, but the resulting vectors take up less space.
Sparse buffer (containing only nonzero entries)