GZIp and MMap IO for ByteStrings, encoding utilities, and miscellaneous functions for Data.ByteString
- unsafeWithInternals :: ByteString -> (Ptr Word8 -> Int -> IO a) -> IO a
- unpackPSFromUTF8 :: ByteString -> String
- packStringToUTF8 :: String -> ByteString
- gzReadFilePS :: FilePath -> IO ByteString
- mmapFilePS :: FilePath -> IO ByteString
- gzWriteFilePS :: FilePath -> ByteString -> IO ()
- gzWriteFilePSs :: FilePath -> [ByteString] -> IO ()
- gzReadStdin :: IO ByteString
- isGZFile :: FilePath -> IO (Maybe Int)
- gzDecompress :: Maybe Int -> ByteString -> ([ByteString], Bool)
- dropSpace :: ByteString -> ByteString
- breakSpace :: ByteString -> (ByteString, ByteString)
- linesPS :: ByteString -> [ByteString]
- unlinesPS :: [ByteString] -> ByteString
- hashPS :: ByteString -> Int32
- breakFirstPS :: Char -> ByteString -> Maybe (ByteString, ByteString)
- breakLastPS :: Char -> ByteString -> Maybe (ByteString, ByteString)
- substrPS :: ByteString -> ByteString -> Maybe Int
- readIntPS :: ByteString -> Maybe (Int, ByteString)
- isFunky :: ByteString -> Bool
- fromHex2PS :: ByteString -> ByteString
- fromPS2Hex :: ByteString -> ByteString
- betweenLinesPS :: ByteString -> ByteString -> ByteString -> Maybe ByteString
- breakAfterNthNewline :: Int -> ByteString -> Maybe (ByteString, ByteString)
- breakBeforeNthNewline :: Int -> ByteString -> (ByteString, ByteString)
- intercalate :: ByteString -> [ByteString] -> ByteString
- decodeLocale :: ByteString -> String
- encodeLocale :: String -> ByteString
- decodeString :: String -> String
Do something with the internals of a PackedString. Beware of altering the contents!
ByteString containing UTF-8 to a
String. Decoding errors are
flagged with the U+FFFD character.
Read an entire file, which may or may not be gzip compressed, directly
Like readFilePS, this reads an entire file directly into a
ByteString, but it is even more efficient. It involves directly
mapping the file to memory. This has the advantage that the contents of
the file never need to be copied. Also, under memory pressure the page
may simply be discarded, wile in the case of readFilePS it would need to
be written to swap. If you read many small files, mmapFilePS will be
less memory-efficient than readFilePS, since each mmapFilePS takes up a
separate page of memory. Also, you can run into bus errors if the file
is modified. NOTE: as with
readFilePS, the string representation in
the file is assumed to be ISO-8859-1.
Read standard input, which may or may not be gzip compressed, directly
Decompress the given bytestring into a lazy list of chunks, along with a boolean flag indicating (if True) that the CRC was corrupted. Inspecting the flag will cause the entire list of chunks to be evaluated (but if you throw away the list immediately this should run in constant space).
dropSpace efficiently returns the
ByteString argument with
white space Chars removed from the front. It is more efficient than
calling dropWhile for removing whitespace. I.e.
dropWhile isSpace == dropSpace
breakSpace returns the pair of ByteStrings when the argument is
broken at the first whitespace byte. I.e.
break isSpace == breakSpace
This function acts exactly like the Prelude unlines function, or like
unlines, but with one important difference: it will
produce a string which may not end with a newline! That is:
unlinesPS ["foo", "bar"]
evaluates to "foo\nbar", not "foo\nbar\n"! This point should hold true for
linesPS as well.
TODO: rename this function.
readIntPS skips any whitespace at the beginning of its argument, and reads an Int from the beginning of the PackedString. If there is no integer at the beginning of the string, it returns Nothing, otherwise it just returns the int read, along with a B.ByteString containing the remainder of its input.
betweenLinesPS returns the B.ByteString between the two lines given, or Nothing if they do not appear.
Decode a ByteString to a String according to the current locale unsafePerformIO in the locale function is ratified by the fact that GHC 6.12 and above also supply locale conversion with functions with a pure type. Unrecognized byte sequences in the input are skipped.
Encode a String to a ByteString according to the current locale