Safe Haskell | Safe-Inferred |
---|
Functionality for manipulating KEGG annotations.
KEGG is a bit hard find, but there exist species-specific tables Available organisms are listed in the table at
ftp://ftp.genome.jp/pub/kegg/genes/etc/all_species.tab
Data for each organism is stored its own subdirectory under
ftp://ftp.genome.jp/pub/kegg/genes/organisms/
Containing tables linking everything -- including external resources like UniProt, PDB, or NCBI -- together.
- genReadKegg :: FilePath -> IO [(ByteString, ByteString)]
- newtype KO = KO ByteString
- decodeUP :: ByteString -> UniProtAcc
- decodeKO :: ByteString -> KO
- removePrefix :: String -> String -> (ByteString -> a) -> ByteString -> a
Documentation
genReadKegg :: FilePath -> IO [(ByteString, ByteString)]Source
Most KEGG files that contain associations, have one association per line, consisting of two items separated by whitespace. This is a generalized reader function.
decodeUP :: ByteString -> UniProtAccSource
Convert UniProt IDs (up:xxxxxx) to the UniProtAcc type.
decodeKO :: ByteString -> KOSource
Convert KO IDs (ko:xxxxx) to the KO data type.
removePrefix :: String -> String -> (ByteString -> a) -> ByteString -> aSource
KEGG uses strings with an identifying prefix for IDs. This helper function checks and removes prefix to construct native values.