Follow
Follow
is a Haskell library to build recipes which allow you to
follow the content published about any subject you are interested.
Here bellow you have a quick tutorial you can follow. Just run the
snippets of code in the repl.
:set -XOverloadedStrings
import Follow
import Data.Time (LocalTime)
import Control.Monad (join)
import qualified Data.Text.IO as T (writeFile)
import Data.Yaml (decodeFileThrough)
Subject
A subject is just a bunch of information about what is being
followed. It consists of a title, a description and a list of tags,
haskell =
Subject
{ sTitle = "Haskell"
, sDescription = "Some resources about Haskell"
, sTags = ["haskell", "programming"]
}
Directory
A directory is just a subject and a list of entries.
An entry is an item meant to contain an URI with content relative to
the associated subject along with associated information.
manualDirectory =
Directory
{ dSubject = haskell
, dEntries =
[ Entry
{ eURI =
Just "https://bartoszmilewski.com/2013/06/19/basics-of-haskell/"
, eGUID = Just "basics-of-haskell"
, eTitle = Just "Basics of Haskell"
, eDescription = Just "Introductory material for Haskell"
, eAuthor = Just "Bartosz Milewski"
, ePublishDate = Just (read "2013-06-19 14:14:00" :: LocalTime)
}
]
}
Fetchers
Of course, building list of entries by hand is not very
useful. Fetchers are functions which usually reach the outside world
to return a list of entries and which can throw an error.
Any fetcher can be used, but Follow
tries to ship with common
ones. Right now there are two fetchers available:
- Feed: Take entries from a RSS or Atom feed.
- Web Scraping: Take entries
scraping the HTML of a web page.
The function directoryFromFetched
can be used to glue a subject with
some fetched content:
import qualified Follow.Fetchers.Feed as Feed
directory =
directoryFromFetched (Feed.fetch "https://bartoszmilewski.com/feed/") haskell
Middlewares
Fetched content may need some further processing in order to fit what
is actually desired. A middleware is a function which transforms a
directory into another directory, allowing us to do any kind of
transformation.
The aim of Follow
is to provide some common middlewares. For now,
there are these ones:
- Filter: Filter entries according
some predicate.
- Sort: Sort entries.
- Decode: Decodes entries from
UTF8 or other encodings.
import qualified Follow.Middlewares.Sort as Sort
sortedDirectory =
Sort.apply (Sort.byGetter eTitle) <$> directory
Digesters
Once you have your distillate content, you need some way to consume
it. A Digester
is a function which transforms a directory into
anything that can be consumed by an end user.
As before, Follow
aims to provide useful ones out of the box. Right
now the following are available:
import qualified Follow.Digesters.SimpleText as SimpleText
content = SimpleText.digest <$> sortedDirectory
Now, for example, you are ready to save the content to a file:
join $ T.writeFile "/your/path/haskell.txt" <$> content
Recipes: Combining sources and middlewares
Content is not limited to be fetched from a single source. Instead, a
directory can be built merging the entries fetched from different
sources. Also, the stack of middlewares to be applied to each source can be
given in a single shot.
This whole process specification is called a Recipe
, and it contains
all the information needed to follow a subject.
To build the recipe you need to provide three fields:
- The subject being followed.
- A list of two field tuples where:
- First field is some fetched content.
- Second field is a list of middlewares to apply to the fetched content in the first field.
- A list of middlewares to apply to the directory resulted after applying the list of fetched/middlewares.
haskellRecipe =
Recipe
{ rSubject = haskell
, rSteps =
[ ( Feed.fetch "https://bartoszmilewski.com/feed/"
, [Sort.apply (Sort.byGetter eTitle)])
, (Feed.fetch "https://planet.haskell.org/rss20.xml", [])
]
, rMiddlewares = []
}
You can combine the function directoryFromRecipe
and some digester
to quickly consume a recipe:
SimpleText.digest <$> directoryFromRecipe haskellRecipe
Collecting recipes
One nice thing in Follow is that you don't need to create the recipes
programmatically each time you need them. Instead, you can store them
in a YAML file and just parse
them when you need.
For example, the previous recipe can be represented in a file
recipe.yml
as the following:
subject:
title: Haskell
description: Some resources about Haskell
tags: [haskell, programming]
steps:
-
- type: feed
options:
url: "https://bartoszmilewski.com/feed/"
-
- type: sort
options:
function:
type: by_field
options:
field: title
middlewares: []
You can use now decode functions in
Data.Yaml
to get the
recipe back:
recipe' <- decodeFileThrow "/your/path/recipe.yml" :: IO (Recipe IO)
directory' = directoryFromRecipe recipe'
Look at src/Follow/Parser.hs
for details about
encoding each kind of fetcher and middleware.
Contributing
Bug reports and pull requests are welcome on GitHub at
https://github.com/waiting-for-dev/follow. This project is
intended to be a safe, welcoming space for collaboration, and
contributors are expected to adhere to the Contributor
Covenant code of conduct.
License
The package is available as open source under the terms of the GNU
LGPLv3 License.