| Safe Haskell | None |
|---|---|
| Language | Haskell2010 |
Web.Feed.Collect
Contents
Description
Intro
This module allows you to run a loop which visits web feeds (RSS, Atom, etc.) periodically and reports new items using a function you provide. You can also insert control commands into the loop, such as adding and removing a feed, so that stopping and restarting it isn't required.
Both http and https URLs are supported.
The original use case which motivated the creation of this library is FunBot.
Running
The run function runs the loop, and it takes a collector callback function
(as one of its parameters) to be called when new feed items are found.
Here is a simple usage example.
import Data.Default.Class (def)
import Data.Time.Interval
import Data.Time.Units
import Web.Feed.Collect
collect :: Label -> URL -> Feed -> Item -> IO ()
collect label url feed item = do
putStrLn $ label ++ " : " ++ url
putStrLn "Got a new feed item!"
putStrLn $ showFeed feed
putStrLn $ showItem item
logError :: Label -> Error -> IO ()
logError l e = putStrLn $ l ++ " : " ++ show e
feeds :: [(Label, Url)]
feeds = [("democ-now", "http://www.democracynow.org/democracynow.rss")]
main :: IO ()
main = run def
{ wcCollect = collect
, wcLogError = logError
, wcVisitInterval = time (1 :: Minute)
, wcMaxItems = 3
, wcFeeds = fromPairs feeds
}For quick testing, you can use one of the predefined collector functions.
For example, there is collectorNull which discards the feed items and does
nothing (you can use it e.g. when testing success of HTTP requests or
debugging the library itself), and collectorPretty which writes a short
nicely formatted entry to stdout for each new feed item. And there are more.
If new items aren't being detected correctly, you can enable debugging
through the wcDebug field. You can get detailed logs of the detection
process. You can tail -f a log file from your terminal to watch new log
entries get appended to it.
Using Control Commands
Now let's see how to push control commands into the loop. The
wcCommandQueue field is an optional command queue for making changes
while the watcher runs. In the example above we didn't provide one. Now
let's provide a queue and use it.
Suppose we are writing a program with a command-line interface. It watches news feeds in the background, and can take commands from the user at the same time on the command-line. Feeds can be added, removed, etc. without restarting the program.
Assume we've written a function named parseCommand, which takes a line of
user input and returns a command ready to push into the queue. The parsed
commands are of type Command and are created using functions like
addFeed, removeFeed and so on, which this module provides. Using our
parseCommand we can write the program like this:
import Control.Concurrent (forkIO)
import Data.Default.Class (def)
import Data.Time.Interval
import Data.Time.Units
import System.IO
import Web.Feed.Collect
collect :: Label -> URL -> Feed -> Item -> IO ()
collect = collectPretty
feed :: FeedConfig
feed = mkFeed "fsf-news" "https://www.fsf.org/static/fsforg/rss/news.xml"
parseCommand :: String -> Maybe Command
parseCommand line = {- ... - }
main :: IO ()
main = do
cqueue <- newCommandQueue
forkIO $ run def
{ wcCollect = collect
, wcCommandQueue = Just cqueue
, wcFeeds = [feed]
}
let loop = do
line <- getLine
if line == "quit"
then putStrLn "Bye!"
else do
case parseCommand line of
Just cmd -> sendCommand cqueue cmd
Nothing -> putStrLn "Invalid input"
loop
loop- type Label = String
- type Url = String
- data Error
- data FeedConfig
- fcLabel :: FeedConfig -> Label
- fcUrl :: FeedConfig -> Url
- fcActive :: FeedConfig -> Bool
- fcDebug :: FeedConfig -> Bool
- mkFeed :: Label -> Url -> FeedConfig
- fromPairs :: [(Label, Url)] -> [FeedConfig]
- data WatcherConfig m
- wcCollect :: WatcherConfig m -> Label -> Url -> Feed -> Item -> m ()
- wcCollectMany :: WatcherConfig m -> Maybe (Label -> Url -> Feed -> [Item] -> m ())
- wcLogError :: WatcherConfig m -> Label -> Error -> m ()
- wcCommandQueue :: WatcherConfig m -> Maybe CommandQueue
- wcVisitInterval :: WatcherConfig m -> TimeInterval
- wcMaxItems :: WatcherConfig m -> Int
- wcFeeds :: WatcherConfig m -> [FeedConfig]
- wcDebug :: WatcherConfig m -> DebugConfig
- collectorNull :: Label -> Url -> Feed -> Item -> IO ()
- collectorPrint :: Label -> Url -> Feed -> Item -> IO ()
- collectorPretty :: Label -> Url -> Feed -> Item -> IO ()
- collectorLog :: (Label -> FilePath) -> Label -> Url -> Feed -> Item -> IO ()
- run :: MonadIO m => WatcherConfig m -> m ()
- data CommandQueue
- newCommandQueue :: IO CommandQueue
- sendCommand :: CommandQueue -> Command -> IO ()
- sendCommands :: CommandQueue -> [Command] -> IO ()
- data Command
- addFeed :: FeedConfig -> Command
- removeFeed :: Label -> Command
- setFeedActive :: Label -> Bool -> Command
- setInterval :: TimeUnit t => t -> Command
- setMaxPerVisit :: Int -> Command
- showFeed :: Feed -> String
- showItem :: Item -> String
- data DebugConfig = DebugConfig {
- dcDebugCycle :: Bool
- dcGeneralLog :: FilePath
- dcFeedLog :: Label -> FilePath
- setFeedDebug :: Label -> Bool -> Command
- setGeneralDebug :: Bool -> Command
Misc Types
An error occuring while reading from a news feed.
Constructors
| HttpError HttpException | Error while creating an HTTP request or receiving a response. |
| FeedParsingFailed Url | Error while parsing the HTTP response body into feed content |
Specifying Feeds
data FeedConfig Source
Details of a news feed to be watched. Some of these are just initial settings which you can change while the watcher runs, using the command queue.
You can use def and override fields a needed. In that case you should
specify at least the label and URL! Alternatively, use mkFeed or
fromPairs.
Instances
fcLabel :: FeedConfig -> Label Source
A short name tag by which you can refer to the feed.
fcUrl :: FeedConfig -> Url Source
The HTTP or HTTPS URL of the feed.
fcActive :: FeedConfig -> Bool Source
Whether the feed should initially be active. Disabled feeds aren't visited (except once to remember the old feed items). You can enable and disable feeds later, while the watcher runs, using the command queue.
fcDebug :: FeedConfig -> Bool Source
Whether per-feed debug logs should be written for this feed. These are detailed messages listing the results of each step of the new item detection process. You can later toggle this setting per feed using the command queue.
mkFeed :: Label -> Url -> FeedConfig Source
Initialize feed details from a feed label and its URL.
fromPairs :: [(Label, Url)] -> [FeedConfig] Source
Create a list of feed configs from a list of label-url pairs.
Watcher Configuration
data WatcherConfig m Source
Feed watcher behavior description. To create one, use def and override
fields as needed.
Instances
| MonadIO m => Default (WatcherConfig m) Source |
wcCollect :: WatcherConfig m -> Label -> Url -> Feed -> Item -> m () Source
Collector, i.e. action to perform when receiving a feed item.
wcCollectMany :: WatcherConfig m -> Maybe (Label -> Url -> Feed -> [Item] -> m ()) Source
wcLogError :: WatcherConfig m -> Label -> Error -> m () Source
Error logging action. It will be called when an error occurs while trying to download and parse a feed. Such an error doesn't cause anything to stop. Execution simply goes on to read the next feed, and will try the erronous feed again in the next round.
wcCommandQueue :: WatcherConfig m -> Maybe CommandQueue Source
A command queue you can use to change settings while the program
runs. For example, you can add a new feed or change the interval
between polls without relaunching run.
wcVisitInterval :: WatcherConfig m -> TimeInterval Source
Time interval between visits of a watched feed.
wcMaxItems :: WatcherConfig m -> Int Source
Maximal number of items to collect per visit (if more are available, they will be collected in the next visit).
wcFeeds :: WatcherConfig m -> [FeedConfig] Source
List of feeds and their configuration details.
wcDebug :: WatcherConfig m -> DebugConfig Source
Debugging options. Useful when new items aren't being detected as expected.
Collectors
collectorNull :: Label -> Url -> Feed -> Item -> IO () Source
An item collector which discards the item, i.e. does nothing.
collectorPrint :: Label -> Url -> Feed -> Item -> IO () Source
An item collector which prints feed and item fields to stdout.
collectorPretty :: Label -> Url -> Feed -> Item -> IO () Source
An item collector which prints short friendly feed and item descriptions
to stdout.
collectorLog :: (Label -> FilePath) -> Label -> Url -> Feed -> Item -> IO () Source
An item collector which writes friendly descriptions into a log file, determining the log file's name using the function given as the first argument (the feed label, i.e. second argument, is also used as the argument for that function).
Running
run :: MonadIO m => WatcherConfig m -> m () Source
Watch feeds and perform the given action on received feed items.
Control Commands
data CommandQueue Source
newCommandQueue :: IO CommandQueue Source
Create a new empty command queue.
sendCommand :: CommandQueue -> Command -> IO () Source
Send a command into the queue.
sendCommands :: CommandQueue -> [Command] -> IO () Source
Send a series of commands into the queue.
addFeed :: FeedConfig -> Command Source
Add a new feed to watch.
removeFeed :: Label -> Command Source
Remove a previously added feed.
setFeedActive :: Label -> Bool -> Command Source
Set whether a given feed should be watched (active) or not (inactive).
setInterval :: TimeUnit t => t -> Command Source
Set the interval, in microseconds, between feed scans.
setMaxPerVisit :: Int -> Command Source
Set the maximal number of feed items to be collected per feed per scan. If more new items are found, they wait for the next scan.
Utilities and Debugging
data DebugConfig Source
Debugging options. Useful when new items aren't being detected as
expected. The Default instance has all debugging off. These are just the
initial settings: you can change some of them while the feed watcher runs,
using the command queue.
Since debugging is expected to be used only during development, changes to the fields of this type will cause only a micro version change in the package version. If you think some options here are useful for more than debugging, please contact the author.
Constructors
| DebugConfig | |
Fields
| |
Instances
setFeedDebug :: Label -> Bool -> Command Source
Enable or disable debug logs for a given feed.
setGeneralDebug :: Bool -> Command Source
Enable or disable general (i.e. not feed-specific) debug logs.