Posted By

softmechanics on 01/21/10

Tagged

Versions (?)

Last Edited at 01/21/10 04:00pm

Statistics

Viewed 1213 times

Favorited by 1 user(s)

Related snippets

Simple Broadcatcher in Haskell/HSH

/ Published in: Haskell

HSH is a cool haskell library that allows you to leverage your shell scripting prowess in haskell programs. In this simple broadcatcher, I use curl for http get, and other standard unix tools for tracking history (so we don't get the same file twice). The feed parsing and filtering is done in haskell using the Text.Feed and Text.Regex libraries.

Note: if you decide to use this in real life, be sure to respect your feed's time to live (ttl) in your crontab.

Expand | Embed | Plain Text

Copy this code and paste it in your HTML

#!/usr/bin/env runhaskell
 
import Char
import Data.List
import HSH
import Maybe
import Text.Feed.Import
import Text.Feed.Query
import Text.Regex.Posix
 
-- CONFIGURATION --
dlDir = "/path/to/download/dir/"
historyFile = "/path/to/download/history.log"
 
any_patterns = ["some.*thing", "something.*else", "etc"]
all_patterns = ["every.*thing"]
none_patterns = ["some.*boring.*thing"]
 
feed_url = "http://my/feed.rss"
 
-- curl cli flags (see man curl)
curl_opts = ""
 
-- END CONFIGURATION --
 
curl = "curl -s " ++ curl_opts
fetchFeed = curl ++ "\"" ++ feed_url ++ "\""
fetchFiles = "(cd " ++ dlDir ++ " && xargs -r " ++ curl ++ " -O)"
 
withCurry f g           = curry $ f . uncurry g
matches patterns title  = map (\p -> title =~ p :: Bool) patterns
match_any               = any id `withCurry` matches
match_all               = all id `withCurry` matches 
match_none              = all not `withCurry` matches 
 
filters = [match_any any_patterns, match_all all_patterns, match_none none_patterns]
 
-- filter using a list of predicates
allPreds fs      = flip all fs . flip ($)
 
filterSubscriptions lines =
  case parseFeedString $ unlines lines of
       Just feed  -> map link $ doFilter $ mapMaybe titleAndLink (getFeedItems feed)
       Nothing    -> error "feed parse failed"
  where title (x, _) = x
        link (_,x) = x
        titleAndLink item = do title <- getItemTitle item
                               link  <- getItemLink item
                               return (title, link)
        doFilter          = filter (allPreds filters . map toLower . title)
 
checkHistory = "bash -c \"sort | diff <(sort " ++ historyFile ++ ") - | sed -n 's/^> //p' | tee -a " ++ historyFile ++ "\""
 
test = runIO $ "cat /tmp/feed.xml" -|- filterSubscriptions
main = runIO $ fetchFeed -|- "tee /tmp/feed.xml" -|- filterSubscriptions -|- checkHistory -|- fetchFiles

Report this snippet Tweet

Comments

Subscribe to comments

Comment:

You need to login to post a comment.