I've been playing around with Clojure a lot recently. Here is a little example of adding parallelism to your clojure code.
Let's say you want to get some weather data from Weatherbug.
We define a method
raw-weatherdata-from-zipcode that will:
- construct a URL string from the
zipcodeparameter (and my API key that I've left out)
- parse that data into an xml structure
- coerce the xml structure into a sequence
(use '[clojure.xml :as cxml]) (defn raw-weatherdata-from-zipcode [^String zipcode] (-> (str "http://api.wxbug.net/getLiveCompactWeatherRSS.aspx?ACode=" apikey "&zipcode=" zipcode "&unittype=0") cxml/parse xml-seq))
Now that we have a sequence, we can easily filter out the XML elements we want via their tag:
(defn struct-from-weathertag [tag xs] (first (filter #(= tag (:tag %)) xs)))
Putting it all together in a function:
(defn city-condition-by-zip [^String zipcode] (let [weather (raw-weatherdata-from-zipcode zipcode) station (struct-from-weathertag :aws:station weather) cc (struct-from-weathertag :aws:current-condition weather)] (str (-> station :attrs :name) ": " (-> cc :content first))))
We can now get the weather for individual zipcodes:
user=> (city-condition-by-zip "01602") "Worcester Regional Airport: Rain Showers"
But what if we need to get the weather for a 100 zipcodes. We can do that simply with the
map function; so, if we wanted to get the weather conditions for 100 of the
same zipcode sequentially, we could:
(time (let [result (map city-condition-by-zip (repeat 100 "01602"))] (println result))) ... "Elapsed time: 6287.561 msecs"
That doesn't seem terribly efficient because each result is independent. It would certainly
take a lot of code to parallelize this process, right? Nope. One measely letter,
gets us to where we need to be:
(time (let [result (pmap city-condition-by-zip (repeat 100 "01602"))] (println result))) ... "Elapsed time: 857.029 msecs"