So I realize my previous post was less descriptive than it could’ve been. And it still will be, but here is the story.
The “project” I’m working on is to open a Fiddler session archive — that tool captures HTTP traffic, and can save the results…
The session archive I am interested is Tracer results from PegaSystems PRPC tools (PegaRules Process Commander), which is a series of XML results from the server (requested by the IE window open on my desktop requesting such).
So unless you’re extracting XML from a Fiddler session archive, this code won’t, by itself, be meaningful to you, but a smart programmer could adapt some of this for various purposes. And, hopefully, the concepts may be interesting.
That said, let’s get back to the code:
I would’ve liked to replace iterEltFn with something like map, but the SingleNodeIterator won’t coerce to a sequence… Someone want to wrap it for me, I’d take that.
The main function parseIndex, changed to just this:
(save_xml_doc (iter2addXml nil (.elements t)) "myResults" nil)
The function iter2addXml returns the combined xml document.
(defn iter2addXml [d sni]
"If the SingleNodeIterator [sni] has more nodes, call_readXml on nextNodes,
then if doc is nil, add_events to a new_xml_result with the contents,
else add_events to the existing doc
otherwise (sni has no more nodes), return the doc as-is."
(if (.hasMoreNodes sni)
(let [n (.nextNode sni)
nxt (call_readXml n)
i 0]
(if (nil? nxt)
(recur d sni)
(let [d2 (if (nil? d)
(new_xml_result)
d)
c (:content nxt)
n (add_events d2 c)]
; here we will do size checking things
(recur n sni)
)))
;; else (no more nodes) just return the doc itself, we're done
d))
(I realize the the code highlighter I’d used gets messed up by WordPress) The logic is: if there are more nodes, get those, do call_readXml on the results, if the result is nil, recur without it; else add_events to either a new document (via new_xml_result) or the existing doc, then recurse; if there are no more nodes, return the existing doc.
The function call_readXml is, in my opinion, a somewhat elegant “one-liner.”
(defn call_readXml [e] (readXmlFile (uri_to_filename (.. e (getFirstChild) (getChildren) (extractAllNodesThatMatch (new HasChildFilter (new StringFilter "S"))) (elementAt 0)(extractLink)))))
Granted, most of this is basically a string of Java functions (“e.getFirstChild().getChildren().extractAllNodesThatMatch(...).elementAt(0).extractLink“); then pass the result to uri_to_filename and pass that result to readXmlFile.
-
;; read an XML file — more to do here
-
;; N.B. "fn" as a param means "filename" not "function"
-
(defn readXmlFile [fn]
-
;; read the filename
-
(let [buffRd (new java.io.BufferedReader (new java.io.FileReader fn))
-
lns (slurp buffRd)]
-
;; if the file contents (lns = lines [of the file])
-
;; starts with an HTTP <<status>>, continue
-
(if (. lns startsWith "HTTP")
-
;; split the next line (the hard way), find the whitespace of this (the status) line –
-
;; what remains is the "status," number and text –
-
;; and the linebreak of the next line
-
(let [lnbrk (.indexOf lns "\n")
-
dt (.substring lns (+ lnbrk 1))
-
idxSp (.indexOf lns " ")
-
status (. lns substring (+ 1 idxSp) (- lnbrk 1))
-
eol (.indexOf dt "\n")]
-
; (println "HTTP status:" (. lns substring (+ 1 idxSp) (- lnbrk 1 )))
-
(if (not (= status "302 Found"))
-
;; call the hdr function…
-
(let [[hdr r2] (hdr (struct-map header
-
:status status
-
:date (.substring dt 6 (- eol 1)))
-
(.substring dt (+ eol 1)))]
-
(println "header is" hdr)
-
;; then parse the remaining data from the file
-
(clojure.xml/parse (new java.io.ByteArrayInputStream (.getBytes r2))))
-
-
;; if this was a "302 Found" status, we’re dealing with a VIP exchange,
-
;; there’s nothing useful here…
-
(println "Location is:" (. dt substring 10 eol))))
-
)))
-
Similar to what was before, but condensed twice: I could do more, like
“(slurp (buffRd (new java.io.BufferedReader…” and more: using duck-streams — I’m just thinking out loud.
And the hdr function has been re-structured for easier reading:
(defn hdr [sm str] (let [[ln rem] (nlr str)] (cond (.startsWith ln "Cache-Control:") (recur (add_val :control (.substring ln 15) sm) rem) (.startsWith ln "Keep-Alive:") (recur (add_val :keep-alive (.substring ln 12) sm) rem) (.startsWith ln "Content-Type:") (recur (add_val :type (.substring ln 14) sm) rem) (.startsWith ln "Connection:") (recur (add_val :connection (.substring ln 12) sm) rem) (.startsWith ln "Content-Length:") (recur (add_val :length (.substring ln 16) sm) rem) (.startsWith ln "Content-Language:") (recur (add_val :language (.substring ln 18) sm) rem) (= 0 (count (.trim ln))) (list sm rem) true (println "no match and length wasn't zero (i.e. unknown, non-blank line):" (count ln) "'" ln "'"))))
This is a new function:
(defn add_events [d events] ; add these trace events (contents) to the xml doc
(type events) "]")
(if (nil? events) d ; if events was nil, just return the doc unchanged
;; otherwise
(let [a (attrs d)
r (reduce (fn [v i] (conj v i)) (or (:content d) []) events)]
(assoc d :content r
:attrs (assoc a :count (+ (:count a) (count events)))))))
I realize I’m writing far too much code here. Sorry.
I did wind up re-writing the XML emitting code, but that’s for another day.