EOF System.in, Java stream 和 Windows命令行
什么是EOF, 如果是一个文件的话, 全部字符读完之后, 继续读得到就会是EOF, 数值上是-1, 例如下面的Clojure代码
(def pbr (java.io.PushbackReader. (java.io.InputStreamReader. (java.io.ByteArrayInputStream. (.getBytes "hello\n"))) 100) ) (def foo (clojure.string/reverse (loop [ch (.read pbr) acc ""] (if (= ch -1) acc (recur (.read pbr) (str (char ch) acc)) ) ) ) )
这里EOF可以作为循环的终止条件. 在处理stream的过程中最重要的两个符号就是换行符和EOF了.
如果我们的stream换成System.in呢? 什么情况才会读到EOF? 一个典型的例子是Clojure的REPL
如果我们去读取Clojure REPL的standard in, 会发现读不到EOF
(defn is-char-letter [ch] (if (or (and (>= ch (int \a)) (<= ch (int \z))) (and (>= ch (int \A)) (<= ch (int \Z))) ) true nil ) ) (defn auto-paren [s] (let [first-ch (.read s)] (.unread s first-ch) ;; push it back no matter what (println "first char is " (char first-ch)) (if (= first-ch (int \() ) nil ;; do nothing in this case (if (is-char-letter first-ch) ;; else if it start with a-z A-Z ;; 确定buffer里面只有一行, 然后把整行读出来, 加上括号, 再pushback (let [command-line-no-paren (clojure.string/reverse (loop [ch (.read s) acc ""] (println "a char from System.in" (if (= ch -1) "-1" (char ch)) " the number is" ch) (if (= ch -1) acc (recur (.read s) (str (char ch) acc)) ) ) ) _ (println "command-line-no-paren is " command-line-no-paren ) ] (.unread s (char-array (str "(" command-line-no-paren ")"))) ) ) ) ) ) (defn skip-whitespace "Skips whitespace characters on stream s. Returns :line-start, :stream-end, or :body to indicate the relative location of the next character on s. Interprets comma as whitespace and semicolon as comment to end of line. Does not interpret #! as comment to end of line because only one character of lookahead is available. The stream must either be an instance of LineNumberingPushbackReader or duplicate its behavior of both supporting .unread and collapsing all of CR, LF, and CRLF to a single \\newline." [s] (loop [c (.read s)] (cond (= c (int \newline)) :line-start (= c -1) :stream-end (= c (int \;)) (do (.readLine s) :line-start) (or (Character/isWhitespace (char c)) (= c (int \,))) (recur (.read s)) :else (do (.unread s c) (auto-paren s) :body))))
实际上当从input中读取全部的字符之后, read方法将会阻塞, 每次表达式执行完之后REPL的状态其实就是这种阻塞状态.
如果按Ctrl + C, 会直接退出JVM. 这是我第一个想到的. 实际这个EOF在Windows中是Ctrl + Z, 对应Linux中的Ctrl + D.