jStyleParser assignDOM 无法返回

跑一个基于jStyleParser的程序, 发现JVM内存占用异常. 观察日志发现, 对某些URL调用CSSFactory.assignDOM之后函数无法返回.

下面的代码可以重现该问题.

 
package test;
 
import java.net.MalformedURLException;
import java.net.URL;
 
import org.w3c.dom.Document;
 
import cz.vutbr.web.css.CSSFactory;
import cz.vutbr.web.domassign.StyleMap;
 
public class TestAssignDOM {
    public static void main ( String [] args ) {
 
        String url = "http://jayconrod.com/posts/55/a-tour-of-v8-garbage-collection";
 
        // 下载页面源文件
        String postContent = DownloadPage.DownloadPage(url);
 
        // System.out.println(postContent);
 
        // 生成Document对象
        Document doc = MatchSelectorTest.GetDoc(postContent);
 
        URL urlBase;
        try {
            urlBase = new URL(url);
            StyleMap decl = CSSFactory.assignDOM(doc, urlBase, "screen", true);
            System.out.println("returned");
        } catch (MalformedURLException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }
 
    }
}
 
 

打开jStyleParser的日志输出. 得到如下结果

 
22:06:46.347 [main] DEBUG cz.vutbr.web.csskit.antlr.CSSLexer - FILE: "css/main.css"
22:06:46.347 [main] INFO  cz.vutbr.web.csskit.antlr.CSSLexer - Will import file "css/main.css" with media: 
22:06:46.347 [main] DEBUG cz.vutbr.web.csskit.antlr.CSSLexer - BASE: http://jayconrod.com/posts/55/a-tour-of-v8-garbage-collection
22:06:46.347 [main] DEBUG cz.vutbr.web.csskit.antlr.CSSLexer - Actually, will try to import file "http://jayconrod.com/posts/55/css/main.css"
22:06:47.944 [main] INFO  cz.vutbr.web.csskit.antlr.CSSLexer - File "http://jayconrod.com/posts/55/css/main.css" was imported.
22:06:47.945 [main] INFO  cz.vutbr.web.csskit.antlr.CSSLexer - ANTLR: line 2:2 mismatched character 'D' expecting '-'
22:06:47.948 [main] DEBUG cz.vutbr.web.csskit.antlr.CSSLexer - FILE: "css/main.css"
22:06:47.948 [main] INFO  cz.vutbr.web.csskit.antlr.CSSLexer - Will import file "css/main.css" with media: 
22:06:47.948 [main] DEBUG cz.vutbr.web.csskit.antlr.CSSLexer - BASE: http://jayconrod.com/posts/55/css/main.css
22:06:47.948 [main] DEBUG cz.vutbr.web.csskit.antlr.CSSLexer - Actually, will try to import file "http://jayconrod.com/posts/55/css/css/main.css"
22:06:50.376 [main] INFO  cz.vutbr.web.csskit.antlr.CSSLexer - File "http://jayconrod.com/posts/55/css/css/main.css" was imported.
22:06:50.376 [main] INFO  cz.vutbr.web.csskit.antlr.CSSLexer - ANTLR: line 2:2 mismatched character 'D' expecting '-'
22:06:50.379 [main] DEBUG cz.vutbr.web.csskit.antlr.CSSLexer - FILE: "css/main.css"
22:06:50.379 [main] INFO  cz.vutbr.web.csskit.antlr.CSSLexer - Will import file "css/main.css" with media: 
22:06:50.379 [main] DEBUG cz.vutbr.web.csskit.antlr.CSSLexer - BASE: http://jayconrod.com/posts/55/css/css/main.css
22:06:50.379 [main] DEBUG cz.vutbr.web.csskit.antlr.CSSLexer - Actually, will try to import file "http://jayconrod.com/posts/55/css/css/css/main.css"
22:06:52.194 [main] INFO  cz.vutbr.web.csskit.antlr.CSSLexer - File "http://jayconrod.com/posts/55/css/css/css/main.css" was imported.
22:06:52.194 [main] INFO  cz.vutbr.web.csskit.antlr.CSSLexer - ANTLR: line 2:2 mismatched character 'D' expecting '-'
22:06:52.197 [main] DEBUG cz.vutbr.web.csskit.antlr.CSSLexer - FILE: "css/main.css"
22:06:52.197 [main] INFO  cz.vutbr.web.csskit.antlr.CSSLexer - Will import file "css/main.css" with media: 
22:06:52.197 [main] DEBUG cz.vutbr.web.csskit.antlr.CSSLexer - BASE: http://jayconrod.com/posts/55/css/css/css/main.css
22:06:52.197 [main] DEBUG cz.vutbr.web.csskit.antlr.CSSLexer - Actually, will try to import file "http://jayconrod.com/posts/55/css/css/css/css/main.css"
22:06:54.282 [main] INFO  cz.vutbr.web.csskit.antlr.CSSLexer - File "http://jayconrod.com/posts/55/css/css/css/css/main.css" was imported.
22:06:54.283 [main] INFO  cz.vutbr.web.csskit.antlr.CSSLexer - ANTLR: line 2:2 mismatched character 'D' expecting '-'
22:06:54.285 [main] DEBUG cz.vutbr.web.csskit.antlr.CSSLexer - FILE: "css/main.css"
22:06:54.285 [main] INFO  cz.vutbr.web.csskit.antlr.CSSLexer - Will import file "css/main.css" with media: 
22:06:54.285 [main] DEBUG cz.vutbr.web.csskit.antlr.CSSLexer - BASE: http://jayconrod.com/posts/55/css/css/css/css/main.css
22:06:54.286 [main] DEBUG cz.vutbr.web.csskit.antlr.CSSLexer - Actually, will try to import file "http://jayconrod.com/posts/55/css/css/css/css/css/main.css"
22:06:55.489 [main] INFO  cz.vutbr.web.csskit.antlr.CSSLexer - File "http://jayconrod.com/posts/55/css/css/css/css/css/main.css" was imported.
22:06:55.489 [main] INFO  cz.vutbr.web.csskit.antlr.CSSLexer - ANTLR: line 2:2 mismatched character 'D' expecting '-'
22:06:55.491 [main] DEBUG cz.vutbr.web.csskit.antlr.CSSLexer - FILE: "css/main.css"
22:06:55.491 [main] INFO  cz.vutbr.web.csskit.antlr.CSSLexer - Will import file "css/main.css" with media: 
22:06:55.491 [main] DEBUG cz.vutbr.web.csskit.antlr.CSSLexer - BASE: http://jayconrod.com/posts/55/css/css/css/css/css/main.css
22:06:55.491 [main] DEBUG cz.vutbr.web.csskit.antlr.CSSLexer - Actually, will try to import file "http://jayconrod.com/posts/55/css/css/css/css/css/css/main.css"
...
 

这是计算相对路径计算进入无限循环. 页面的源码如下.

而且URL非常特殊, 这个url只要前缀是 http://jayconrod.com/posts/55/ 后面无论多少层次的路径输出的是同一个页面, 即每一个页面都会引用相对路径"css/main.css". 而每一个引用又会再一次引用相对路径, 如此递归循环. 运行时间越长, 字符串就会越长, 内存占用就越高.