Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
388 views
in Technique[技术] by (71.8m points)

named entity recognition - Stanford NER tagger generates 'file not found' exception with provided models

I downloaded stanford NER 3.4.1, unpacked it, and tried to run named entity recognition on a local file using the default (provided) trained model. I got this:

 `java.io.FileNotFoundException: /u/nlp/data/pos_tags_are_useless/egw4-reut.512.clusters (No such file or directory) at edu.stanford.nlp.io.IOUtils.inputStreamFromFile(IOUtils.java:481)`

What's wrong and how can I fix it?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

It turns out that the provided models use "distributional similarity features" that require a .clusters file at a location specified in the compressed model file (tricky to change). If you're on the stanford network, presumably the required files are there. If not, I found two choices:

  1. Download stanford NER without the distributional similarity features (slightly degrades performance, but runs faster). disclaimer: I havn't actually tried this, but it should work.
  2. Download the distsim file (look here) from stanford and create a sym-link to it so it appears to be in the correct location. In my case on a Mac, I did this:
    • I created a heirarchy of folders u/nlp/data/pos_tags_are_useless/ somewhere,
    • copied the downloaded egw4-reut.512.clusters file there,
    • then cd /; sudo ln -s <somewhere>/u.

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...