forked from knmnyn/ParsCit
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathINSTALL
103 lines (81 loc) · 3.42 KB
/
INSTALL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
INSTALLATION:
Please see the README.TXT first to familiarize yourself with the
binaries (shell scripts) for this software package. You should first
attempt to get the basic command line clients working first. The
Troubleshooting section of the home page (in doc/index.html in this
distribution; identical to the ParsCit home page at
http://wing.comp.nus.edu.sg/parsCit/#t) also contains some notes of
import.
Please note: Currently the software is not supported by any auto
install package -- we expect a UNIX-knowledgable person to be able to
get the installation working within 30 minutes to an hour.
------------------------------------------------------------
**Prerequisites**
You must have ruby and perl and a working C++ compiler on your local
machine.
ParsCit use the following Perl libraries from CPAN
- Class::Struct
- Getopt::Long
- Getopt::Std
- File::Basename
- File::Spec
- FindBin
- HTML::Entities
- IO::File
- POSIX
- XML::Parser
- XML::Twig
- XML::Writer
- XML::Writer::String
------------------------------------------------------------
**Command-line client**
These are the two scripts: citeExtract.pl and parseRefStrings.pl. The
second script is a subset of the first, and should be your first
target to get running.
1) You will first need to reinstall, recompile the crfpp package and
place the binaries at crfpp/. We are using CRF++ version 0.51. To do
this follow these instructions in the crfpp directory. CRF++ is the
core conditional random field learner, that is re-distributed in this
ParsCit software. It is due to Taku Kudo.
$ cd crfpp
# we're going to rebuild CRF 51
$ rm -Rf CRF++-0.51
$ tar -xvzf CRF++-0.51.tar.gz
$ cd CRF++-0.51
$ configure
# important! the first "make" command fails for me; not sure why
$ make
$ make clean
$ make
# optional, you may want to use sudo or root privileges to install CRF binaries so that other applications can use CRF. Thanks to Priya Venkateshan for this.
$ sudo make install
# hopefully by here, you have a successful build of both crf_test and crf_learn
# move executables to where parscit expects to find them
$ cp crf_learn crf_test ..
# on Windows you may have to do this instead, as the executables are named with .exe
$ copy crf_learn.exe ../crf_learn
$ copy crf_test.exe ../crf_test
$ cd .libs
$ cp -Rf * ../../.libs
2) Once the binaries are placed properly, you may need to edit the
lib/ParsCit/Config.pm file to point to the proper directories on your
machine.
3) Edit the shebang lines (first line) of the scripts in the bin/
directory to point to the proper versions of perl
4) Try running parseRefStrings.pl and citeExtract.pl on the .txt data
found in demodata. It should produce the same result as found in the
.out files. In the bin/ directory try:
./citeExtract.pl -m extract_all ../demodata/sample2.txt
./citeExtract.pl -i xml -m extract_all ../demodata/E06-1050.xml
------------------------------------------------------------
**Web Service**
These are the remaining parts of the system. There are two sets of
client/server pairs (perl ones tested at IST and ruby ones tested at
NUS). You can use either and the installation does not require both.
1) In order to use the ParsCit web service you will need the following
modules in your perl library:
Log::Log4perl
Log::Dispatch
2) Edit lib/ParsCit/Config.pm to provide values appropriate for your
environment. Also edit wsdl/ParsCit.wsdl to reflect any changes to
your service URL.