-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathinstall.html
199 lines (186 loc) · 13.6 KB
/
install.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta http-equiv="Content-Style-Type" content="text/css" />
<meta name="generator" content="pandoc" />
<meta name="author" content="October 18, 2018" />
<title>Savio installation training</title>
<style type="text/css">code{white-space: pre;}</style>
</head>
<body>
<div id="header">
<h1 class="title">Savio installation training</h1>
<h2 class="author">October 18, 2018</h2>
</div>
<h1 id="introduction">Introduction</h1>
<p>We'll spend about 30 minutes going through this material as a demonstration. I encourage you to login to your account and try out the various examples yourself as we go through them. The remainder of the scheduled time will be open time for you to work on your installations with each other and with help from us.</p>
<p>Some of this material is based on the extensive Savio documention we have prepared and continue to prepare, available at <a href="http://research-it.berkeley.edu/services/high-performance-computing/user-guide" class="uri">http://research-it.berkeley.edu/services/high-performance-computing/user-guide</a>. In particular we have some detailed installation instructions under the "Installing your own" tab at <a href="http://research-it.berkeley.edu/services/high-performance-computing/accessing-and-installing-software" class="uri">http://research-it.berkeley.edu/services/high-performance-computing/accessing-and-installing-software</a>.</p>
<p>The materials for this tutorial are available using git at <a href="https://github.com/ucberkeley/savio-training-install-2018" class="uri">https://github.com/ucberkeley/savio-training-install-2018</a> or simply as a <a href="https://github.com/ucberkeley/savio-training-install-2018/archive/master.zip">zip file</a>.</p>
<p>These <em>install.html</em> and <em>install_slides.html</em> files were created from <em>install.md</em> by running 'make' within the repository directory (see <em>Makefile</em> for details on how that creates the html files).</p>
<p>Please see this <a href="https://github.com/ucberkeley/savio-training-intro-2018/archive/master.zip">zip file</a> for materials from our introductory training on September 18, including accessing Savio, data transfer, and basic job submission.</p>
<h1 id="outline">Outline</h1>
<p>This training session will cover the following topics:</p>
<ul>
<li>Software installation
<ul>
<li>Installing third-party software</li>
<li>Installing Python and R packages that rely on third-party software
<ul>
<li>Python example</li>
<li>R example</li>
</ul></li>
</ul></li>
<li>Hands-on software installation help</li>
</ul>
<h1 id="modules">Modules</h1>
<p>Recall that very little software is available except via the modules system.</p>
<p>Simple tools like gnuplot, newer versions of git, cmake, gcc, etc... are all available in the software module farm, not loaded by default in user environments.</p>
<h1 id="third-party-software-installation---overview">Third-party software installation - overview</h1>
<p>In general, third-party software will provide installation instructions on a webpage, Github README, or install file inside the package source code.</p>
<p>The key for installing on Savio is making sure everything gets installed in your own home, project, or scratch directory and making sure you have the packages on which the software depends on also installed by you or loaded from the Savio modules.</p>
<p>A common installation approach is the GNU build system (Autotools), which involves three steps: configure, make, and make install.</p>
<ul>
<li><em>configure</em>: this queries your system to find out what tools (e.g., compilers and other packages) you have available to use in building and using the software</li>
<li><em>make</em>: this compiles the source code in the software package</li>
<li><em>make install</em>: this moves the compiled code (library files and executables) and header files and the like to their permanent home</li>
</ul>
<h1 id="installation-logistics">Installation logistics</h1>
<p>We recommend you set up a directory structure that mimics how we install software on Savio. We'll assume you are installing in a group directory, but you could equally well do this in your home directory or even on scratch (albeit with the danger the installation might be purged).</p>
<p>If you haven't already set up the directory structure for your group, here are the steps:</p>
<pre><code>my_group=co_stat
cd /global/home/groups/${my_group}
mkdir software
mkdir software/sources
mkdir software/modules
mkdir software/scripts
mkdir software/modfiles
chmod -R g+rwX software</code></pre>
<p>However, you can manage your directories as you like; the above is just a suggestion.</p>
<h1 id="third-party-software-installation---examples">Third-party software installation - examples</h1>
<p>Here are a couple examples of installing a piece of software in your home directory.</p>
<h3 id="yaml-package-example">yaml package example</h3>
<pre><code># install yaml, an optional dependency for Python yaml package
PKG=yaml
V=0.1.7
DIR=/global/home/groups/${my_group}/software
SRCDIR=${DIR}/sources/${PKG}/${V}
INSTALLDIR=${DIR}/modules/${PKG}/${V}
mkdir -p ${INSTALLDIR}
mkdir -p ${SRCDIR}
cd ${SRCDIR}
wget http://pyyaml.org/download/libyaml/${PKG}-${V}.tar.gz
tar -xvzf ${PKG}-${V}.tar.gz
cd ${PKG}-${V}
module load gcc/6.3.0
# --prefix is key to install in directory you have access to
./configure --prefix=$INSTALLDIR | tee ../configure-${V}.log
make | tee ../make-${V}.log
make install | tee ../install-${V}.log</code></pre>
<p>To save the information on how you installed the software, we recommend you save the code above. In this case you would save it in <code>software/scripts/yaml/install.sh</code>.</p>
<h1 id="third-party-software-installation---examples-1">Third-party software installation - examples</h1>
<p>Here are a couple examples of installing a piece of software in your home directory.</p>
<h3 id="geos-package-example">geos package example</h3>
<pre><code>PKG=geos
V=3.6.2
DIR=/global/home/groups/${my_group}/software
SRCDIR=${DIR}/sources/${PKG}/${V}
INSTALLDIR=${DIR}/modules/${PKG}/${V}
mkdir -p ${INSTALLDIR}
mkdir -p ${SRCDIR}
cd ${SRCDIR}
wget http://download.osgeo.org/${PKG}/${PKG}-${V}.tar.bz2
tar -xvjf ${PKG}-${V}.tar.bz2
cd ${PKG}-${V}
module load gcc/4.8.5 ## R module depends on 4.8.5
# --prefix is key to install in directory you have access to
./configure --prefix=$INSTALLDIR | tee ../configure-${V}.log
make | tee ../make-${V}.log
make install | tee ../install-${V}.log</code></pre>
<p>For Cmake, the following may work (this is not a worked example, just some template code):</p>
<pre><code>PKG=foo
INSTALLDIR=~/software/${PKG}
cmake -DCMAKE_INSTALL_PREFIX=${INSTALLDIR} . | tee ../cmake.log</code></pre>
<h1 id="third-party-software-installation---final-steps">Third-party software installation - final steps</h1>
<p>To use the software you just installed as a dependency of other software you want to install (e.g., Python or R packages), you often need to make Linux aware of the location of the following in your newly-installed software:</p>
<ul>
<li>library files (for other software to link against)</li>
<li>header files (for other software to compile against).</li>
</ul>
<p>If a library file can't be found, you'll likely see an error message such as <em><code>libfoo.so</code> not found</em>. In this case make sure make sure the compiler can find the <em>lib</em> or <em>lib64</em> directory that contains the file(s).</p>
<p>If a header file can't be found, you'll likely see comments about <em>.h</em> files not being found. In this case make sure make sure the compiler can find the <em>include</em> directory that contains the file(s).</p>
<h1 id="installing-python-and-r-packages">Installing Python and R packages</h1>
<p>This first example illustrates how to install a Python package under a circumstance (which is somewhat contrived in this case) where we need to tell pip where to find some dependency files.</p>
<p>Note if we install pyyaml outside a virtualenv, we don't need to do all this because Savio uses Anaconda and has the yaml system package installed and the installation of pyyaml knows where to find yaml.h and libyaml.so. So this is primarily an illustration and not what would be needed in practice (plus pyyaml is already installed on Savio).</p>
<pre><code>module load python/3.6
virtualenv /global/home/groups/${my_group}/pyyaml
source /global/home/groups/${my_group}/pyyaml/bin/activate
PYPKG=pyyaml
V=0.1.7
PKGDIR=/global/home/groups/${my_group}/software/modules/yaml/${V}
# This fails:
which python
module purge # otherwise demo fails as pyyaml is found from system python
pip install --global-option="--with-libyaml" ${PYPKG} # use faster libyaml bindings
# can't find yaml.h file
# This succeeds:
pip install -v --global-option="--with-libyaml" --global-option=build_ext \
--global-option="-I/${PKGDIR}/include" ${PYPKG}
ls /global/home/groups/${my_group}/pyyaml/lib/python3.6/site-packages</code></pre>
<p>Note that if you install a Python package not in a virtualenv or conda environment, you'll need:</p>
<pre><code>pip install --user package_name</code></pre>
<p>to install it in your home directory. Otherwise pip will try to install it in a system directory that you don't have permissions to write to.</p>
<h1 id="installing-python-and-r-packages-1">Installing Python and R packages</h1>
<p>Here's an R example. In this case compilation would go ok, except R needs access to a geos-related binary for configuration and to the libgeos library file when it test runs the installed rgeos package at the end of the process.</p>
<p>Note: in the following we'll install in our user directory; it would take some additional wrangling to install in a location available to other group members.</p>
<pre><code>V=3.6.2
PKGDIR=/global/home/groups/${my_group}/software/modules/geos/${V}
module load r/3.4.2
Rscript -e "install.packages('rgeos', repos = 'https://cran.cnr.berkeley.edu')"
# that fails as it can't find geos-config,
# an executable installed as part of geos
export PATH=${PKGDIR}/bin:${PATH}
Rscript -e "install.packages('rgeos', repos = 'https://cran.cnr.berkeley.edu')"
# that installs (almost), but fails to run when tested because
# libgoes-3.6.2.so can't be found
export LD_LIBRARY_PATH=${INSTALLDIR}/lib:${LD_LIBRARY_PATH}
Rscript -e "install.packages('rgeos', repos = 'https://cran.cnr.berkeley.edu')"</code></pre>
<p>You may sometimes need to use the <code>configure.args</code> or <code>configure.vars</code> arguments to <code>install.packages()</code> to provide information on the location of <code>include</code> and <code>lib</code> directories of dependencies (similar to what we did for pyyaml).</p>
<h1 id="accessing-executables-and-libraries-at-run-time">Accessing executables and libraries at run-time</h1>
<p>The issues we saw installing rgeos were not really compilation issues, but rather issues in finding binaries and libraries at run time.</p>
<p>For binaries, Linux only looks in certain directories for executables. These directories are in your PATH variable. So in the previous example we added the directory containing geos-config to our PATH.</p>
<p>For library (.so) files Linux only looks in certain directories for the location of .so library files. To tell Linux to look in additional directories, you can add them to your LD_LIBRARY_PATH variable.</p>
<p>In the rgeos case, anytime we want to use the rgeos package, we need to make sure LD_LIBRARY_PATH is set so that rgeos can find <code>libgeos-3.6.2.so</code>.</p>
<h1 id="installation-for-an-entire-group">Installation for an entire group</h1>
<p>Optionally if you'd like your group members to have write access to the directories for the software you installed, you would do this:</p>
<pre><code>PKG=rgeos
cd /global/home/groups/${my_group}/software
chmod -R g+rwX {sources,modfiles,modules,scripts}/${PKG}</code></pre>
<h1 id="setting-up-a-module-for-your-software">Setting up a module for your software</h1>
<p>You may also want to set up a module for the software you just installed. This allows you and your group members to easily set your environment so that the software is accessible for you. To do this you need to do the following.</p>
<p>First we'll need a directory in which to store our module files:</p>
<pre><code>V=3.6.2
MYMODULEPATH=/global/home/groups/${my_group}/software/modfiles
mkdir -p ${MYMODULEPATH}/geos
export MODULEPATH=${MODULEPATH}:${MYMODULEPATH} # good to put this in your .bashrc</code></pre>
<p>Now we create a module file for the version (or one each for multiple versions) of the software we have installed. E.g., for our geos installation we would edit <code>${MYMODULEPATH}/geos/3.6.2</code> based on looking at examples of other module files. An example module file for our geos example is <em>example-modulefile</em>.</p>
<pre><code>cp example-modulefile ${MYMODULEPATH}/geos/3.6.2</code></pre>
<p>Or see some of the Savio system-level modules in <code>/global/software/sl-7.x86_64/modfiles</code>.</p>
<pre><code>cat /global/software/sl-7.x86_64/modfiles/apps/ml/tensorflow/1.7.0-py36</code></pre>
<h1 id="how-to-get-additional-help">How to get additional help</h1>
<ul>
<li>For technical issues and questions about using Savio:
<ul>
<li>[email protected]</li>
</ul></li>
<li>For questions about computing resources in general, including cloud computing:
<ul>
<li>[email protected]</li>
</ul></li>
<li>For questions about data management (including HIPAA-protected data):
<ul>
<li>[email protected]</li>
</ul></li>
</ul>
</body>
</html>