Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trouble building swivel/fastprep.cc on Mac? #127

Closed
moscow25 opened this issue May 19, 2016 · 14 comments
Closed

Trouble building swivel/fastprep.cc on Mac? #127

moscow25 opened this issue May 19, 2016 · 14 comments
Assignees

Comments

@moscow25
Copy link

Hey guys, love the Swivel library. That said, prep.py is too slow (on 1B lines of text dataset), so trying to build the fastprep version.

I get stuck on "rebuild Tensorflow from source" part. It says to build a pip package, but then I get the error:

bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
cp: bazel-bin/tensorflow/tools/pip_package/build_pip_package.runfiles/tensorflow: No such file or directory
cp: bazel-bin/tensorflow/tools/pip_package/build_pip_package.runfiles/external: No such file or directory

Indeed, those files do not exist. I could not find this error in the other peoples' bugs. A pointer would be appreciated!

Also... if I have vocab and co-occurrences cached from GloVe, can I skip the prep phase?

Thanks!

@moscow25
Copy link
Author

To be clear, I have bazel and all the other dependencies installed. I ran the first command, and it returned without error:
bazel build -c opt //tensorflow/tools/pip_package:build_pip_package
...
Target //tensorflow/tools/pip_package:build_pip_package up-to-date:
bazel-bin/tensorflow/tools/pip_package/build_pip_package

And my bazel-bin/tensorflow/tools/pip_package/build_pip_package.runfiles/ exists, just does not include bazel-bin/tensorflow/tools/pip_package/build_pip_package.runfiles/tensorflow

I'm working off of these instructions. Is it possible that something isn't sync'ed?
https://www.tensorflow.org/versions/r0.7/get_started/os_setup.html#installation-for-mac-os-x

@waterson
Copy link
Contributor

Hey there! Apologies for being so slow to respond! :-/

I have not tried building TF from source on Mac; let me take a look at that today and I'll update the issue.

@waterson
Copy link
Contributor

Also... if I have vocab and co-occurrences cached from GloVe, can I skip the prep phase?

And, FWIW, yes, you can definitely do this! Basically you just need re-arrange Glove's co-occurrence matrix into tf.Examples that follow the format that prep.py produces.

@waterson waterson self-assigned this May 25, 2016
waterson pushed a commit to waterson/models that referenced this issue May 25, 2016
@waterson
Copy link
Contributor

I get stuck on "rebuild Tensorflow from source" part. It says to build a pip package, but then I get the error:

bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
cp: bazel-bin/tensorflow/tools/pip_package/build_pip_package.runfiles/tensorflow: No such file or directory
cp: bazel-bin/tensorflow/tools/pip_package/build_pip_package.runfiles/external: No such file or directory

Indeed, those files do not exist. I could not find this error in the other peoples' bugs. A pointer would be appreciated!

Okay, I just tried building from source on Mac and... I at least got past this part.

Here is (more or less) what I did. YMMV because it's hard to know how your system is configured. I followed the OS/X "install from source" directions. This involved: installing JDK8, installing homebrew. Then I used homebrew to install bazel and swig, and pip_install to install six, numpy and wheel. Then I pulled the TF source repo, configured it and built the PIP package. Then I built and installed protocol buffers. Finally, I was able to compile fastprep.cc, but failed to link it. I suspect that I need to do some libtool magic to make that work.

brew install bazel swig
sudo easy_install -U six
sudo easy_install -U numpy
sudo easy_install wheel
git clone --recurse-submodules https://github.com/tensorflow/tensorflow
cd tensorflow
./configure
bazel build -c opt //tensorflow/tools/pip_package:build_pip_package
cd google/protobuf
brew install automake libtool
./autogen.sh
./configure --prefix=${HOME} 
make
make install
cd ~/src/swivel/models
make -f fastprep.mk  # this compiles but can't link. so... that's a bug!

LMK if you can get that far; if not, it might make sense to open another issue on the core TF project. In the meantime, there is certainly a real bug here that fastprep.cc can't link.

Also, for what it's worth, note that TF may not support GPUs on OS/X. If that's the case, you may find that Swivel is not going to provide much of a speedup over Glove training.

@waterson
Copy link
Contributor

Okay, by changing

 LDLIBS=-lprotos_all_cc -lprotobuf -lpthread -lm

to

LDLIBS=-lprotos_all_cc.pic -lprotobuf -lpthread -lm

...I can get fastprep.cc to compile and link. So I need to figure out why TF OS/X names its libraries with ".pic" and see if there's a clean way to have fastprep.mk choose the right one.

martinwicke pushed a commit that referenced this issue Jun 2, 2016
@girving girving added the triaged label Jun 8, 2016
@moscow25
Copy link
Author

moscow25 commented Jul 5, 2016

Sorry for the late reply @waterson. Your change LDLIBS=-lprotos_all_cc.pic -lprotobuf -lpthread -lm seems to have done it! I have fastprep finally installed. The script runs! Now will have to see if Swivel trains from here :-)

Thanks!

@moscow25
Copy link
Author

moscow25 commented Jul 5, 2016

And thanks @waterson for sharing script to port GloVe co-occurrences to Swivel. Will try that as well. Not sure how I managed to miss notification from your response -- gotta check my GitHub settings. I really appreciate it. Excited to try these comparisons!

@moscow25 moscow25 closed this as completed Jul 5, 2016
@moscow25 moscow25 reopened this Jul 6, 2016
@moscow25
Copy link
Author

moscow25 commented Jul 6, 2016

Hi @waterson, unfortunately even though the fastprep compiles, it seems to crash in the co-occurrence phase when writing shards. Have you seen this error before?

For comparison, prep.py works -- albeit very very slowly, so on a 10% sample of this text (which it itself a sample). As long as ulimit -n is set correctly.

I will also separately search for this protobuf error. Thanks!

`$ ./fastprep --output_dir /tmp/swivel --input /Volumes/xxx/rawtweets_10pct.txt --shard_size 16384 --min_count 100 --window_size 15

Computing vocabulary: 100.0% complete...56016451 distinct tokens

Generating Swivel co-occurrence data into /tmp/swivel

Shard size: 16384x16384

Vocab size: 98304

Computing co-occurrences: 100.0% complete...done.

writing marginals...

writing shards...

writing shard 1/36

[libprotobuf FATAL google/protobuf/message_lite.cc:68] CHECK failed: (bytes_produced_by_serialization) == (byte_size_before_serialization): Byte size calculation and serialization were inconsistent. This may indicate a bug in protocol buffers or it may be caused by concurrent modification of the message.

libc++abi.dylib: terminating with uncaught exception of type google::protobuf::FatalException: CHECK failed: (bytes_produced_by_serialization) == (byte_size_before_serialization): Byte size calculation and serialization were inconsistent. This may indicate a bug in protocol buffers or it may be caused by concurrent modification of the message.

Abort trap: 6
`

@aselle aselle removed the triaged label Jul 28, 2016
@suharshs
Copy link
Contributor

@waterson Any update on this?

@waterson
Copy link
Contributor

Sorry, no update.

I notice that 16K x 16K shards seems very large; you may be blowing out a protocol buffer limit somewhere, especially if the shards are dense. Have you tried using a smaller shard size; e.g., the default of 4096 x 4096?

If you want to point me at the data you're using, I'm happy to try to reproduce the problem.

(And, FWIW, to confirm: this is OS/X-specific, correct?)

@suharshs suharshs added the stat:awaiting response Waiting on input from the contributor label Aug 16, 2016
@moscow25
Copy link
Author

moscow25 commented Aug 24, 2016

Interesting. Yes, this is OS-X specific, though I did not try running fastprep.cc on a Linux box.

I would like to get back to this at some point soon, or will ask if a colleague has time for it. We've found the Word2Vec (original, vanilla version) works pretty well on our large (several billion lines of short text) corpus, while GloVe runs into several issues. I would not doubt that Swivel might solve some of these, but this becomes impossible if it can not really handle large amounts of data, and a vocabulary of 5-10 million words.

Please ping me if any solutions emerge, or there is an update to address it. I will also keep my eyes out. Thanks for your help, and sorry it did not work the last time around, even with your helpful fixes. I definitely got further though.

@aselle aselle removed the stat:awaiting response Waiting on input from the contributor label Nov 2, 2016
@girving
Copy link

girving commented Feb 6, 2017

@waterson What's the status of this issue?

@waterson
Copy link
Contributor

So, my understanding is that there is seems to be a problem with very large protocol buffers, and that a few pretty easy work-arounds exist (e.g., use a smaller subblock size or use Linux). I'm happy to accept PRs but I'm not spending any cycles on it at the moment. If I've mis-understood, please let me know.

@girving
Copy link

girving commented Feb 13, 2017

I'll close for now due to age, but I'm happy to reopen if new information surfaces.

@girving girving closed this as completed Feb 13, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants