Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* Initial classes and loading data for poincare model * Initial implementation of training using autograd * faster negative sampling, bugfix in vector updates * allows poincare dist function to be differentiable by autograd * batched gradient descent initial implementation * minor changes to batch poincare distance computation * Adds calculation of gradients for poincare model * Correct implementation of clipping of updated vectors * Fixes error in gradient computation * Better messages while training * Renames PoincareDistance to PoincareExample for clarity * Compares computed gradients to autograd gradients every few iterations * Avoids doing some numpy computations twice * Avoids creating copies of numpy vectors * Only calls nan_to_num when gamma has at least one value equal to 1 * Simply sets nan gradients to zero instead of nan_to_num * Adds batch-wise implementation of training and gradient computations * Minor correction in clipping * Fixes typo in clip_vectors * Prints average loss every few iterations instead of current loss * Adds weighted negative sampling * Ensures positive edges are not returned by negative sampling * Poincare model stores node indices in relations instead of node keys * Minor renaming; uses node indices for batch training instead of node keys * Changes shapes of vectors passed to PoincareBatch * Minor bugfixes related to batch size * Corrects implementation of negative sampling for batch training * Adds option to check gradients in batchwise training * Checks gradients only every few iterations * Handles multiple occurrence of same node across and within batches * Removes unused section of code * Implements slightly different clipping method * Fixes bugs with wrong reshape in batchwise training * Example-wise training takes into account multiple occurrences of same node in an example too * Batchwise training prints average loss over many iterations instead of current batch * Fixes bug in updating vector for batchwise training * Faster implementation of negative sampling * Negative sampling for a node follows different paths depending on fraction of positive relations * Uses a buffer for negative samples to reduce calls to np.random.choice * Cleans up poincare.py, removes unused code * Adds shapes to PoincareBatch, more documentation * Adds more documentation to PoincareModel * Stores indices for nodes in a batch in PoincareBatch for better encapsulation * More documentation for poincare module * Implements burn-in for poincare model * Slightly better logging for poincare model * Uses np.random.random and np.searchsorted for random sampling rather than np.random.choice * Removes duplicates in negative samples * Moves helper classes in poincare after PoincareModel * Change in PoincareModel API to allow initializing from an iterable, separate class for streaming from file * Adds failing test for handling encoding in PoincareData * Fixes encoding handling in PoincareData * Adds docstrings to PoincareData, PoincareData streams tuples now * More unittests for PoincareModel * Changes handle_duplicates to staticmethod, adds test * Adds batch size and print_every parameters to train method * Renames print_check to should_print * Adds separate parameter for checking gradients * Minor fixes for coding style * Removes default values from docstrings, redundant * Adds example to PoincareModel init docstring * Extracts buffer for negatives out into a separate class * More detailed logging, fix to check_gradients * Minor fixes to documentation in poincare.py * Adds tests for gradients checking * Raise AssertionError if gradients check fails * Adds failing tests for saving/loading PoincareModel instances * Fixes bug with saving/loading PoincareModel to disk * Adds test and fix for raising error on invalid input data * Adds test and fix for no duplicates and positives in negative sample * Bugfix with NegativesBuffer having less than items left * Uses larger data for poincare tests, adds data files * Bugfix with incorrect use of random state * Minor fixes in documentation style * Renames PoincareData to PoincareRelations * Change in the order of conditions checked before resampling * Imports datapath from test.utils instead of defining own * Adds working examples and a more detailed description in docstring * Renames term_relations to node_relations * Removes unused imports * Moves iter parameter to train instead of __init__, renames to epochs * Fixes term_relations in tests * Adds option to disable gradient check, disabled by default * Extracts gradient checking code into a separate method * Conditionally import autograd only if gradient checking is enabled * Marks private methods in poincare module with leading underscore * Adds init_range as an API parameter to PoincareModel * Marks private properties with a leading underscore * Fixes bug with burn-in happening on subsequent calls to train * Adds test for training multiple times * Adds autograd to test dependencies * Renames wv to kv in PoincareModel * add numpy==1.12 as test dependency * add missing quote * try to run tests without autograd * fix PEP8 in poincare.py * fix PEP8 in test_poincare * PoincareRelations handles python2 correctly * Bugfix with int division for python2 * Imports mock module for tests correctly in python2 * Cleaner implementation of __iter__ for PoincareRelations * Adds rst file and updates apiref.rst for poincare module * Adds clarifying comment to PoincareRelations.__iter__ * Updates rst file for poincare * Renames hypernym pair to relations everywhere * Simpler way of detecting duplicates * Minor documentation updates in poincare.py * Skips gradients test if autograd not installed, adds test for bytes input data * Fix flake8 (noqa + remove unused var) * Fix missing mock dependency for win * Fix links in docstrings * Changes error message for negative sampling failing * Adds option to specify dtype for PoincareModel and corresponding unittest * Extends test for dtype to check after training, updates docstring
- Loading branch information