diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 4f4c2013..b7ff31cb 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -14,7 +14,7 @@ Kenneth Reitz has also written an [essay](https://www.kennethreitz.org/essays/be As the [Requests Code Of Conduct](http://docs.python-requests.org/en/master/dev/contributing/#be-cordial) states, **all contributions are welcome**, as long as everyone involved is treated with respect. -## Your First Contribution +## Your first contribution A great way to start contributing to Camelot is to pick an issue tagged with the [Contributor Friendly](https://github.com/socialcopsdev/camelot/labels/Contributor%20Friendly) tag or the [Level: Easy](https://github.com/socialcopsdev/camelot/labels/Level%3A%20Easy) tag. If you're unable to find a good first issue, feel free to contact the maintainer. @@ -26,19 +26,17 @@ To install the dependencies needed for development, you can use pip: $ pip install camelot-py[dev] -### Alternatively - -You can clone the project repository, and install using pip: +Alternatively, you can clone the project repository, and install using pip:
-$ pip install .[dev]
+$ pip install ".[dev]"
 
## Pull Requests -### Submit a Pull Request +### Submit a pull request -The preferred workflow for contributing to Camelot is to fork the [project repository](https://github.com/socialcopsdev/camelot) on GitHub, clone, develop on a branch and then finally submit a pull request. Steps: +The preferred workflow for contributing to Camelot is to fork the [project repository](https://github.com/socialcopsdev/camelot) on GitHub, clone, develop on a branch and then finally submit a pull request. Here are the steps: 1. Fork the project repository. Click on the ‘Fork’ button near the top of the page. This creates a copy of the code under your account on the GitHub. @@ -73,7 +71,7 @@ $ git push -u origin my-feature Now it's time to go to the your fork of Camelot and create a pull request! You can [follow these instructions](https://help.github.com/articles/creating-a-pull-request-from-a-fork/) to do this. -### Work on your Pull Request +### Work on your pull request We recommend that your pull request complies with the following rules: @@ -81,7 +79,7 @@ We recommend that your pull request complies with the following rules: - In case your pull request contains function docstrings, make sure you follow the [numpydoc](https://numpydoc.readthedocs.io/en/latest/format.html) format. All function docstrings in Camelot follow this format. Moreover, following the format will make sure that the API documentation is generated flawlessly. -- Make sure your commit messages follow [the seven rules of a great git commit message](https://chris.beams.io/posts/git-commit/). +- Make sure your commit messages follow [the seven rules of a great git commit message](https://chris.beams.io/posts/git-commit/): - Separate subject from body with a blank line - Limit the subject line to 50 characters - Capitalize the subject line @@ -104,15 +102,15 @@ Writing documentation, function docstrings, examples and tutorials is a great wa It is written in [reStructuredText](https://en.wikipedia.org/wiki/ReStructuredText), with [Sphinx](http://www.sphinx-doc.org/en/master/) used to generate these lovely HTML files that you're currently reading (unless you're reading this on GitHub). You can edit the documentation using any text editor and then generate the HTML output by running `make html` in the `docs/` directory. -The function docstrings are written using the [numpydoc](https://numpydoc.readthedocs.io/en/latest/format.html) extension for Sphinx. Make sure you check out its format guidelines, before you start writing one. +The function docstrings are written using the [numpydoc](https://numpydoc.readthedocs.io/en/latest/format.html) extension for Sphinx. Make sure you check out its format guidelines before you start writing one. ## Filing Issues -We use [GitHub issues](https://docs.pytest.org/en/latest/) to keep track of all issues and pull requests. Before opening an issue (which asks a question or reports a bug), it is advisable to use GitHub search to look for existing issues (both open and closed) that may be similar. +We use [GitHub issues](https://docs.pytest.org/en/latest/) to keep track of all issues and pull requests. Before opening an issue (which asks a question or reports a bug), please use GitHub search to look for existing issues (both open and closed) that may be similar. ### Questions -Please don't use GitHub issues for support questions, a better place for them would be [Stack Overflow](http://stackoverflow.com). Make sure you tag them using the `python-camelot` tag. +Please don't use GitHub issues for support questions. A better place for them would be [Stack Overflow](http://stackoverflow.com). Make sure you tag them using the `python-camelot` tag. ### Bug Reports diff --git a/README.md b/README.md index ffdb09ae..8c9fd565 100644 --- a/README.md +++ b/README.md @@ -7,11 +7,11 @@ [![Build Status](https://travis-ci.org/socialcopsdev/camelot.svg?branch=master)](https://travis-ci.org/socialcopsdev/camelot) [![codecov.io](https://codecov.io/github/socialcopsdev/camelot/badge.svg?branch=master&service=github)](https://codecov.io/github/socialcopsdev/camelot?branch=master) [![image](https://img.shields.io/pypi/v/camelot-py.svg)](https://pypi.org/project/camelot-py/) [![image](https://img.shields.io/pypi/l/camelot-py.svg)](https://pypi.org/project/camelot-py/) [![image](https://img.shields.io/pypi/pyversions/camelot-py.svg)](https://pypi.org/project/camelot-py/) -**Camelot** is a Python library which makes it easy for *anyone* to extract tables from PDF files! +**Camelot** is a Python library that makes it easy for *anyone* to extract tables from PDF files! --- -**Here's how you can extract tables from PDF files.** Check out the PDF used in this example, [here](https://github.com/socialcopsdev/camelot/blob/master/docs/_static/pdf/foo.pdf). +**Here's how you can extract tables from PDF files.** Check out the PDF used in this example [here](https://github.com/socialcopsdev/camelot/blob/master/docs/_static/pdf/foo.pdf).
 >>> import camelot
@@ -43,14 +43,14 @@
 
 There's a [command-line interface](https://camelot-py.readthedocs.io/en/latest/user/cli.html) too!
 
-**Note:** Camelot only works with text-based PDFs and not scanned documents. If you can click-and-drag to select text in your table in a PDF viewer, then your PDF is text-based.
+**Note:** Camelot only works with text-based PDFs and not scanned documents. If you can click and drag to select text in your table in a PDF viewer, then your PDF is text-based.
 
 ## Why Camelot?
 
-- **You are in control**: Unlike other libraries and tools which either give a nice output or fail miserably (with no in-between), Camelot gives you the power to tweak table extraction. (Since everything in the real world, including PDF table extraction, is fuzzy.)
-- **Metrics**: *Bad* tables can be discarded based on metrics like accuracy and whitespace, without ever having to manually look at each table.
-- Each table is a **pandas DataFrame**, which enables seamless integration into [ETL and data analysis workflows](https://gist.github.com/vinayak-mehta/e5949f7c2410a0e12f25d3682dc9e873).
-- **Export** to multiple formats, including json, excel and html.
+- **You are in control.**: Unlike other libraries and tools which either give a nice output or fail miserably (with no in-between), Camelot gives you the power to tweak table extraction. (This is important since everything in the real world, including PDF table extraction, is fuzzy.)
+- *Bad* tables can be discarded based on **metrics** like accuracy and whitespace, without ever having to manually look at each table.
+- Each table is a **pandas DataFrame**, which seamlessly integrates into [ETL and data analysis workflows](https://gist.github.com/vinayak-mehta/e5949f7c2410a0e12f25d3682dc9e873).
+- **Export** to multiple formats, including JSON, Excel and HTML.
 
 See [comparison with other PDF table extraction libraries and tools](https://github.com/socialcopsdev/camelot/wiki/Comparison-with-other-PDF-Table-Extraction-libraries-and-tools).
 
diff --git a/docs/dev/contributing.rst b/docs/dev/contributing.rst
index 16f2a42d..e0602f13 100644
--- a/docs/dev/contributing.rst
+++ b/docs/dev/contributing.rst
@@ -24,7 +24,7 @@ As the `Requests Code Of Conduct`_ states, **all contributions are welcome**, as
 
 .. _Requests Code Of Conduct: http://docs.python-requests.org/en/master/dev/contributing/#be-cordial
 
-Your First Contribution
+Your first contribution
 -----------------------
 
 A great way to start contributing to Camelot is to pick an issue tagged with the `Contributor Friendly`_ or the `Easy`_ tags. If you're unable to find a good first issue, feel free to contact the maintainer.
@@ -39,13 +39,17 @@ To install the dependencies needed for development, you can use pip::
 
     $ pip install camelot-py[dev]
 
+Alternatively, you can clone the project repository, and install using pip::
+
+    $ pip install ".[dev]"
+
 Pull Requests
 -------------
 
-Submit a Pull Request
+Submit a pull request
 ^^^^^^^^^^^^^^^^^^^^^
 
-The preferred workflow for contributing to Camelot is to fork the `project repository`_ on GitHub, clone, develop on a branch and then finally submit a pull request. Steps:
+The preferred workflow for contributing to Camelot is to fork the `project repository`_ on GitHub, clone, develop on a branch and then finally submit a pull request. Here are the steps:
 
 .. _project repository: https://github.com/socialcopsdev/camelot
 
@@ -76,7 +80,7 @@ Now it's time to go to the your fork of Camelot and create a pull request! You c
 
 .. _follow these instructions: https://help.github.com/articles/creating-a-pull-request-from-a-fork/
 
-Work on your Pull Request
+Work on your pull request
 ^^^^^^^^^^^^^^^^^^^^^^^^^
 
 We recommend that your pull request complies with the following guidelines:
@@ -89,7 +93,7 @@ We recommend that your pull request complies with the following guidelines:
 
 .. _numpydoc: https://numpydoc.readthedocs.io/en/latest/format.html
 
-- Make sure your commit messages follow `the seven rules of a great git commit message`_.
+- Make sure your commit messages follow `the seven rules of a great git commit message`_:
     - Separate subject from body with a blank line
     - Limit the subject line to 50 characters
     - Capitalize the subject line
@@ -119,7 +123,7 @@ Writing documentation, function docstrings, examples and tutorials is a great wa
 
 The documentation is written in `reStructuredText`_, with `Sphinx`_ used to generate these lovely HTML files that you're currently reading (unless you're reading this on GitHub). You can edit the documentation using any text editor and then generate the HTML output by running `make html` in the ``docs/`` directory.
 
-The function docstrings are written using the `numpydoc`_ extension for Sphinx. Make sure you check out how its format guidelines, before you start writing one.
+The function docstrings are written using the `numpydoc`_ extension for Sphinx. Make sure you check out how its format guidelines before you start writing one.
 
 .. _reStructuredText: https://en.wikipedia.org/wiki/ReStructuredText
 .. _Sphinx: http://www.sphinx-doc.org/en/master/
@@ -128,14 +132,14 @@ The function docstrings are written using the `numpydoc`_ extension for Sphinx.
 Filing Issues
 -------------
 
-We use `GitHub issues`_ to keep track of all issues and pull requests. Before opening an issue (which asks a question or reports a bug), it is advisable to use GitHub search to look for existing issues (both open and closed) that may be similar.
+We use `GitHub issues`_ to keep track of all issues and pull requests. Before opening an issue (which asks a question or reports a bug), please use GitHub search to look for existing issues (both open and closed) that may be similar.
 
 .. _GitHub issues: https://docs.pytest.org/en/latest/
 
 Questions
 ^^^^^^^^^
 
-Please don't use GitHub issues for support questions, a better place for them would be `Stack Overflow`_. Make sure you tag them using the ``python-camelot`` tag.
+Please don't use GitHub issues for support questions. A better place for them would be `Stack Overflow`_. Make sure you tag them using the ``python-camelot`` tag.
 
 .. _Stack Overflow: http://stackoverflow.com
 
diff --git a/docs/index.rst b/docs/index.rst
index 00605e41..e2d5857b 100644
--- a/docs/index.rst
+++ b/docs/index.rst
@@ -23,11 +23,11 @@ Release v\ |version|. (:ref:`Installation `)
 .. image:: https://img.shields.io/pypi/pyversions/camelot-py.svg
     :target: https://pypi.org/project/camelot-py/
 
-**Camelot** is a Python library which makes it easy for *anyone* to extract tables from PDF files!
+**Camelot** is a Python library that makes it easy for *anyone* to extract tables from PDF files!
 
 ----
 
-**Here's how you can extract tables from PDF files.** Check out the PDF used in this example, `here`_.
+**Here's how you can extract tables from PDF files.** Check out the PDF used in this example `here`_.
 
 .. _here: _static/pdf/foo.pdf
 
@@ -55,15 +55,15 @@ Release v\ |version|. (:ref:`Installation `)
 
 There's a :ref:`command-line interface ` too!
 
-.. note:: Camelot only works with text-based PDFs and not scanned documents. If you can click-and-drag to select text in your table in a PDF viewer, then your PDF is text-based.
+.. note:: Camelot only works with text-based PDFs and not scanned documents. If you can click and drag to select text in your table in a PDF viewer, then your PDF is text-based.
 
 Why Camelot?
 ------------
 
-- **You are in control**: Unlike other libraries and tools which either give a nice output or fail miserably (with no in-between), Camelot gives you the power to tweak table extraction. (Since everything in the real world, including PDF table extraction, is fuzzy.)
-- **Metrics**: *Bad* tables can be discarded based on metrics like accuracy and whitespace, without ever having to manually look at each table.
-- Each table is a **pandas DataFrame**, which enables seamless integration into `ETL and data analysis workflows`_.
-- **Export** to multiple formats, including json, excel and html.
+- **You are in control.** Unlike other libraries and tools which either give a nice output or fail miserably (with no in-between), Camelot gives you the power to tweak table extraction. (This is important since everything in the real world, including PDF table extraction, is fuzzy.)
+- *Bad* tables can be discarded based on **metrics** like accuracy and whitespace, without ever having to manually look at each table.
+- Each table is a **pandas DataFrame**, which seamlessly integrates into `ETL and data analysis workflows`_.
+- **Export** to multiple formats, including JSON, Excel and HTML.
 
 See `comparison with other PDF table extraction libraries and tools`_.
 
@@ -73,7 +73,7 @@ See `comparison with other PDF table extraction libraries and tools`_.
 The User Guide
 --------------
 
-This part of the documentation, begins with some background information about why Camelot was created, takes a small dip into the implementation details and then focuses on step-by-step instructions for getting the most out of Camelot.
+This part of the documentation begins with some background information about why Camelot was created, takes a small dip into the implementation details and then focuses on step-by-step instructions for getting the most out of Camelot.
 
 .. toctree::
    :maxdepth: 2
@@ -85,8 +85,8 @@ This part of the documentation, begins with some background information about wh
    user/advanced
    user/cli
 
-The API Documentation / Guide
------------------------------
+The API Documentation/Guide
+---------------------------
 
 If you are looking for information on a specific function, class, or method,
 this part of the documentation is for you.
diff --git a/docs/user/advanced.rst b/docs/user/advanced.rst
index e73d3e43..fd65005a 100644
--- a/docs/user/advanced.rst
+++ b/docs/user/advanced.rst
@@ -8,7 +8,7 @@ This page covers some of the more advanced configurations for :ref:`Lattice ` needs the lines that make the table, to be in foreground. Here's an example of a table with lines in background.
+To detect line segments, :ref:`Lattice ` needs the lines that make the table to be in the foreground. Here's an example of a table with lines in the background:
 
 .. figure:: ../_static/png/background_lines.png
     :scale: 50%
@@ -68,16 +68,16 @@ Let's plot all the text present on the table's PDF page.
     :alt: A plot of all text on a PDF page
     :align: left
 
-This, as we shall later see, is very helpful with :ref:`Stream `, for noting table areas and column separators, in case Stream does not guess them correctly.
+This, as we shall later see, is very helpful with :ref:`Stream ` for noting table areas and column separators, in case Stream does not guess them correctly.
 
-.. note:: The *x-y* coordinates shown aboe change as you move your mouse cursor on the image, which can help you note coordinates.
+.. note:: The *x-y* coordinates shown above change as you move your mouse cursor on the image, which can help you note coordinates.
 
 .. _geometry_table:
 
 table
 ^^^^^
 
-Let's plot the table (to see if it was detected correctly or not). This geometry type, along with contour, line and joint is useful for debugging and improving the extraction output, in case the table wasn't detected correctly. More on that later.
+Let's plot the table (to see if it was detected correctly or not). This geometry type, along with contour, line and joint is useful for debugging and improving the extraction output, in case the table wasn't detected correctly. (More on that later.)
 
 ::
 
@@ -170,9 +170,9 @@ In cases like `these <../_static/pdf/column_separators.pdf>`__, where the text i
 
 You can pass the column separators as a list of comma-separated strings to :meth:`read_pdf() `, using the ``columns`` keyword argument.
 
-In case you passed a single column separators string list, and no table area is specified, the separators will be applied to the whole page. When a list of table areas is specified and there is a need to specify column separators as well, **the length of both lists should be equal**. Each table area will be mapped to each column separators' string using their indices.
+In case you passed a single column separators string list, and no table area is specified, the separators will be applied to the whole page. When a list of table areas is specified and you need to specify column separators as well, **the length of both lists should be equal**. Each table area will be mapped to each column separators' string using their indices.
 
-For example, if you have specified two table areas, ``table_areas=['12,23,43,54', '20,33,55,67']``, and only want to specify column separators for the first table, you can pass an empty string for the second table in the column separators' list, like this, ``columns=['10,120,200,400', '']``.
+For example, if you have specified two table areas, ``table_areas=['12,23,43,54', '20,33,55,67']``, and only want to specify column separators for the first table, you can pass an empty string for the second table in the column separators' list like this, ``columns=['10,120,200,400', '']``.
 
 Let's get back to the *x* coordinates we got from :ref:`plotting text ` that exists on this `PDF <../_static/pdf/column_separators.pdf>`__, and get the table out!
 
@@ -188,12 +188,12 @@ Let's get back to the *x* coordinates we got from :ref:`plotting text `_ merged the strings, "NUMBER", "TYPE" and "DBA NAME"; all of them were assigned to the same cell. Let's see how we can fix this in the next section.
+Ah! Since `PDFMiner `_ merged the strings, "NUMBER", "TYPE" and "DBA NAME", all of them were assigned to the same cell. Let's see how we can fix this in the next section.
 
 Split text along separators
 ---------------------------
 
-To deal with cases like the output from the previous section, you can pass ``split_text=True`` to :meth:`read_pdf() `, which will split any strings that lie in different cells but have been assigned to the a single cell (as a result of being merged together by `PDFMiner `_).
+To deal with cases like the output from the previous section, you can pass ``split_text=True`` to :meth:`read_pdf() `, which will split any strings that lie in different cells but have been assigned to a single cell (as a result of being merged together by `PDFMiner `_).
 
 ::
 
@@ -210,13 +210,13 @@ To deal with cases like the output from the previous section, you can pass ``spl
 Flag superscripts and subscripts
 --------------------------------
 
-There might be cases where you want to differentiate between the text, and superscripts or subscripts, like this `PDF <../_static/pdf/superscript.pdf>`_.
+There might be cases where you want to differentiate between the text and superscripts or subscripts, like this `PDF <../_static/pdf/superscript.pdf>`_.
 
 .. figure:: ../_static/png/superscript.png
     :alt: A PDF with superscripts
     :align: left
 
-In this case, the text that `other tools`_ return, will be ``24.912``. This is harmless as long as there is that decimal point involved. But when it isn't there, you'll be left wondering why the results of your data analysis were 10x bigger!
+In this case, the text that `other tools`_ return, will be ``24.912``. This is relatively harmless when that decimal point is involved. But when it isn't there, you'll be left wondering why the results of your data analysis are 10x bigger!
 
 You can solve this by passing ``flag_size=True``, which will enclose the superscripts and subscripts with ````, based on font size, as shown below.
 
@@ -327,7 +327,7 @@ Voila! Camelot can now see those lines. Let's get our table.
 Shift text in spanning cells
 ----------------------------
 
-By default, the :ref:`Lattice ` method shifts text in spanning cells, first to the left and then to the top, as you can observe in the output table above. However, this behavior can be changed using the ``shift_text`` keyword argument. Think of it as setting the *gravity* for a table, it decides the direction in which the text will move and finally come to rest.
+By default, the :ref:`Lattice ` method shifts text in spanning cells, first to the left and then to the top, as you can observe in the output table above. However, this behavior can be changed using the ``shift_text`` keyword argument. Think of it as setting the *gravity* for a table — it decides the direction in which the text will move and finally come to rest.
 
 ``shift_text`` expects a list with one or more characters from the following set: ``('', l', 'r', 't', 'b')``, which are then applied *in order*. The default, as we discussed above, is ``['l', 't']``.
 
@@ -356,7 +356,7 @@ We'll use the `PDF <../_static/pdf/short_lines.pdf>`__ from the previous example
     "Knowledge &Practices on HTN &","2400","Men (≥ 18 yrs)","-","-","-","1728"
     "DM","2400","Women (≥ 18 yrs)","-","-","-","1728"
 
-No surprises there, it did remain in place (observe the strings "2400" and "All the available individuals"). Let's pass ``shift_text=['r', 'b']``, to set the *gravity* to right-bottom, and move the text in that direction.
+No surprises there — it did remain in place (observe the strings "2400" and "All the available individuals"). Let's pass ``shift_text=['r', 'b']`` to set the *gravity* to right-bottom and move the text in that direction.
 
 ::
 
@@ -380,7 +380,7 @@ No surprises there, it did remain in place (observe the strings "2400" and "All
 Copy text in spanning cells
 ---------------------------
 
-You can copy text in spanning cells when using :ref:`Lattice `, in either horizontal or vertical direction, or both. This behavior is disabled by default.
+You can copy text in spanning cells when using :ref:`Lattice `, in either the horizontal or vertical direction, or both. This behavior is disabled by default.
 
 ``copy_text`` expects a list with one or more characters from the following set: ``('v', 'h')``, which are then applied *in order*.
 
diff --git a/docs/user/cli.rst b/docs/user/cli.rst
index a90d8f91..f96ceae9 100644
--- a/docs/user/cli.rst
+++ b/docs/user/cli.rst
@@ -1,11 +1,11 @@
 .. _cli:
 
-Command-line interface
+Command-Line Interface
 ======================
 
 Camelot comes with a command-line interface.
 
-You can print the help for the interface, by typing ``camelot --help`` in your favorite terminal program, as shown below. Furthermore, you can print the help for each command, by typing ``camelot  --help``, try it out!
+You can print the help for the interface by typing ``camelot --help`` in your favorite terminal program, as shown below. Furthermore, you can print the help for each command by typing ``camelot  --help``. Try it out!
 
 ::
 
diff --git a/docs/user/how-it-works.rst b/docs/user/how-it-works.rst
index 5329e385..385b393c 100644
--- a/docs/user/how-it-works.rst
+++ b/docs/user/how-it-works.rst
@@ -3,9 +3,9 @@
 How It Works
 ============
 
-This part of the documentation details a high-level explanation of how Camelot extracts tables from PDF files.
+This part of the documentation includes a high-level explanation of how Camelot extracts tables from PDF files.
 
-You can choose between two table parsing methods, *Stream* and *Lattice*. The naming for parsing methods inside Camelot (i.e. Stream and Lattice) was inspired from `Tabula`_.
+You can choose between two table parsing methods, *Stream* and *Lattice*. These names for parsing methods inside Camelot were inspired from `Tabula`_.
 
 .. _Tabula: https://github.com/tabulapdf/tabula
 
@@ -16,7 +16,7 @@ Stream
 
 Stream can be used to parse tables that have whitespaces between cells to simulate a table structure. It looks for these spaces between text to form a table representation.
 
-It is built on top of PDFMiner's functionality of grouping characters on a page into words and sentences, using `margins`_. After getting the words given on a page, it groups them into rows based on their *y* coordinates and tries to guess the number of columns the table might have by calculating the mode of the number of words in each row. This mode is used to calculate *x* ranges for the table's columns. It then adds columns to this column range list based on any words that may lie outside or inside the current column *x* ranges.
+It is built on top of PDFMiner's functionality of grouping characters on a page into words and sentences, using `margins`_. After getting the words on a page, it groups them into rows based on their *y* coordinates. It then tries to guess the number of columns the table might have by calculating the mode of the number of words in each row. This mode is used to calculate *x* ranges for the table's columns. It then adds columns to this column range list based on any words that may lie outside or inside the current column *x* ranges.
 
 .. _margins: https://euske.github.io/pdfminer/#tools
 
@@ -29,9 +29,9 @@ It is built on top of PDFMiner's functionality of grouping characters on a page
 Lattice
 -------
 
-Lattice is more deterministic in nature, and does not rely on guesses. It can be used to parse tables that have demarcated lines between cells, and can automatically parse multiple tables present on a page.
+Lattice is more deterministic in nature, and it does not rely on guesses. It can be used to parse tables that have demarcated lines between cells, and it can automatically parse multiple tables present on a page.
 
-It starts by converting the PDF page to an image using ghostscript and then processing it to get horizontal and vertical line segments by applying a set of morphological transformations (erosion and dilation) using OpenCV.
+It starts by converting the PDF page to an image using ghostscript, and then processes it to get horizontal and vertical line segments by applying a set of morphological transformations (erosion and dilation) using OpenCV.
 
 Let's see how Lattice processes the second page of `this PDF`_, step-by-step.
 
@@ -55,7 +55,7 @@ Let's see how Lattice processes the second page of `this PDF`_, step-by-step.
     :scale: 50%
     :align: left
 
-3. Table boundaries are computed, by overlapping the detected line segments again, this time by "`or`_"ing their pixel intensities.
+3. Table boundaries are computed by overlapping the detected line segments again, this time by "`or`_"ing their pixel intensities.
 
 .. _or: https://en.wikipedia.org/wiki/Logical_disjunction
 
@@ -65,7 +65,7 @@ Let's see how Lattice processes the second page of `this PDF`_, step-by-step.
     :scale: 50%
     :align: left
 
-4. Since dimensions of the PDF page and its image vary; the detected table boundaries, line intersections and line segments are scaled and translated to the PDF page's coordinate space, and a representation of the table is created.
+4. Since dimensions of the PDF page and its image vary, the detected table boundaries, line intersections, and line segments are scaled and translated to the PDF page's coordinate space, and a representation of the table is created.
 
 .. image:: ../_static/png/table.png
     :height: 674
diff --git a/docs/user/install.rst b/docs/user/install.rst
index 17dfe86a..d07cb891 100644
--- a/docs/user/install.rst
+++ b/docs/user/install.rst
@@ -3,12 +3,12 @@
 Installation of Camelot
 =======================
 
-This part of the documentation covers the installation of Camelot. First, you'll need to install the dependencies, which include `tk`_ and `ghostscript`_.
+This part of the documentation covers how to install Camelot. First, you'll need to install the dependencies, which include `tk`_ and `ghostscript`_.
 
 .. _tk: https://packages.ubuntu.com/trusty/python-tk
 .. _ghostscript: https://www.ghostscript.com/
 
-These can be installed using your system's package manager. You can run the following based on your OS.
+These can be installed using your system's package manager. You can run one of the following, based on your OS.
 
 For Ubuntu::
 
@@ -27,17 +27,17 @@ After installing the dependencies, you can simply use pip to install Camelot::
 
     $ pip install camelot-py
 
-Get the Source Code
+Get the source code
 -------------------
 
-Alternatively, you can install from source by:
+Alternatively, you can install from the source by:
 
 1. Cloning the GitHub repository.
 ::
 
     $ git clone https://www.github.com/socialcopsdev/camelot
 
-2. And then simply using pip again.
+2. Then simply using pip again.
 ::
 
     $ cd camelot
diff --git a/docs/user/intro.rst b/docs/user/intro.rst
index 59c666ea..a0bcd65b 100644
--- a/docs/user/intro.rst
+++ b/docs/user/intro.rst
@@ -6,20 +6,20 @@ Introduction
 The Camelot Project
 -------------------
 
-The Portable Document Format (PDF) was born out of `The Camelot Project`_ when a need was felt for "a universal way to communicate documents across a wide variety of machine configurations, operating systems and communication networks". The goal was to make these documents viewable on any display and printable on any modern printers. The invention of the `PostScript`_ page description language, which enabled the creation of *fixed-layout* flat documents (with text, fonts, graphics, images encapsulated), solved the problem.
+The PDF (Portable Document Format) was born out of `The Camelot Project`_ to create "a universal way to communicate documents across a wide variety of machine configurations, operating systems and communication networks". The goal was to make these documents viewable on any display and printable on any modern printers. The invention of the `PostScript`_ page description language, which enabled the creation of *fixed-layout* flat documents (with text, fonts, graphics, images encapsulated), solved this problem.
 
-At a very high level, PostScript defines instructions, such as, "place this character at this x,y coordinate on a plane". Spaces can be *simulated* by placing characters relatively far apart. Extending from that, tables can be *simulated* by placing characters (which constitute words) in two-dimensional grids. A PDF viewer just takes these instructions and draws everything for the user to view. Since it's just characters on a plane, there is no table data structure which can be extracted and used for analysis!
+At a high level, PostScript defines instructions, such as "place this character at this *x,y* coordinate on a plane". Spaces can be *simulated* by placing characters relatively far apart. Extending from that, tables can be *simulated* by placing characters (which constitute words) in two-dimensional grids. A PDF viewer just takes these instructions and draws everything for the user to view. Since a PDF is just characters on a plane, there is no table data structure that can be extracted and used for analysis!
 
-Sadly, a lot of open data is given out as tables which are trapped inside PDF files.
+Sadly, a lot of today's open data is trapped in PDF tables.
 
 .. _PostScript: http://www.planetpdf.com/planetpdf/pdfs/warnock_camelot.pdf
 
-Why another PDF Table Extraction library?
+Why another PDF table extraction library?
 -----------------------------------------
 
-There are both open (`Tabula`_, `pdf-table-extract`_) and closed-source (`smallpdf`_, `PDFTables`_) tools that are widely used, to extract tables from PDF files. They either give a nice output, or fail miserably. There is no in-between. This is not helpful, since everything in the real world, including PDF table extraction, is fuzzy, leading to creation of adhoc table extraction scripts for each different type of PDF that the user wants to parse.
+There are both open (`Tabula`_, `pdf-table-extract`_) and closed-source (`smallpdf`_, `PDFTables`_) tools that are widely used to extract tables from PDF files. They either give a nice output or fail miserably. There is no in between. This is not helpful since everything in the real world, including PDF table extraction, is fuzzy. This leads to the creation of ad-hoc table extraction scripts for each type of PDF table.
 
-Camelot was created with the goal of offering its users complete control over table extraction. If the users are not able to get the desired output with the default configuration, they should be able to tweak it and get the job done!
+We created Camelot to offer users complete control over table extraction. If you can't get your desired output with the default settings, you can tweak them and get the job done!
 
 Here is a `comparison`_ of Camelot's output with outputs from other open-source PDF parsing libraries and tools.
 
@@ -34,7 +34,7 @@ What's in a name?
 
 As you can already guess, this library is named after `The Camelot Project`_.
 
-Fun fact: "Camelot" is the name of the castle in the British comedy film `Monty Python and the Holy Grail`_ (and in the `Arthurian legend`_, which the film depicts), where Arthur leads his men, the Knights of the Round Table, and then sets off elsewhere after deciding that it is "a silly place". Interestingly, the language in which this library is written (Python) was named after Monty Python.
+Fun fact: In the British comedy film `Monty Python and the Holy Grail`_ (and in the `Arthurian legend`_ depicted in the film), "Camelot" is the name of the castle where Arthur leads his men, the Knights of the Round Table, and then sets off elsewhere after deciding that it is "a silly place". Interestingly, the language in which this library is written (Python) was named after Monty Python.
 
 .. _The Camelot Project: http://www.planetpdf.com/planetpdf/pdfs/warnock_camelot.pdf
 .. _Monty Python and the Holy Grail: https://en.wikipedia.org/wiki/Monty_Python_and_the_Holy_Grail
diff --git a/docs/user/quickstart.rst b/docs/user/quickstart.rst
index d7137465..f7c2863d 100644
--- a/docs/user/quickstart.rst
+++ b/docs/user/quickstart.rst
@@ -3,7 +3,7 @@
 Quickstart
 ==========
 
-In a hurry to extract tables from PDFs? This document gives a good introduction to help you get started with using Camelot.
+In a hurry to extract tables from PDFs? This document gives a good introduction to help you get started with Camelot.
 
 Read the PDF
 ------------
@@ -14,7 +14,7 @@ Begin by importing the Camelot module::
 
     >>> import camelot
 
-Now, let's try to read a PDF. You can check out the PDF used in this example, `here`_. Since the PDF has a table with clearly demarcated lines, we will use the :ref:`Lattice ` method here. To do that we will set the ``mesh`` keyword argument to ``True``.
+Now, let's try to read a PDF. (You can check out the PDF used in this example `here`_.) Since the PDF has a table with clearly demarcated lines, we will use the :ref:`Lattice ` method here. To do that, we will set the ``mesh`` keyword argument to ``True``.
 
 .. note:: :ref:`Lattice ` is used by default. You can use :ref:`Stream ` with ``flavor='stream'``.
 
@@ -47,7 +47,7 @@ Let's print the parsing report.
         'page': 1
     }
 
-Woah! The accuracy is top-notch and whitespace is less, that means the table was extracted correctly (most probably). You can access the table as a pandas DataFrame by using the :class:`table ` object's ``df`` property.
+Woah! The accuracy is top-notch and there is less whitespace, which means the table was most likely extracted correctly. You can access the table as a pandas DataFrame by using the :class:`table ` object's ``df`` property.
 
 ::
 
@@ -56,7 +56,7 @@ Woah! The accuracy is top-notch and whitespace is less, that means the table was
 .. csv-table::
   :file: ../_static/csv/foo.csv
 
-Looks good! You can be export the table as a CSV file using its :meth:`to_csv() ` method. Alternatively you can use :meth:`to_json() `, :meth:`to_excel() ` or :meth:`to_html() ` methods to export the table as JSON, Excel and HTML files respectively.
+Looks good! You can now export the table as a CSV file using its :meth:`to_csv() ` method. Alternatively you can use :meth:`to_json() `, :meth:`to_excel() ` or :meth:`to_html() ` methods to export the table as JSON, Excel and HTML files respectively.
 
 ::
 
@@ -85,7 +85,7 @@ By default, Camelot only uses the first page of the PDF to extract tables. To sp
 
     >>> camelot.read_pdf('your.pdf', pages='1,2,3')
 
-The ``pages`` keyword argument accepts pages as comma-separated string of page numbers. You can also specify page ranges, for example ``pages=1,4-10,20-30`` or ``pages=1,4-10,20-end``.
+The ``pages`` keyword argument accepts pages as comma-separated string of page numbers. You can also specify page ranges — for example, ``pages=1,4-10,20-30`` or ``pages=1,4-10,20-end``.
 
 ------------------------