Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix doc format. #173

Merged
merged 4 commits into from
Jan 24, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@
# -- Project information -----------------------------------------------------

project = 'rLLM'
copyright = '2024, Rllm Team'
copyright = '2024, rLLM Team'
author = 'Zheng Wang, Weichen Li, Xiaotong Huang, Enze Zhang'
version = '1.0'
# The full version, including alpha/beta/rc tags
Expand Down
14 changes: 7 additions & 7 deletions docs/source/introduce/table_data_handle.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,12 +4,12 @@ Table Data Handle
Data Handling of Tables
-----------------------

A table contains many different columns with many different types. Each column type in Rllm is described by a certain semantic type, i.e., ColType. Rllm supports two basic column types so far:
A table contains many different columns with many different types. Each column type in rLLM is described by a certain semantic type, i.e., ColType. rLLM supports two basic column types so far:

- :obj:`ColType.CATEGORICAL`: represent categorical or discrete data, such as grade levels in a student dataset and diabetes types in a diabetes dataset.
- :obj:`ColType.NUMERICAL`: represent numerical or continuous data, such as such as temperature in a weather dataset and income in a salary dataset.

A table in Rllm is described by an instance of :class:`~rllm.data.table_data.TableData` with many default attributes:
A table in rLLM is described by an instance of :class:`~rllm.data.table_data.TableData` with many default attributes:

- :obj:`df`: A `pandas.DataFrame`_ stores raw tabular data.
- :obj:`col_types`: A dictionary indicating :class:`~rllm.types.ColType` of each column.
Expand Down Expand Up @@ -73,10 +73,10 @@ A table in Rllm is described by an instance of :class:`~rllm.data.table_data.Tab
dataset.y
>>> tensor([0, 1, 1, ..., 0, 1, 0])

dataset.stats_dict[ColType.CATEGORICAL][0]
dataset.stats_dict[ColType.CATEGORICAL][0]
>>> {<StatType.COUNT: 'COUNT'>: 3, <StatType.MOST_FREQUENT: 'MOST_FREQUENT'>: 2, <StatType.COLNAME: 'COLNAME'>: 'Pclass'}

dataset.stats_dict[ColType.NUMERICAL][0]
dataset.stats_dict[ColType.NUMERICAL][0]
>>> {<StatType.MEAN: 'MEAN'>: 29.69911766052246, <StatType.MAX: 'MAX'>: 80.0, <StatType.MIN: 'MIN'>: 0.41999998688697815, <StatType.STD: 'STD'>: 14.526496887207031, <StatType.QUANTILES: 'QUANTILES'>: [0.41999998688697815, 20.125, 28.0, 38.0, 80.0], <StatType.COLNAME: 'COLNAME'>: 'Age'}

Also, an instance of :class:`~rllm.data.table_data.TableData` contains many basic properties:
Expand Down Expand Up @@ -105,9 +105,9 @@ We support transferring the data in a :class:`~rllm.data.table_data.TableData` t
Common Benchmark Datasets (Table Part)
---------------------------------------

Rllm contains a large number of common benchmark datasets. The list of all datasets are available in :mod:`~rllm.datasets`. Our dataset includes graph datasets and tabular datasets. We use tabular data for the demonstration.
rLLM contains a large number of common benchmark datasets. The list of all datasets are available in :mod:`~rllm.datasets`. Our dataset includes graph datasets and tabular datasets. We use tabular data for the demonstration.

Initializing tabular datasets is straightforward in Rllm. An initialization of a dataset will automatically download its raw files and process its columns.
Initializing tabular datasets is straightforward in rLLM. An initialization of a dataset will automatically download its raw files and process its columns.

In the below example, we will use one of the pre-loaded datasets, containing the Titanic passengers.

Expand Down Expand Up @@ -138,7 +138,7 @@ In the below example, we will use one of the pre-loaded datasets, containing the

[5 rows x 11 columns]

Rllm also supports a custom dataset, so that you can use Rllm for your own problem. Assume you prepare your `pandas.DataFrame`_ as :obj:`df` with five columns: :obj:`cat1`, :obj:`cat2`, :obj:`num1`, :obj:`num2`, and :obj:`y`. Creating :class:`~rllm.data.table_data.TableData` object is very easy.
rLLM also supports a custom dataset, so that you can use rLLM for your own problem. Assume you prepare your `pandas.DataFrame`_ as :obj:`df` with five columns: :obj:`cat1`, :obj:`cat2`, :obj:`num1`, :obj:`num2`, and :obj:`y`. Creating :class:`~rllm.data.table_data.TableData` object is very easy.

.. _pandas.DataFrame: http://pandas.pydata.org/pandas-docs/dev/reference/api/pandas.DataFrame.html#pandas.DataFrame

Expand Down
6 changes: 3 additions & 3 deletions docs/source/tutorial/gnns.rst
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
Design of GNNs
===============

What is a GNN?
What is GNN?
----------------
In machine learning, **Graph Neural Networks (GNNs)** are a class of neural networks specifically designed to process graph-structured data. In a GNN, the input is represented as a graph, where nodes (vertices) correspond to entities and edges represent the relationships or interactions between these entities. A typical GNN architecture consists of an initial Transform followed by multiple Convolution layers, as detailed in *Understanding Transform* and *Understanding Convolution*.
In machine learning, **Graph Neural Networks (GNNs)** are a class of neural networks specifically designed to process graph-structured data. In a GNN, the input is represented as a graph, where nodes (vertices) correspond to entities and edges represent the relationships or interactions between these entities. A typical GNN architecture consists of an initial Transform followed by multiple Convolution layers, as detailed in :doc:`Understanding Transforms <transforms>` and :doc:`Understanding Convolutions <convolutions>`.


Construct a GCN
Expand Down Expand Up @@ -91,4 +91,4 @@ Finally, we need to implement a :obj:`train()` function and a :obj:`test()` func
acc = correct / int(data.test_mask.sum())

print(f"Accuracy: {acc:.4f}")
>>> 0.8150
>>> 0.8150
5 changes: 3 additions & 2 deletions docs/source/tutorial/rtls.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ Design of RTLs

What is RTL?
----------------
In machine learning, **Relational Table Learnings (RTLs)** typically refers to the learning of relational table data, which consists of multiple interconnected tables with significant heterogeneity. In an RTL, the input comprises multiple table signals that are interrelated. A typical RTL architecture consists of one or more Transforms followed by multiple Convolution layers, as detailed in **Understanding Transforms** and **Understanding Convolutions**.
In machine learning, **Relational Table Learnings (RTLs)** typically refers to the learning of relational table data, which consists of multiple interconnected tables with significant heterogeneity. In an RTL, the input comprises multiple table signals that are interrelated. A typical RTL architecture consists of one or more Transforms followed by multiple Convolution layers, as detailed in :doc:`Understanding Transforms <transforms>` and :doc:`Understanding Convolutions <convolutions>`.


Construct a BRIDGE
Expand Down Expand Up @@ -132,5 +132,6 @@ Finally, we jointly train the model and evaluate the results on the test set.
)
preds = logits.argmax(dim=1)
acc = (preds[test_mask] == y[test_mask]).sum(dim=0) / test_mask.sum()

print(f'Accuracy: {acc:.4f}')
>>> 0.3860
>>> 0.3860
4 changes: 3 additions & 1 deletion docs/source/tutorial/tnns.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,8 @@ Design of TNNs
===============
What is TNN?
----------------
In machine learning, **Table/Tabular Neural Networks (TNNs)** are recently emerging neural networks specifically designed to process tabular data. In a TNN, the input is structured tabular data, usually organized in rows and columns. A typical TNN architecture consists of an initial Transform followed by multiple Convolution layers, as detailed in *Understanding Transforms* and *Understanding Convolutions*.
In machine learning, **Table/Tabular Neural Networks (TNNs)** are recently emerging neural networks specifically designed to process tabular data. In a TNN, the input is structured tabular data, usually organized in rows and columns. A typical TNN architecture consists of an initial Transform followed by multiple Convolution layers, as detailed in :doc:`Understanding Transforms <transforms>` and :doc:`Understanding Convolutions <convolutions>`.



Construct a TabTransformer
Expand Down Expand Up @@ -108,5 +109,6 @@ Finally, we train our model and get the classification results on the test set.
pred_class = pred.argmax(dim=-1)
correct += (y == pred_class).sum()
acc = int(correct) / len(test_dataset)

print(f'Accuracy: {acc:.4f}')
>>> 0.8082
Loading