Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] Add ability to determine whether an inducing path exists between two nodes #78

Merged
merged 36 commits into from
Jun 21, 2023
Merged
Show file tree
Hide file tree
Changes from 31 commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
8b7a1b6
Add function definition for inducing path algorithm
aryan26roy May 22, 2023
317c0a2
Changed function definition and handled multiple graph types as inputs
aryan26roy May 24, 2023
827f770
Add helper functions and test
aryan26roy May 25, 2023
6c3b2f9
Add some more unit tests
aryan26roy May 26, 2023
9d74925
Add implementation
aryan26roy May 26, 2023
cd5cb51
Fixed bugs in tests and implementation
aryan26roy May 26, 2023
e0f1e47
Update Changelog
aryan26roy May 26, 2023
a6ab296
Merge branch 'main' into inducing_path
aryan26roy May 26, 2023
3eec132
Fixed docstring
aryan26roy May 26, 2023
8a413f1
Fixed default arguments
aryan26roy May 26, 2023
ae6cad4
Added unit tests covering corner cases
aryan26roy May 26, 2023
641f06d
Update pywhy_graphs/algorithms/generic.py
aryan26roy Jun 1, 2023
c5255e1
Update pywhy_graphs/algorithms/generic.py
aryan26roy Jun 1, 2023
1aa0c59
Update pywhy_graphs/algorithms/generic.py
aryan26roy Jun 1, 2023
13fe758
Added some tests and incoporated review suggestions
aryan26roy Jun 1, 2023
efa2837
changed docstrings
aryan26roy Jun 1, 2023
30f7a91
Clarified return type
aryan26roy Jun 1, 2023
799bfa3
Corrected implementation
aryan26roy Jun 6, 2023
2282aa7
linting
aryan26roy Jun 6, 2023
f71670a
Changed ancestor check in implementation
aryan26roy Jun 6, 2023
bd658b8
Patched is_collider bug and updated function names
aryan26roy Jun 10, 2023
f3ed973
Updated docstring
aryan26roy Jun 10, 2023
3b7a0de
Patched ancestors bug
aryan26roy Jun 11, 2023
529a2a6
Fixed docstring
aryan26roy Jun 11, 2023
1f9ddac
Linting
aryan26roy Jun 11, 2023
433fd22
Fix docstring
aryan26roy Jun 11, 2023
a4c34d9
Update pywhy_graphs/algorithms/generic.py
aryan26roy Jun 13, 2023
cb8d57a
Linting
aryan26roy Jun 13, 2023
b23a6bd
Fixed typing
aryan26roy Jun 17, 2023
2cabbfc
Added a minimal working example
aryan26roy Jun 17, 2023
0568b4c
Linting
aryan26roy Jun 17, 2023
2919d13
Added refrences section in docstring
aryan26roy Jun 18, 2023
f8eef02
Fix circleCI issues
adam2392 Jun 20, 2023
ead96b0
Merge branch 'main' into inducing_path
adam2392 Jun 20, 2023
81dbd69
Compelte merge
adam2392 Jun 20, 2023
83852da
Removed inducing path example
aryan26roy Jun 21, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@ causal graph operations.
.. autosummary::
:toctree: generated/

inducing_path
is_valid_mec_graph
possible_ancestors
possible_descendants
Expand Down
2 changes: 2 additions & 0 deletions docs/whats_new/v0.1.rst
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,8 @@ Changelog
- |Feature| Implement pre-commit hooks for development, by `Jaron Lee`_ (:pr:`68`)
- |Feature| Implement a new submodule for converting graphs to a functional model, with :func:`pywhy_graphs.functional.make_graph_linear_gaussian`, by `Adam Li`_ (:pr:`75`)
- |Feature| Implement a multidomain linear functional graph, with :func:`pywhy_graphs.functional.make_graph_multidomain`, by `Adam Li`_ (:pr:`77`)
- |Feature| Implement and test functions to find inducing paths between two nodes, `Aryan Roy`_ (:pr:`78`)


Code and Documentation Contributors
-----------------------------------
Expand Down
60 changes: 60 additions & 0 deletions examples/inducingpath/inducing_path.py
aryan26roy marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
"""
======================================================
An introduction to Inducing Paths and how to find them
======================================================

A path p is called an ``inducing path`` relateve to <L,S>
on an ancestral graph, where every non-endpoint vertex on p
is either in L or a collider, and every collider on p is an ancestor
of either X , Y or a member of S.


In other words, it is a path between two nodes that cannot be
d-seperated, making it active regardless of what variables
we condition on.
"""

import pywhy_graphs
from pywhy_graphs import ADMG
from pywhy_graphs.viz import draw

# construct a causal graph that will result in
# X <- Y <-> Z <-> H; Z -> X
G = ADMG()
G.add_edge("Y", "X", G.directed_edge_name)
G.add_edge("Z", "X", G.directed_edge_name)
G.add_edge("Z", "Y", G.bidirected_edge_name)
G.add_edge("Z", "H", G.bidirected_edge_name)


dot_graph = draw(G)
dot_graph.render(outfile="admg.png", view=True)


# L contains the list of non-colliders in the path
L = {"Y"}

# Since the graph doesn't have a collider which is not
# an ancestor of any of the end-points, S is empty.
S = {}


print(pywhy_graphs.inducing_path(G, "X", "H", L, S))


# Construct a causal graph that will result in:
# X <-> Y <-> Z <-> H; Z -> X
G = ADMG()
G.add_edge("Y", "X", G.bidirected_edge_name)
G.add_edge("Z", "X", G.directed_edge_name)
G.add_edge("Z", "Y", G.bidirected_edge_name)
G.add_edge("Z", "H", G.bidirected_edge_name)

# There are no non-colliders in the path.
L = {}

# Y is a collider that not an ancestor of X or Y.
S = {"Y"}


print(pywhy_graphs.inducing_path(G, "X", "H", L, S))
259 changes: 258 additions & 1 deletion pywhy_graphs/algorithms/generic.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
from typing import List, Union
from typing import List, Set, Union

import networkx as nx

Expand All @@ -12,6 +12,7 @@
"is_node_common_cause",
"set_nodes_as_latent_confounders",
"is_valid_mec_graph",
"inducing_path",
]


Expand Down Expand Up @@ -333,3 +334,259 @@ def _single_shortest_path_early_stop(G, firstlevel, paths, cutoff, join, valid_p
nextlevel[w] = 1
level += 1
return paths


def _directed_sub_graph_ancestors(G, node: Node):
"""Finds the ancestors of a node in the directed subgraph.

Parameters
----------
G : Graph
The graph.
node : Node
The node for which we have to find the ancestors.

Returns
-------
out : set
The parents of the provided node.
"""

return nx.ancestors(G.sub_directed_graph(), node)


def _directed_sub_graph_parents(G, node: Node):
"""Finds the parents of a node in the directed subgraph.

Parameters
----------
G : Graph
The graph.
node : Node
The node for which we have to find the parents.

Returns
-------
out : set
The parents of the provided node.
"""

return set(G.sub_directed_graph().predecessors(node))


def _bidirected_sub_graph_neighbors(G, node: Node):
"""Finds the neighbors of a node in the bidirected subgraph.

Parameters
----------
G : Graph
The graph.
node : Node
The node for which we have to find the neighbors.

Returns
-------
out : set
The parents of the provided node.
"""
bidirected_parents = set()

if not isinstance(G, CPDAG):
bidirected_parents = set(G.sub_bidirected_graph().neighbors(node))

return bidirected_parents


def _is_collider(G, prev_node: Node, cur_node: Node, next_node: Node):
"""Checks if the given node is a collider or not.

Parameters
----------
G : graph
The graph.
prev_node : node
The previous node in the path.
cur_node : node
The node to be checked.
next_node: Node
The next node in the path.

Returns
-------
iscollider : bool
Bool is set true if the node is a collider, false otherwise.
"""
parents = _directed_sub_graph_parents(G, cur_node)
parents = parents.union(_bidirected_sub_graph_neighbors(G, cur_node))

if prev_node in parents and next_node in parents:
return True

return False


def _shortest_valid_path(
G,
node_x: Node,
node_y: Node,
L: Set,
S: Set,
visited: Set,
all_ancestors: Set,
cur_node: Node,
prev_node: Node,
):
"""Recursively explores a graph to find a path.

Finds path that are compliant with the inducing path requirements.

Parameters
----------
G : graph
The graph.
node_x : node
The source node.
node_y : node
The destination node
L : Set
Set containing all the non-colliders.
S : Set
Set containing all the colliders.
visited : Set
Set containing all the nodes already visited.
all_ancestors : Set
Set containing all the ancestors a collider needs to be checked against.
cur_node : node
The current node.
aryan26roy marked this conversation as resolved.
Show resolved Hide resolved
prev_node : node
The previous node in the path.

Returns
-------
path : Tuple[bool, path]
A tuple containing a bool and a path which is empty if the bool is false.
"""
path_exists = False
path = []
visited.add(cur_node)
neighbors = G.neighbors(cur_node)

if cur_node is node_y:
return (True, [node_y])

for elem in neighbors:
if elem in visited:
continue

else:
# If the current node is a collider, check that it is either an
# ancestor of X, Y or any element of S or that it is
# the destination node itself.
if (
aryan26roy marked this conversation as resolved.
Show resolved Hide resolved
_is_collider(G, prev_node, cur_node, elem)
and (cur_node not in all_ancestors)
and (cur_node not in S)
aryan26roy marked this conversation as resolved.
Show resolved Hide resolved
and (cur_node is not node_y)
):
continue

# If the current node is not a collider, check that it is
# either in L or the destination node itself.

elif (
not _is_collider(G, prev_node, cur_node, elem)
and (cur_node not in L)
and (cur_node is not node_y)
):
continue

# if it is a valid node and not the destination node,
# check if it has a path to the destination node

path_exists, temp_path = _shortest_valid_path(
G, node_x, node_y, L, S, visited, all_ancestors, elem, cur_node
)

if path_exists:
path.append(cur_node)
path.extend(temp_path)
break

return (path_exists, path)


def inducing_path(G, node_x: Node, node_y: Node, L: Set = None, S: Set = None):
"""Checks if an inducing path exists between two nodes as defined in :footcite:`Zhang2008`.

aryan26roy marked this conversation as resolved.
Show resolved Hide resolved
Parameters
----------
G : Graph
The graph.
node_x : node
The source node.
node_y : node
The destination node.
L : Set
Nodes that are ignored on the path. Defaults to an empty set. See Notes for details.
S: Set
Nodes that are always conditioned on. Defaults to an empty set. See Notes for details.

Returns
-------
path : Tuple[bool, path]
A tuple containing a bool and a path if the bool is true, an empty list otherwise.

Notes
aryan26roy marked this conversation as resolved.
Show resolved Hide resolved
-----
An inducing path intuitively is a path between two non-adjacent nodes that
cannot be d-separated. Therefore, the path is always "active" regardless of
what variables we condition on. L contains all the non-colliders, these nodes
are ignored along the path. S contains nodes that are always conditioned on
(hence if the ancestors of colliders are in S, then those collider
paths are always "active").
adam2392 marked this conversation as resolved.
Show resolved Hide resolved
"""
if L is None:
L = set()

if S is None:
S = set()

nodes = set(G.nodes)

if node_x not in nodes or node_y not in nodes:
raise ValueError("The provided nodes are not in the graph.")

if node_x == node_y:
raise ValueError("The source and destination nodes are the same.")

path = [] # this will contain the path.

x_ancestors = _directed_sub_graph_ancestors(G, node_x)
y_ancestors = _directed_sub_graph_ancestors(G, node_y)

xy_ancestors = x_ancestors.union(y_ancestors)

s_ancestors: set[Node] = set()

for elem in S:
s_ancestors = s_ancestors.union(_directed_sub_graph_ancestors(G, elem))

# ancestors of X, Y and all the elements of S

all_ancestors = xy_ancestors.union(s_ancestors)
x_neighbors = G.neighbors(node_x)

path_exists = False
for elem in x_neighbors:

visited = {node_x}
if elem not in visited:
path_exists, temp_path = _shortest_valid_path(
G, node_x, node_y, L, S, visited, all_ancestors, elem, node_x
)
if path_exists:
path.append(node_x)
path.extend(temp_path)
break

return (path_exists, path)
Loading