top_order.Rd
Find the Topological Ordering of parameters, which are assumed to come from a linear SEM (also known as SCM) where it is assumed that the independent noise terms have equal error variance.
top_order(X, method = "TD", ...)
X | A matrix containing the oberved variables. |
---|---|
method | The estimation method. Possible choises are "TD", "BU", "HTD" and "HBU". |
... | Terms passed to the method specific estimation steps. If the method "HBU" is specified it is possible to specify a tuning
parameter If the method "HTD" is specified it is posible to specify a
The "full" search method simply looks at all possible subsets of the
current ansestreal set of size |
A vector of length equal to the number of parameters with the estimated topological ordering is returned.
The function top_order
estimated the topological order of the causal
graph. The method is based on the theory and pseudo code from Chen, Drton and
Wang (2018). The 4 methods "TD" (Top Down), "BU" (Bottom Up), "HTD" (High
dimensional Top Down) and "HBU" (High dimensional Bottom Up) are implemented
in this function.
The "TD" and "BU" methods only work when the number of observations
n
is larger then the number of parameters p
. Moreover the
consistency results shown in the preprint only hold if \(n > p^2 log(p)\).
The "HTD" and "HBU" both require extra assumption either on the maximal in-degree or on the maximal size of markov blankets in the graph respectivly. Hence these methods both has parameters that allows the user to tune and use different sub procedures that fit the assumptions on the graph.
The "HBU" needs a tuning parameter M
which is used in the organic
lasso to estimate the conditional variance:
$$\hat{\sigma}^2_{j,\Theta} = \min((1/n)||X_j - X_{V \ (\Theta \cup \{j\})}\beta||_2^2 + 2\lambda||\beta||_1^2)$$
where \(\lambda = (2M(1/n)log(p))^{1/2}\).
The "HTD" needs both a max.degree
equal to the assumed maximal
in-degree in the graph and a search
parameter. One of three different
search
procedures "full", "B&B" or "OMP" must be used. All three
procedures seek to find a subset of variables of size max.degree
that
minimize the conditional variance of the current variable in question given
this subset.
The "full" search method simply goes through all poissible subsets of size
max.degree
. The "B&B" searched by bound and branch methods. Lastly the
"OMP" procedure findes max.degree
variables via forward selection. -
Chen, W., Drton, M. & Wang, Y. S. (2018). On Causal Discovery with Equal Variance Assumption. arXiv preprint arXiv:1807.03419.
The wrapper function graph_est
combines top_order
and the model selection function graph_from_top
into one
function that estimates the causal graph from the data.
n <- 1000 p <- 5 sigma <- 1 # an example where all nodes in the graph are root nodes X <- matrix(rnorm(n * p, 0, sigma), ncol = p) order <- top_order(X) # an example where pa(X_i) = X_j for all j < i for (i in seq_len(p)[-1]) { X[ ,i] <- X[ ,i-1] + X[ ,i] } top_order(X, method = "TD")#> [1] 1 2 3 4 5top_order(X, method = "BU")#> [1] 1 2 3 4 5top_order(X, method = "HTD", max.degree = 2L, search = "B&B")#> [1] 1 2 3 4 5top_order(X, method = "HBU")#> [1] 1 3 4 2 5