distance_histogram#
- RNAdist.sampling.ed_sampling.distance_histogram(fc: RNA.fold_compound, nr_samples: int = 1000, i: int = None, j: int = None, return_samples: bool = False)#
Samples structures for a sequence and returns the histogram of (all) pairwise distances.
Uses a much faster implementation if i and j are specified. Else computes all pairwise histograms
- Parameters:
fc (RNA.fold_compound) – ViennaRNA fold compound.
nr_samples (int) – How many samples should be drawn
i (int) – only use starting index i
j (int) – only use target index j
return_samples (bool) – returns samples as dictionary containing bit compressed structures as keys and counts as values. Structures can be decompressed using
bit_to_structure()
- Returns:
N x N x Nmatrix orNmatrix depending on wheter i and j are specified- Without i and j the fill matrix containins the histogram of distances from nucleotide
itoj at
matrix[i][j]
- Without i and j the fill matrix containins the histogram of distances from nucleotide
- Return type:
np.ndarray
dict: Dictionary containing bytes representation of structures if return samples if true
It is possible to sample expected distances using the ViennaRNA fold compound as follows. Please make sure to enable unique multiloop decomposition via
uniq_ML=1.>>> import RNA >>> seq = "GGGCUAUUAGCUC" >>> fc = RNA.fold_compound(seq, RNA.md(uniq_ML=1)) >>> x = distance_histogram(fc) >>> x[0, -1] array([ 0, 867, 1, 109, 0, 14, 0, 0, 0, 0, 0, 0, 9], dtype=int32)