Fasta map class
- At first we create a named tuple that will contain the sample information needed for both sorting and filtering.
- We go through the CSV table once to create a dictionary as a key we will have the region and as a value a list where we will add those values registered in the named tuple.
- When we already have the dictionary, we pass the list to the function get_average_row that will return the sample line that has the average value in the samples of each region.
- Finally the list comprehension will filter the csv with our samples.
class FastaMap
Represents a Map that stores RNA codes.
A fasta map is initialised using:
fasta_map = FastaMap(path_file) # file_path: The path to the .fasta file
_read_fasta
Reads a fasta file and returns a dict where the keys are the accessions and the values are the RNA sequences
:param: file_path
:return: sequences: Dict[str, str]
_get_rna
Get the header and the RNA from a String
:param: A String that contains info about the genome
:return: A Id-value: Tuple[str, str]
group_samples
Creation Sets "family samples"
:param: csv_table:
:return: tuple of relations: Tuple[List[set]]
compare_multi
Function to parallelize comparisons
:param ids :
:return: Tuple of relations: Tuple[str, str, float]
generate_relations
Generate tuple of list where this list stores all sample's name
:param compares:
:return: Tuple relation of samples: Tuple[List[set]]