The project objective is to perform an enrichment analysis of fusions with associated chromosomal regions which contain SNPs that relate to a certain disease or trait (that we have already the data of them) and see if they relate somehow to Alzheimer disease.
In stage 1 we would like to attribute the genes of traits/diseases from the data to a certain medical classification (which will be very helpful and promote the project) and divide them into categories that will be determined by me due to knowledge in this field. In stage 2, which includes the enrichment itself, we will do:
- Counting how many SNPs belong to each category and decide whether there is a valid enrichment of certain DNA areas or not.
- Counting how many SNPs are close-by to the fusion/junction boundaries.
It will be done with tools like script in Perl (for calculating the enrichment) and a related Venn diagram to see the overlap. Our expectation is that if the DNA area will be bigger and contain more base pairs of nucleotides, we will get a wider spectrum of relations.
The main aim eventually is reaching the understanding on how these certain SNPs associate with specific diseases due to being close to the novel fusion/junctions.