WO2015171457A1 - Methods of identifying biomarkers associated with or causative of the progression of disease, in particular for use in prognosticating primary open angle glaucoma - Google Patents
Methods of identifying biomarkers associated with or causative of the progression of disease, in particular for use in prognosticating primary open angle glaucoma Download PDFInfo
- Publication number
- WO2015171457A1 WO2015171457A1 PCT/US2015/028833 US2015028833W WO2015171457A1 WO 2015171457 A1 WO2015171457 A1 WO 2015171457A1 US 2015028833 W US2015028833 W US 2015028833W WO 2015171457 A1 WO2015171457 A1 WO 2015171457A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- hsa
- mir
- disease
- poag
- genes
- Prior art date
Links
- 206010030348 Open-Angle Glaucoma Diseases 0.000 title claims abstract description 154
- 201000010099 disease Diseases 0.000 title claims abstract description 154
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 title claims abstract description 154
- 201000006366 primary open angle glaucoma Diseases 0.000 title claims abstract description 146
- 238000000034 method Methods 0.000 title claims abstract description 145
- 239000000090 biomarker Substances 0.000 title claims abstract description 54
- 108090000623 proteins and genes Proteins 0.000 claims abstract description 230
- 230000014509 gene expression Effects 0.000 claims abstract description 114
- 108700028369 Alleles Proteins 0.000 claims abstract description 81
- 238000012360 testing method Methods 0.000 claims abstract description 24
- 238000011282 treatment Methods 0.000 claims abstract description 23
- 238000007482 whole exome sequencing Methods 0.000 claims abstract description 19
- 238000012937 correction Methods 0.000 claims abstract description 7
- -1 ABBBP Proteins 0.000 claims description 206
- 108700011259 MicroRNAs Proteins 0.000 claims description 65
- 210000001519 tissue Anatomy 0.000 claims description 40
- 239000002679 microRNA Substances 0.000 claims description 33
- 108091053841 Homo sapiens miR-483 stem-loop Proteins 0.000 claims description 31
- 108091055511 Homo sapiens miR-548ah stem-loop Proteins 0.000 claims description 30
- 210000000349 chromosome Anatomy 0.000 claims description 25
- 238000012163 sequencing technique Methods 0.000 claims description 23
- 108091070395 Homo sapiens miR-31 stem-loop Proteins 0.000 claims description 21
- 150000007523 nucleic acids Chemical class 0.000 claims description 21
- 230000001105 regulatory effect Effects 0.000 claims description 21
- 108091044923 Homo sapiens miR-1226 stem-loop Proteins 0.000 claims description 20
- 108091069019 Homo sapiens miR-124-1 stem-loop Proteins 0.000 claims description 20
- 108091069008 Homo sapiens miR-124-2 stem-loop Proteins 0.000 claims description 20
- 108091069007 Homo sapiens miR-124-3 stem-loop Proteins 0.000 claims description 20
- 108091069015 Homo sapiens miR-138-2 stem-loop Proteins 0.000 claims description 20
- 108091067580 Homo sapiens miR-214 stem-loop Proteins 0.000 claims description 20
- 108091069517 Homo sapiens miR-224 stem-loop Proteins 0.000 claims description 20
- 108091080306 Homo sapiens miR-2277 stem-loop Proteins 0.000 claims description 20
- 108091055648 Homo sapiens miR-4423 stem-loop Proteins 0.000 claims description 20
- 108091032542 Homo sapiens miR-452 stem-loop Proteins 0.000 claims description 20
- 108091023276 Homo sapiens miR-4640 stem-loop Proteins 0.000 claims description 20
- 108091023130 Homo sapiens miR-4677 stem-loop Proteins 0.000 claims description 20
- 108091064365 Homo sapiens miR-505 stem-loop Proteins 0.000 claims description 20
- 108091060465 Homo sapiens miR-767 stem-loop Proteins 0.000 claims description 20
- 102000039446 nucleic acids Human genes 0.000 claims description 20
- 108020004707 nucleic acids Proteins 0.000 claims description 20
- 108091068853 Homo sapiens miR-100 stem-loop Proteins 0.000 claims description 19
- 108091067617 Homo sapiens miR-139 stem-loop Proteins 0.000 claims description 19
- 108091056757 Homo sapiens miR-3613 stem-loop Proteins 0.000 claims description 19
- 108091056649 Homo sapiens miR-3622a stem-loop Proteins 0.000 claims description 19
- 108091063807 Homo sapiens miR-545 stem-loop Proteins 0.000 claims description 19
- 239000002773 nucleotide Substances 0.000 claims description 18
- 125000003729 nucleotide group Chemical group 0.000 claims description 18
- 108091069002 Homo sapiens miR-145 stem-loop Proteins 0.000 claims description 17
- 108091063813 Homo sapiens miR-455 stem-loop Proteins 0.000 claims description 17
- 108091089467 Homo sapiens miR-5584 stem-loop Proteins 0.000 claims description 17
- 108091063808 Homo sapiens miR-574 stem-loop Proteins 0.000 claims description 17
- 108091086709 Homo sapiens miR-675 stem-loop Proteins 0.000 claims description 17
- 238000005259 measurement Methods 0.000 claims description 17
- 102000004169 proteins and genes Human genes 0.000 claims description 17
- 108091070522 Homo sapiens let-7a-2 stem-loop Proteins 0.000 claims description 16
- 108091072933 Homo sapiens miR-3117 stem-loop Proteins 0.000 claims description 15
- 108091044881 Homo sapiens miR-1246 stem-loop Proteins 0.000 claims description 11
- 108091044886 Homo sapiens miR-1250 stem-loop Proteins 0.000 claims description 11
- 108091069022 Homo sapiens miR-130a stem-loop Proteins 0.000 claims description 11
- 108091070490 Homo sapiens miR-18a stem-loop Proteins 0.000 claims description 11
- 108091031921 Homo sapiens miR-18b stem-loop Proteins 0.000 claims description 11
- 108091092301 Homo sapiens miR-193b stem-loop Proteins 0.000 claims description 11
- 108091080307 Homo sapiens miR-2276 stem-loop Proteins 0.000 claims description 11
- 108091072662 Homo sapiens miR-3182 stem-loop Proteins 0.000 claims description 11
- 108091065451 Homo sapiens miR-34b stem-loop Proteins 0.000 claims description 11
- 108091055376 Homo sapiens miR-4448 stem-loop Proteins 0.000 claims description 11
- 108091032929 Homo sapiens miR-449a stem-loop Proteins 0.000 claims description 11
- 108091045258 Homo sapiens miR-513c stem-loop Proteins 0.000 claims description 11
- 239000012472 biological sample Substances 0.000 claims description 11
- 108091068943 Homo sapiens miR-105-1 stem-loop Proteins 0.000 claims description 10
- 108091068938 Homo sapiens miR-105-2 stem-loop Proteins 0.000 claims description 10
- 108091067642 Homo sapiens miR-129-1 stem-loop Proteins 0.000 claims description 10
- 108091069093 Homo sapiens miR-129-2 stem-loop Proteins 0.000 claims description 10
- 108091067471 Homo sapiens miR-211 stem-loop Proteins 0.000 claims description 10
- 108091069021 Homo sapiens miR-30b stem-loop Proteins 0.000 claims description 10
- 108091072697 Homo sapiens miR-323b stem-loop Proteins 0.000 claims description 10
- 108091067563 Homo sapiens miR-376a-1 stem-loop Proteins 0.000 claims description 10
- 108091069003 Homo sapiens miR-9-1 stem-loop Proteins 0.000 claims description 10
- 108091068996 Homo sapiens miR-9-2 stem-loop Proteins 0.000 claims description 10
- 108091069001 Homo sapiens miR-9-3 stem-loop Proteins 0.000 claims description 10
- 108091065457 Homo sapiens miR-99b stem-loop Proteins 0.000 claims description 10
- 108091070511 Homo sapiens let-7c stem-loop Proteins 0.000 claims description 9
- 108091044695 Homo sapiens miR-1248 stem-loop Proteins 0.000 claims description 9
- 108091069092 Homo sapiens miR-138-1 stem-loop Proteins 0.000 claims description 9
- 108091067654 Homo sapiens miR-148a stem-loop Proteins 0.000 claims description 9
- 108091070493 Homo sapiens miR-21 stem-loop Proteins 0.000 claims description 9
- 108091070400 Homo sapiens miR-27a stem-loop Proteins 0.000 claims description 9
- 108091007773 MIR100 Proteins 0.000 claims description 9
- 108091008065 MIR21 Proteins 0.000 claims description 9
- 108091008051 MIR27A Proteins 0.000 claims description 9
- 108091007772 MIRLET7C Proteins 0.000 claims description 9
- 102000054765 polymorphisms of proteins Human genes 0.000 claims description 9
- 108091070519 Homo sapiens miR-19b-1 stem-loop Proteins 0.000 claims description 8
- 230000002401 inhibitory effect Effects 0.000 claims description 8
- 101100396232 Bombyx mori EN03 gene Proteins 0.000 claims description 7
- 102100033781 Collagen alpha-2(IV) chain Human genes 0.000 claims description 7
- 101000710876 Homo sapiens Collagen alpha-2(IV) chain Proteins 0.000 claims description 7
- 101000984192 Homo sapiens Leukocyte immunoglobulin-like receptor subfamily B member 3 Proteins 0.000 claims description 7
- 101000972286 Homo sapiens Mucin-4 Proteins 0.000 claims description 7
- 101000736906 Homo sapiens Protein prune homolog 2 Proteins 0.000 claims description 7
- 101000798707 Homo sapiens Transmembrane protease serine 13 Proteins 0.000 claims description 7
- 108091067602 Homo sapiens miR-181b-1 stem-loop Proteins 0.000 claims description 7
- 108091065989 Homo sapiens miR-181b-2 stem-loop Proteins 0.000 claims description 7
- 102100025582 Leukocyte immunoglobulin-like receptor subfamily B member 3 Human genes 0.000 claims description 7
- 102100022693 Mucin-4 Human genes 0.000 claims description 7
- 102100036040 Protein prune homolog 2 Human genes 0.000 claims description 7
- 102100032467 Transmembrane protease serine 13 Human genes 0.000 claims description 7
- 238000007481 next generation sequencing Methods 0.000 claims description 7
- 238000000528 statistical test Methods 0.000 claims description 7
- 102100022031 39S ribosomal protein L23, mitochondrial Human genes 0.000 claims description 6
- 238000000729 Fisher's exact test Methods 0.000 claims description 6
- 101001107433 Homo sapiens 39S ribosomal protein L23, mitochondrial Proteins 0.000 claims description 6
- 101000972548 Homo sapiens Leucine-rich repeat-containing protein 37A Proteins 0.000 claims description 6
- 101000984189 Homo sapiens Leukocyte immunoglobulin-like receptor subfamily B member 2 Proteins 0.000 claims description 6
- 101001093937 Homo sapiens SEC14-like protein 1 Proteins 0.000 claims description 6
- 101000976446 Homo sapiens Zinc finger protein 594 Proteins 0.000 claims description 6
- 101000964789 Homo sapiens Zinc finger protein 83 Proteins 0.000 claims description 6
- 102100022672 Leucine-rich repeat-containing protein 37A Human genes 0.000 claims description 6
- 102100025583 Leukocyte immunoglobulin-like receptor subfamily B member 2 Human genes 0.000 claims description 6
- 102100035214 SEC14-like protein 1 Human genes 0.000 claims description 6
- 102100023641 Zinc finger protein 594 Human genes 0.000 claims description 6
- 102100040639 Zinc finger protein 83 Human genes 0.000 claims description 6
- 230000004770 neurodegeneration Effects 0.000 claims description 6
- 208000015122 neurodegenerative disease Diseases 0.000 claims description 6
- 230000008707 rearrangement Effects 0.000 claims description 6
- 102100024626 5'-AMP-activated protein kinase subunit gamma-2 Human genes 0.000 claims description 5
- 102100028247 Abl interactor 1 Human genes 0.000 claims description 5
- 102100034564 Ankyrin repeat domain-containing protein 36A Human genes 0.000 claims description 5
- 102100034566 Ankyrin repeat domain-containing protein 36B Human genes 0.000 claims description 5
- 102000014811 CACNA1E Human genes 0.000 claims description 5
- 102100039534 Calcium-activated chloride channel regulator 4 Human genes 0.000 claims description 5
- 102100034952 Coiled-coil domain-containing protein 66 Human genes 0.000 claims description 5
- 102100036727 Deformed epidermal autoregulatory factor 1 homolog Human genes 0.000 claims description 5
- 102100032565 Golgin subfamily A member 3 Human genes 0.000 claims description 5
- 101000760987 Homo sapiens 5'-AMP-activated protein kinase subunit gamma-2 Proteins 0.000 claims description 5
- 101000724225 Homo sapiens Abl interactor 1 Proteins 0.000 claims description 5
- 101000924343 Homo sapiens Ankyrin repeat domain-containing protein 36A Proteins 0.000 claims description 5
- 101000924345 Homo sapiens Ankyrin repeat domain-containing protein 36B Proteins 0.000 claims description 5
- 101000888577 Homo sapiens Calcium-activated chloride channel regulator 4 Proteins 0.000 claims description 5
- 101000946606 Homo sapiens Coiled-coil domain-containing protein 66 Proteins 0.000 claims description 5
- 101000929421 Homo sapiens Deformed epidermal autoregulatory factor 1 homolog Proteins 0.000 claims description 5
- 101001014634 Homo sapiens Golgin subfamily A member 3 Proteins 0.000 claims description 5
- 101000608935 Homo sapiens Leukosialin Proteins 0.000 claims description 5
- 101001005668 Homo sapiens Mastermind-like protein 3 Proteins 0.000 claims description 5
- 101000589016 Homo sapiens Myomegalin Proteins 0.000 claims description 5
- 101000741895 Homo sapiens POTE ankyrin domain family member C Proteins 0.000 claims description 5
- 101000735217 Homo sapiens Paralemmin-2 Proteins 0.000 claims description 5
- 101000775052 Homo sapiens Protein AHNAK2 Proteins 0.000 claims description 5
- 101000942742 Homo sapiens Protein lin-7 homolog A Proteins 0.000 claims description 5
- 101000685298 Homo sapiens Protein sel-1 homolog 3 Proteins 0.000 claims description 5
- 101000654679 Homo sapiens Semaphorin-5B Proteins 0.000 claims description 5
- 101000830563 Homo sapiens Trinucleotide repeat-containing gene 18 protein Proteins 0.000 claims description 5
- 101000607865 Homo sapiens Ubiquitin carboxyl-terminal hydrolase 20 Proteins 0.000 claims description 5
- 101000860430 Homo sapiens Versican core protein Proteins 0.000 claims description 5
- 101000867844 Homo sapiens Voltage-dependent R-type calcium channel subunit alpha-1E Proteins 0.000 claims description 5
- 101000955141 Homo sapiens WASH complex subunit 1 Proteins 0.000 claims description 5
- 101000931048 Homo sapiens Zinc finger protein DPF3 Proteins 0.000 claims description 5
- 101710043141 KIAA0930 Proteins 0.000 claims description 5
- 108010025026 Ku Autoantigen Proteins 0.000 claims description 5
- 102100025134 Mastermind-like protein 3 Human genes 0.000 claims description 5
- 102100032966 Myomegalin Human genes 0.000 claims description 5
- 208000022873 Ocular disease Diseases 0.000 claims description 5
- 102100038763 POTE ankyrin domain family member C Human genes 0.000 claims description 5
- 102100035032 Paralemmin-2 Human genes 0.000 claims description 5
- 102100031838 Protein AHNAK2 Human genes 0.000 claims description 5
- 102100032928 Protein lin-7 homolog A Human genes 0.000 claims description 5
- 102100023163 Protein sel-1 homolog 3 Human genes 0.000 claims description 5
- 102100032780 Semaphorin-5B Human genes 0.000 claims description 5
- 102100024597 Trinucleotide repeat-containing gene 18 protein Human genes 0.000 claims description 5
- 102100039920 Ubiquitin carboxyl-terminal hydrolase 20 Human genes 0.000 claims description 5
- 102100025766 Uncharacterized protein KIAA0930 Human genes 0.000 claims description 5
- 102100028437 Versican core protein Human genes 0.000 claims description 5
- 102100038956 WASH complex subunit 1 Human genes 0.000 claims description 5
- 102100036296 Zinc finger protein DPF3 Human genes 0.000 claims description 5
- 238000012217 deletion Methods 0.000 claims description 5
- 230000037430 deletion Effects 0.000 claims description 5
- 238000003780 insertion Methods 0.000 claims description 5
- 230000037431 insertion Effects 0.000 claims description 5
- 238000012417 linear regression Methods 0.000 claims description 5
- QHXIQBNUQFLDAB-UHFFFAOYSA-N 4-N,6-N-dimethyl-2-N-propan-2-yl-1,3,5-triazine-2,4,6-triamine Chemical compound CNC1=NC(NC)=NC(NC(C)C)=N1 QHXIQBNUQFLDAB-UHFFFAOYSA-N 0.000 claims description 4
- 102100036512 7-dehydrocholesterol reductase Human genes 0.000 claims description 4
- 102100032309 A disintegrin and metalloproteinase with thrombospondin motifs 15 Human genes 0.000 claims description 4
- 108091005672 ADAMTS15 Proteins 0.000 claims description 4
- 102100034526 AP-1 complex subunit mu-1 Human genes 0.000 claims description 4
- 102100039675 Adenylate cyclase type 2 Human genes 0.000 claims description 4
- 102100036793 Adhesion G protein-coupled receptor L3 Human genes 0.000 claims description 4
- 102100032964 Alpha-actinin-2 Human genes 0.000 claims description 4
- 102100027708 Astrotactin-1 Human genes 0.000 claims description 4
- 102100032850 Beta-1-syntrophin Human genes 0.000 claims description 4
- 102100040840 C-type lectin domain family 7 member A Human genes 0.000 claims description 4
- 102100032976 CCR4-NOT transcription complex subunit 6 Human genes 0.000 claims description 4
- 108010009992 CD163 antigen Proteins 0.000 claims description 4
- 102100025493 CUGBP Elav-like family member 5 Human genes 0.000 claims description 4
- 102100024154 Cadherin-13 Human genes 0.000 claims description 4
- 102100029758 Cadherin-4 Human genes 0.000 claims description 4
- 102100028797 Calsyntenin-2 Human genes 0.000 claims description 4
- 102100037988 Cartilage acidic protein 1 Human genes 0.000 claims description 4
- 102100038165 Chromodomain-helicase-DNA-binding protein 8 Human genes 0.000 claims description 4
- 102100030781 Collagen alpha-1(XXIII) chain Human genes 0.000 claims description 4
- 102100033234 Cyclin-dependent kinase 17 Human genes 0.000 claims description 4
- 102100025620 Cytochrome b-245 light chain Human genes 0.000 claims description 4
- 102100025508 Cytoplasmic tRNA 2-thiolation protein 2 Human genes 0.000 claims description 4
- 102100020756 D(2) dopamine receptor Human genes 0.000 claims description 4
- 102100033934 DNA repair protein RAD51 homolog 2 Human genes 0.000 claims description 4
- 102100027043 Discoidin, CUB and LCCL domain-containing protein 2 Human genes 0.000 claims description 4
- 102100037927 DnaJ homolog subfamily B member 11 Human genes 0.000 claims description 4
- 102100034745 E3 ubiquitin-protein ligase HERC2 Human genes 0.000 claims description 4
- 102100029712 E3 ubiquitin-protein ligase TRIM58 Human genes 0.000 claims description 4
- 102100030787 ERI1 exoribonuclease 2 Human genes 0.000 claims description 4
- 102100024604 Endoribonuclease LACTB2 Human genes 0.000 claims description 4
- 102100030341 Ethanolaminephosphotransferase 1 Human genes 0.000 claims description 4
- 102100036315 FAD-dependent oxidoreductase domain-containing protein 2 Human genes 0.000 claims description 4
- 102100026117 Ferredoxin-2, mitochondrial Human genes 0.000 claims description 4
- 102100037043 Forkhead box protein D4 Human genes 0.000 claims description 4
- 102100021383 Guanine nucleotide exchange factor DBS Human genes 0.000 claims description 4
- 102100035786 Guanine nucleotide-binding protein G(I)/G(S)/G(O) subunit gamma-7 Human genes 0.000 claims description 4
- 102100028972 HLA class I histocompatibility antigen, A alpha chain Human genes 0.000 claims description 4
- 108010075704 HLA-A Antigens Proteins 0.000 claims description 4
- 101000843360 Haloferax mediterranei (strain ATCC 33500 / DSM 1411 / JCM 8866 / NBRC 14739 / NCIMB 2177 / R-4) Gas vesicle structural protein Proteins 0.000 claims description 4
- 101000928720 Homo sapiens 7-dehydrocholesterol reductase Proteins 0.000 claims description 4
- 101000924643 Homo sapiens AP-1 complex subunit mu-1 Proteins 0.000 claims description 4
- 101000959347 Homo sapiens Adenylate cyclase type 2 Proteins 0.000 claims description 4
- 101000928176 Homo sapiens Adhesion G protein-coupled receptor L3 Proteins 0.000 claims description 4
- 101000797275 Homo sapiens Alpha-actinin-2 Proteins 0.000 claims description 4
- 101000936741 Homo sapiens Astrotactin-1 Proteins 0.000 claims description 4
- 101000952934 Homo sapiens Atrial natriuretic peptide-converting enzyme Proteins 0.000 claims description 4
- 101000868444 Homo sapiens Beta-1-syntrophin Proteins 0.000 claims description 4
- 101000749325 Homo sapiens C-type lectin domain family 7 member A Proteins 0.000 claims description 4
- 101000942595 Homo sapiens CCR4-NOT transcription complex subunit 6 Proteins 0.000 claims description 4
- 101000914302 Homo sapiens CUGBP Elav-like family member 5 Proteins 0.000 claims description 4
- 101000762243 Homo sapiens Cadherin-13 Proteins 0.000 claims description 4
- 101000794580 Homo sapiens Cadherin-4 Proteins 0.000 claims description 4
- 101000916406 Homo sapiens Calsyntenin-2 Proteins 0.000 claims description 4
- 101000878940 Homo sapiens Cartilage acidic protein 1 Proteins 0.000 claims description 4
- 101000883545 Homo sapiens Chromodomain-helicase-DNA-binding protein 8 Proteins 0.000 claims description 4
- 101000920176 Homo sapiens Collagen alpha-1(XXIII) chain Proteins 0.000 claims description 4
- 101000944358 Homo sapiens Cyclin-dependent kinase 17 Proteins 0.000 claims description 4
- 101000856723 Homo sapiens Cytochrome b-245 light chain Proteins 0.000 claims description 4
- 101000856509 Homo sapiens Cytoplasmic tRNA 2-thiolation protein 2 Proteins 0.000 claims description 4
- 101000931901 Homo sapiens D(2) dopamine receptor Proteins 0.000 claims description 4
- 101000911787 Homo sapiens Discoidin, CUB and LCCL domain-containing protein 2 Proteins 0.000 claims description 4
- 101000805858 Homo sapiens DnaJ homolog subfamily B member 11 Proteins 0.000 claims description 4
- 101000872516 Homo sapiens E3 ubiquitin-protein ligase HERC2 Proteins 0.000 claims description 4
- 101000795365 Homo sapiens E3 ubiquitin-protein ligase TRIM58 Proteins 0.000 claims description 4
- 101000938751 Homo sapiens ERI1 exoribonuclease 2 Proteins 0.000 claims description 4
- 101001051467 Homo sapiens Endoribonuclease LACTB2 Proteins 0.000 claims description 4
- 101000938340 Homo sapiens Ethanolaminephosphotransferase 1 Proteins 0.000 claims description 4
- 101000930979 Homo sapiens FAD-dependent oxidoreductase domain-containing protein 2 Proteins 0.000 claims description 4
- 101000912981 Homo sapiens Ferredoxin-2, mitochondrial Proteins 0.000 claims description 4
- 101001029302 Homo sapiens Forkhead box protein D4 Proteins 0.000 claims description 4
- 101000615232 Homo sapiens Guanine nucleotide exchange factor DBS Proteins 0.000 claims description 4
- 101001073247 Homo sapiens Guanine nucleotide-binding protein G(I)/G(S)/G(O) subunit gamma-7 Proteins 0.000 claims description 4
- 101001059713 Homo sapiens Inner nuclear membrane protein Man1 Proteins 0.000 claims description 4
- 101001015006 Homo sapiens Integrin beta-4 Proteins 0.000 claims description 4
- 101001082070 Homo sapiens Interferon alpha-inducible protein 6 Proteins 0.000 claims description 4
- 101000942713 Homo sapiens Liprin-alpha-2 Proteins 0.000 claims description 4
- 101001043596 Homo sapiens Low-density lipoprotein receptor-related protein 3 Proteins 0.000 claims description 4
- 101001043598 Homo sapiens Low-density lipoprotein receptor-related protein 4 Proteins 0.000 claims description 4
- 101001090688 Homo sapiens Lymphocyte cytosolic protein 2 Proteins 0.000 claims description 4
- 101000572820 Homo sapiens MICOS complex subunit MIC60 Proteins 0.000 claims description 4
- 101000957559 Homo sapiens Matrin-3 Proteins 0.000 claims description 4
- 101001056160 Homo sapiens Methylcrotonoyl-CoA carboxylase subunit alpha, mitochondrial Proteins 0.000 claims description 4
- 101001000090 Homo sapiens Methyltransferase N6AMT1 Proteins 0.000 claims description 4
- 101000573441 Homo sapiens Misshapen-like kinase 1 Proteins 0.000 claims description 4
- 101000955255 Homo sapiens Multiple epidermal growth factor-like domains protein 11 Proteins 0.000 claims description 4
- 101001030232 Homo sapiens Myosin-9 Proteins 0.000 claims description 4
- 101000672316 Homo sapiens Netrin receptor UNC5B Proteins 0.000 claims description 4
- 101001024598 Homo sapiens Neuroblastoma breakpoint family member 15 Proteins 0.000 claims description 4
- 101000759168 Homo sapiens Palmitoyltransferase ZDHHC7 Proteins 0.000 claims description 4
- 101000616502 Homo sapiens Phosphatidylinositol 3,4,5-trisphosphate 5-phosphatase 1 Proteins 0.000 claims description 4
- 101000734572 Homo sapiens Phosphoenolpyruvate carboxykinase, cytosolic [GTP] Proteins 0.000 claims description 4
- 101000799554 Homo sapiens Protein AATF Proteins 0.000 claims description 4
- 101000573199 Homo sapiens Protein PML Proteins 0.000 claims description 4
- 101000757241 Homo sapiens Protein angel homolog 2 Proteins 0.000 claims description 4
- 101000931682 Homo sapiens Protein furry homolog-like Proteins 0.000 claims description 4
- 101000971400 Homo sapiens Protein kinase C eta type Proteins 0.000 claims description 4
- 101000702391 Homo sapiens Protein sprouty homolog 1 Proteins 0.000 claims description 4
- 101000743768 Homo sapiens R3H domain-containing protein 1 Proteins 0.000 claims description 4
- 101000707951 Homo sapiens Ras and Rab interactor 3 Proteins 0.000 claims description 4
- 101001075565 Homo sapiens Rho GTPase-activating protein 30 Proteins 0.000 claims description 4
- 101000716748 Homo sapiens SR-related and CTD-associated factor 8 Proteins 0.000 claims description 4
- 101000654697 Homo sapiens Semaphorin-5A Proteins 0.000 claims description 4
- 101000829212 Homo sapiens Serine/arginine repetitive matrix protein 2 Proteins 0.000 claims description 4
- 101000835541 Homo sapiens Target of Nesh-SH3 Proteins 0.000 claims description 4
- 101000909641 Homo sapiens Transcription factor COE2 Proteins 0.000 claims description 4
- 101000643895 Homo sapiens Ubiquitin carboxyl-terminal hydrolase 6 Proteins 0.000 claims description 4
- 101000670973 Homo sapiens V-type proton ATPase subunit E 2 Proteins 0.000 claims description 4
- 101000787286 Homo sapiens Valine-tRNA ligase Proteins 0.000 claims description 4
- 101000787276 Homo sapiens Valine-tRNA ligase, mitochondrial Proteins 0.000 claims description 4
- 101000964611 Homo sapiens Zinc finger protein 155 Proteins 0.000 claims description 4
- 101000964759 Homo sapiens Zinc finger protein 573 Proteins 0.000 claims description 4
- 102100028799 Inner nuclear membrane protein Man1 Human genes 0.000 claims description 4
- 102100033000 Integrin beta-4 Human genes 0.000 claims description 4
- 102100027354 Interferon alpha-inducible protein 6 Human genes 0.000 claims description 4
- 102100032894 Liprin-alpha-2 Human genes 0.000 claims description 4
- 102100021917 Low-density lipoprotein receptor-related protein 3 Human genes 0.000 claims description 4
- 102100021918 Low-density lipoprotein receptor-related protein 4 Human genes 0.000 claims description 4
- 102100034709 Lymphocyte cytosolic protein 2 Human genes 0.000 claims description 4
- 102100026639 MICOS complex subunit MIC60 Human genes 0.000 claims description 4
- 102100038645 Matrin-3 Human genes 0.000 claims description 4
- 102100038354 Metabotropic glutamate receptor 4 Human genes 0.000 claims description 4
- 102100026552 Methylcrotonoyl-CoA carboxylase subunit alpha, mitochondrial Human genes 0.000 claims description 4
- 102100036543 Methyltransferase N6AMT1 Human genes 0.000 claims description 4
- 102100026287 Misshapen-like kinase 1 Human genes 0.000 claims description 4
- 102100039008 Multiple epidermal growth factor-like domains protein 11 Human genes 0.000 claims description 4
- 102100038938 Myosin-9 Human genes 0.000 claims description 4
- 102100031455 NAD-dependent protein deacetylase sirtuin-1 Human genes 0.000 claims description 4
- 102100040289 Netrin receptor UNC5B Human genes 0.000 claims description 4
- 102100037031 Neuroblastoma breakpoint family member 15 Human genes 0.000 claims description 4
- 102100023402 Palmitoyltransferase ZDHHC7 Human genes 0.000 claims description 4
- 102100021797 Phosphatidylinositol 3,4,5-trisphosphate 5-phosphatase 1 Human genes 0.000 claims description 4
- 102100034796 Phosphoenolpyruvate carboxykinase, cytosolic [GTP] Human genes 0.000 claims description 4
- 102100034180 Protein AATF Human genes 0.000 claims description 4
- 102100026375 Protein PML Human genes 0.000 claims description 4
- 102100022990 Protein angel homolog 2 Human genes 0.000 claims description 4
- 102100020916 Protein furry homolog-like Human genes 0.000 claims description 4
- 102100021556 Protein kinase C eta type Human genes 0.000 claims description 4
- 102100030399 Protein sprouty homolog 1 Human genes 0.000 claims description 4
- 102100038382 R3H domain-containing protein 1 Human genes 0.000 claims description 4
- 101710018890 RAD51B Proteins 0.000 claims description 4
- 102100031439 Ras and Rab interactor 3 Human genes 0.000 claims description 4
- 102100020887 Rho GTPase-activating protein 30 Human genes 0.000 claims description 4
- 108091006556 SLC30A8 Proteins 0.000 claims description 4
- 102100020875 SR-related and CTD-associated factor 8 Human genes 0.000 claims description 4
- 102100023363 Sarcosine dehydrogenase, mitochondrial Human genes 0.000 claims description 4
- 101150028021 Sardh gene Proteins 0.000 claims description 4
- 102100025831 Scavenger receptor cysteine-rich type 1 protein M130 Human genes 0.000 claims description 4
- 102100032782 Semaphorin-5A Human genes 0.000 claims description 4
- 102100023657 Serine/arginine repetitive matrix protein 2 Human genes 0.000 claims description 4
- 108010041191 Sirtuin 1 Proteins 0.000 claims description 4
- 102100026544 Target of Nesh-SH3 Human genes 0.000 claims description 4
- 102100024204 Transcription factor COE2 Human genes 0.000 claims description 4
- 102100021015 Ubiquitin carboxyl-terminal hydrolase 6 Human genes 0.000 claims description 4
- 102100039464 V-type proton ATPase subunit E 2 Human genes 0.000 claims description 4
- 102100025607 Valine-tRNA ligase Human genes 0.000 claims description 4
- 101000585017 Vipera ammodytes meridionalis Acidic phospholipase A2 homolog vipoxin A chain Proteins 0.000 claims description 4
- 102100040783 Zinc finger protein 155 Human genes 0.000 claims description 4
- 102100040656 Zinc finger protein 573 Human genes 0.000 claims description 4
- 239000013068 control sample Substances 0.000 claims description 4
- 230000007614 genetic variation Effects 0.000 claims description 4
- 108010038422 metabotropic glutamate receptor 4 Proteins 0.000 claims description 4
- 102100032599 Adhesion G protein-coupled receptor B3 Human genes 0.000 claims description 3
- 102000014836 CACNA1I Human genes 0.000 claims description 3
- 102100025580 Calmodulin-1 Human genes 0.000 claims description 3
- 102100030652 Glutamate receptor 1 Human genes 0.000 claims description 3
- 102100022193 Glutamate receptor ionotropic, delta-1 Human genes 0.000 claims description 3
- 102100040485 HLA class II histocompatibility antigen, DRB1 beta chain Human genes 0.000 claims description 3
- 108010039343 HLA-DRB1 Chains Proteins 0.000 claims description 3
- 101000796801 Homo sapiens Adhesion G protein-coupled receptor B3 Proteins 0.000 claims description 3
- 101000984164 Homo sapiens Calmodulin-1 Proteins 0.000 claims description 3
- 101001010445 Homo sapiens Glutamate receptor 1 Proteins 0.000 claims description 3
- 101000900493 Homo sapiens Glutamate receptor ionotropic, delta-1 Proteins 0.000 claims description 3
- 101000993455 Homo sapiens Metal transporter CNNM2 Proteins 0.000 claims description 3
- 101000581507 Homo sapiens Methyl-CpG-binding domain protein 1 Proteins 0.000 claims description 3
- 101001137535 Homo sapiens Nuclear ubiquitous casein and cyclin-dependent kinase substrate 1 Proteins 0.000 claims description 3
- 101001134861 Homo sapiens Pericentriolar material 1 protein Proteins 0.000 claims description 3
- 101000665449 Homo sapiens RNA binding protein fox-1 homolog 1 Proteins 0.000 claims description 3
- 101001026870 Homo sapiens Serine/threonine-protein kinase D1 Proteins 0.000 claims description 3
- 101000759876 Homo sapiens Tetraspanin-11 Proteins 0.000 claims description 3
- 101000844686 Homo sapiens Thioredoxin reductase 1, cytoplasmic Proteins 0.000 claims description 3
- 101000662686 Homo sapiens Torsin-1A Proteins 0.000 claims description 3
- 101000932785 Homo sapiens Voltage-dependent T-type calcium channel subunit alpha-1I Proteins 0.000 claims description 3
- 101000915470 Homo sapiens Zinc finger MYND domain-containing protein 11 Proteins 0.000 claims description 3
- 102100031677 Metal transporter CNNM2 Human genes 0.000 claims description 3
- 102100027383 Methyl-CpG-binding domain protein 1 Human genes 0.000 claims description 3
- 102100021007 Nuclear ubiquitous casein and cyclin-dependent kinase substrate 1 Human genes 0.000 claims description 3
- 102100038188 RNA binding protein fox-1 homolog 1 Human genes 0.000 claims description 3
- 102100037310 Serine/threonine-protein kinase D1 Human genes 0.000 claims description 3
- 102100024987 Tetraspanin-11 Human genes 0.000 claims description 3
- 102100031208 Thioredoxin reductase 1, cytoplasmic Human genes 0.000 claims description 3
- 102100037454 Torsin-1A Human genes 0.000 claims description 3
- 102100028551 Zinc finger MYND domain-containing protein 11 Human genes 0.000 claims description 3
- 238000007477 logistic regression Methods 0.000 claims description 3
- 108091037787 miR-19b stem-loop Proteins 0.000 claims description 3
- 230000002018 overexpression Effects 0.000 claims description 3
- 230000009452 underexpressoin Effects 0.000 claims description 3
- 208000023275 Autoimmune disease Diseases 0.000 claims description 2
- 208000024172 Cardiovascular disease Diseases 0.000 claims description 2
- 101001132733 Homo sapiens Rab GTPase-activating protein 1 Proteins 0.000 claims description 2
- 206010028980 Neoplasm Diseases 0.000 claims description 2
- 102100033883 Rab GTPase-activating protein 1 Human genes 0.000 claims description 2
- 201000011510 cancer Diseases 0.000 claims description 2
- 208000026278 immune system disease Diseases 0.000 claims description 2
- 208000027866 inflammatory disease Diseases 0.000 claims description 2
- 108091091807 let-7a stem-loop Proteins 0.000 claims description 2
- 108091057746 let-7a-4 stem-loop Proteins 0.000 claims description 2
- 108091028376 let-7a-5 stem-loop Proteins 0.000 claims description 2
- 108091024393 let-7a-6 stem-loop Proteins 0.000 claims description 2
- 108091091174 let-7a-7 stem-loop Proteins 0.000 claims description 2
- 108091041042 miR-18 stem-loop Proteins 0.000 claims description 2
- 108091062221 miR-18a stem-loop Proteins 0.000 claims description 2
- 108091063841 miR-219 stem-loop Proteins 0.000 claims description 2
- 102100036973 X-ray repair cross-complementing protein 5 Human genes 0.000 claims 2
- 235000005039 Brassica rapa var. dichotoma Nutrition 0.000 claims 1
- 101100004157 Drosophila melanogaster bab1 gene Proteins 0.000 claims 1
- 244000130745 brown sarson Species 0.000 claims 1
- 108091046933 miR-18b stem-loop Proteins 0.000 claims 1
- 108091047549 miR-323b stem-loop Proteins 0.000 claims 1
- 208000010412 Glaucoma Diseases 0.000 abstract description 86
- 230000008901 benefit Effects 0.000 abstract description 12
- 238000003745 diagnosis Methods 0.000 abstract description 7
- 238000002560 therapeutic procedure Methods 0.000 abstract description 6
- 238000004393 prognosis Methods 0.000 abstract description 5
- 238000012216 screening Methods 0.000 abstract description 5
- 210000001328 optic nerve Anatomy 0.000 description 65
- 238000013456 study Methods 0.000 description 53
- 210000001525 retina Anatomy 0.000 description 34
- 108020004705 Codon Proteins 0.000 description 27
- 238000004458 analytical method Methods 0.000 description 24
- 230000004410 intraocular pressure Effects 0.000 description 21
- 108020004414 DNA Proteins 0.000 description 20
- 239000000523 sample Substances 0.000 description 20
- 108700024394 Exon Proteins 0.000 description 16
- 230000002068 genetic effect Effects 0.000 description 14
- 235000018102 proteins Nutrition 0.000 description 14
- 210000001585 trabecular meshwork Anatomy 0.000 description 14
- 238000012552 review Methods 0.000 description 13
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 12
- 210000003733 optic disk Anatomy 0.000 description 12
- 238000013459 approach Methods 0.000 description 11
- 230000008569 process Effects 0.000 description 11
- 101710196550 Myocilin Proteins 0.000 description 10
- 210000003161 choroid Anatomy 0.000 description 10
- 238000013507 mapping Methods 0.000 description 10
- 102100029839 Myocilin Human genes 0.000 description 9
- 108091023045 Untranslated Region Proteins 0.000 description 9
- 230000006378 damage Effects 0.000 description 9
- 238000001914 filtration Methods 0.000 description 8
- 238000010197 meta-analysis Methods 0.000 description 7
- 230000000717 retained effect Effects 0.000 description 7
- 108091092195 Intron Proteins 0.000 description 6
- 108091030146 MiRBase Proteins 0.000 description 6
- 238000004422 calculation algorithm Methods 0.000 description 6
- 230000003247 decreasing effect Effects 0.000 description 6
- 230000007717 exclusion Effects 0.000 description 6
- 208000030533 eye disease Diseases 0.000 description 6
- 238000011835 investigation Methods 0.000 description 6
- 239000003550 marker Substances 0.000 description 6
- 201000004569 Blindness Diseases 0.000 description 5
- 108091008064 CDKN2B-AS1 Proteins 0.000 description 5
- 241000124008 Mammalia Species 0.000 description 5
- 206010056677 Nerve degeneration Diseases 0.000 description 5
- 230000001364 causal effect Effects 0.000 description 5
- 210000004027 cell Anatomy 0.000 description 5
- 210000004240 ciliary body Anatomy 0.000 description 5
- 108020004999 messenger RNA Proteins 0.000 description 5
- 230000001225 therapeutic effect Effects 0.000 description 5
- 102100031493 Growth arrest-specific protein 7 Human genes 0.000 description 4
- 101000923044 Homo sapiens Growth arrest-specific protein 7 Proteins 0.000 description 4
- 102100031822 Optineurin Human genes 0.000 description 4
- 210000001766 X chromosome Anatomy 0.000 description 4
- 238000004590 computer program Methods 0.000 description 4
- 238000001514 detection method Methods 0.000 description 4
- 230000000694 effects Effects 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 238000003205 genotyping method Methods 0.000 description 4
- 230000036541 health Effects 0.000 description 4
- 238000011160 research Methods 0.000 description 4
- 238000007619 statistical method Methods 0.000 description 4
- 230000004382 visual function Effects 0.000 description 4
- 108091026890 Coding region Proteins 0.000 description 3
- 102100027417 Cytochrome P450 1B1 Human genes 0.000 description 3
- 102100029279 Homeobox protein SIX1 Human genes 0.000 description 3
- 102100025448 Homeobox protein SIX6 Human genes 0.000 description 3
- 101000725164 Homo sapiens Cytochrome P450 1B1 Proteins 0.000 description 3
- 101000634171 Homo sapiens Homeobox protein SIX1 Proteins 0.000 description 3
- 101000835956 Homo sapiens Homeobox protein SIX6 Proteins 0.000 description 3
- 102000015335 Ku Autoantigen Human genes 0.000 description 3
- 108091028043 Nucleic acid sequence Proteins 0.000 description 3
- 206010030043 Ocular hypertension Diseases 0.000 description 3
- 101710131459 Optineurin Proteins 0.000 description 3
- 208000018737 Parkinson disease Diseases 0.000 description 3
- 210000002593 Y chromosome Anatomy 0.000 description 3
- 239000008280 blood Substances 0.000 description 3
- 210000004369 blood Anatomy 0.000 description 3
- 230000001684 chronic effect Effects 0.000 description 3
- 230000000295 complement effect Effects 0.000 description 3
- 238000012350 deep sequencing Methods 0.000 description 3
- 238000013461 design Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 238000012268 genome sequencing Methods 0.000 description 3
- 238000004519 manufacturing process Methods 0.000 description 3
- 239000000203 mixture Substances 0.000 description 3
- 230000035772 mutation Effects 0.000 description 3
- 230000000324 neuroprotective effect Effects 0.000 description 3
- 230000036961 partial effect Effects 0.000 description 3
- 108091033319 polynucleotide Proteins 0.000 description 3
- 102000040430 polynucleotide Human genes 0.000 description 3
- 239000002157 polynucleotide Substances 0.000 description 3
- 230000000750 progressive effect Effects 0.000 description 3
- 238000003860 storage Methods 0.000 description 3
- 230000002459 sustained effect Effects 0.000 description 3
- 230000008685 targeting Effects 0.000 description 3
- 230000000007 visual effect Effects 0.000 description 3
- 101150037123 APOE gene Proteins 0.000 description 2
- 102100029470 Apolipoprotein E Human genes 0.000 description 2
- 102000004219 Brain-derived neurotrophic factor Human genes 0.000 description 2
- 108090000715 Brain-derived neurotrophic factor Proteins 0.000 description 2
- 102100035888 Caveolin-1 Human genes 0.000 description 2
- 102100038909 Caveolin-2 Human genes 0.000 description 2
- 208000017667 Chronic Disease Diseases 0.000 description 2
- 102100031457 Collagen alpha-1(V) chain Human genes 0.000 description 2
- 102000009512 Cyclin-Dependent Kinase Inhibitor p15 Human genes 0.000 description 2
- 108010009356 Cyclin-Dependent Kinase Inhibitor p15 Proteins 0.000 description 2
- 238000001712 DNA sequencing Methods 0.000 description 2
- 206010012565 Developmental glaucoma Diseases 0.000 description 2
- 101000715467 Homo sapiens Caveolin-1 Proteins 0.000 description 2
- 101000740981 Homo sapiens Caveolin-2 Proteins 0.000 description 2
- 101000941708 Homo sapiens Collagen alpha-1(V) chain Proteins 0.000 description 2
- 101001054649 Homo sapiens Latent-transforming growth factor beta-binding protein 2 Proteins 0.000 description 2
- 101001054646 Homo sapiens Latent-transforming growth factor beta-binding protein 3 Proteins 0.000 description 2
- 101000701154 Homo sapiens Transcription factor ATOH7 Proteins 0.000 description 2
- 101000955110 Homo sapiens WD repeat-containing protein 36 Proteins 0.000 description 2
- 206010020772 Hypertension Diseases 0.000 description 2
- 102100027017 Latent-transforming growth factor beta-binding protein 2 Human genes 0.000 description 2
- 108091027974 Mature messenger RNA Proteins 0.000 description 2
- 208000012902 Nervous system disease Diseases 0.000 description 2
- 208000025966 Neurological disease Diseases 0.000 description 2
- 101100285899 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) SSE2 gene Proteins 0.000 description 2
- 102100029372 Transcription factor ATOH7 Human genes 0.000 description 2
- 102100038944 WD repeat-containing protein 36 Human genes 0.000 description 2
- 150000001413 amino acids Chemical class 0.000 description 2
- 210000001742 aqueous humor Anatomy 0.000 description 2
- 210000001130 astrocyte Anatomy 0.000 description 2
- 210000003050 axon Anatomy 0.000 description 2
- 210000004556 brain Anatomy 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 230000022131 cell cycle Effects 0.000 description 2
- 230000033077 cellular process Effects 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 239000002299 complementary DNA Substances 0.000 description 2
- 230000000875 corresponding effect Effects 0.000 description 2
- 238000007405 data analysis Methods 0.000 description 2
- 206010012601 diabetes mellitus Diseases 0.000 description 2
- 230000009274 differential gene expression Effects 0.000 description 2
- 238000009826 distribution Methods 0.000 description 2
- 230000004438 eyesight Effects 0.000 description 2
- 210000003917 human chromosome Anatomy 0.000 description 2
- OCYSGIYOVXAGKQ-UHFFFAOYSA-N hydron;3-[1-hydroxy-2-(methylamino)ethyl]phenol;chloride Chemical compound Cl.CNCC(O)C1=CC=CC(O)=C1 OCYSGIYOVXAGKQ-UHFFFAOYSA-N 0.000 description 2
- 210000000554 iris Anatomy 0.000 description 2
- 230000002427 irreversible effect Effects 0.000 description 2
- 230000000366 juvenile effect Effects 0.000 description 2
- 230000003278 mimic effect Effects 0.000 description 2
- 230000008450 motivation Effects 0.000 description 2
- 230000001537 neural effect Effects 0.000 description 2
- 125000001805 pentosyl group Chemical group 0.000 description 2
- 230000008092 positive effect Effects 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 230000002829 reductive effect Effects 0.000 description 2
- 210000003994 retinal ganglion cell Anatomy 0.000 description 2
- 230000002441 reversible effect Effects 0.000 description 2
- 238000002864 sequence alignment Methods 0.000 description 2
- 230000009897 systematic effect Effects 0.000 description 2
- 238000011144 upstream manufacturing Methods 0.000 description 2
- 230000004393 visual impairment Effects 0.000 description 2
- OJHZNMVJJKMFGX-RNWHKREASA-N (4r,4ar,7ar,12bs)-9-methoxy-3-methyl-1,2,4,4a,5,6,7a,13-octahydro-4,12-methanobenzofuro[3,2-e]isoquinoline-7-one;2,3-dihydroxybutanedioic acid Chemical compound OC(=O)C(O)C(O)C(O)=O.O=C([C@@H]1O2)CC[C@H]3[C@]4([H])N(C)CC[C@]13C1=C2C(OC)=CC=C1C4 OJHZNMVJJKMFGX-RNWHKREASA-N 0.000 description 1
- 102000040650 (ribonucleotides)n+m Human genes 0.000 description 1
- 108020005345 3' Untranslated Regions Proteins 0.000 description 1
- 102100033051 40S ribosomal protein S19 Human genes 0.000 description 1
- 102100021548 5-methylcytosine rRNA methyltransferase NSUN4 Human genes 0.000 description 1
- 102100024049 A-kinase anchor protein 13 Human genes 0.000 description 1
- 101710159080 Aconitate hydratase A Proteins 0.000 description 1
- 101710159078 Aconitate hydratase B Proteins 0.000 description 1
- 101100328883 Arabidopsis thaliana COL1 gene Proteins 0.000 description 1
- 102000007370 Ataxin2 Human genes 0.000 description 1
- 108010032951 Ataxin2 Proteins 0.000 description 1
- 102100033948 Basic salivary proline-rich protein 4 Human genes 0.000 description 1
- 102100022794 Bestrophin-1 Human genes 0.000 description 1
- 102100040647 Beta-galactosidase-1-like protein 3 Human genes 0.000 description 1
- 102100024505 Bone morphogenetic protein 4 Human genes 0.000 description 1
- 241000283690 Bos taurus Species 0.000 description 1
- 102100040785 CUB and sushi domain-containing protein 2 Human genes 0.000 description 1
- 241000282465 Canis Species 0.000 description 1
- 102100033040 Carbonic anhydrase 12 Human genes 0.000 description 1
- 102100024646 Cell adhesion molecule 2 Human genes 0.000 description 1
- 102100034744 Cell division cycle 7-related protein kinase Human genes 0.000 description 1
- 102100037828 Charged multivesicular body protein 7 Human genes 0.000 description 1
- 101710153987 Charged multivesicular body protein 7 Proteins 0.000 description 1
- 102100023459 Chloride channel protein ClC-Kb Human genes 0.000 description 1
- 102100028757 Chondroitin sulfate proteoglycan 4 Human genes 0.000 description 1
- 108010077544 Chromatin Proteins 0.000 description 1
- 102100023337 Chymotrypsin-like elastase family member 3A Human genes 0.000 description 1
- 102100021967 Coiled-coil domain-containing protein 33 Human genes 0.000 description 1
- 102100033601 Collagen alpha-1(I) chain Human genes 0.000 description 1
- 102100031162 Collagen alpha-1(XVIII) chain Human genes 0.000 description 1
- 102100040496 Collagen alpha-2(VIII) chain Human genes 0.000 description 1
- 206010010356 Congenital anomaly Diseases 0.000 description 1
- 206010018325 Congenital glaucomas Diseases 0.000 description 1
- 102100024342 Contactin-2 Human genes 0.000 description 1
- 102100040499 Contactin-associated protein-like 2 Human genes 0.000 description 1
- 241000699800 Cricetinae Species 0.000 description 1
- 108010009392 Cyclin-Dependent Kinase Inhibitor p16 Proteins 0.000 description 1
- 102100039683 Cyclin-G-associated kinase Human genes 0.000 description 1
- 102100031515 D-ribitol-5-phosphate cytidylyltransferase Human genes 0.000 description 1
- 102100036218 DNA replication complex GINS protein PSF2 Human genes 0.000 description 1
- 102000052510 DNA-Binding Proteins Human genes 0.000 description 1
- 101710096438 DNA-binding protein Proteins 0.000 description 1
- 102100035784 Decorin Human genes 0.000 description 1
- 206010061818 Disease progression Diseases 0.000 description 1
- 108010044266 Dopamine Plasma Membrane Transport Proteins Proteins 0.000 description 1
- 102100031780 Endonuclease Human genes 0.000 description 1
- 108010092408 Eosinophil Peroxidase Proteins 0.000 description 1
- 102100028471 Eosinophil peroxidase Human genes 0.000 description 1
- 102100021600 Ephrin type-A receptor 10 Human genes 0.000 description 1
- 241000283073 Equus caballus Species 0.000 description 1
- 102100034552 Fanconi anemia group M protein Human genes 0.000 description 1
- 241000282324 Felis Species 0.000 description 1
- 102100031387 Fibrillin-3 Human genes 0.000 description 1
- 102100031812 Fibulin-1 Human genes 0.000 description 1
- 108010010285 Forkhead Box Protein L2 Proteins 0.000 description 1
- 102100021084 Forkhead box protein C1 Human genes 0.000 description 1
- 102100020855 Forkhead box protein E3 Human genes 0.000 description 1
- 102100035137 Forkhead box protein L2 Human genes 0.000 description 1
- 102100028496 Galactocerebrosidase Human genes 0.000 description 1
- 102100031364 Galectin-9C Human genes 0.000 description 1
- 230000010558 Gene Alterations Effects 0.000 description 1
- 102100038073 General transcription factor II-I Human genes 0.000 description 1
- 108700039691 Genetic Promoter Regions Proteins 0.000 description 1
- 102000058062 Glucose Transporter Type 3 Human genes 0.000 description 1
- 102100026801 Glycophorin-E Human genes 0.000 description 1
- 102100034153 Golgin subfamily A member 6B Human genes 0.000 description 1
- 102100032804 Histone-lysine N-methyltransferase SMYD3 Human genes 0.000 description 1
- 102000006947 Histones Human genes 0.000 description 1
- 108010033040 Histones Proteins 0.000 description 1
- 101000733040 Homo sapiens 40S ribosomal protein S19 Proteins 0.000 description 1
- 101001108645 Homo sapiens 5-methylcytosine rRNA methyltransferase NSUN4 Proteins 0.000 description 1
- 101000833679 Homo sapiens A-kinase anchor protein 13 Proteins 0.000 description 1
- 101001068637 Homo sapiens Basic salivary proline-rich protein 4 Proteins 0.000 description 1
- 101000903449 Homo sapiens Bestrophin-1 Proteins 0.000 description 1
- 101001039066 Homo sapiens Beta-galactosidase-1-like protein 3 Proteins 0.000 description 1
- 101000762379 Homo sapiens Bone morphogenetic protein 4 Proteins 0.000 description 1
- 101000978379 Homo sapiens C-C motif chemokine 13 Proteins 0.000 description 1
- 101000892047 Homo sapiens CUB and sushi domain-containing protein 2 Proteins 0.000 description 1
- 101000867855 Homo sapiens Carbonic anhydrase 12 Proteins 0.000 description 1
- 101000760622 Homo sapiens Cell adhesion molecule 2 Proteins 0.000 description 1
- 101000945740 Homo sapiens Cell division cycle 7-related protein kinase Proteins 0.000 description 1
- 101000906654 Homo sapiens Chloride channel protein ClC-Kb Proteins 0.000 description 1
- 101000916489 Homo sapiens Chondroitin sulfate proteoglycan 4 Proteins 0.000 description 1
- 101000907964 Homo sapiens Chymotrypsin-like elastase family member 3A Proteins 0.000 description 1
- 101000897106 Homo sapiens Coiled-coil domain-containing protein 33 Proteins 0.000 description 1
- 101000940068 Homo sapiens Collagen alpha-1(XVIII) chain Proteins 0.000 description 1
- 101000749886 Homo sapiens Collagen alpha-2(VIII) chain Proteins 0.000 description 1
- 101000909516 Homo sapiens Contactin-2 Proteins 0.000 description 1
- 101000749877 Homo sapiens Contactin-associated protein-like 2 Proteins 0.000 description 1
- 101000886209 Homo sapiens Cyclin-G-associated kinase Proteins 0.000 description 1
- 101000994204 Homo sapiens D-ribitol-5-phosphate cytidylyltransferase Proteins 0.000 description 1
- 101000736065 Homo sapiens DNA replication complex GINS protein PSF2 Proteins 0.000 description 1
- 101001000206 Homo sapiens Decorin Proteins 0.000 description 1
- 101000722054 Homo sapiens Dynamin-like 120 kDa protein, mitochondrial Proteins 0.000 description 1
- 101000898673 Homo sapiens Ephrin type-A receptor 10 Proteins 0.000 description 1
- 101000848187 Homo sapiens Fanconi anemia group M protein Proteins 0.000 description 1
- 101000846888 Homo sapiens Fibrillin-3 Proteins 0.000 description 1
- 101001065276 Homo sapiens Fibulin-1 Proteins 0.000 description 1
- 101000818310 Homo sapiens Forkhead box protein C1 Proteins 0.000 description 1
- 101000931489 Homo sapiens Forkhead box protein E3 Proteins 0.000 description 1
- 101000860395 Homo sapiens Galactocerebrosidase Proteins 0.000 description 1
- 101001130153 Homo sapiens Galectin-9C Proteins 0.000 description 1
- 101001032427 Homo sapiens General transcription factor II-I Proteins 0.000 description 1
- 101000972850 Homo sapiens Glutamate receptor ionotropic, NMDA 2B Proteins 0.000 description 1
- 101000833785 Homo sapiens Glycophorin-E Proteins 0.000 description 1
- 101001070507 Homo sapiens Golgin subfamily A member 6B Proteins 0.000 description 1
- 101000708574 Homo sapiens Histone-lysine N-methyltransferase SMYD3 Proteins 0.000 description 1
- 101000838011 Homo sapiens Ion channel TACAN Proteins 0.000 description 1
- 101001006886 Homo sapiens Krueppel-like factor 12 Proteins 0.000 description 1
- 101001043321 Homo sapiens Lysyl oxidase homolog 1 Proteins 0.000 description 1
- 101001043352 Homo sapiens Lysyl oxidase homolog 2 Proteins 0.000 description 1
- 101000614988 Homo sapiens Mediator of RNA polymerase II transcription subunit 12 Proteins 0.000 description 1
- 101000587058 Homo sapiens Methylenetetrahydrofolate reductase Proteins 0.000 description 1
- 101000613610 Homo sapiens Monocyte to macrophage differentiation factor Proteins 0.000 description 1
- 101001116514 Homo sapiens Myotubularin-related protein 13 Proteins 0.000 description 1
- 101000654298 Homo sapiens N-terminal kinase-like protein Proteins 0.000 description 1
- 101001109451 Homo sapiens NACHT, LRR and PYD domains-containing protein 9 Proteins 0.000 description 1
- 101001109472 Homo sapiens NKG2-F type II integral membrane protein Proteins 0.000 description 1
- 101001024608 Homo sapiens Neuroblastoma breakpoint family member 3 Proteins 0.000 description 1
- 101000822103 Homo sapiens Neuronal acetylcholine receptor subunit alpha-7 Proteins 0.000 description 1
- 101000996663 Homo sapiens Neurotrophin-4 Proteins 0.000 description 1
- 101000970403 Homo sapiens Nuclear pore complex protein Nup153 Proteins 0.000 description 1
- 101000603404 Homo sapiens Nuclear pore complex-interacting protein family member B15 Proteins 0.000 description 1
- 101001121143 Homo sapiens Olfactory receptor 2L3 Proteins 0.000 description 1
- 101000988395 Homo sapiens PDZ and LIM domain protein 4 Proteins 0.000 description 1
- 101001095073 Homo sapiens PRAME family member 2 Proteins 0.000 description 1
- 101001095308 Homo sapiens Periostin Proteins 0.000 description 1
- 101001073025 Homo sapiens Peroxisomal targeting signal 1 receptor Proteins 0.000 description 1
- 101001113717 Homo sapiens Phenazine biosynthesis-like domain-containing protein Proteins 0.000 description 1
- 101000595669 Homo sapiens Pituitary homeobox 2 Proteins 0.000 description 1
- 101000595674 Homo sapiens Pituitary homeobox 3 Proteins 0.000 description 1
- 101000691478 Homo sapiens Placenta-specific protein 4 Proteins 0.000 description 1
- 101001067187 Homo sapiens Plexin-A2 Proteins 0.000 description 1
- 101000687549 Homo sapiens Prickle-like protein 4 Proteins 0.000 description 1
- 101000580713 Homo sapiens Probable RNA-binding protein 23 Proteins 0.000 description 1
- 101001123963 Homo sapiens Protein O-mannosyl-transferase 1 Proteins 0.000 description 1
- 101000688348 Homo sapiens Protein phosphatase 1 regulatory subunit 14C Proteins 0.000 description 1
- 101000824415 Homo sapiens Protocadherin Fat 3 Proteins 0.000 description 1
- 101000741892 Homo sapiens Putative POTE ankyrin domain family member M Proteins 0.000 description 1
- 101000777204 Homo sapiens Putative ubiquitin carboxyl-terminal hydrolase 41 Proteins 0.000 description 1
- 101000779418 Homo sapiens RAC-alpha serine/threonine-protein kinase Proteins 0.000 description 1
- 101000738765 Homo sapiens Receptor-type tyrosine-protein phosphatase N2 Proteins 0.000 description 1
- 101001095783 Homo sapiens Ribonucleoside-diphosphate reductase subunit M2 B Proteins 0.000 description 1
- 101000704149 Homo sapiens SRC kinase signaling inhibitor 1 Proteins 0.000 description 1
- 101000740205 Homo sapiens Sal-like protein 1 Proteins 0.000 description 1
- 101000632056 Homo sapiens Septin-9 Proteins 0.000 description 1
- 101000777277 Homo sapiens Serine/threonine-protein kinase Chk2 Proteins 0.000 description 1
- 101000885321 Homo sapiens Serine/threonine-protein kinase DCLK1 Proteins 0.000 description 1
- 101000987295 Homo sapiens Serine/threonine-protein kinase PAK 5 Proteins 0.000 description 1
- 101000665442 Homo sapiens Serine/threonine-protein kinase TBK1 Proteins 0.000 description 1
- 101000869448 Homo sapiens Solute carrier family 35 member E2A Proteins 0.000 description 1
- 101000826397 Homo sapiens Sulfotransferase 1A2 Proteins 0.000 description 1
- 101000821263 Homo sapiens Synapsin-3 Proteins 0.000 description 1
- 101000891898 Homo sapiens Synaptotagmin-3 Proteins 0.000 description 1
- 101000845013 Homo sapiens Thioredoxin reductase 2, mitochondrial Proteins 0.000 description 1
- 101000659879 Homo sapiens Thrombospondin-1 Proteins 0.000 description 1
- 101000635938 Homo sapiens Transforming growth factor beta-1 proprotein Proteins 0.000 description 1
- 101000635958 Homo sapiens Transforming growth factor beta-2 proprotein Proteins 0.000 description 1
- 101000674804 Homo sapiens Transmembrane protein 191B Proteins 0.000 description 1
- 101000713936 Homo sapiens Tudor domain-containing protein 7 Proteins 0.000 description 1
- 101000611183 Homo sapiens Tumor necrosis factor Proteins 0.000 description 1
- 101000785698 Homo sapiens Zinc finger protein 276 Proteins 0.000 description 1
- 101000915642 Homo sapiens Zinc finger protein 469 Proteins 0.000 description 1
- 101000744939 Homo sapiens Zinc finger protein 492 Proteins 0.000 description 1
- 101000781856 Homo sapiens Zinc finger protein 512B Proteins 0.000 description 1
- 101000857270 Homo sapiens Zinc finger protein GLIS1 Proteins 0.000 description 1
- 101000857276 Homo sapiens Zinc finger protein GLIS3 Proteins 0.000 description 1
- 208000001953 Hypotension Diseases 0.000 description 1
- 206010061218 Inflammation Diseases 0.000 description 1
- 108091029795 Intergenic region Proteins 0.000 description 1
- 102100028548 Ion channel TACAN Human genes 0.000 description 1
- 201000006336 Juvenile glaucoma Diseases 0.000 description 1
- 108010038888 KCNQ3 Potassium Channel Proteins 0.000 description 1
- 102100027792 Krueppel-like factor 12 Human genes 0.000 description 1
- 101150031598 LCY1 gene Proteins 0.000 description 1
- 101150099142 LOXL1 gene Proteins 0.000 description 1
- 241000283953 Lagomorpha Species 0.000 description 1
- 108091026898 Leader sequence (mRNA) Proteins 0.000 description 1
- 102100021958 Lysyl oxidase homolog 1 Human genes 0.000 description 1
- 102100021948 Lysyl oxidase homolog 2 Human genes 0.000 description 1
- 102100021070 Mediator of RNA polymerase II transcription subunit 12 Human genes 0.000 description 1
- 101000859568 Methanobrevibacter smithii (strain ATCC 35061 / DSM 861 / OCM 144 / PS) Carbamoyl-phosphate synthase Proteins 0.000 description 1
- 102100029684 Methylenetetrahydrofolate reductase Human genes 0.000 description 1
- 208000009795 Microphthalmos Diseases 0.000 description 1
- 102100040849 Monocyte to macrophage differentiation factor Human genes 0.000 description 1
- 241001529936 Murinae Species 0.000 description 1
- 102100024960 Myotubularin-related protein 13 Human genes 0.000 description 1
- 102100031703 N-terminal kinase-like protein Human genes 0.000 description 1
- 102100022694 NACHT, LRR and PYD domains-containing protein 9 Human genes 0.000 description 1
- 102100022700 NKG2-F type II integral membrane protein Human genes 0.000 description 1
- 102100036999 Neuroblastoma breakpoint family member 3 Human genes 0.000 description 1
- 102100021511 Neuronal acetylcholine receptor subunit alpha-7 Human genes 0.000 description 1
- 102100033857 Neurotrophin-4 Human genes 0.000 description 1
- 206010067013 Normal tension glaucoma Diseases 0.000 description 1
- 102100021706 Nuclear pore complex protein Nup153 Human genes 0.000 description 1
- 102100038872 Nuclear pore complex-interacting protein family member B15 Human genes 0.000 description 1
- 102100026576 Olfactory receptor 2L3 Human genes 0.000 description 1
- 206010061323 Optic neuropathy Diseases 0.000 description 1
- 108010032788 PAX6 Transcription Factor Proteins 0.000 description 1
- 102100029178 PDZ and LIM domain protein 4 Human genes 0.000 description 1
- 102100036996 PRAME family member 2 Human genes 0.000 description 1
- 102100037506 Paired box protein Pax-6 Human genes 0.000 description 1
- 102100037765 Periostin Human genes 0.000 description 1
- 102100036598 Peroxisomal targeting signal 1 receptor Human genes 0.000 description 1
- 102100023743 Phenazine biosynthesis-like domain-containing protein Human genes 0.000 description 1
- 102100036090 Pituitary homeobox 2 Human genes 0.000 description 1
- 102100036088 Pituitary homeobox 3 Human genes 0.000 description 1
- 102100026184 Placenta-specific protein 4 Human genes 0.000 description 1
- 102100034381 Plexin-A2 Human genes 0.000 description 1
- 102100034360 Potassium voltage-gated channel subfamily KQT member 3 Human genes 0.000 description 1
- 102100024857 Prickle-like protein 4 Human genes 0.000 description 1
- 208000024777 Prion disease Diseases 0.000 description 1
- 102100027483 Probable RNA-binding protein 23 Human genes 0.000 description 1
- 102100028120 Protein O-mannosyl-transferase 1 Human genes 0.000 description 1
- 102220638487 Protein PML_F645L_mutation Human genes 0.000 description 1
- 102100024145 Protein phosphatase 1 regulatory subunit 14C Human genes 0.000 description 1
- 102100022134 Protocadherin Fat 3 Human genes 0.000 description 1
- 102100038764 Putative POTE ankyrin domain family member M Human genes 0.000 description 1
- 102100031285 Putative ubiquitin carboxyl-terminal hydrolase 41 Human genes 0.000 description 1
- 102100033810 RAC-alpha serine/threonine-protein kinase Human genes 0.000 description 1
- 102000044126 RNA-Binding Proteins Human genes 0.000 description 1
- 230000004570 RNA-binding Effects 0.000 description 1
- 101710105008 RNA-binding protein Proteins 0.000 description 1
- 108010092799 RNA-directed DNA polymerase Proteins 0.000 description 1
- 238000003559 RNA-seq method Methods 0.000 description 1
- 102000004912 RYR2 Human genes 0.000 description 1
- 108060007241 RYR2 Proteins 0.000 description 1
- 241000700159 Rattus Species 0.000 description 1
- 102100037404 Receptor-type tyrosine-protein phosphatase N2 Human genes 0.000 description 1
- 201000007737 Retinal degeneration Diseases 0.000 description 1
- 102100038013 Ribonucleoside-diphosphate reductase subunit M2 B Human genes 0.000 description 1
- 108091028664 Ribonucleotide Proteins 0.000 description 1
- 108091006298 SLC2A3 Proteins 0.000 description 1
- 108091006262 SLC4A4 Proteins 0.000 description 1
- 108060007757 SLC6A18 Proteins 0.000 description 1
- 102000005026 SLC6A18 Human genes 0.000 description 1
- 102000005029 SLC6A3 Human genes 0.000 description 1
- 102100031876 SRC kinase signaling inhibitor 1 Human genes 0.000 description 1
- 102100037204 Sal-like protein 1 Human genes 0.000 description 1
- 102100028024 Septin-9 Human genes 0.000 description 1
- 238000012300 Sequence Analysis Methods 0.000 description 1
- 102100031075 Serine/threonine-protein kinase Chk2 Human genes 0.000 description 1
- 102100039758 Serine/threonine-protein kinase DCLK1 Human genes 0.000 description 1
- 102100027941 Serine/threonine-protein kinase PAK 5 Human genes 0.000 description 1
- 102100038192 Serine/threonine-protein kinase TBK1 Human genes 0.000 description 1
- 240000003801 Sigesbeckia orientalis Species 0.000 description 1
- 235000003407 Sigesbeckia orientalis Nutrition 0.000 description 1
- 206010040844 Skin exfoliation Diseases 0.000 description 1
- 102000006633 Sodium-Bicarbonate Symporters Human genes 0.000 description 1
- 102100032276 Solute carrier family 35 member E2A Human genes 0.000 description 1
- 102100023984 Sulfotransferase 1A2 Human genes 0.000 description 1
- 102100021920 Synapsin-3 Human genes 0.000 description 1
- 102100040757 Synaptotagmin-3 Human genes 0.000 description 1
- 102100031241 Thioredoxin reductase 2, mitochondrial Human genes 0.000 description 1
- 102100036034 Thrombospondin-1 Human genes 0.000 description 1
- 108091023040 Transcription factor Proteins 0.000 description 1
- 102000040945 Transcription factor Human genes 0.000 description 1
- 102100033663 Transforming growth factor beta receptor type 3 Human genes 0.000 description 1
- 102100030742 Transforming growth factor beta-1 proprotein Human genes 0.000 description 1
- 102100030737 Transforming growth factor beta-2 proprotein Human genes 0.000 description 1
- 102100021219 Transmembrane protein 191B Human genes 0.000 description 1
- 102100036455 Tudor domain-containing protein 7 Human genes 0.000 description 1
- 102100040247 Tumor necrosis factor Human genes 0.000 description 1
- 102100033254 Tumor suppressor ARF Human genes 0.000 description 1
- 241000700605 Viruses Species 0.000 description 1
- 206010047571 Visual impairment Diseases 0.000 description 1
- 102100026335 Zinc finger protein 276 Human genes 0.000 description 1
- 102100029042 Zinc finger protein 469 Human genes 0.000 description 1
- 102100039969 Zinc finger protein 492 Human genes 0.000 description 1
- 102100036647 Zinc finger protein 512B Human genes 0.000 description 1
- 102100025883 Zinc finger protein GLIS1 Human genes 0.000 description 1
- 102100025879 Zinc finger protein GLIS3 Human genes 0.000 description 1
- 230000002159 abnormal effect Effects 0.000 description 1
- 230000032683 aging Effects 0.000 description 1
- 108010029483 alpha 1 Chain Collagen Type I Proteins 0.000 description 1
- 125000003275 alpha amino acid group Chemical group 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 210000002159 anterior chamber Anatomy 0.000 description 1
- 230000006907 apoptotic process Effects 0.000 description 1
- 238000012098 association analyses Methods 0.000 description 1
- 238000012093 association test Methods 0.000 description 1
- 238000011888 autopsy Methods 0.000 description 1
- 230000004009 axon guidance Effects 0.000 description 1
- 108010079292 betaglycan Proteins 0.000 description 1
- 230000002146 bilateral effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 238000010241 blood sampling Methods 0.000 description 1
- 230000021164 cell adhesion Effects 0.000 description 1
- 230000004640 cellular pathway Effects 0.000 description 1
- 238000007385 chemical modification Methods 0.000 description 1
- 210000003483 chromatin Anatomy 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 238000012790 confirmation Methods 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- 238000009223 counseling Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000006735 deficit Effects 0.000 description 1
- 230000001934 delay Effects 0.000 description 1
- 239000005547 deoxyribonucleotide Substances 0.000 description 1
- 125000002637 deoxyribonucleotide group Chemical group 0.000 description 1
- 230000006866 deterioration Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 230000005750 disease progression Effects 0.000 description 1
- 208000037765 diseases and disorders Diseases 0.000 description 1
- 238000002224 dissection Methods 0.000 description 1
- 229940079593 drug Drugs 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 230000004406 elevated intraocular pressure Effects 0.000 description 1
- 230000008030 elimination Effects 0.000 description 1
- 238000003379 elimination reaction Methods 0.000 description 1
- 238000012854 evaluation process Methods 0.000 description 1
- 238000004299 exfoliation Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000010195 expression analysis Methods 0.000 description 1
- 239000012530 fluid Substances 0.000 description 1
- 238000001502 gel electrophoresis Methods 0.000 description 1
- 238000012252 genetic analysis Methods 0.000 description 1
- 102000054766 genetic haplotypes Human genes 0.000 description 1
- 238000003306 harvesting Methods 0.000 description 1
- 208000021822 hypotensive Diseases 0.000 description 1
- 230000001077 hypotensive effect Effects 0.000 description 1
- 230000002757 inflammatory effect Effects 0.000 description 1
- 230000004054 inflammatory process Effects 0.000 description 1
- 230000028709 inflammatory response Effects 0.000 description 1
- 208000028507 juvenile open angle glaucoma Diseases 0.000 description 1
- 230000000670 limiting effect Effects 0.000 description 1
- 210000004185 liver Anatomy 0.000 description 1
- 201000002978 low tension glaucoma Diseases 0.000 description 1
- 208000002780 macular degeneration Diseases 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 108091079013 miR-34b Proteins 0.000 description 1
- 108091084018 miR-34b stem-loop Proteins 0.000 description 1
- 108091063470 miR-34b-1 stem-loop Proteins 0.000 description 1
- 108091049916 miR-34b-2 stem-loop Proteins 0.000 description 1
- 108091057222 miR-34b-3 stem-loop Proteins 0.000 description 1
- 108091092639 miR-34b-4 stem-loop Proteins 0.000 description 1
- 201000010478 microphthalmia Diseases 0.000 description 1
- 210000003470 mitochondria Anatomy 0.000 description 1
- 230000002438 mitochondrial effect Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000007479 molecular analysis Methods 0.000 description 1
- 230000003988 neural development Effects 0.000 description 1
- 108091027963 non-coding RNA Proteins 0.000 description 1
- 102000042567 non-coding RNA Human genes 0.000 description 1
- 235000015097 nutrients Nutrition 0.000 description 1
- 208000020911 optic nerve disease Diseases 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000004806 packaging method and process Methods 0.000 description 1
- 230000008506 pathogenesis Effects 0.000 description 1
- 230000001575 pathological effect Effects 0.000 description 1
- 239000013610 patient sample Substances 0.000 description 1
- 230000000144 pharmacologic effect Effects 0.000 description 1
- 150000004713 phosphodiesters Chemical group 0.000 description 1
- 238000013439 planning Methods 0.000 description 1
- 230000035935 pregnancy Effects 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 230000035755 proliferation Effects 0.000 description 1
- 235000004252 protein component Nutrition 0.000 description 1
- 208000020016 psychiatric disease Diseases 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 230000004258 retinal degeneration Effects 0.000 description 1
- 230000002207 retinal effect Effects 0.000 description 1
- 239000002336 ribonucleotide Substances 0.000 description 1
- 125000002652 ribonucleotide group Chemical group 0.000 description 1
- 210000003786 sclera Anatomy 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 238000007841 sequencing by ligation Methods 0.000 description 1
- 238000013517 stratification Methods 0.000 description 1
- 238000011477 surgical intervention Methods 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
- 230000009885 systemic effect Effects 0.000 description 1
- 230000000699 topical effect Effects 0.000 description 1
- 238000012384 transportation and delivery Methods 0.000 description 1
- 238000011277 treatment modality Methods 0.000 description 1
- 238000011269 treatment regimen Methods 0.000 description 1
- 208000001072 type 2 diabetes mellitus Diseases 0.000 description 1
- 108700026220 vif Genes Proteins 0.000 description 1
- 208000029257 vision disease Diseases 0.000 description 1
- 238000012070 whole genome sequencing analysis Methods 0.000 description 1
- 238000012049 whole transcriptome sequencing Methods 0.000 description 1
- 239000002023 wood Substances 0.000 description 1
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6876—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
- C12Q1/6883—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61P—SPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
- A61P27/00—Drugs for disorders of the senses
- A61P27/02—Ophthalmic agents
- A61P27/06—Antiglaucoma agents or miotics
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B20/00—ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B20/00—ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
- G16B20/20—Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/112—Disease subtyping, staging or classification
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/118—Prognosis of disease development
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/156—Polymorphic or mutational markers
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/158—Expression markers
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/178—Oligonucleotides characterized by their use miRNA, siRNA or ncRNA
Definitions
- Glaucoma is one of the most prevalent causes of blindness in the United States. Types of glaucoma can be grouped as open-angle, angle closure, and secondary. It is estimated that in the United States in 2010, of those over age 40, open-angle glaucoma affected nearly 2.8 million people, and worldwide caused bilateral blindness in more than 4.4 million people [1].
- POAG Primary open-angle glaucoma
- IOP intraocular pressure
- identifying genes whose alleles are associative with or causative of the progression of a disease comprising:
- genes having one or more site variants in the exomes from patients who have been diagnosed with the disease with one or more properties, e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,16, 17, or 18 properties, selected from:
- site variant is found in one or more patients
- site variant is found in three or more patients.
- one or more reference exomes have the major allele
- site variant is the minor allele in reference exomes
- site variant has only one alternate allele
- site is within genome region with balanced G+C and A+T content
- site is located outside low complexity genome regions; ix) site is located in genome region with no paralog within 95% identity; and
- site variant is located on chromosomes 1-22 or site variant is located on chromosome X or Y only if disease incidence is gender-biased;
- xi) site was measured in 25 or more patients
- xii) site variant frequency in patients differs from general populations by more than expected measurement error, e.g., 0.05 (on a frequency scale from 0.00 - 1.00);
- xiii) site variant frequency in patients exceeds general populations, e.g., by more than 0.10;
- xiv) site variant is within a gene or regulatory regions influencing its expression as R A or protein;
- xv site variant is within or near a gene expressed in tissues relevant to disease
- xvii) frequency of site variant in patients is above a line fitted to filtered sites represented as datapoints where X is reference general population frequency and Y is patient frequency, e.g. , fit with least squares linear regression;
- a p-value calculated with a 2x2 statistical test e.g., Fisher's Exact Test, from numbers of alternate and reference alleles observed for the site in patients and in general population remains significant after correction for multiple testing.
- the methods comprise selecting for genes having one or more site variants in the exomes from patients who have been diagnosed with the disease is carried out with nine or more properties, or twelve or more properties, or fifteen or more properties, or all eighteen of the properties identified above (i) to (xviii).
- identifying genes whose alleles are associative with or causative of the onset and/or progression and/or severity and/or recurrence of a disease comprising: a) sequencing or reviewing multiple exomes from patients who have been diagnosed with the disease and one or more exomes from one or more individuals known not to have the disease, wherein the one or more exomes from one or more individuals known not to have the disease comprise one or more reference exomes;
- genes having one or more site variants in the exomes from patients who have been diagnosed with the disease, wherein the genes have one or more properties, e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 properties, selected from:
- i) site variant is found in one or more patients
- iii) site variant is found in three or more patients
- one or more reference exomes have the major allele
- v) site variant is the minor allele in reference exomes
- site is within genome region with balanced G+C and A+T content
- site is located outside low complexity genome regions
- ix) site is located in genome region with no paralog within 95% identity
- site variant is located on chromosomes 1-22 or site variant is located on chromosome X or Y only if disease incidence is gender-biased.
- genes having one or more site variants in the exomes from patients who have been diagnosed with the disease, wherein the genes have one or more properties, e.g., 1, 2, 3, 4, 5, 6, 7, or 8 properties, selected from:
- i) site was measured in 25 or more patients
- site variant frequency in patients differs from general populations by more than expected measurement error, e.g., 0.05 (on a frequency scale from 0.00 - 1.00);
- iii) site variant frequency in patients exceeds general populations, e.g., by more than 0.10;
- iv) site variant is within a gene or regulatory regions influencing its expression as R A or protein;
- v) site variant is within or near a gene expressed in tissues relevant to disease; vi) odds ratio 95% confidence interval lower bound calculated for the site from patient and reference general population frequencies is above 1.00;
- frequency of site variant in patients is above a line fitted to filtered sites represented as datapoints where X is reference general population frequency and Y is patient frequency, e.g. , fit with least squares linear regression;
- a p-value calculated with a 2x2 statistical test e.g. , Fisher's Exact Test, from numbers of alternate and reference alleles observed for the site in patients and in general population remains significant after correction for multiple testing.
- the disease is, for example, a systematic, chronic disease, such as, for example, a neurodegenerative disease, a cancer, a cardiovascular disease, an ocular disease, an immune disease, an autoimmune disease, an endocrinologic disease (e.g., diabetes), or an inflammatory disease (including chronic inflammatory).
- the disease is a neurodegenerative disease.
- the disease is an ocular disease.
- the disease is primary open angle glaucoma (POAG).
- POAG primary open angle glaucoma
- the patients are symptomatic for the disease.
- the method is computer implemented.
- the site variants are selected from single nucleotide polymorphisms (SNPs), insertions, deletions and rearrangements.
- the methods further comprise determining the expression levels of the genes from patient exomes and reference exomes.
- the methods further comprise determining the expression levels of the microRNA from patient exomes and reference exomes.
- the sequencing step comprises employing a next-generation sequencing (NGS) technique or method.
- the methods further comprise selecting exomes sequenced and read with a fidelity of 4, 3, 2, 1 or fewer ⁇ e.g., no) mismatches per 100 bases.
- the general population exome dataset is selected from or derived from one or more of 1000 Genomes (1000genomes.org), the Exome Sequencing Project (evs.gs.washington.edu/EVS/) datasets, UK10K (ukl0k.org/), UCSC Genome Bioinformatics Site (genome.ucsc.edu/), other available public datasets, and proprietary datasets made available for comparison.
- the methods further comprise weighting said selected genes according to predictive power rankings of the collection of signature biomarkers.
- methods for predicting onset and/or progression and/or severity and/or recurrence of primary open angle glaucoma (POAG) in a subject comprising:
- allelic information and/or expression levels of a collection of signature biomarkers from a biological sample taken from said subject suspected of suffering POAG wherein said collection of signature biomarkers comprises one or more genes and/or microRNAs, e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, or more or all, selected from the group consisting of: AATF, ABI1, ABI3BP, ACTN2, ADAMTS15, ADCY2, AHNAK2, ANGEL2, ANKRD36, ANKRD36B, AN05, AP1M1, ARHGAP30, ASTN1, ATP6V1E2, BAI3, CACNA1E, CACNA1I,
- said collection of signature biomarkers comprises one or more genes selected from the biomarkers listed in Tables 4, 5 and/or 6.
- collection of signature biomarkers comprises one or more genes selected from the group consisting of: AATF, ABI1, ABI3BP, ACTN2, ADAMTS15, ADCY2, AHNAK2, ANGEL2, ANKRD36, ANKRD36B, AN05, AP1M1, ARHGAP30, ASTN1, ATP6V1E2, BAI3, CACNA1E, CACNA1I, CALM1, CCDC66, CD163, CDH13, CDH4, CDK17, CELF5, CHD8, CLCA4, CLEC7A, CLSTN2, C NM2, CNOT6, COL23A1, COL4A2, CRTAC1, CTU2, CYBA, DCBLD2, DHCR7, DNAJB11, DPF3, DRD2, EBF2, EN03, EPTl, ERI2,
- the methods comprise further administering to the subject an inhibitory nucleic acid that reduces or inhibits the expression of one or more microRNAs selected from hsa-miR-1246, hsa-miR-1248, hsa-miR-130a, hsa-miR-130a-3p, hsa-miR- 145, hsa-miR-145-3p, hsa-miR-148a, hsa-miR-148a-3p, hsa-miR-214, hsa-miR-214-3p, hsa-miR-216a, hsa-miR-224, hsa-miR-224-5p, hsa-miR-27a-5p, hsa-miR-31, hsa-miR-31- 5p, hsa-miR-4448, hsa-mi
- the methods further comprise administering to the subject one or more microRNAs or one or more mimics of microRNAs selected from hsa-miR-1246, hsa-miR-1248, hsa-miR-130a, hsa-miR-130a-3p, hsa-miR- 145, hsa-miR-145-3p, hsa-miR-148a, hsa-miR-148a-3p, hsa-miR-214, hsa-miR-214-3p, hsa-miR-216a, hsa-miR-224, hsa-miR-224-5p, hsa-miR-27a-5p, hsa-miR-31, hsa-miR-31 - 5p, hsa-miR-4448, hsa-miR-449a,
- the methods comprise further administering to the subject an inhibitory nucleic acid that reduces or inhibits the expression of one or more microRNAs selected from hsa-miR-100, hsa-miR- 100-5p, hsa-miR-105, hsa-miR- 105-5p, hsa-miR- 1226, hsa-miR- 1226-3p, hsa-miR-124, hsa-miR- 124-3p, hsa-miR- 124-5p, hsa-miR-1250, hsa-miR-129, hsa-miR- 129-5p, hsa-miR-138, hsa-miR- 138-1, hsa-miR- 138-2, hsa-miR- 138-2-3p, hsa-miR-139, hsa-miR-
- the methods further comprise administering to the subject one or more microRNAs or one or more mimics of microRNAs selected from hsa-miR-100, hsa-miR- 100-5p, hsa-miR-105, hsa-miR- 105-5p, hsa-miR- 1226, hsa-miR- 1226-3p, hsa-miR- 124, hsa-miR- 124-3p, hsa-miR- 124-5p, hsa- miR-1250, hsa-miR-129, hsa-miR- 129-5p, hsa-miR-138, hsa-miR- 138-1 , hsa-miR- 138-2, hsa-miR- 138-2-3p, hsa-miR-139, hsa-miR-
- the individual is symptomatic for POAG. In some embodiments, the individual has a family history of POAG. In some embodiments, said output of the predictive model predicts a likelihood of recurrence of POAG in the individual after said individual has undergone treatment for POAG. In some embodiments, the methods further comprise providing a report having a prediction of clinical recurrence of POAG of said individual. In some embodiments, the methods further comprise combining the allelic information and/or gene expression levels of said signature biomarkers with one or more other biomarkers to predict onset and/or progression and/or severity and/or recurrence of POAG in said individual. In some embodiments, the expression levels of a collection of signature biomarkers comprise gene expression levels are measured at multiple times.
- the methods further comprise using the dynamics of the gene expression levels measured at multiple times to predict onset and/or progression and/or severity and/or recurrence of disease (e.g., HPG/POAG) in said subject.
- the methods further comprise evaluating the output of the predictive model to determine whether or not the individual falls in a high risk group.
- the methods further comprise developing said predictive model using stability selection or logistic regression.
- the methods further comprise developing said predictive model using stability selection.
- the methods further comprise developing said predictive model using logistic regression.
- applying said allelic information and/or expression levels of the collection of signature biomarkers to said predictive model comprises weighting said expression levels according to stability rankings or predictive power rankings of the collection of signature biomarkers.
- applying said allelic information and/or expression levels of the collection of signature biomarkers to said predictive model comprises weighting said expression levels according to stability rankings of the collection of signature biomarkers. In some embodiments, applying said allelic information and/or expression levels of the collection of signature biomarkers to said predictive model comprises weighting said expression levels according to predictive power rankings of the collection of signature biomarkers.
- One embodiment is a method of identifying genes whose alleles are associative with or causative of the progression of a disease, comprising:
- i) site variant is present in 25 or more patients
- ii) site variant has only one alternate allele
- the one or more reference exomes have the major allele;
- site variant is within a gene or regulatory regions influencing its expression as R A or protein;
- site variant is located on chromosomes 1-22 or site variant is located on chromosome X or Y only if disease incidence is gender-biased;
- site variant has a frequency of ⁇ 0.95 in patients
- site variant is within general population exome dataset
- site variant has approximately the same frequency within the general population as the frequency of the disease within the general population; and ix) site variant occurs in patients with a frequency greater than in the general population.
- Another embodiment is a method of identifying genes whose alleles are associative with or causative of the progression of a disease, comprising:
- i) site variant is present in two or more patients
- ii) site variant has only one alternate allele
- the one or more reference exomes have the major allele; and iv) site variant is within a gene or regulatory regions influencing its expression as R A or protein;
- genes having one or more site variants in the exomes from patients who have been diagnosed with the disease wherein the genes have one or more properties selected from:
- i) site variant is present in 25 or more patients
- site variant is located on chromosomes 1-22 or site variant is located on chromosome X or Y only if disease incidence is gender-biased;
- iii) site variant has a frequency of ⁇ 0.95 in patients
- site variant is within general population exome dataset; v) site variant has approximately the same frequency within the general population as the frequency of the disease within the general population; and vi) site variant occurs in patients with a frequency greater than in the general population.
- POAG primary open angle glaucoma
- Methods for diagnosis, prognosis, and/or therapy for the diseases described herein, including glaucoma and POAG are generally known in the art and can be combined with the methods of gene and biomarker identification described herein. For example, a patient can be tested for having or not having the identified genetic marker as described herein. One or more samples can be taken from the patient, and the samples analyzed.
- additional diagnosis, prognosis, and therapy can be carried out with the patient. For example, one can analyze for onset, progression, severity, and/or recurrence of the disease. Methods known in the art can be used. See, for example, US Patent Publication
- kits designed and configured for practicing methods are also provided herein as known in the art of diagnostic and testing kits and devices. The use of kits is generally known in the art. See, for example, US Patent Publication 2011/0177509, which is incorporated herein by reference in its entirety. Kits can include, for example, appropriate genetic materials, indicators, instructions, and/or packaging.
- kits for identifying a patient or subject using the methods described herein which can include kits.
- One or more genetic tests can be used to identify the patient or subject.
- the patient or subject can then be given a prognosis and/or treatment.
- exome refers to the part of the genome formed by exons, the sequences which when transcribed remain within the mature RNA after introns are removed by RNA splicing. It differs from a transcriptome in that it consists of all DNA that is transcribed into mature RNA in cells of any type.
- the exome includes coding exons, non-coding exons, 5' untranslated regions (UTR ), 3' UTR, flanking introns, microRNA, and proximal promoters.
- threshold level refers to a representative or predetermined expression level of a gene or microRNA.
- the threshold level can represent expression detected in a sample from a normal control, i.e., from non-diseased tissue or non-diseased subject.
- the normal control is of the same tissue type of the biological sample subject to testing.
- the threshold level can be determined from an individual or from a population of individuals.
- the expression levels of a gene or microRNA from a diseased tissue or subject may be above (increased) or below (decreased) in comparison to a control level.
- the terms “increased expression level” or “overexpression” interchangeably refer to a predetermined threshold level or a level of expression from a normal or non- diseased control.
- An increased expression level is determined when the level of expression in the test biological sample is at least about 10%, 25%, 50%, 75%, 100% (i.e., 1-fold), 2- fold, 3 -fold, 4-fold or greater, in comparison to the predetermined threshold level of expression or the level of expression from a normal or non-diseased control tissue. In determining an increased level of expression, usually the same tissue types are compared.
- the terms “decreased expression level” or “underexpression” interchangeably refer to a predetermined threshold level or a level of expression from a normal or non-diseased control.
- a decreased expression level is determined when the level of expression in the test biological sample is at least about 10%, 25%, 50%, 75%, 100%) (i.e., 1-fold), 2-fold, 3-fold, 4-fold or less or lower, in comparison to the predetermined threshold level of expression or the level of expression from a normal or non-diseased control tissue. In determining an decreased level of expression, usually the same tissue types are compared.
- the term "individual,” “patient,”, “subject” interchangeably refer to a mammal, for example, a human, a non-human primate, a domesticated mammal (e.g., a canine or a feline), an agricultural mammal (e.g., equine, bovine, ovine, porcine), or a laboratory mammal (e.g., rattus, murine, lagomorpha, hamster).
- a mammal for example, a human, a non-human primate, a domesticated mammal (e.g., a canine or a feline), an agricultural mammal (e.g., equine, bovine, ovine, porcine), or a laboratory mammal (e.g., rattus, murine, lagomorpha, hamster).
- composition or method comprising
- elements are included, but other elements (e.g., unnamed signature genes) may be added and still represent a composition or method within the scope of the claim.
- transitional phrase "consisting essentially of means that the associated composition or method encompasses additional elements, including, for example, additional signature genes, that do not affect the basic and novel characteristics of the disclosure.
- the term "signature gene” refers to a gene whose expression is correlated, either positively or negatively, with disease extent or outcome or with another predictor of disease extent or outcome.
- a gene expression score can be statistically derived from the expression levels of a set of signature genes and used to diagnose a condition or to predict clinical course.
- the expression levels of the signature genes may be used to predict onset and/or progression and/or severity and/or recurrence of disease (e.g., POAG or HPG) without relying on a
- a "signature nucleic acid” is a nucleic acid comprising or corresponding to, in case of cDNA, the complete or partial sequence of a R A transcript encoded by a signature gene, or the complement of such complete or partial sequence.
- a signature protein is encoded by or corresponding to a signature gene of the disclosure.
- the predictive methods of the present disclosure also can provide valuable tools in predicting if a patient is likely to respond favorably to a treatment regimen, such as surgical intervention and/or pharmacological intervention.
- the term "plurality" refers to more than one element.
- the term is used herein in reference to a number of nucleic acid molecules or sequence tags that are sufficient to identify significant differences in copy number variations in test samples and qualified samples using the methods disclosed herein.
- at least about 3 x 10 6 sequence tags of between about 20 and 40 bp are obtained for each test sample.
- each test sample provides data for at least about 5 x 10 6 , 8 x 10 6 , 10 x 10 6 , 15 x 10 6 , 20 x 10 6 , 30 x 10 6 , 40 x 10 6 , or 50 x 10 6 sequence tags, each sequence tag comprising between about 20 and 40 bp.
- nucleic acid refers to a covalently linked sequence of nucleotides (i.e., ribonucleotides for R A and deoxyribonucleotides for DNA) in which the 3' position of the pentose of one nucleotide is joined by a phosphodiester group to the 5' position of the pentose of the next.
- nucleotides include sequences of any form of nucleic acid, including, but not limited to RNA and DNA molecules.
- polynucleotide includes, without limitation, single- and double-stranded polynucleotide.
- microRNA mimic and “mimics of microRNA” are well known in the art. See e.g., Wang, Z., 2009, Chapter on “miRNA Mimic Technology,” pages 93-100, MicroRNA Interference Technologies, Springer- Ver lag.
- it can refer to synthetic sequences that are nearly identical or identical to microRNAs found in cells. They can be, for example, sometimes modified chemically in some way for stability (e.g., to make it through the liver) or with a nucleotide or two changed for delivery or manufacturing purposes.
- microRNAs or short synthetic RNAs nearly identical to the microRNAs can be used, e.g., 90% identical or closer, possibly with chemical modifications to the nucleotides. Double stranded miRNA mimics can be used.
- NGS Next Generation Sequencing
- the term "read” refers to a sequence read from a portion of a nucleic acid sample. Typically, though not necessarily, a read represents a short sequence of contiguous base pairs in the sample. The read may be represented symbolically by the base pair sequence (in ATCG) of the sample portion. It may be stored in a memory device and processed as appropriate to determine whether it matches a reference sequence or meets other criteria.
- a read may be obtained directly from a sequencing apparatus or indirectly from stored sequence information concerning the sample.
- a read is a DNA sequence of sufficient length (e.g., at least about 25 bp) that can be used to identify a larger sequence or region, e.g., that can be aligned and specifically assigned to a chromosome or genomic region or gene.
- the terms “aligned,” “alignment,” or “aligning” refer to the process of comparing a read or tag to a reference sequence and thereby determining whether the reference sequence contains the read sequence. If the reference sequence contains the read, the read may be mapped to the reference sequence or, in certain embodiments, to a particular location in the reference sequence. In some cases, alignment simply tells whether or not a read is a member of a particular reference sequence (i.e., whether the read is present or absent in the reference sequence). For example, the alignment of a read to the reference sequence for human chromosome 13 will tell whether the read is present in the reference sequence for chromosome 13. A tool that provides this information may be called a set membership tester.
- an alignment additionally indicates a location in the reference sequence where the read or tag maps to. For example, if the reference sequence is the whole human genome sequence, an alignment may indicate that a read is present on chromosome 13, and may further indicate that the read is on a particular strand and/or site of chromosome 13.
- Aligned reads or tags are one or more sequences that are identified as a match in terms of the order of their nucleic acid molecules to a known sequence from a reference genome. Alignment can be done manually, although it is typically implemented by a computer algorithm, as it would be impossible to align reads in a reasonable time period for implementing the methods disclosed herein.
- an algorithm from aligning sequences is the Efficient Local Alignment of Nucleotide Data (ELAND) computer program distributed as part of the Illumina Genomics Analysis pipeline.
- ELAND Efficient Local Alignment of Nucleotide Data
- a Bloom filter or similar set membership tester may be employed to align reads to reference genomes.
- an indexing algorithm such as that implemented in versions of the BowTie computer program may be employed to align reads to reference genomes.
- the matching of a sequence read in aligning can be a 100% sequence match or less than 100% (non-perfect match).
- mapping refers to specifically assigning a sequence read to a larger sequence, e.g., a reference genome, by alignment.
- reference genome refers to any particular known genome sequence, whether partial or complete, of any organism or virus which may be used to reference identified sequences from a subject.
- reference genome used for human subjects as well as many other organisms is found at the National Center for Biotechnology Information at ncbi.nlm.nih.gov.
- a "genome” refers to the complete genetic information of a mammal expressed in nucleic acid sequences.
- the reference sequence is significantly larger than the reads that are aligned to it.
- it may be at least about 100 times larger, or at least about 1000 times larger, or at least about 10,000 times larger, or at least about 10 5 times larger, or at least about 10 6 times larger, or at least about 10 7 times larger.
- chromosome refers to the heredity-bearing gene carrier of a living cell, which is derived from chromatin strands comprising DNA and protein components (especially histones).
- chromatin strands comprising DNA and protein components (especially histones).
- the conventional internationally recognized individual human genome chromosome numbering system is employed herein.
- condition herein refers to "medical condition” as a broad term that includes all diseases and disorders, but can include [injuries] and normal health situations, such as pregnancy, that might affect a person's health, benefit from medical assistance, or have implications for medical treatments.
- sensitivity is equal to the number of true positives divided by the sum of true positives and false negatives.
- Figure 1 illustrates strategies for high fidelity identification of SNPs, insertions/deletions (indels), and genome rearrangements associated with disease causation and/or progression.
- SNPS @ 3x Large rectangles represent ranges of genome nucleotides to which sequence reads, represented by smaller lines, were mapped.
- SNPS @ 3x Large rectangles represent ranges of genome nucleotides to which sequence reads, represented by smaller lines, were mapped.
- To identify SNPs reads with 0 to 3 mismatches per 100 bases are aligned to the reference genome and their bases are compared. Mismatches between reference nucleotides and read nucleotides, represented by dark dots on the reads, designate a variant site. Generally, 3+ sequence reads are needed to determine whether a site has a variant.
- FIG. 1 illustrates genes with their strength of expression in human eye tissues. Left: Dark to light color represents high to low overall expression in eye tissues for a non-exhaustive list of genes detected as expressed in eye tissues by RNA sequencing; genes were selected to range from high to low expression.
- GAS7 Three genes previously associated with glaucoma are noted, GAS7, HLA-DRB1, and COL4A2.
- TM trabecular meshwork
- CB cilliary bodies
- CH choroid
- OD optic disk
- RT retina
- Blue lines (top) denote gene exons.
- Black vertical lines denote RNA sequence reads.
- Figure 3 illustrates expression of four genes in 6 eye tissues, for each gene including trabecular meshwork (TM), ciliary body (CB), choroid (CH), optic disk (OD), optic nerve (ON) and retina (RT). Each gene has distinct tissue-specific expression.
- TM trabecular meshwork
- CB ciliary body
- CH choroid
- OD optic disk
- ON optic nerve
- RT retina
- Figure 5 illustrates microRNA overexpressed in diseased optic nerve (i.e., optic nerve from patients having primary open angle glaucoma).
- Overexpressed microRNAs include hsa-miR-483-5p, hsa-miR-483-3p, hsa-miR-214-3p, hsa-miR-452-5p, hsa-miR-4448, hsa-miR-224-5p, hsa-miR-1246, hsa-miR-130a-3p, hsa-miR-9-3p, hsa-miR- 767-5p, and hsa-miR-449a.
- FIG. 6 illustrates microRNA (miRNA) underexpressed in diseased optic nerve (i.e., optic nerve from patients having primary open angle glaucoma).
- Underexpressed microRNAs include hsa-miR-34b-3p, hsa-miR-3182, hsa-miR-4640-3p, hsa-miR-2276, hsa- miR-4423-5p, hsa-miR-2277-3p, hsa-miR-513c-5p, hsa-miR-1250, hsa-miR-18a-3p, hsa- miR-505-5p, hsa-miR-138-2-3p, hsa-miR-548ah-3p, hsa-miR-4677-3p hsa-miR-1226-3p, hsa-miR-193b-5p, and hsa-miR-18b
- kits for identification of disease-associated genome variants in coding or regulatory regions of genes are provided herein.
- the methods are exemplified in a preferred embodiment by the identification of genes that are associated with and/or promote onset or progression of a type of primary open-angle glaucoma.
- Other methods such as predictive, diagnostic, prognostic, and therapeutic methods are also provided herein.
- the methods are based, in part, on the definition and use of a logic-based method to rank variants and genes based on clinical properties of disease.
- the methods are exemplified by application to variants from a cohort of patients with primary open angle glaucoma (POAG) and with elevated eye pressure, the method revealed 140 genes with variants over-represented in this disease in this embodiment. Genes were further ranked within the method based on gene expression patterns in tissues relevant to the disease process, which in the case of POAG can be retina, optic disk, optic nerve, ciliary body, choroid, trabecular meshwork, iris, sclera, and lamina cribrosa. Additional genes associated with the ranked genes were identified within the method as potential regulators of RNA and protein expression levels whose regulatory performance is disrupted or altered by highly ranked variants.
- the method implements technical and clinical filters that reflect occurrence of disease in general populations. These filter reduced thousands of potential variants to under 150 for the preferred embodiment.
- the method incorporates gene expression information from tissues relevant to disease to refine ranked genes.
- the method incorporates information about potential microR A, DNA-binding protein, and RNA- binding protein regulators of genes identified by the clinical ranking parameters.
- the genes identified by the analysis are potential targets or members of cellular pathways or processes that may be effective therapeutic targets in treating or curing the disease of interest (e.g., POAG). More particularly, disease onset, progression, severity, and/or recurrence can be addressed. Currently, for example, there is no cure for POAG and the only treatment is reduction of pressure in the eye to slow disease progression. Many variants found are in regulatory regions of genes and may control production of mRNA and/or protein. Molecules that bind to DNA or RNA at sites disrupted or altered by variants are further therapeutic targets.
- the disease of interest e.g., POAG
- POAG disease of interest
- Many variants found are in regulatory regions of genes and may control production of mRNA and/or protein. Molecules that bind to DNA or RNA at sites disrupted or altered by variants are further therapeutic targets.
- a key advantage in at least some embodiments is that a patient can receive earlier treatment for the disease such as POAG by use of the methods, screenings, and predictions described herein.
- Another key advantage in at least some embodiments is that a patient can receive more personalized or particular treatment for the disease such as POAG by use of the methods, screenings, and predictions described herein.
- the medical community is provided with a method to identify the genetic changes in a genome that are associated with a disease state, where those changes are not findable by standard GWAS or exome analysis methods.
- the newly identified sites provide a new patient management tool.
- the approach described and claimed herein for glaucoma did find several genes previously associated with glaucoma, which puts new focus on those genes. Within those genes, the approach found sites that were not previously found in other studies because those studies focused on marker sites, whereas the presently described and claimed methods focus on finding causal sites inside the genes. Even further, it was found that frequencies of sites associated with glaucoma varied in frequency in the general population from very rare at ⁇ 0.01 to very common at nearly 0.50.
- microRNAs in optic nerve differed from microRNAs in retina and even optic disc. This was a large surprise because the optic nerve comprises axons of retinal ganglion cells whose nucleii are within the retina.
- the resulting sites passed clinical utility thresholds, they can be used directly for biomarker tests.
- the patient frequency of each final site ranges from 0.18 to 0.98 with an average of 0.55. That is, large numbers of the HPG patients in which an allele was measured harbored each variant allele. The final sites are thus worth a clinician's time to consider and use in planning a patient's treatment.
- RNA expression data were sought herein to gather RNA expression data for assessing sites found through our analysis.
- Surgical skill is required for the fine dissection of ocular tissues to find and harvest distinct tissues, e.g., optic nerve vs. optic disk, optic disk vs. retina and trabecular meshwork vs. iris and choroid.
- computational skill is required to analyze and interpret sequence reads obtained from tissues RNA and note differential expression of genes and microRNAs that control availability of those genes to make protein.
- the complementary and necessary surgical and computational skill resulted in assembly of a glaucoma-specific gene expression catalog which is and will continue to be a critical component to assess variants over-represented in HPG patients.
- the group of patients are checked for relatives (e.g., brother and sister in the patient cohort), repeated patients (e.g., a patient who moved from one study center to another), and population stratification (e.g., a number of patients with Mexican ancestry among Caucasian patients recruited from a southern state).
- Population features are corrected by eliminating subjects from the cohort or applying statistical corrections.
- This procedure generates a list of markers that each point to a nearest gene or genes. Each plurality of markers near a given gene are subjected to additional statistical analysis and identify the gene as associated with disease. As multiple studies of the same disease are published, meta-analysis can be performed in which case cohorts are combined as are control cohorts; the larger numbers of cases and controls confer additional discovery power.
- Markers are chosen for the measurement platform to cover the genome evenly and completely. They do not indicate cause. (2) Markers may be over- or under-represented in the cases. Under-representation (Odds Ratio (OR) ⁇ 1) indicates causal variant is likely to be nearby and over-represented in patients by virtue of being on a different version of the gene, i.e., a different haplotype. (3) Measured markers are restricted to known variants and may be restricted to those with general population frequency >0.05, depending on the platform. So variants rare in the population remain unmeasured. They can be inferred through statistical analysis of deeply sequenced genomes from general populations and assessing local recurring combinations of markers (a process called imputation).
- Genome sequencing aims to identify variants in a person's genome through direct DNA sequencing and assembly of DNA reads into contiguous stretches.
- Some considerations of this include: (1) 30x coverage leaves random areas sparsely covered; so lOOx is generally used for clinical purposes, more than tripling the cost to -$10,000. (2) Rearrangements and repeats are more numerous between genes and make data analysis for variant discovery more complex.
- Exome sequencing uses DNA capture technology to sequence only the parts of genes that make molecules used in cells, e.g., exons that are protein coding or generate functional non-coding RNAs after an RNA transcribed from the genome has been spliced.
- Captured exonic DNA is sequenced and mapped to a reference genome to find differences between a person's genome and the reference.
- the resulting variants may be causal of disease and are subjected to filtering to identify causal variants.
- Standard filters reject intronic and intergenic sites as off-target.
- Successful exome searches have focused on novel variants new in a small number ⁇ e.g., 10) patients with disease, as in [22].
- one advantage for at least some embodiments is that every variant detected in one or more patients is considered for disease association.
- standard GWAS or exome analysis requires variant alleles to be found in a larger number of patients.
- Another advantage for at least some embodiments is that statistical analysis is applied to sites observed in 25 or more patients, and each site is statistically tested based on its number of observations in the patient cohort.
- standard GWAS methods require uniform numbers of observations for all sites tested, e.g., measurement in 95% of cases and controls.
- frequencies calculated from patients are compared to more than one available reference population.
- frequencies measured in HPG patients are compared with 1000 Genomes, Phase 1, since it is the most broadly used in the community, and then against the more recent release 1000 Genomes, Phase 3, with restriction to the subset of subjects of similar ancestry, and then against the Exome Sequencing Project, again with restriction to similar ancestry.
- standard GWAS uses control cohorts measured along with the case cohorts; GWAS meta-analysis combines case cohorts for multiple studies into one and compares with one combined control cohort.
- Another advantage for at least some embodiments is that since the majority of sites measured in patients are concordant with general population frequencies, outliers are identified in two steps that are clinically motivated rather than statistically motivated.
- an absolute difference threshold is applied (>0.10, in example). This recognizes the clinical motivation that in a well-phenotyped patient population that harbors genetic causes of disease, the disease-causing variants should be vastly higher than general populations. This restricts variants to those that will be clinically significant. This is in contrast to findings in GWAS studies where frequency deviations may be as small as 2% but have strong p-values. By restricted sites to those with large differences, final sites will be clinically significant.
- GWAS and meta-analysis identify outliers based on p-values and genome-wide significance thresholds, thus accepting as disease-associated variants that do little to explain disease and with little or no clinical utility.
- Another advantage of at least some embodiments is that false positives are minimized through a novel series of filters so that variant detection can be more sensitive. As a result, more variants, including many deep inside introns or upstream of genes in promoter regions can be considered for relationship to disease. Problematic variants are identified in two steps.
- mapping bias is identified directly and captured as two exclusion lists. These lists holds sites for which (i) the reference base is the minor allele in the reference genome used for mapping; and (ii) the alternate allele found in patients in also the minor allele in general populations. In the example, these two exclusion lists eliminated from further consideration 1,188,903 and 127,620 variant sites, respectively.
- Every candidate variant site is screened against a constructed list of sites genome-wide that have anomalies within the genome region. Such anomalies can introduce false positive variant calls.
- the approach here relies on three exclusion lists that were constructed to implement three sequence-based filters. These lists hold sites computed to occur within 100-200 bases with (i) GC/AT bias; (ii) replicates elsewhere in the genome; and (iii) tandemly repeated motifs. In the example, the exclusion lists were used to reject 77,149 sites within regions of GC/AT bias, 56,905 sites within sequences repeated elsewhere in the genome, and 124 sites with tandem repeats.
- GWAS studies are limited to sites represented on commercial genotyping platforms and do not include variants novel in a patient, and exome studies are limited to sites with uniformly deep coverage across the exome.
- variants that cause chronic, systemic diseases in the general population at rates higher than, say, 1%, i.e., common diseases. Such variants are unlikely to be novel within patient populations. Otherwise the disease would be far less common. However, combinations of lower frequency variants may together explain disease across a patient population. Here, variants are considered for disease association regardless of their frequency in general populations, and all variants detected in patients are considered.
- the source material sequences of use in the present methods have been sequenced with high fidelity, e.g., the sequences determined with 4 or fewer mismatches per 100 bases, e.g., with 4, 3 or 2 or fewer mismatches per 100 bases.
- Table 2 provides a summary of steps that can be taken in the inventive methods for the preferred embodiment of POAG.
- One skilled in the art can vary the order of steps as needed for a particular application.
- One skilled in the art also can eliminate one or more steps as needed for a particular application.
- One or more technical, clinical, gene- based, and/or statistical constraints listed in Table 2 are applied for the selection of genes associated with or causative of a disease condition.
- sites are excluded if the base in the hgl9 reference genome was the minor allele base in 1000G.
- sites are included only if the alternate allele remained the minor allele in general populations of similar ethnic descent as the patient cohort.
- Sixth, sites found to have more than one alternative base are set aside for future consideration.
- Seventh, eighth and ninth, sites are restricted to those in genome regions with balanced G+C and A+T content; located outside low complexity regions; and located in genome regions without nearly identical, e.g., within 95% identity, paralogs elsewhere.
- Tenth, any sites located on the X-chromosome or the Y-chromosome are unlikely to contribute to a target disease (e.g., high pressure glaucoma) unless the disease has a clear gender predilection, and therefore can be excluded (e.g., limit selection to genes expressed from chromosomes 1-22). See, Ederer, et al, 1994 [23]. Thus sites on the X and Y chromosomes are excluded from further analysis.
- a SNP site must be observed in enough patients to calculate its importance in disease. Because sequencing does not always capture a given site in all samples, the denominator for frequency calculation for a SNP site becomes twice the number of samples with reads at that site. In varying embodiments, sites are excluded from consideration if they are measured in fewer than 25 patients. Twelfth, a genomic aberration is not likely to be important as a primary cause of a target disease (e.g., high pressure glaucoma) if it occurs with frequency close to that in the normal population.
- a target disease e.g., high pressure glaucoma
- sites with patient frequencies within measurement error e.g., 0.05, of the 1000 Genomes Phase 1 general population frequency are set aside, as are sites with patient frequencies within measurement error of the European subset of the 1000 Genomes Phase 3 subjects.
- sites with patient frequencies within measurement error of the European subset of the Exome Sequencing Project (ESP) are set aside.
- SNP sites with allele frequencies of greater than the prevalence of the target disease e.g. , high pressure glaucoma, with occurs in about 2 to 4% of the adult general population
- the target disease e.g. , high pressure glaucoma, with occurs in about 2 to 4% of the adult general population
- sites are kept if their patient allele frequency substantially exceeds general population frequency, e.g., by 0.10 or greater in any adult general population used for comparison.
- two gene-base criteria are applied. Fourteenth, sites outside of a gene or regulatory regions influencing its expression as RNA or protein are excluded from further analysis as off target. Fifteenth, sites within or near genes expressed in tissues relevant to disease are retained.
- odds ratio and confidence interval are calculated for each site based on number of patients in whom the site was measured, the number of alternate alleles observed, and the number of measured and alternate alleles in the 1000G Phase 3 database. Sites with a 95% odds ratio confidence interval lower bound above 1.0 are retained. Seventeenth, sites are further retained if their frequency in patients is above a statistical fit of a line to datapoints where X is reference general population frequency and y is patient frequency. In some embodiments, the fit is performed with a least square linear estimate function. Eighteenth, a 2x2 statistical test is applied to obtain p-values. In some embodiments, Fisher's Exact Test is used.
- a significance threshold is calculated for each measurement group.
- the Bonferroni formula (0.05/N) is used to calculate the threshold maximum p-value to determine significance under multiple testing. SNP sites passing these constraints indicate genes important in the target disease (e.g., high pressure glaucoma, ocular diseases and disorders, Alzheimer's, Parkinson's, Prion Disease (PRNP) and other misfolded protein diseases).
- Embodiments disclosed herein also relate to apparatus for performing these operations.
- This apparatus may be specially constructed for the required purposes, or it may be a general-purpose computer (or a group of computers) selectively activated or reconfigured by a computer program and/or data structure stored in the computer.
- a group of processors performs some or all of the recited analytical operations collaboratively (e.g., via a network or cloud computing) and/or in parallel.
- a processor or group of processors for performing the methods described herein may be of various types including microcontrollers and microprocessors such as
- certain embodiments relate to tangible and/or non-transitory computer readable media or computer program products that include program instructions and/or data (including data structures) for performing various computer-implemented operations. See, for example, WO 2014/080323 for use of non-transitory computer readable or storage media in the genomic context.
- Examples of computer-readable media include, but are not limited to, semiconductor memory devices, magnetic media such as disk drives, magnetic tape, optical media such as CDs, magneto-optical media, and hardware devices that are specially configured to store and perform program instructions, such as read-only memory devices (ROM) and random access memory (RAM).
- the computer readable media may be directly controlled by an end user or the media may be indirectly controlled by the end user. Examples of directly controlled media include the media located at a user facility and/or media that are not shared with other entities.
- Examples of indirectly controlled media include media that is indirectly accessible to the user via an external network and/or via a service providing shared resources such as the "cloud.”
- Examples of program instructions include both machine code, such as produced by a compiler, and files containing higher level code that may be executed by the computer using an interpreter.
- the data or information employed in the disclosed methods and apparatus is provided in an electronic format.
- Such data or information may include reads and tags derived from a nucleic acid sample, counts or densities of such tags that align with particular regions of a reference sequence (e.g., that align to a chromosome or chromosome segment), reference sequences (including reference sequences providing solely or primarily polymorphisms), counseling recommendations, diagnoses, and the like.
- data or other information provided in electronic format is available for storage on a machine and transmission between machines.
- data in electronic format is provided digitally and may be stored as bits and/or bytes in various data structures, lists, databases, etc. The data may be embodied electronically, optically, etc. 3.
- Identified Biomarkers Causing Onset or Affecting Progression of Primary Open Angle Glaucoma (POAG) or high pressure glaucoma (HPG)
- biomarkers including genes and microRNAs, determined to be associated with and/or causative of POAG and/or HPG are provided in Tables 4, 5, and 6.
- the alternative (ALT) allele is associated with disease.
- Tables 5 and 6 summarize microRNAs that are overexpressed or underexpressed in tissues from patients having POAG and/or HPG.
- expression of any of the listed biomarkers in Tables 4, 5, and 6 can be determined in the various ocular tissues, including without limitation trabecular meshwork (TM), ciliary body (CB), choroid (CH), optic disk (OD), optic nerve (ON) and retina (RT). Methods known in the art can be used to determine expression levels.
- the POAG/HPG associative and/or causative genes discovered herein can be evaluated and/or monitored with genes known to be associated with and/or causative of glaucoma and/or other eye diseases.
- Prior genome-wide association and linkage-based studies have identified loci with contribution to glaucoma including myocilin, CYP1B1 , optineurin, WDR36, TBK1 , TBK2, and GALC.
- Loci contributing to POAG found through GWAS include TMCOl , CAV1/CAV2,
- CDKN2B-AS1 SIX1/SIX6, TXNRD2, ATXN2, FOXC 1 , an 8q22 intergenic region, and GAS7.
- Loci associated with optic disk area, a phenotype relevant to POAG include
- Loci associated with vertical cup to disk ratio (CDR), a useful measurement to monitor progression of optic neuropathy in POAG, include SCYL1/LTBP3, CHEK2, ATOH7, DCLK1 , SIX1/SIX6, CDKN2A/B, and
- CDKN2B-AS 1.
- CCT central corneal thickness
- FOXOl Several genes are strongly associated with central corneal thickness (CCT), including FOXOl , COL5A1 , ZNF469, AKAP13, AVGR8, and COL8A2; however, recent genetic studies indicate CCT may not be directly associated with susceptibility to POAG.
- Molecular studies of differential gene expression in tissues relevant to glaucoma revealed genes up- or down-regulated in trabecular meshwork, lamina cribrosa, and optic nerve head astrocytes from glaucomatous eyes compared to eyes without disease.
- OMIM database of diseases and genes maintained at NCBI aims to provide a comprehensive list of disease-related genes for all human diseases.
- OMIM lists 29 genes indirectly related to glaucoma: APOE, BEST1, BMP4, CA12, CANTl, CNTNAP2, CRBl, EPO, FOXE3, FOXL2, GJAl, GLIS3, ISPD, LMXIB, LOXL1, MTHFR, PAX6, PEX5, PITX2, PITX3, POMT1, RPS19, RRM2B, SLC4A4, TDRD7, TGFB2, TNF, and TTR as well as TMCOl listed above.
- the National Eye Institute's EyeGene project maintains a database of genes involved in any eye disease and their variants causing disease.
- One skilled in the art can combine prior art knowledge with the inventive features described and claimed herein to address disease.
- Another important aspect is a method for predicting onset and/or progression and/or severity and/or recurrence of disease (e.g, primary open angle glaucoma (POAG)) in a subject, the method including receiving allelic information and/or expression levels of a collection of signature biomarkers from a biological sample taken from the subject suspected of developing or suffering a disease such as POAG, wherein said collection of signature biomarkers comprises one or more genes and/or microRNA selected from a group developed using the methods described herein.
- POAG primary open angle glaucoma
- kits can be used for testing of subjects.
- POAG primary open angle glaucoma
- HPG high-pressure POAG
- the DNA samples for this study are a subset of the de -identified samples from patients enrolled in the NEIGHBOR GWAS.
- Patients with primary open angle glaucoma (POAG) were enrolled in NEIGHBOR after confirmation of reliable visual field (VF) tests with characteristic defects on two or more tests, or with a single qualifying VF test accompanied by a vertical cup-disc ratio of 0.7 or more in at least one eye.
- Examination of the ocular anterior segment disclosed no signs of secondary causes for elevated IOP.
- the approach to the filtration structures in the anterior chamber angle was wide open on gonioscopic examination. All patients selected for the present study had a documented, confirmed history of IOP >22 mm Hg and were classified as HPG [8,27].
- Paired DNA sequences (readpairs) of length 100 bases (2x100) were determined for enriched DNA to generate a minimum of 50 million readpairs per sample.
- the hgl9 reference genome 14 contains 21,210 genes with HUGO identifiers and 464,698 exons annotated in the Refseq database at NCBL.
- the Nimblegen V2 probes were designed to cover 44,070,352 bases in 392,771 Refseq exons and 18,804 genes with HUGO identifiers.
- the Nimblegen V3 probes were designed to cover an expanded target region with 64,148,113 bases in 410,269 exons and 19,721 genes.
- FIG. 1 illustrates the read mapping strategy. Mapped reads were converted from a text-based sequence alignment/map (SAM) format to a binary (BAM) format with Samtools [30]. [0109] Sequence data quality filtering and genoty ping. The BAM files for each sample were reviewed to determine whether reads were sufficient to determine genotypes at variant sites across the targeted capture regions. Any sample with insufficient breadth of coverage was excluded from further analysis. This yielded 295 samples with sufficient sequencing (Table 1). Each remaining BAM file was treated as follows: All sequence data were analyzed with respect to the forward strand of the hgl9 reference genome. The
- Samtools "pileup" algorithm 16 was called to extract bases from reads at every sequenced site to produce a list of bases ("pileup") and a consensus base at each site. Each pileup was separated into evidence agreeing with the hgl9 reference base and evidence for an alternate base at that site.
- reads were required to be from both forward and reverse DNA strands, with at least three high quality reads per base for the genotype to be considered heterozygous (two or more differing nucleotides) or four high quality reads to be considered homozygous (two copies of one nucleotide).
- the ratio of reads supporting each nucleotide had to be between 0.5 and 2, indicating the reads were balanced between both chromosomes. If this analysis found evidence that supported either the hgl9 reference or an alternate base yet did not meet the criteria for a call, the site was designated as "no call" for the sample, and the observation of the site in the patient flagged as
- IOP treated intraocular pressure
- the table included every site observed with an allele call different from the reference genome in at least one patient. SeattleSeq returned annotations for each site with gene names, dbSNP database identifiers for known SNPs, whether a SNP changes a protein amino acid, likely impact of the change on the protein using the PolyPhen2 and SIFT2 algorithms [32,33,34], distance to nearest exon-intron splice site, distance to stop codon for SNPs in untranslated regions, distance to nearest gene for intergenic SNPs, relative conservation of DNA around the SNP across mammalian genomes, and any known clinical or disease association.
- the annotations were added to the Master Variant Table to support further analysis and search for genes associated with HPG.
- Sites were excluded from consideration if they were measured in fewer than 25 patients. Twelfth, sites with patient frequencies within measurement error, e.g., 0.05, of the 1000 Genomes Phase 1 general population frequency were set aside, as were sites with patient frequencies within measurement error of the European subset of the 1000 Genomes Phase 3 subjects. Likewise, sites with patient frequencies within measurement error of the European subset of the Exome Sequencing Project (ESP) were set aside. Thirteenth, since POAG occurs in about 2 to 4% of the adult general population, sites were kept if their patient allele frequency substantially exceeded general population, e.g., by 0.10 or greater in a comparison adult general population.
- ESP Exome Sequencing Project
- Constraint 2 Of these, 2,748,984 were variant in 3 or more HPG patients (Constraint 3). Some of the sites in the reference genome had the minor allele in the comparison database, 1000G, potentially causing reference bias during analysis, and were eliminated from consideration; 1,560,081 sites had the major allele as the reference base (Constraint 4). For some sites, the alternate allele, although minor in the 1000G Phase 1 generation population, became the major allele in the European population and were eliminated, yielding 1,432,461 sites (Constraint 5). Next, 1,423,956 of the sites remaining after the previous constraint had no more than one alternate allele in the HPG patients (Constraint 6).
- 1,350,492 had balanced G+C content (Constraint 7); 1,350,455 were located outside low complexity regions (e.g., tandem repeats) (Constraint 8); and 1,302,588 had no identical or nearly identical paralogs (Constraint 9). After restricting sites to Chromosomes 1 - 22 (Constraint 10), 1,279,295 sites remained. [0123] Second, a series of five constraints based on clinical criteria were applied as prerequisites for association with disease. The number of sites fell to 455,413 when restricted to those measured in at least 25 of the HPG patients (Constraint 11).
- HPG patients and 107 (67%) each occurred in at least 50 of the HPG patients. Due to fluctuation in DNA capture efficiency, sites located in introns farther from exon splice sites tended to have smaller numbers of observations.
- the 160 SNP sites are found in 140 genes. While 12 genes contained 2 SNP sites and 4 genes contained 3 SNP sites, 124 of the 140 genes contained a single SNP site. The genes are distributed across the genome. See, Tables 3 and 4. The nomenclature and sequence identification of these genes and other biomarkers described herein are known in the art and incorporated herein by reference (e.g., HUGO Gene Nomenclature Committee, National Center for Biotechnology Information, NCBI; GenBank accession numbers).
- SNPs per gene b. location in gene, c. codon effect, d. distance to splicesite, proximal SNPs within genes, f. proximal SNPs in adjacent genes, genes with functions relevant to glaucoma, h. prior glaucoma related genes, glaucoma related and relevant functions.
- HPG high pressure glaucoma
- HPG high pressure glaucoma
- HPG high pressure glaucoma
- HPG high pressure glaucoma
- HPG high pressure glaucoma
- HPG high pressure glaucoma
- HPG high pressure glaucoma
- HPG high pressure glaucoma
- HPG high pressure glaucoma
- HPG high pressure glaucoma
- Chromosome Chromosome; REF, hgl9 Reference base; ALT, alternate base observed in HPG patients; dbSNP, NCBI identifier for SNP site; missense, site position in codon, amino acid changes in sequence translated from mRNA upon replacement of REF base with ALT base; synonym, site position in codon, no change in amino acid sequence translated from mRNA upon replacement of REF base with ALT base; utr-3p, transcribed but untranslated region (UTR) of mRNA (UTR) in final (3') exon; utr-5p, UTR in first (5p) exon; utr-NC, UTR in internal exon; SS DIST, distance to splicesite; OR, Odds ratio; Conf. Int., Confidence interval; pValue, probability that HPG and KG allele distributions are not different.
- microRNAs differentially regulated in glaucoma optic nerve (GON) vs normal optic nerve (ON) and targeting HPG genes, with microRNA name and the mature arm with strongest differential expression.
- Group 1 and 2 11 microRNA elevated in GON.
- Group 3 and 4 11 microRNA decreased in GON.
- Group 5 and 6 16 microRNA present in ON and absent or very low in GON. microRNA names, miRbase [38], Ambros, et al., 2002 [39].
- RT RT n
- retina ⁇ , ⁇ n
- optic nerve GON, ON g, glaucomatous optic nerve; A » B, "A significantly higher than B".
- microRNAs differentially regulated in glaucoma vs normal optic nerve and targeting HPG genes, with microRNA name and the mature arm with strongest differential expression, evaluated through maximum and total expression levels.
- Group 1 13 microRNA elevated in GON, lower or absent in RT.
- Group 2 microRNA decreased in GON, lower in RT.
- RT RT n
- retina ON, ON n, optic nerve
- GON ON g, glaucomatous optic nerve
- a » B level in A significantly higher than B.
- Inhibitory nucleic acids or small inhibitory nucleic acids can be used in therapy treatments in combination with measurement of expression levels.
- Tables 5 and 6 list microRNA differentially expressed in glaucomatous optic nerve (GON) versus normal optic nerve (ON or NON). microRNA underexpressed in GON can be neuroprotective when administered to a glaucoma patient. Targeting microRNA
- microRNA underexpressed in GON can be pathological and thus targeted, e.g., with an inhibitory nucleic acid, in a glaucoma patient; microRNA overexpressed in NON can be neuroprotective when administered to a glaucoma patient.
- POAG Primary open-angle glaucoma
- POAG Primary open-angle glaucoma
- IOP intraocular pressure
- aqueous humor dynamics They have hampered outflow from the eye of the nutrient-containing aqueous humor. This is associated with nearly constant rate of aqueous production, no matter what the steady state IOP.
- Sustained, above-normal levels of IOP constitute the largest risk factor for developing characteristic damage to visual function, the clinical basis for glaucoma diagnosis. This damage affects the retinal ganglion cells, their axons, and the optic nerve in a diagnostic manner.
- HPG high-pressure POAG
- MYOC Myocilin
- This method provides a path to a list of associated, potentially causative disease genes that can be used to predict onset, progression, severity, or recurrence of disease after treatment. Additional work will require assessment of the role of candidate genes in the anterior and posterior segments of the eye. Further, the sites and their genes can be considered in doublets or higher numbers of interacting mutations that affect the eye and cause HPG. [0144] This investigation identified, and categorized, SNP-containing genes present in unusually high frequency in HPG patients compared with the general population.
- AGIS Advanced Glaucoma Intervention Study
- PubMed PMID 22570617; PubMed Central PMCID: PMC3343074. Fan BJ, Wang DY, Pasquale LR, Haines JL, Wiggs JL. Genetic variants associated with optic nerve vertical cup-to-disc ratio are risk factors for primary open angle glaucoma in a US Caucasian population. Invest Ophthalmol Vis Sci. 2011 Mar 28;52(3): 1788-92. doi: 10.1167/iovs.10-6339. PubMed PMID: 21398277; PubMed Central PMCID: PMC3101676.
- CDKN2B-AS 1. Nat Genet. 2011 Jun;43(6):574-8. doi: 10.1038/ng.824. Epub 2011 May 1. PubMed PMID: 21532571.
- PubMed PMID 25852444
- PubMed Central PMCID PMC4369115.
- Glaucoma Intervention Study (AGIS): 1. Study design and methods and baseline characteristics of study patients. Control Clin Trials. 1994 Aug; 15(4) :299- 325. PubMed PMID: 7956270.
- mirBase, mirbase.org microRNA identifiers with matures sequences from
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Physics & Mathematics (AREA)
- Organic Chemistry (AREA)
- Genetics & Genomics (AREA)
- Analytical Chemistry (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Biotechnology (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Ophthalmology & Optometry (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Theoretical Computer Science (AREA)
- Bioinformatics & Computational Biology (AREA)
- Medical Informatics (AREA)
- Evolutionary Biology (AREA)
- Pathology (AREA)
- General Engineering & Computer Science (AREA)
- Biochemistry (AREA)
- Microbiology (AREA)
- Immunology (AREA)
- Medicinal Chemistry (AREA)
- Veterinary Medicine (AREA)
- Public Health (AREA)
- Animal Behavior & Ethology (AREA)
- Pharmacology & Pharmacy (AREA)
- Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
- General Chemical & Material Sciences (AREA)
- Chemical Kinetics & Catalysis (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
Provided are methods of identifying biomarkers that cause or promote progression of disease by exome sequencing. The disease genes are selected based on the frequency of a possible disease allele in patients; the disease allele being the minor allele; the allele being outside a low complexity region; the polymorphism influencing the expression of the gene; the polymorphism being near a gene expressed in the tissue influenced by the disease; and a significant correlation to disease after correction for multiple testing. The successful application of the methods is demonstrated by the identification of biomarkers associated with and/or causative of the onset and/or progression and/or severity and/or recurrence of glaucoma and primary open angle glaucoma (POAG). Many of these biomarkers were not previously associated with glaucoma or POAG. Predictive methods are also described, as well as applications in prognosis, diagnosis, and therapy. Testing for onset, progression, severity, and/or recurrence can be carried out. A key advantage in at least some embodiments is that a patient can receive earlier treatment for the disease such as POAG by use of the methods, screenings, and predictions described herein. Another key advantage in at least some embodiments is that a patient can receive more personalized or particular treatment for the disease such as POAG by use of the methods, screenings, and predictions described herein.
Description
METHODS OF IDENTIFYING BIOMARKERS ASSOCIATED WITH OR CAUSATIVE OF THE PROGRESSION OF DISEASE, IN PARTICULAR FOR USE IN
PROGNOSTICATING PRIMARY OPEN ANGLE GLAUCOMA
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This application claims the benefit under 35 U.S. C. § 119(e) of
U.S. Provisional Application No. 61/988,202 filed on May 3, 2014, which is hereby incorporated herein by reference in its entirety for all purposes.
STATEMENT OF GOVERNMENTAL SUPPORT
[0002] This invention was made with government support under Grant Nos.
EY020678 and EY022306, awarded by the National Eye Institute, National Institutes of Health. The government has certain rights in the invention.
FIELD
[0003] Provided are methods of identifying genes that cause or promote progression of disease. BACKGROUND
[0004] Many systematic, chronic types of diseases exist for which better diagnoses and treatments are needed, including the disease of glaucoma in its various forms. In glaucoma, progressive optic nerve degeneration often causes progressive, irreversible visual impairment, and potential blindness. Glaucoma is one of the most prevalent causes of blindness in the United States. Types of glaucoma can be grouped as open-angle, angle closure, and secondary. It is estimated that in the United States in 2010, of those over age 40, open-angle glaucoma affected nearly 2.8 million people, and worldwide caused bilateral blindness in more than 4.4 million people [1]. Primary open-angle glaucoma (POAG) is the more frequent form of the disease in the United States, affecting nearly equal numbers of men and women [2]. Treatment to lower the intraocular pressure (IOP) inhibits progression of vision loss from glaucoma; yet it is not always totally successful, and it seldom reverses established damage [3,4]. Because treatment inhibits progression of visual function damage, early detection is important.
[0005] People with a first-degree relative with POAG have double, or greater, risk of developing the disease [5,6]. A small number of identified genes clearly underlie a limited number of glaucoma cases, including some with POAG. Some genes have been
noted as involved in open angle glaucoma or neurodegeneration similar to that found in POAG through gene expression studies, model systems, linkage, and genome wide association studies (GWAS). Identification of causative glaucoma-associated genes is key to risk prediction, early detection, and eventual curative intervention. A major risk factor for visual system damage in POAG is elevated IOP arising from abnormal fluid dynamics in the eye, yet glaucomatous optic nerve degeneration occurs in the presence of normal IOP in about half of cases [7]. Of Caucasian POAG patients enrolled in the meta-analysis of the combined Genetic Etiologies of Primary-open Angle Glaucoma (GLAUGEN) and National Eye Institute Glaucoma Human Genetics Collaboration (NEIGHBOR) GWAS, 1669 cases had IOP >22 mmHg before treatment, and 720 had IOP <22 mmHg [8]. Genetic observations in these patients hint at the genetic complexity of POAG. Tissues that participate in aqueous dynamics, and thus IOP, are in the front of the eye while the retina and optic nerve, where vision damage occurs, are in the back of the eye. Both are involved in high pressure POAG or high pressure glaucoma (HPG). Thus, it makes sense to search broadly in the genome and across tissue systems for genetic explanations.
[0006] Association and linkage-based glaucoma genetics studies have identified loci contributing to susceptibility to glaucoma or to phenotypic features associated with risk of glaucoma, for example, large optic discs [9]. Genes including myocilin, CYP1B1, and optineurin lead to early onset, juvenile, or congenital glaucoma and some cases of adult- onset POAG. Susceptibility alleles in the LOXL1 gene confer risk of exfoliation open- angle glaucoma, where disease is secondary [10]. The NEIGHBOR GWAS found two loci strongly associated with optic nerve degeneration in POAG, CDKN2B-AS1 and SIX1/SIX6 [8]. Other GWAS have reported an association of the CDKN2B-AS1, CAV1/CAV2, TMCOl, and GAS7 loci with POAG [11,12]. Taken individually, these genes explain a limited portion of cases of POAG.
[0007] Additional references which discuss the genetics of glaucoma and POAG include: (1) Nowak et al, Biomed. Research Int'l, 2015, ID258281 [13], (2) Nowak et al, Arch Med. Sci. 6, December 2014, [14] (3) US Patent Publication 2009/0035279, (4) US Patent Publication 2007/0172919, and (5) US Patent Publication 2004/0132795. SUMMARY
[0008] Briefly, a study of genetics is described and claimed herein wherein, in a preferred embodiment, a genome-wide, targeted sequencing of exons and flanking regions
was carried out based on blood-derived DNA from patients with HPG. Briefly, a new method of constraint-based filtering and analysis based on technical and clinical criteria has been developed and applied. Briefly, a search— using the single nucleotide polymorphisms (SNPs) found within and near transcribed exons— is described and claimed for potentially causative genes in patients, including patients with genetically complex, chronic diseasess such as eye disease, such as glaucoma. In a preferred embodiment, through genomic DNA sequencing and computational search, briefly, genome variations with markedly higher occurrence in HPG patients have been identified in comparison with general populations. Of the approximately 25,000 genes encoded in the human genome, briefly, this study in its preferred embodiment has identified about 140 genes containing about 160 variants overrepresented in HPG. Unexpectedly, in the preferred embodiment, most of these genes and their variants have not been previously connected with glaucoma.
[0009] In one aspect, provided are methods of identifying genes whose alleles are associative with or causative of the progression of a disease, comprising:
a) sequencing or reviewing multiple exomes from patients who have been diagnosed with the disease and one or more exomes from one or more individuals known not to have the disease, wherein the one or more exomes from one or more individuals known not to have the disease comprise one or more reference exomes;
b) selecting exomes sequenced and read with a fidelity of 4 or fewer mismatches per 100 bases, e.g., fewer than 3 or 2 mismatches per 100 bases;
c) selecting for genes having one or more site variants in the exomes from patients who have been diagnosed with the disease with one or more properties, e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,16, 17, or 18 properties, selected from:
site variant is found in one or more patients;
site variant is observed in a general population dataset;
site variant is found in three or more patients;
one or more reference exomes have the major allele;
site variant is the minor allele in reference exomes;
site variant has only one alternate allele;
site is within genome region with balanced G+C and A+T content;
viii) site is located outside low complexity genome regions;
ix) site is located in genome region with no paralog within 95% identity; and
x) site variant is located on chromosomes 1-22 or site variant is located on chromosome X or Y only if disease incidence is gender-biased;
xi) site was measured in 25 or more patients;
xii) site variant frequency in patients differs from general populations by more than expected measurement error, e.g., 0.05 (on a frequency scale from 0.00 - 1.00);
xiii) site variant frequency in patients exceeds general populations, e.g., by more than 0.10;
xiv) site variant is within a gene or regulatory regions influencing its expression as R A or protein;
xv) site variant is within or near a gene expressed in tissues relevant to disease;
xvi) odds ratio 95% confidence interval lower bound calculated for the site from patient and reference general population frequencies is above 1.00;
xvii) frequency of site variant in patients is above a line fitted to filtered sites represented as datapoints where X is reference general population frequency and Y is patient frequency, e.g. , fit with least squares linear regression; and
xviii) a p-value calculated with a 2x2 statistical test, e.g., Fisher's Exact Test, from numbers of alternate and reference alleles observed for the site in patients and in general population remains significant after correction for multiple testing.
[0010] In varying embodiments, the methods comprise selecting for genes having one or more site variants in the exomes from patients who have been diagnosed with the disease is carried out with nine or more properties, or twelve or more properties, or fifteen or more properties, or all eighteen of the properties identified above (i) to (xviii).
[0011] In a further aspect, provided are methods of identifying genes whose alleles are associative with or causative of the onset and/or progression and/or severity and/or recurrence of a disease, comprising: a) sequencing or reviewing multiple exomes from patients who have been diagnosed with the disease and one or more exomes from one or more individuals
known not to have the disease, wherein the one or more exomes from one or more individuals known not to have the disease comprise one or more reference exomes;
b) selecting exomes sequenced and read with a fidelity of 4 or fewer mismatches per 100 bases;
c) selecting for genes having one or more site variants in the exomes from patients who have been diagnosed with the disease, wherein the genes have one or more properties, e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 properties, selected from:
i) site variant is found in one or more patients;
ii) site variant is observed in a general population dataset;
iii) site variant is found in three or more patients;
iv) one or more reference exomes have the major allele;
v) site variant is the minor allele in reference exomes;
vi) site variant has only one alternate allele;
vii) site is within genome region with balanced G+C and A+T content;
viii) site is located outside low complexity genome regions;
ix) site is located in genome region with no paralog within 95% identity; and
x) site variant is located on chromosomes 1-22 or site variant is located on chromosome X or Y only if disease incidence is gender-biased.
d) selecting for genes having one or more site variants in the exomes from patients who have been diagnosed with the disease, wherein the genes have one or more properties, e.g., 1, 2, 3, 4, 5, 6, 7, or 8 properties, selected from:
i) site was measured in 25 or more patients;
ii) site variant frequency in patients differs from general populations by more than expected measurement error, e.g., 0.05 (on a frequency scale from 0.00 - 1.00);
iii) site variant frequency in patients exceeds general populations, e.g., by more than 0.10;
iv) site variant is within a gene or regulatory regions influencing its expression as R A or protein;
v) site variant is within or near a gene expressed in tissues relevant to disease;
vi) odds ratio 95% confidence interval lower bound calculated for the site from patient and reference general population frequencies is above 1.00;
vii) frequency of site variant in patients is above a line fitted to filtered sites represented as datapoints where X is reference general population frequency and Y is patient frequency, e.g. , fit with least squares linear regression; and
viii) a p-value calculated with a 2x2 statistical test, e.g. , Fisher's Exact Test, from numbers of alternate and reference alleles observed for the site in patients and in general population remains significant after correction for multiple testing.
[0012] In varying embodiments of the method of identification, the disease is, for example, a systematic, chronic disease, such as, for example, a neurodegenerative disease, a cancer, a cardiovascular disease, an ocular disease, an immune disease, an autoimmune disease, an endocrinologic disease (e.g., diabetes), or an inflammatory disease (including chronic inflammatory). In some embodiments, the disease is a neurodegenerative disease. In some embodiments, the disease is an ocular disease. In some embodiments, the disease is primary open angle glaucoma (POAG). In some embodiments, the patients are symptomatic for the disease. In some embodiments, the method is computer implemented. In some embodiments, the site variants are selected from single nucleotide polymorphisms (SNPs), insertions, deletions and rearrangements. In some embodiments, the methods further comprise determining the expression levels of the genes from patient exomes and reference exomes. In some embodiments, the methods further comprise determining the expression levels of the microRNA from patient exomes and reference exomes. In some embodiments, the sequencing step comprises employing a next-generation sequencing (NGS) technique or method. In some embodiments, the methods further comprise selecting exomes sequenced and read with a fidelity of 4, 3, 2, 1 or fewer {e.g., no) mismatches per 100 bases. In some embodiments, the general population exome dataset is selected from or derived from one or more of 1000 Genomes (1000genomes.org), the Exome Sequencing Project (evs.gs.washington.edu/EVS/) datasets, UK10K (ukl0k.org/), UCSC Genome Bioinformatics Site (genome.ucsc.edu/), other available public datasets, and proprietary datasets made available for comparison. In some embodiments, the methods further comprise weighting said selected genes according to predictive power rankings of the collection of signature biomarkers.
[0013] In a further aspect, provided are methods for predicting onset and/or progression and/or severity and/or recurrence of primary open angle glaucoma (POAG) in a subject, the method comprising:
(a) receiving allelic information and/or expression levels of a collection of signature biomarkers from a biological sample taken from said subject suspected of suffering POAG, wherein said collection of signature biomarkers comprises one or more genes and/or microRNAs, e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, or more or all, selected from the group consisting of: AATF, ABI1, ABI3BP, ACTN2, ADAMTS15, ADCY2, AHNAK2, ANGEL2, ANKRD36, ANKRD36B, AN05, AP1M1, ARHGAP30, ASTN1, ATP6V1E2, BAI3, CACNA1E, CACNA1I,
CALM1, CCDC66, CD163, CDH13, CDH4, CDK17, CELF5, CHD8, CLCA4, CLEC7A, CLSTN2, C NM2, CNOT6, COL23A1, COL4A2, CRTAC1, CTU2, CYBA, DCBLD2, DHCR7, DNAJB11, DPF3, DRD2, EBF2, EN03, EPT1, ERI2, FDX1L, FLJ22184, FOXD4, FOXRED2, FRYL, GAS 7, GNG7, GOLGA3, GRIA1, GRID1, GRM4, HERC2, HLA-A, HLA-DRB 1 , IFI6, IMMT, INPP5D, ITGB4, KIAA0930, LACTB2, LCP2, LEMD3, LILRB2, LILRB3, LIN7A, LOC642846, LOC643387, LOC728537, LPHN3, LRP3, LRP4, LRRC37A, MAML3, MATR3, MCCC1, MCF2L, MEGF11, MGC21881, MINK1, MRPL23, MUC4, MYH9, MYOIE, N6AMT1, NBPF16, NOM02, NUCKS1, PALM2, PCK1, PCM1, PDE4DIP, PML, POTEC, PPFIA2, PRKAG2, PRKCH, PRKD1, PRUNE2, R3HDM1, RABGAPl, RAD51B, RBFOXl, RIN3, SARDH, SCAF8, SEC14L1, SEL1L3, SEMA5A, SEMA5B, SIRT1, SLC30A8, SNTB1, SPN, SPRY1, SRRM2, TMPRSS13, TNRC18, TOR1A, TRIM58, TSPAN11, TXNRD1, UNC5B, USP20, USP6, VAC 14, VARS2, VCAN, WASH1, XRCC5, ZDHHC7, ZMYND11, ZNF155, ZNF573, ZNF594, ZNF83, hsa-miR-100, hsa-miR-100-5p, hsa-miR-105, hsa-miR-105-5p, hsa-miR- 1226, hsa-miR-1226-3p, hsa-miR-124, hsa-miR-124-3p, hsa-miR-124-5p, hsa-miR-1250, hsa-miR-129, hsa-miR-129-5p, hsa-miR-138, hsa-miR-138-1, hsa-miR-138-2, hsa-miR- 138-2-3p, hsa-miR-139, hsa-miR-139-5p, hsa-miR-181b, hsa-miR-181b-5p, hsa-miR-18a, hsa-miR-18a-3p, hsa-miR-18b, hsa-miR-18b-5p, hsa-miR-193b, hsa-miR-193b-5p, hsa- miR-19b, hsa-miR-19b-l, hsa-miR-19b-l-5p, hsa-miR-211, hsa-miR-21 l-5p, hsa-miR-219, hsa-miR-219-1, hsa-miR-219-2, hsa-miR-219-2-3p, hsa-miR-219-5p, hsa-miR-2276, hsa- miR-2277, hsa-miR-2277-3p, hsa-miR-30b, hsa-miR-30b-3p, hsa-miR-3117, hsa-miR- 3117-3p, hsa-miR-3182, hsa-miR-323b, hsa-miR-323b-3p, hsa-miR-34b, hsa-miR-34b-3p, hsa-miR-3613, hsa-miR-3613-3p, hsa-miR-3622a, hsa-miR-3622a-5p, hsa-miR-376a , hsa- miR-376a-5p, hsa-miR-4423, hsa-miR-4423-5p, hsa-miR-4640, hsa-miR-4640-3p, hsa-
miR-4677, hsa-miR-4677-3p, hsa-miR-505, hsa-miR-505-5p, hsa-miR-513c, hsa-miR- 513c-5p, hsa-miR-545, hsa-miR-545-5p, hsa-miR-548ah, hsa-miR-548ah-3p, hsa-miR- 548ah-5p, hsa-miR-99b, hsa-miR-99b-5p, hsa-miR-1246, hsa-miR-1248, hsa-miR-130a, hsa-miR-130a-3p, hsa-miR-145, hsa-miR-145-3p, hsa-miR-148a, hsa-miR-148a-3p, hsa- miR-214, hsa-miR-214-3p, hsa-miR-216a, hsa-miR-224, hsa-miR-224-5p, hsa-miR-27a-5p, hsa-miR-31, hsa-miR-31 -5p, hsa-miR-4448, hsa-miR-449a, hsa-miR-452, hsa-miR-452-5p, hsa-miR-455, hsa-miR-455-5p, hsa-miR-483, hsa-miR-483-3p, hsa-miR-483-5p, hsa-miR- 549, hsa-miR-5584, hsa-miR-5584-5p, hsa-miR-574, hsa-miR-574-5p, hsa-miR-675, hsa- miR-675-3p, hsa-miR-767, hsa-miR-767-5p, hsa-miR-9, hsa-miR-9-3p, msa-miR-27a, hsa- let-7a, hsa-let-7a-2, hsa-let-7a-2-3p, and hsa-let-7c;
(b) applying the allelic information and/or expression levels to a predictive model relating allelic information and/or expression levels of said collection of signature biomarkers with onset of POAG; and (c) evaluating an output of said predictive model to predict onset of POAG in said individual; and/or
(c) applying the allelic information and/or expression levels to a predictive model relating allelic information and/or expression levels of said collection of signature biomarkers with progression of POAG; and (e) evaluating an output of said predictive model to predict progression of POAG in said individual; and/or
(d) applying the allelic information and/or expression levels to a predictive model relating allelic information and/or expression levels of said collection of signature biomarkers with severity of POAG; and (g) evaluating an output of said predictive model to predict severity of POAG in said individual; and/or
(e) applying the allelic information and/or expression levels to a predictive model relating allelic information and/or expression levels of said collection of signature biomarkers with recurrence of POAG; and (i) evaluating an output of said predictive model to predict recurrence of POAG in said individual. The relevant sequence identifications for these biomarkers, genes, and microRNAs are incorporated herein by reference.
[0014] In some embodiments of the methods of predicting, said collection of signature biomarkers comprises one or more genes selected from the biomarkers listed in Tables 4, 5 and/or 6. In varying embodiments, collection of signature biomarkers comprises one or more genes selected from the group consisting of: AATF, ABI1, ABI3BP, ACTN2, ADAMTS15, ADCY2, AHNAK2, ANGEL2, ANKRD36, ANKRD36B, AN05, AP1M1, ARHGAP30, ASTN1, ATP6V1E2, BAI3, CACNA1E, CACNA1I, CALM1, CCDC66, CD163, CDH13, CDH4, CDK17, CELF5, CHD8, CLCA4, CLEC7A, CLSTN2, C NM2,
CNOT6, COL23A1, COL4A2, CRTAC1, CTU2, CYBA, DCBLD2, DHCR7, DNAJB11, DPF3, DRD2, EBF2, EN03, EPTl, ERI2, FDXIL, FLJ22184, FOXD4, FOXRED2, FRYL, GAS 7, GNG7, GOLGA3, GRIA1, GRID1, GRM4, HERC2, HLA-A, HLA-DRB 1 , IFI6, IMMT, INPP5D, ITGB4, KIAA0930, LACTB2, LCP2, LEMD3, LILRB2, LILRB3, LIN7A, LOC642846, LOC643387, LOC728537, LPHN3, LRP3, LRP4, LRRC37A,
MAML3, MATR3, MCCC1, MCF2L, MEGF11, MGC21881, MINK1, MRPL23, MUC4, MYH9, MY01E, N6AMT1, NBPF16, NOM02, NUCKS1, PALM2, PCK1, PCM1, PDE4DIP, PML, POTEC, PPFIA2, PRKAG2, PRKCH, PRKD1, PRUNE2, R3HDM1, RABGAP1, RAD51B, RBFOX1, RIN3, SARDH, SCAF8, SEC14L1, SEL1L3, SEMA5A, SEMA5B, SIRT1, SLC30A8, SNTB1, SPN, SPRY1, SRRM2, TMPRSS13, TNRC18, TOR1A, TRIM58, TSPAN11, TXNRD1, UNC5B, USP20, USP6, VAC 14, VARS2, VCAN, WASH1, XRCC5, ZDHHC7, ZMYND1 1, ZNF155, ZNF573, ZNF594, and ZNF83, wherein the position and allele of the genetic variation associated with and/or causative of POAG is as provided in Table 4. In varying embodiments, overexpression of one or more microRNAs selected from hsa-miR-1246, hsa-miR-1248, hsa-miR-130a, hsa- miR-130a-3p, hsa-miR-145, hsa-miR-145-3p, hsa-miR-148a, hsa-miR-148a-3p, hsa-miR- 214, hsa-miR-214-3p, hsa-miR-216a, hsa-miR-224, hsa-miR-224-5p, hsa-miR-27a-5p, hsa- miR-31, hsa-miR-31-5p, hsa-miR-4448, hsa-miR-449a, hsa-miR-452, hsa-miR-452-5p, hsa- miR-455, hsa-miR-455-5p, hsa-miR-483, hsa-miR-483-3p, hsa-miR-483-5p, hsa-miR-549, hsa-miR-5584, hsa-miR-5584-5p, hsa-miR-574, hsa-miR-574-5p, hsa-miR-675, hsa-miR- 675-3p, hsa-miR-767, hsa-miR-767-5p, hsa-miR-9, hsa-miR-9-3p, msa-miR-27a, hsa-let- 7a, hsa-let-7a-2, hsa-let-7a-2-3p, and hsa-let-7c in the biological sample from the subject in comparison to a control sample from an individual known not to have POAG predicts negative outcome or onset and/or progression and/or severity and/or recurrence of POAG. In varying embodiments, the methods comprise further administering to the subject an inhibitory nucleic acid that reduces or inhibits the expression of one or more microRNAs selected from hsa-miR-1246, hsa-miR-1248, hsa-miR-130a, hsa-miR-130a-3p, hsa-miR- 145, hsa-miR-145-3p, hsa-miR-148a, hsa-miR-148a-3p, hsa-miR-214, hsa-miR-214-3p, hsa-miR-216a, hsa-miR-224, hsa-miR-224-5p, hsa-miR-27a-5p, hsa-miR-31, hsa-miR-31- 5p, hsa-miR-4448, hsa-miR-449a, hsa-miR-452, hsa-miR-452-5p, hsa-miR-455, hsa-miR- 455-5p, hsa-miR-483, hsa-miR-483-3p, hsa-miR-483-5p, hsa-miR-549, hsa-miR-5584, hsa- miR-5584-5p, hsa-miR-574, hsa-miR-574-5p, hsa-miR-675, hsa-miR-675-3p, hsa-miR-767, hsa-miR-767-5p, hsa-miR-9, hsa-miR-9-3p, msa-miR-27a, hsa-let-7a, hsa-let-7a-2, hsa-let- 7a-2-3p, and hsa-let-7c. In varying embodiments, the methods further comprise
administering to the subject one or more microRNAs or one or more mimics of microRNAs selected from hsa-miR-1246, hsa-miR-1248, hsa-miR-130a, hsa-miR-130a-3p, hsa-miR- 145, hsa-miR-145-3p, hsa-miR-148a, hsa-miR-148a-3p, hsa-miR-214, hsa-miR-214-3p, hsa-miR-216a, hsa-miR-224, hsa-miR-224-5p, hsa-miR-27a-5p, hsa-miR-31, hsa-miR-31 - 5p, hsa-miR-4448, hsa-miR-449a, hsa-miR-452, hsa-miR-452-5p, hsa-miR-455, hsa-miR- 455-5p, hsa-miR-483, hsa-miR-483-3p, hsa-miR-483-5p, hsa-miR-549, hsa-miR-5584, hsa- miR-5584-5p, hsa-miR-574, hsa-miR-574-5p, hsa-miR-675, hsa-miR-675-3p, hsa-miR-767, hsa-miR-767-5p, hsa-miR-9, hsa-miR-9-3p, msa-miR-27a, hsa-let-7a, hsa-let-7a-2, hsa-let- 7a-2-3p, and hsa-let-7c. In varying embodiments, underexpression or nonexpression of one or more microRNAs selected from hsa-miR- 100, hsa-miR- 100-5p, hsa-miR- 105 , hsa-miR- 105-5p, hsa-miR-1226, hsa-miR- 1226-3p, hsa-miR-124, hsa-miR- 124-3p, hsa-miR- 124-5p, hsa-miR-1250, hsa-miR-129, hsa-miR- 129-5p, hsa-miR-138, hsa-miR- 138-1, hsa-miR-138- 2, hsa-miR- 138-2-3p, hsa-miR-139, hsa-miR-139-5p, hsa-miR-181b, hsa-miR- 18 lb-5p, hsa-miR- 18a, hsa-miR- 18a-3p, hsa-miR- 18b, hsa-miR- 18b-5p, hsa-miR- 193b, hsa-miR- 193b-5p, hsa-miR- 19b, hsa-miR- 19b- 1, hsa-miR- 19b- l-5p, hsa-miR-211 , hsa-miR-21 l-5p, hsa-miR-219, hsa-miR-219-1 , hsa-miR-219-2, hsa-miR-219-2-3p, hsa-miR-219-5p, hsa- miR-2276, hsa-miR-2277, hsa-miR-2277-3p, hsa-miR-30b, hsa-miR-30b-3p, hsa-miR- 3117, hsa-miR-3117-3p, hsa-miR-3182, hsa-miR-323b, hsa-miR-323b-3p, hsa-miR-34b, hsa-miR-34b-3p, hsa-miR-3613, hsa-miR-3613-3p, hsa-miR-3622a, hsa-miR-3622a-5p, hsa-miR-376a , hsa-miR-376a-5p, hsa-miR-4423, hsa-miR-4423-5p, hsa-miR-4640, hsa- miR-4640-3p, hsa-miR-4677, hsa-miR-4677-3p, hsa-miR-505, hsa-miR-505-5p, hsa-miR- 513c, hsa-miR-513c-5p, hsa-miR-545, hsa-miR-545-5p, hsa-miR-548ah, hsa-miR-548ah- 3p, hsa-miR-548ah-5p, hsa-miR-99b, and hsa-miR-99b-5p in the biological sample from the subject in comparison to a control sample from an individual known not to have POAG predicts a negative outcome or onset and/or progression and/or severity and/or recurrence of POAG. In varying embodiments, the methods comprise further administering to the subject an inhibitory nucleic acid that reduces or inhibits the expression of one or more microRNAs selected from hsa-miR-100, hsa-miR- 100-5p, hsa-miR-105, hsa-miR- 105-5p, hsa-miR- 1226, hsa-miR- 1226-3p, hsa-miR-124, hsa-miR- 124-3p, hsa-miR- 124-5p, hsa-miR-1250, hsa-miR-129, hsa-miR- 129-5p, hsa-miR-138, hsa-miR- 138-1, hsa-miR- 138-2, hsa-miR- 138-2-3p, hsa-miR-139, hsa-miR-139-5p, hsa-miR-181b, hsa-miR- 18 lb-5p, hsa-miR-18a, hsa-miR- 18a-3p, hsa-miR- 18b, hsa-miR- 18b-5p, hsa-miR- 193b, hsa-miR- 193b-5p, hsa- miR-19b, hsa-miR- 19b- 1, hsa-miR- 19b- l-5p, hsa-miR-211, hsa-miR-21 l-5p, hsa-miR-219, hsa-miR-219-1, hsa-miR-219-2, hsa-miR-219-2-3p, hsa-miR-219-5p, hsa-miR-2276, hsa-
miR-2277, hsa-miR-2277-3p, hsa-miR-30b, hsa-miR-30b-3p, hsa-miR-31 17, hsa-miR- 31 17-3p, hsa-miR-3182, hsa-miR-323b, hsa-miR-323b-3p, hsa-miR-34b, hsa-miR-34b-3p, hsa-miR-3613, hsa-miR-3613-3p, hsa-miR-3622a, hsa-miR-3622a-5p, hsa-miR-376a , hsa- miR-376a-5p, hsa-miR-4423, hsa-miR-4423-5p, hsa-miR-4640, hsa-miR-4640-3p, hsa- miR-4677, hsa-miR-4677-3p, hsa-miR-505, hsa-miR-505-5p, hsa-miR-513c, hsa-miR- 513c-5p, hsa-miR-545, hsa-miR-545-5p, hsa-miR-548ah, hsa-miR-548ah-3p, hsa-miR- 548ah-5p, hsa-miR-99b, and hsa-miR-99b-5p. In varying embodiments, the methods further comprise administering to the subject one or more microRNAs or one or more mimics of microRNAs selected from hsa-miR-100, hsa-miR- 100-5p, hsa-miR-105, hsa-miR- 105-5p, hsa-miR- 1226, hsa-miR- 1226-3p, hsa-miR- 124, hsa-miR- 124-3p, hsa-miR- 124-5p, hsa- miR-1250, hsa-miR-129, hsa-miR- 129-5p, hsa-miR-138, hsa-miR- 138-1 , hsa-miR- 138-2, hsa-miR- 138-2-3p, hsa-miR-139, hsa-miR- 139-5p, hsa-miR-181b, hsa-miR-181b-5p, hsa- miR- 18a, hsa-miR- 18a-3p, hsa-miR- 18b, hsa-miR- 18b-5p, hsa-miR- 193b, hsa-miR- 193b- 5p, hsa-miR-19b, hsa-miR- 19b- 1 , hsa-miR- 19b- l-5p, hsa-miR-21 1 , hsa-miR-21 l-5p, hsa- miR-219, hsa-miR-219-1 , hsa-miR-219-2, hsa-miR-219-2-3p, hsa-miR-219-5p, hsa-miR- 2276, hsa-miR-2277, hsa-miR-2277-3p, hsa-miR-30b, hsa-miR-30b-3p, hsa-miR-31 17, hsa- miR-31 17-3p, hsa-miR-3182, hsa-miR-323b, hsa-miR-323b-3p, hsa-miR-34b, hsa-miR- 34b-3p, hsa-miR-3613, hsa-miR-3613-3p, hsa-miR-3622a, hsa-miR-3622a-5p, hsa-miR- 376a , hsa-miR-376a-5p, hsa-miR-4423, hsa-miR-4423-5p, hsa-miR-4640, hsa-miR-4640- 3p, hsa-miR-4677, hsa-miR-4677-3p, hsa-miR-505, hsa-miR-505-5p, hsa-miR-513c, hsa- miR-513c-5p, hsa-miR-545, hsa-miR-545-5p, hsa-miR-548ah, hsa-miR-548ah-3p, hsa-miR- 548ah-5p, hsa-miR-99b, and hsa-miR-99b-5p. In some embodiments, the individual is symptomatic for POAG. In some embodiments, the individual has a family history of POAG. In some embodiments, said output of the predictive model predicts a likelihood of recurrence of POAG in the individual after said individual has undergone treatment for POAG. In some embodiments, the methods further comprise providing a report having a prediction of clinical recurrence of POAG of said individual. In some embodiments, the methods further comprise combining the allelic information and/or gene expression levels of said signature biomarkers with one or more other biomarkers to predict onset and/or progression and/or severity and/or recurrence of POAG in said individual. In some embodiments, the expression levels of a collection of signature biomarkers comprise gene expression levels are measured at multiple times. In varying embodiments, the methods further comprise using the dynamics of the gene expression levels measured at multiple times to predict onset and/or progression and/or severity and/or recurrence of disease (e.g.,
HPG/POAG) in said subject. In varying embodiments, the methods further comprise evaluating the output of the predictive model to determine whether or not the individual falls in a high risk group. In varying embodiments, the methods further comprise developing said predictive model using stability selection or logistic regression. In varying embodiments, the methods further comprise developing said predictive model using stability selection. In varying embodiments, the methods further comprise developing said predictive model using logistic regression. In some embodiments, applying said allelic information and/or expression levels of the collection of signature biomarkers to said predictive model comprises weighting said expression levels according to stability rankings or predictive power rankings of the collection of signature biomarkers. In some
embodiments, applying said allelic information and/or expression levels of the collection of signature biomarkers to said predictive model comprises weighting said expression levels according to stability rankings of the collection of signature biomarkers. In some embodiments, applying said allelic information and/or expression levels of the collection of signature biomarkers to said predictive model comprises weighting said expression levels according to predictive power rankings of the collection of signature biomarkers.
[0015] One embodiment is a method of identifying genes whose alleles are associative with or causative of the progression of a disease, comprising:
a) sequencing or reviewing multiple exomes from patients who have been diagnosed with the disease and one or more exomes from one or more individuals known not to have the disease, wherein the one or more exomes from one or more individuals known not to have the disease comprise one or more reference exomes;
b) selecting exomes sequenced and read with a fidelity of 4 or fewer mismatches per 100 bases;
c) selecting for genes having one or more site variants in the exomes from patients who have been diagnosed with the disease with one or more properties selected from:
i) site variant is present in 25 or more patients;
ii) site variant has only one alternate allele;
iii) the one or more reference exomes have the major allele; iv) site variant is within a gene or regulatory regions influencing its expression as R A or protein;
v) site variant is located on chromosomes 1-22 or site variant is located on chromosome X or Y only if disease incidence is gender-biased;
vi) site variant has a frequency of < 0.95 in patients;
vii) site variant is within general population exome dataset;
viii) site variant has approximately the same frequency within the general population as the frequency of the disease within the general population; and ix) site variant occurs in patients with a frequency greater than in the general population.
[0016] Another embodiment is a method of identifying genes whose alleles are associative with or causative of the progression of a disease, comprising:
a) sequencing or reviewing multiple exomes from patients who have been diagnosed with the disease and one or more exomes from one or more individuals known not to have the disease, wherein the one or more exomes from one or more individuals known not to have the disease comprise one or more reference exomes;
b) selecting exomes sequenced and read with a fidelity of 4 or fewer mismatches per 100 bases;
c) selecting for genes having one or more site variants in the exomes from patients who have been diagnosed with the disease with one or more properties selected from:
i) site variant is present in two or more patients;
ii) site variant has only one alternate allele;
iii) the one or more reference exomes have the major allele; and iv) site variant is within a gene or regulatory regions influencing its expression as R A or protein;
d) selecting for genes having one or more site variants in the exomes from patients who have been diagnosed with the disease, wherein the genes have one or more properties selected from:
i) site variant is present in 25 or more patients;
ii) site variant is located on chromosomes 1-22 or site variant is located on chromosome X or Y only if disease incidence is gender-biased;
iii) site variant has a frequency of < 0.95 in patients;
iv) site variant is within general population exome dataset;
v) site variant has approximately the same frequency within the general population as the frequency of the disease within the general population; and vi) site variant occurs in patients with a frequency greater than in the general population. [0017] Still further, another embodiment is a method for predicting progression of primary open angle glaucoma (POAG) in a subject, the method comprising:
(a) receiving allelic information and/or expression levels of a collection of signature biomarkers from a biological sample taken from said subject suspected of suffering POAG, wherein said collection of signature biomarkers comprises one or more genes and/or microRNA selected from the group consisting of: ABI1, ABI3BP, AKT1, ANKRD36B, CADM2, CCDC33, CELA3A, CHMP7, CHRNA7, CLCNKB, CNNM2, CNTN2, COL4A2, CSMD2, CSPG4, DPF3, EN03, EPHA10, FANCM, FAT3, FBN3, FDX1L, GAK, GAS 7, GINS2, GLB1L3, GLIS1, GOLGA3, GOLGA6B, GTF2I, GYPE, HLA-DQBl, HLA-DRBl, ILIB, KCNQl, KCNQ3, KLF12, KLRC4, LGALS9C, LILRB2, LILRB3, LOXL2, MMD, MRPL23, MUC4, NBPF3, NLRP9, NOM02, NPIPL2, NSUN4, NUP153, OR2L3, PAK7, PALM2, PDLIM4, PLAC4, PLXNA2, POTEM, PPP1R14C, PRAMEF2, PRB4, PRICKLE4, PRKAG2, PTPRN2, RANGAPl, RBM23, RGPDl, RYR2, SEL1L3, SEPT9, SLC2A3, SLC35E2, SLC6A18, SLC6A3, SPN, SRCIN1, SULT1A2, SYN3, SYT3, TMEM120A, TMEM191B, TMPRSS13, USP20, USP41, WASHl, ZNF276, ZNF492, ZNF512B, ZNF594, ZNF83, hsa-miR-100, hsa-miR- 100-5p, hsa-miR-105, hsa- miR-105-5p, hsa-miR-1226, hsa-miR- 1226-3p, hsa-miR-124, hsa-miR- 124-3p, hsa-miR- 124-5p, hsa-miR-1250, hsa-miR-129, hsa-miR- 129-5p, hsa-miR-138, hsa-miR- 138-1, hsa- miR-138-2, hsa-miR- 138-2-3p, hsa-miR-139, hsa-miR- 139-5p, hsa-miR-181b, hsa-miR- 181b-5p, hsa-miR-18a, hsa-miR- 18a-3p, hsa-miR-18b, hsa-miR- 18b-5p, hsa-miR-193b, hsa-miR- 193b-5p, hsa-miR- 19b, hsa-miR- 19b- 1, hsa-miR- 19b- l-5p, hsa-miR-211, hsa- miR-211-5p, hsa-miR-219, hsa-miR-219-1, hsa-miR-219-2, hsa-miR-219-2-3p, hsa-miR- 219-5p, hsa-miR-2276, hsa-miR-2277, hsa-miR-2277-3p, hsa-miR-30b, hsa-miR-30b-3p, hsa-miR-3117, hsa-miR-3117-3p, hsa-miR-3182, hsa-miR-323b, hsa-miR-323b-3p, hsa- miR-34b, hsa-miR-34b-3p, hsa-miR-3613, hsa-miR-3613-3p, hsa-miR-3622a, hsa-miR- 3622a-5p, hsa-miR-376a , hsa-miR-376a-5p, hsa-miR-4423, hsa-miR-4423-5p, hsa-miR- 4640, hsa-miR-4640-3p, hsa-miR-4677, hsa-miR-4677-3p, hsa-miR-505, hsa-miR-505-5p, hsa-miR-513c, hsa-miR-513c-5p, hsa-miR-545, hsa-miR-545-5p, hsa-miR-548ah, hsa-miR- 548ah-3p, hsa-miR-548ah-5p, hsa-miR-99b, hsa-miR-99b-5p, hsa-miR- 1246, hsa-miR-
1248, hsa-miR-130a, hsa-miR-130a-3p, hsa-miR-145, hsa-miR-145-3p, hsa-miR-148a, hsa- miR-148a-3p, hsa-miR-214, hsa-miR-214-3p, hsa-miR-216a, hsa-miR-224, hsa-miR-224- 5p, hsa-miR-27a-5p, hsa-miR-31, hsa-miR-31-5p, hsa-miR-4448, hsa-miR-449a, hsa-miR- 452, hsa-miR-452-5p, hsa-miR-455, hsa-miR-455-5p, hsa-miR-483, hsa-miR-483-3p, hsa- miR-483-5p, hsa-miR-549, hsa-miR-5584, hsa-miR-5584-5p, hsa-miR-574, hsa-miR-574- 5p, hsa-miR-675, hsa-miR-675-3p, hsa-miR-767, hsa-miR-767-5p, hsa-miR-9, hsa-miR-9- 3p, msa-miR-27a, hsa-let-7a, hsa-let-7a-2, hsa-let-7a-2-3p, and hsa-let-7c;
(b) applying the allelic information and/or expression levels to a predictive model relating allelic information and/or expression levels of said collection of signature biomarkers with progression of POAG; and
(c) evaluating an output of said predictive model to predict progression of POAG in said individual.
[0018] Also provided herein are methods of diagnosis, prognosis, and/or therapy for the diseases described herein, including glaucoma and POAG, and also methods and kits for determining the presence or absence of the disease, such as glaucoma or POAG, or of an increased risk of the disease, such as glaucoma or POAG in an individual. Methods for diagnosis, prognosis, and/or therapy for the diseases described herein, including glaucoma and POAG, are generally known in the art and can be combined with the methods of gene and biomarker identification described herein. For example, a patient can be tested for having or not having the identified genetic marker as described herein. One or more samples can be taken from the patient, and the samples analyzed. If the patient has the marker, additional diagnosis, prognosis, and therapy can be carried out with the patient. For example, one can analyze for onset, progression, severity, and/or recurrence of the disease. Methods known in the art can be used. See, for example, US Patent Publication
2004/0132795 for methods of screening and treating individuals with glaucoma or the propensity to develop glaucoma, and this reference is incorporated herein by reference in its entirety. Diseases other than glaucoma and POAG can be included in these methods. See, for example, US Patent Publication 2011/0177509 (which is incorporated herein by reference in its entirety) for risk factors and a therapeutic target for neurodegenerative disorders, as well as methods for identification of a subject at risk for a neurodegenerative disorder; see also US Patent No. 7,794,933 for neurological disorders including depression (and which is incorporated herein by reference in its entirety).
[0019] Kits designed and configured for practicing methods are also provided herein as known in the art of diagnostic and testing kits and devices. The use of kits is generally known in the art. See, for example, US Patent Publication 2011/0177509, which is incorporated herein by reference in its entirety. Kits can include, for example, appropriate genetic materials, indicators, instructions, and/or packaging.
[0020] Hence, also provided herein is a method of identifying a patient or subject using the methods described herein which can include kits. One or more genetic tests can be used to identify the patient or subject. The patient or subject can then be given a prognosis and/or treatment. DEFINITIONS
[0021] The term "exome" refers to the part of the genome formed by exons, the sequences which when transcribed remain within the mature RNA after introns are removed by RNA splicing. It differs from a transcriptome in that it consists of all DNA that is transcribed into mature RNA in cells of any type. For the purposes of the present application, the exome includes coding exons, non-coding exons, 5' untranslated regions (UTR ), 3' UTR, flanking introns, microRNA, and proximal promoters.
[0022] The term "threshold level" refers to a representative or predetermined expression level of a gene or microRNA. The threshold level can represent expression detected in a sample from a normal control, i.e., from non-diseased tissue or non-diseased subject. In varying embodiments, the normal control is of the same tissue type of the biological sample subject to testing. The threshold level can be determined from an individual or from a population of individuals. The expression levels of a gene or microRNA from a diseased tissue or subject may be above (increased) or below (decreased) in comparison to a control level. [0023] The terms "increased expression level" or "overexpression" interchangeably refer to a predetermined threshold level or a level of expression from a normal or non- diseased control. An increased expression level is determined when the level of expression in the test biological sample is at least about 10%, 25%, 50%, 75%, 100% (i.e., 1-fold), 2- fold, 3 -fold, 4-fold or greater, in comparison to the predetermined threshold level of expression or the level of expression from a normal or non-diseased control tissue. In determining an increased level of expression, usually the same tissue types are compared.
[0024] The terms "decreased expression level" or "underexpression" interchangeably refer to a predetermined threshold level or a level of expression from a normal or non-diseased control. A decreased expression level is determined when the level of expression in the test biological sample is at least about 10%, 25%, 50%, 75%, 100%) (i.e., 1-fold), 2-fold, 3-fold, 4-fold or less or lower, in comparison to the predetermined threshold level of expression or the level of expression from a normal or non-diseased control tissue. In determining an decreased level of expression, usually the same tissue types are compared.
[0025] The term "individual," "patient,", "subject" interchangeably refer to a mammal, for example, a human, a non-human primate, a domesticated mammal (e.g., a canine or a feline), an agricultural mammal (e.g., equine, bovine, ovine, porcine), or a laboratory mammal (e.g., rattus, murine, lagomorpha, hamster).
[0026] As used herein the term "comprising" means that the named elements are included, but other elements (e.g., unnamed signature genes) may be added and still represent a composition or method within the scope of the claim. The transitional phrase "consisting essentially of means that the associated composition or method encompasses additional elements, including, for example, additional signature genes, that do not affect the basic and novel characteristics of the disclosure.
[0027] As used herein, the term "signature gene" refers to a gene whose expression is correlated, either positively or negatively, with disease extent or outcome or with another predictor of disease extent or outcome. In some embodiments, a gene expression score (GEX) can be statistically derived from the expression levels of a set of signature genes and used to diagnose a condition or to predict clinical course. In some embodiments, the expression levels of the signature genes may be used to predict onset and/or progression and/or severity and/or recurrence of disease (e.g., POAG or HPG) without relying on a
GEX. A "signature nucleic acid" is a nucleic acid comprising or corresponding to, in case of cDNA, the complete or partial sequence of a R A transcript encoded by a signature gene, or the complement of such complete or partial sequence. A signature protein is encoded by or corresponding to a signature gene of the disclosure. [0028] The term "prediction" is used herein to refer to the prediction of disease onset and/or progression and/or severity and/or recurrence in a patient. The patient may be symptomatic or asymptomatic. The patient may have undergone or currently be undergoing a therapeutic regime. The predictive methods of the present disclosure can be used
clinically to make treatment decisions by choosing the most appropriate treatment modalities for any particular patient. The predictive methods of the present disclosure also can provide valuable tools in predicting if a patient is likely to respond favorably to a treatment regimen, such as surgical intervention and/or pharmacological intervention. [0029] The term "plurality" refers to more than one element. For example, the term is used herein in reference to a number of nucleic acid molecules or sequence tags that are sufficient to identify significant differences in copy number variations in test samples and qualified samples using the methods disclosed herein. In some embodiments, at least about 3 x 106 sequence tags of between about 20 and 40 bp are obtained for each test sample. In some embodiments, each test sample provides data for at least about 5 x 106, 8 x 106, 10 x 106, 15 x 106, 20 x 106, 30 x 106, 40 x 106, or 50 x 106 sequence tags, each sequence tag comprising between about 20 and 40 bp.
[0030] The terms "polynucleotide," "nucleic acid" and "nucleic acid molecules" are used interchangeably and refer to a covalently linked sequence of nucleotides (i.e., ribonucleotides for R A and deoxyribonucleotides for DNA) in which the 3' position of the pentose of one nucleotide is joined by a phosphodiester group to the 5' position of the pentose of the next. The nucleotides include sequences of any form of nucleic acid, including, but not limited to RNA and DNA molecules. The term "polynucleotide" includes, without limitation, single- and double-stranded polynucleotide. [0031] The terms "microRNA mimic" and "mimics of microRNA" are well known in the art. See e.g., Wang, Z., 2009, Chapter on "miRNA Mimic Technology," pages 93-100, MicroRNA Interference Technologies, Springer- Ver lag. Herein, it can refer to synthetic sequences that are nearly identical or identical to microRNAs found in cells. They can be, for example, sometimes modified chemically in some way for stability (e.g., to make it through the liver) or with a nucleotide or two changed for delivery or manufacturing purposes. Herein, microRNAs or short synthetic RNAs nearly identical to the microRNAs can be used, e.g., 90% identical or closer, possibly with chemical modifications to the nucleotides. Double stranded miRNA mimics can be used.
[0032] The term "Next Generation Sequencing (NGS)" herein refers to sequencing methods that allow for massively parallel sequencing of clonally amplified molecules and of single nucleic acid molecules. Non-limiting examples of NGS include sequencing-by- synthesis using reversible dye terminators, and sequencing-by-ligation.
[0033] The term "read" refers to a sequence read from a portion of a nucleic acid sample. Typically, though not necessarily, a read represents a short sequence of contiguous base pairs in the sample. The read may be represented symbolically by the base pair sequence (in ATCG) of the sample portion. It may be stored in a memory device and processed as appropriate to determine whether it matches a reference sequence or meets other criteria. A read may be obtained directly from a sequencing apparatus or indirectly from stored sequence information concerning the sample. In some cases, a read is a DNA sequence of sufficient length (e.g., at least about 25 bp) that can be used to identify a larger sequence or region, e.g., that can be aligned and specifically assigned to a chromosome or genomic region or gene.
[0034] As used herein, the terms "aligned," "alignment," or "aligning" refer to the process of comparing a read or tag to a reference sequence and thereby determining whether the reference sequence contains the read sequence. If the reference sequence contains the read, the read may be mapped to the reference sequence or, in certain embodiments, to a particular location in the reference sequence. In some cases, alignment simply tells whether or not a read is a member of a particular reference sequence (i.e., whether the read is present or absent in the reference sequence). For example, the alignment of a read to the reference sequence for human chromosome 13 will tell whether the read is present in the reference sequence for chromosome 13. A tool that provides this information may be called a set membership tester. In some cases, an alignment additionally indicates a location in the reference sequence where the read or tag maps to. For example, if the reference sequence is the whole human genome sequence, an alignment may indicate that a read is present on chromosome 13, and may further indicate that the read is on a particular strand and/or site of chromosome 13. [0035] Aligned reads or tags are one or more sequences that are identified as a match in terms of the order of their nucleic acid molecules to a known sequence from a reference genome. Alignment can be done manually, although it is typically implemented by a computer algorithm, as it would be impossible to align reads in a reasonable time period for implementing the methods disclosed herein. One example of an algorithm from aligning sequences is the Efficient Local Alignment of Nucleotide Data (ELAND) computer program distributed as part of the Illumina Genomics Analysis pipeline. Alternatively, a Bloom filter or similar set membership tester may be employed to align reads to reference genomes. Alternatively, an indexing algorithm such as that implemented in versions of the
BowTie computer program may be employed to align reads to reference genomes. The matching of a sequence read in aligning can be a 100% sequence match or less than 100% (non-perfect match).
[0036] The term "mapping" used herein refers to specifically assigning a sequence read to a larger sequence, e.g., a reference genome, by alignment.
[0037] As used herein, the term "reference genome" or "reference sequence" refers to any particular known genome sequence, whether partial or complete, of any organism or virus which may be used to reference identified sequences from a subject. For example, a reference genome used for human subjects as well as many other organisms is found at the National Center for Biotechnology Information at ncbi.nlm.nih.gov. A "genome" refers to the complete genetic information of a mammal expressed in nucleic acid sequences.
[0038] In various embodiments, the reference sequence is significantly larger than the reads that are aligned to it. For example, it may be at least about 100 times larger, or at least about 1000 times larger, or at least about 10,000 times larger, or at least about 105 times larger, or at least about 106 times larger, or at least about 107 times larger.
[0039] The term "based on" when used in the context of obtaining a specific quantitative value, herein refers to using another quantity as input to calculate the specific quantitative value as an output.
[0040] As used herein the term "chromosome" refers to the heredity-bearing gene carrier of a living cell, which is derived from chromatin strands comprising DNA and protein components (especially histones). The conventional internationally recognized individual human genome chromosome numbering system is employed herein.
[0041] The term "condition" herein refers to "medical condition" as a broad term that includes all diseases and disorders, but can include [injuries] and normal health situations, such as pregnancy, that might affect a person's health, benefit from medical assistance, or have implications for medical treatments.
[0042] The term "sensitivity" as used herein is equal to the number of true positives divided by the sum of true positives and false negatives.
[0043] The term "specificity" as used herein is equal to the number of true negatives divided by the sum of true negatives and false positives.
BRIEF DESCRIPTION OF THE DRAWINGS
[0044] Figure 1 illustrates strategies for high fidelity identification of SNPs, insertions/deletions (indels), and genome rearrangements associated with disease causation and/or progression. Upper left (SNPS @ 3x): Large rectangles represent ranges of genome nucleotides to which sequence reads, represented by smaller lines, were mapped. To identify SNPs, reads with 0 to 3 mismatches per 100 bases are aligned to the reference genome and their bases are compared. Mismatches between reference nucleotides and read nucleotides, represented by dark dots on the reads, designate a variant site. Generally, 3+ sequence reads are needed to determine whether a site has a variant. Upper right (indel): Reads that span a small insertion or deletion in a patient genome are aligned to a reference genome with gaps in the read or reference. Lower (split pair): Paired reads may fail to align nearby each other in the reference genome because of a rearrangement in a patient genome. Their alignment to different genome regions, on the same or different chromosome, depicted by left and right large rectangles indicates a rearrangement. [0045] Figure 2 illustrates genes with their strength of expression in human eye tissues. Left: Dark to light color represents high to low overall expression in eye tissues for a non-exhaustive list of genes detected as expressed in eye tissues by RNA sequencing; genes were selected to range from high to low expression. Three genes previously associated with glaucoma are noted, GAS7, HLA-DRB1, and COL4A2. Right: Also depicted is expression of GAS7 in 6 eye tissues, including trabecular meshwork (TM), cilliary bodies (CB), choroid (CH), optic disk (OD), optic nerve (ON) and retina (RT) with stronger expression in TM, OD, ON, and RT compared to CB and CH. Blue lines (top) denote gene exons. Black vertical lines denote RNA sequence reads.
[0046] Figure 3 illustrates expression of four genes in 6 eye tissues, for each gene including trabecular meshwork (TM), ciliary body (CB), choroid (CH), optic disk (OD), optic nerve (ON) and retina (RT). Each gene has distinct tissue-specific expression.
[0047] Figure 4 provides a scatterplot of filtered variant sites represented as datapoints with X = frequency of variant in general populations and Y = frequency of variant in HPG patients. [0048] Figure 5 illustrates microRNA overexpressed in diseased optic nerve (i.e., optic nerve from patients having primary open angle glaucoma). Overexpressed microRNAs include hsa-miR-483-5p, hsa-miR-483-3p, hsa-miR-214-3p, hsa-miR-452-5p,
hsa-miR-4448, hsa-miR-224-5p, hsa-miR-1246, hsa-miR-130a-3p, hsa-miR-9-3p, hsa-miR- 767-5p, and hsa-miR-449a.
[0049] Figure 6 illustrates microRNA (miRNA) underexpressed in diseased optic nerve (i.e., optic nerve from patients having primary open angle glaucoma). Underexpressed microRNAs include hsa-miR-34b-3p, hsa-miR-3182, hsa-miR-4640-3p, hsa-miR-2276, hsa- miR-4423-5p, hsa-miR-2277-3p, hsa-miR-513c-5p, hsa-miR-1250, hsa-miR-18a-3p, hsa- miR-505-5p, hsa-miR-138-2-3p, hsa-miR-548ah-3p, hsa-miR-4677-3p hsa-miR-1226-3p, hsa-miR-193b-5p, and hsa-miR-18b-5p.
DETAILED DESCRIPTION
1. Introduction
[0050] Provided herein in some embodiments are methods of identification of disease-associated genome variants in coding or regulatory regions of genes. The methods are exemplified in a preferred embodiment by the identification of genes that are associated with and/or promote onset or progression of a type of primary open-angle glaucoma. Other methods such as predictive, diagnostic, prognostic, and therapeutic methods are also provided herein.
[0051] The methods are based, in part, on the definition and use of a logic-based method to rank variants and genes based on clinical properties of disease. The methods are exemplified by application to variants from a cohort of patients with primary open angle glaucoma (POAG) and with elevated eye pressure, the method revealed 140 genes with variants over-represented in this disease in this embodiment. Genes were further ranked within the method based on gene expression patterns in tissues relevant to the disease process, which in the case of POAG can be retina, optic disk, optic nerve, ciliary body, choroid, trabecular meshwork, iris, sclera, and lamina cribrosa. Additional genes associated with the ranked genes were identified within the method as potential regulators of RNA and protein expression levels whose regulatory performance is disrupted or altered by highly ranked variants.
[0052] The method implements technical and clinical filters that reflect occurrence of disease in general populations. These filter reduced thousands of potential variants to under 150 for the preferred embodiment. The method incorporates gene expression information from tissues relevant to disease to refine ranked genes. The method
incorporates information about potential microR A, DNA-binding protein, and RNA- binding protein regulators of genes identified by the clinical ranking parameters.
[0053] The methods have been implemented as a body of software code written in
Perl and other scripting languages, and applied to compare variations from a disease patient cohort (e.g., primary open angle glaucoma or POAG) with two publicly available datasets, e.g., 1000 Genomes (1000genomes.org) and the Exome Sequencing Project
(evs.gs.washington.edu/EVS/) dataset. Other data sets can be used.
[0054] The genes identified by the analysis are potential targets or members of cellular pathways or processes that may be effective therapeutic targets in treating or curing the disease of interest (e.g., POAG). More particularly, disease onset, progression, severity, and/or recurrence can be addressed. Currently, for example, there is no cure for POAG and the only treatment is reduction of pressure in the eye to slow disease progression. Many variants found are in regulatory regions of genes and may control production of mRNA and/or protein. Molecules that bind to DNA or RNA at sites disrupted or altered by variants are further therapeutic targets.
[0055] The various embodiments described herein provide numerous, and in many cases surprising, advantages. For example, a key advantage in at least some embodiments is that a patient can receive earlier treatment for the disease such as POAG by use of the methods, screenings, and predictions described herein. Another key advantage in at least some embodiments is that a patient can receive more personalized or particular treatment for the disease such as POAG by use of the methods, screenings, and predictions described herein.
[0056] Moreover, despite the knowledge in the art, numerous surprising results were found throughout the presently described and claimed methodologies. For example, it was found that variant sites were concentrated in introns compared to coding regions.
[0057] In addition, it was found that the vast majority of the genes found in the glaucoma effort were not previously associated with glaucoma. However, for at least some of them, their functions within cells are in cellular processes related to glaucoma, e.g., genes involved in cell cycle, neural development and axon guidance, and inflammation.
[0058] In general, with the inventive filtering tool, the medical community is provided with a method to identify the genetic changes in a genome that are associated with a disease state, where those changes are not findable by standard GWAS or exome analysis methods. The newly identified sites provide a new patient management tool.
[0059] In addition, the approach described and claimed herein for glaucoma did find several genes previously associated with glaucoma, which puts new focus on those genes. Within those genes, the approach found sites that were not previously found in other studies because those studies focused on marker sites, whereas the presently described and claimed methods focus on finding causal sites inside the genes. Even further, it was found that frequencies of sites associated with glaucoma varied in frequency in the general population from very rare at < 0.01 to very common at nearly 0.50.
[0060] Moreover, a list of genes was generated with their expression levels in tissues involved in the disease from human donor eyes. It was surprising to find genes and microRNAs that were differentially expressed across optic nerve, optic disc, retina, ciliary body, and trabecular meshwork tissues, and further were differentially expressed in tissues from eyes with disease compared with normal.
[0061] Also, microRNAs in optic nerve differed from microRNAs in retina and even optic disc. This was a large surprise because the optic nerve comprises axons of retinal ganglion cells whose nucleii are within the retina.
[0062] Hence, the technical effects of the claimed methodologies were clear, useful, and unexpected. Additional aspects of these technical effects are noted. For example, the elimination of false positive variants through direct genome sequence analysis of the region around the site early in the filtering steps is new and inventive.
[0063] Also important is the application of the clinical motivation to winnow sites of clinical utility. This led to filters that are more strict than have been used before (e.g., required >0.10 allele frequency difference between patients and general populations).
[0064] In addition, because the resulting sites passed clinical utility thresholds, they can be used directly for biomarker tests. The odds ratios of each final site, calculated after the direct filters were applied, range from 2 to 95. Their relative risk score ranges from 1.5 to 69. These are enormous and thus have much great clinical utility than glaucoma- associated sites found by others through GWAS with odds ratios of 1.1-1.4.
[0065] Furthermore, in the preferred embodiment, the patient frequency of each final site ranges from 0.18 to 0.98 with an average of 0.55. That is, large numbers of the HPG patients in which an allele was measured harbored each variant allele. The final sites are thus worth a clinician's time to consider and use in planning a patient's treatment.
[0066] In the preferred embodiments, human donor eyes were sought herein to gather RNA expression data for assessing sites found through our analysis. Surgical skill is required for the fine dissection of ocular tissues to find and harvest distinct tissues, e.g.,
optic nerve vs. optic disk, optic disk vs. retina and trabecular meshwork vs. iris and choroid. In addition, computational skill is required to analyze and interpret sequence reads obtained from tissues RNA and note differential expression of genes and microRNAs that control availability of those genes to make protein. The complementary and necessary surgical and computational skill resulted in assembly of a glaucoma-specific gene expression catalog which is and will continue to be a critical component to assess variants over-represented in HPG patients.
[0067] Some additional aspects of various embodiments are described, particularly with respect to prior art GWAS approaches, and citing eight references below. Standard approaches to genome-wide association used in the past and present apply a platform (e.g. , Illumina 660 genotyping array, Illumina, San Diego, California) to identify "marker" variants genome-wide in a large number of patients with disease (cases) and people confirmed not to have disease, often matched for attributes such as age (controls). Chichon et al provide a review of methods and their discovery power [15] Only variant sites measured in most cases and controls (e.g., 95% of both) are kept for analysis. After genotyping, the group of patients are checked for relatives (e.g., brother and sister in the patient cohort), repeated patients (e.g., a patient who moved from one study center to another), and population stratification (e.g., a number of patients with Mexican ancestry among Caucasian patients recruited from a southern state). Population features are corrected by eliminating subjects from the cohort or applying statistical corrections. Statistical tests are then applied to generate a p-value for each marker variant, and p-values less than 10"8 are considered to have "genome-wide significance" since the number of marker sites tested is generally on the order of 1 million (false discovery rate 0.05: 0.05 / 1M = 10"8).
[0068] This procedure generates a list of markers that each point to a nearest gene or genes. Each plurality of markers near a given gene are subjected to additional statistical analysis and identify the gene as associated with disease. As multiple studies of the same disease are published, meta-analysis can be performed in which case cohorts are combined as are control cohorts; the larger numbers of cases and controls confer additional discovery power.
[0069] The following are some elements and considerations to these approaches: (1)
Markers are chosen for the measurement platform to cover the genome evenly and completely. They do not indicate cause. (2) Markers may be over- or under-represented in the cases. Under-representation (Odds Ratio (OR) < 1) indicates causal variant is likely to be nearby and over-represented in patients by virtue of being on a different version of the
gene, i.e., a different haplotype. (3) Measured markers are restricted to known variants and may be restricted to those with general population frequency >0.05, depending on the platform. So variants rare in the population remain unmeasured. They can be inferred through statistical analysis of deeply sequenced genomes from general populations and assessing local recurring combinations of markers (a process called imputation). [16] (4) If the platform measures rarer variants with frequencing <0.05, larger numbers of cases and controls are required to achieve p-values below 10~8 [17] (5) Sites associated with disease through GWAS generally explain disease in a small fraction of cases, e.g., 2%-4%. [18] (6) Meta-analysis requires harmonizing multiple datasets where genotypes were measured on different platforms. This reduces the sites measured and requires imputation for sites measured on one study's platform but not another's, which introduces uncertainty about measurements. See [19] as a comprehensive meta-analysis example.
[0070] Some additional description is provided of genome sequencing by short reads which provides more context for the various embodiments described herein. Genome sequencing aims to identify variants in a person's genome through direct DNA sequencing and assembly of DNA reads into contiguous stretches. [20] Genome sequencing can be expensive. For example, short-read sequencing with paired reads of length 10 bases (2x100) requires -480 million read-pairs for 30x coverage (30 * 3.2B / 200 = 480M); at a cost of $1,250 per 200M read-pairs, a 30x genome is ~$3,000 plus costs for sample handling labor.
[0071] Some considerations of this include: (1) 30x coverage leaves random areas sparsely covered; so lOOx is generally used for clinical purposes, more than tripling the cost to -$10,000. (2) Rearrangements and repeats are more numerous between genes and make data analysis for variant discovery more complex.
[0072] A brief description of standard exome approaches to finding variants causing disease is provided. Exome sequencing uses DNA capture technology to sequence only the parts of genes that make molecules used in cells, e.g., exons that are protein coding or generate functional non-coding RNAs after an RNA transcribed from the genome has been spliced. [21] Captured exonic DNA is sequenced and mapped to a reference genome to find differences between a person's genome and the reference. The resulting variants may be causal of disease and are subjected to filtering to identify causal variants. Standard filters reject intronic and intergenic sites as off-target. Successful exome searches have focused on novel variants new in a small number {e.g., 10) patients with disease, as in [22]. Attempts to use large numbers of patients' exomes to associate variants with disease have failed to yield results.
[0073] Some considerations for this include: (1) Standard statistical treatments require variants to be measured in most cases and controls, but exome sequencing is a random capture process. So analyzable regions are restricted to those that are reliably captured, typically within or very near exons. (2) Standard variant callers require lOx coverage of a variant site to minimize false positive variant calls. Even so, false positives occur because of properties of the genome, e.g., tandem repeats or 2 or 3 nucleotides (e.g., CAGCAGCAG...) or regions rich in G+C.
[0074] Clearly, a physician treating patients needs clear, causal information that applies to a given patient. Various embodiments described herein are designed to identify clinically useful variants through a novel evaluation process. Clinical utility of variants identified as associated with disease drive the invented process.
[0075] For example, one advantage for at least some embodiments is that every variant detected in one or more patients is considered for disease association. In contrast, standard GWAS or exome analysis requires variant alleles to be found in a larger number of patients.
[0076] Another advantage for at least some embodiments is that statistical analysis is applied to sites observed in 25 or more patients, and each site is statistically tested based on its number of observations in the patient cohort. In contrast, standard GWAS methods require uniform numbers of observations for all sites tested, e.g., measurement in 95% of cases and controls.
[0077] Furthermore, another advantage for at least some embodiments is that frequencies calculated from patients are compared to more than one available reference population. In the example, frequencies measured in HPG patients are compared with 1000 Genomes, Phase 1, since it is the most broadly used in the community, and then against the more recent release 1000 Genomes, Phase 3, with restriction to the subset of subjects of similar ancestry, and then against the Exome Sequencing Project, again with restriction to similar ancestry. In contrast, standard GWAS uses control cohorts measured along with the case cohorts; GWAS meta-analysis combines case cohorts for multiple studies into one and compares with one combined control cohort.
[0078] Moreover, another advantage for at least some embodiments is that since the majority of sites measured in patients are concordant with general population frequencies, outliers are identified in two steps that are clinically motivated rather than statistically motivated.
[0079] In a first step, an absolute difference threshold is applied (>0.10, in
example). This recognizes the clinical motivation that in a well-phenotyped patient population that harbors genetic causes of disease, the disease-causing variants should be vastly higher than general populations. This restricts variants to those that will be clinically significant. This is in contrast to findings in GWAS studies where frequency deviations may be as small as 2% but have strong p-values. By restricted sites to those with large differences, final sites will be clinically significant.
[0080] In a second step, an odds-ratio and confidence interval are calculated, and the confidence interval lower bound must be above 1.0. Clinicians need strong, clear indications of risk for disease and avoid making treatment decisions based on low confidence data.
[0081] In contrast, GWAS and meta-analysis identify outliers based on p-values and genome-wide significance thresholds, thus accepting as disease-associated variants that do little to explain disease and with little or no clinical utility.
[0082] Another advantage of at least some embodiments is that false positives are minimized through a novel series of filters so that variant detection can be more sensitive. As a result, more variants, including many deep inside introns or upstream of genes in promoter regions can be considered for relationship to disease. Problematic variants are identified in two steps.
(a) First, false variants can emerge from the mapping process. Others have tried to improve mapping. Here, sources of mapping bias are identified directly and captured as two exclusion lists. These lists holds sites for which (i) the reference base is the minor allele in the reference genome used for mapping; and (ii) the alternate allele found in patients in also the minor allele in general populations. In the example, these two exclusion lists eliminated from further consideration 1,188,903 and 127,620 variant sites, respectively.
(b) Second, every candidate variant site is screened against a constructed list of sites genome-wide that have anomalies within the genome region. Such anomalies can introduce false positive variant calls. The approach here relies on three exclusion lists that were constructed to implement three sequence-based filters. These lists hold sites computed to occur within 100-200 bases with (i) GC/AT bias; (ii) replicates elsewhere in the genome; and (iii) tandemly repeated motifs. In the example, the exclusion lists were used to reject 77,149 sites within regions of GC/AT bias, 56,905 sites within sequences repeated elsewhere in the genome, and 124 sites with tandem repeats.
[0083] In contrast, standard exome methods simply do not filter variants directly based on genome sequence properties.
[0084] Another important point is that because problematic variants are filtered
directly by direct analysis of genome sequence properties, false variants are minimized before any statistical tests are applied. This allows a lower threshold on the number of reads needed to call a variant. Where other exome interpretation approaches require a minimum of 10 reads, our approach requires a minimum of three. The further a variant is from the exome probes used for capture, the lower its coverage with reads. In the example, variants inside genes but as far as 10,000 bases from upstream or downstream of exons were considered for their disease-relatedness. Consequently, the final list of HPG variants includes a large number of intronic variants, which are missed entirely by standard exome analysis methods. In the example, the list of 932 variants remaining after step 15 contain only 75 sites present in the Exome Sequencing Project database, and the final list of 160 sites contains just 23 sites in ESP.
[0085] In contrast, GWAS studies are limited to sites represented on commercial genotyping platforms and do not include variants novel in a patient, and exome studies are limited to sites with uniformly deep coverage across the exome.
[0086] In addition, the focus here is on variants that cause chronic, systemic diseases in the general population at rates higher than, say, 1%, i.e., common diseases. Such variants are unlikely to be novel within patient populations. Otherwise the disease would be far less common. However, combinations of lower frequency variants may together explain disease across a patient population. Here, variants are considered for disease association regardless of their frequency in general populations, and all variants detected in patients are considered.
[0087] In contrast, GWAS studies are limited to sites represented on commercial platforms, and other exome studies have used approaches that focused on novel and rare variants. 2. Methods of Identifying Genes Causing Onset or Affecting Progression or
Severity of Disease
[0088] Generally the source material sequences of use in the present methods have been sequenced with high fidelity, e.g., the sequences determined with 4 or fewer mismatches per 100 bases, e.g., with 4, 3 or 2 or fewer mismatches per 100 bases. [0089] Table 2 provides a summary of steps that can be taken in the inventive methods for the preferred embodiment of POAG. One skilled in the art can vary the order of steps as needed for a particular application. One skilled in the art also can eliminate one or more steps as needed for a particular application. One or more technical, clinical, gene-
based, and/or statistical constraints listed in Table 2 (e.g., for genes associated with and/or causative of HPG) are applied for the selection of genes associated with or causative of a disease condition. First, sites are counted if observed as variant either from a reference genome or from other patients. Second, sites are evaluated if reported in a publicly available genome dataset, e.g., 1000G, the primary comparison population. Third, sites are restricted to those observed as variant in 3 or more patients. Fourth, to limit false positive effects due to reference bias during mapping, sites are excluded if the base in the hgl9 reference genome was the minor allele base in 1000G. Fifth, sites are included only if the alternate allele remained the minor allele in general populations of similar ethnic descent as the patient cohort. Sixth, sites found to have more than one alternative base are set aside for future consideration. Seventh, eighth and ninth, sites are restricted to those in genome regions with balanced G+C and A+T content; located outside low complexity regions; and located in genome regions without nearly identical, e.g., within 95% identity, paralogs elsewhere. Tenth, any sites located on the X-chromosome or the Y-chromosome are unlikely to contribute to a target disease (e.g., high pressure glaucoma) unless the disease has a clear gender predilection, and therefore can be excluded (e.g., limit selection to genes expressed from chromosomes 1-22). See, Ederer, et al, 1994 [23]. Thus sites on the X and Y chromosomes are excluded from further analysis.
[0090] Next, three constraints based on clinical criteria are applied as prerequisites for association with disease. Eleventh, a SNP site must be observed in enough patients to calculate its importance in disease. Because sequencing does not always capture a given site in all samples, the denominator for frequency calculation for a SNP site becomes twice the number of samples with reads at that site. In varying embodiments, sites are excluded from consideration if they are measured in fewer than 25 patients. Twelfth, a genomic aberration is not likely to be important as a primary cause of a target disease (e.g., high pressure glaucoma) if it occurs with frequency close to that in the normal population. In varying embodiments, sites with patient frequencies within measurement error, e.g., 0.05, of the 1000 Genomes Phase 1 general population frequency are set aside, as are sites with patient frequencies within measurement error of the European subset of the 1000 Genomes Phase 3 subjects. Likewise, sites with patient frequencies within measurement error of the European subset of the Exome Sequencing Project (ESP) are set aside. Thirteenth, in varying embodiments, SNP sites with allele frequencies of greater than the prevalence of the target disease (e.g. , high pressure glaucoma, with occurs in about 2 to 4% of the adult
general population) in any adult general population used for comparison are excluded. Further, in varying embodiments, sites are kept if their patient allele frequency substantially exceeds general population frequency, e.g., by 0.10 or greater in any adult general population used for comparison. [0091] Next, two gene-base criteria are applied. Fourteenth, sites outside of a gene or regulatory regions influencing its expression as RNA or protein are excluded from further analysis as off target. Fifteenth, sites within or near genes expressed in tissues relevant to disease are retained.
[0092] Next, three statistical criteria are applied. Sixteenth, odds ratio and confidence interval are calculated for each site based on number of patients in whom the site was measured, the number of alternate alleles observed, and the number of measured and alternate alleles in the 1000G Phase 3 database. Sites with a 95% odds ratio confidence interval lower bound above 1.0 are retained. Seventeenth, sites are further retained if their frequency in patients is above a statistical fit of a line to datapoints where X is reference general population frequency and y is patient frequency. In some embodiments, the fit is performed with a least square linear estimate function. Eighteenth, a 2x2 statistical test is applied to obtain p-values. In some embodiments, Fisher's Exact Test is used. Sites are then grouped by the number of patients, N, in which they are measured, and a significance threshold is calculated for each measurement group. In some embodiments, the Bonferroni formula (0.05/N) is used to calculate the threshold maximum p-value to determine significance under multiple testing. SNP sites passing these constraints indicate genes important in the target disease (e.g., high pressure glaucoma, ocular diseases and disorders, Alzheimer's, Parkinson's, Prion Disease (PRNP) and other misfolded protein diseases).
[0093] Analysis of the sequencing data and the diagnosis derived therefrom can be readily performed using various computer executed algorithms and programs, using appropriate software and hardware available to one skilled in the art. Therefore, certain embodiments employ processes involving data stored in or transferred through one or more computer systems or other processing systems. Embodiments disclosed herein also relate to apparatus for performing these operations. This apparatus may be specially constructed for the required purposes, or it may be a general-purpose computer (or a group of computers) selectively activated or reconfigured by a computer program and/or data structure stored in the computer. In some embodiments, a group of processors performs some or all of the recited analytical operations collaboratively (e.g., via a network or cloud computing) and/or
in parallel. A processor or group of processors for performing the methods described herein may be of various types including microcontrollers and microprocessors such as
programmable devices (e.g., CPLDs and FPGAs) and non-programmable devices such as gate array ASICs or general purpose microprocessors. [0094] In addition, certain embodiments relate to tangible and/or non-transitory computer readable media or computer program products that include program instructions and/or data (including data structures) for performing various computer-implemented operations. See, for example, WO 2014/080323 for use of non-transitory computer readable or storage media in the genomic context. Examples of computer-readable media include, but are not limited to, semiconductor memory devices, magnetic media such as disk drives, magnetic tape, optical media such as CDs, magneto-optical media, and hardware devices that are specially configured to store and perform program instructions, such as read-only memory devices (ROM) and random access memory (RAM). The computer readable media may be directly controlled by an end user or the media may be indirectly controlled by the end user. Examples of directly controlled media include the media located at a user facility and/or media that are not shared with other entities. Examples of indirectly controlled media include media that is indirectly accessible to the user via an external network and/or via a service providing shared resources such as the "cloud." Examples of program instructions include both machine code, such as produced by a compiler, and files containing higher level code that may be executed by the computer using an interpreter.
[0095] In various embodiments, the data or information employed in the disclosed methods and apparatus is provided in an electronic format. Such data or information may include reads and tags derived from a nucleic acid sample, counts or densities of such tags that align with particular regions of a reference sequence (e.g., that align to a chromosome or chromosome segment), reference sequences (including reference sequences providing solely or primarily polymorphisms), counseling recommendations, diagnoses, and the like. As used herein, data or other information provided in electronic format is available for storage on a machine and transmission between machines. Conventionally, data in electronic format is provided digitally and may be stored as bits and/or bytes in various data structures, lists, databases, etc. The data may be embodied electronically, optically, etc.
3. Identified Biomarkers Causing Onset or Affecting Progression of Primary Open Angle Glaucoma (POAG) or high pressure glaucoma (HPG)
[0096] For the preferred embodiment, biomarkers, including genes and microRNAs, determined to be associated with and/or causative of POAG and/or HPG are provided in Tables 4, 5, and 6. In Table 4, the alternative (ALT) allele is associated with disease.
Tables 5 and 6 summarize microRNAs that are overexpressed or underexpressed in tissues from patients having POAG and/or HPG. In varying embodiments, expression of any of the listed biomarkers in Tables 4, 5, and 6 can be determined in the various ocular tissues, including without limitation trabecular meshwork (TM), ciliary body (CB), choroid (CH), optic disk (OD), optic nerve (ON) and retina (RT). Methods known in the art can be used to determine expression levels.
[0097] The POAG/HPG associative and/or causative genes discovered herein (e.g., as summarized in Tables 4, 5, and 6) can be evaluated and/or monitored with genes known to be associated with and/or causative of glaucoma and/or other eye diseases. Prior genome-wide association and linkage-based studies have identified loci with contribution to glaucoma including myocilin, CYP1B1 , optineurin, WDR36, TBK1 , TBK2, and GALC. Loci contributing to POAG found through GWAS include TMCOl , CAV1/CAV2,
CDKN2B-AS1 , SIX1/SIX6, TXNRD2, ATXN2, FOXC 1 , an 8q22 intergenic region, and GAS7. Loci associated with optic disk area, a phenotype relevant to POAG include
ATOH7/PBLD, CDC7/TGFBR3, and SALL1. Loci associated with vertical cup to disk ratio (CDR), a useful measurement to monitor progression of optic neuropathy in POAG, include SCYL1/LTBP3, CHEK2, ATOH7, DCLK1 , SIX1/SIX6, CDKN2A/B, and
CDKN2B-AS 1. Several genes are strongly associated with central corneal thickness (CCT), including FOXOl , COL5A1 , ZNF469, AKAP13, AVGR8, and COL8A2; however, recent genetic studies indicate CCT may not be directly associated with susceptibility to POAG. Molecular studies of differential gene expression in tissues relevant to glaucoma revealed genes up- or down-regulated in trabecular meshwork, lamina cribrosa, and optic nerve head astrocytes from glaucomatous eyes compared to eyes without disease. In the latter, among 183 up-regulated and 220 down-regulated genes, a number of genes previously studied in eye disease and development had notable differences in glaucomatous compared to normal astrocytes, including TGFB 1 , SPARC, POSTN, THBS 1 , CRTL-1 , COL1A1 , COL5A1 , COL1 1A1 (up) and FBLN1 , DCN, COL18A1 (down). Likewise, studies of differential expression in glaucomatous trabecular meshwork, the eye tissue involved in aqueous
outflow, revealed additional genes of interest, as did studies of lamina cribrosa from glaucomatous eyes. The OMIM database of diseases and genes maintained at NCBI aims to provide a comprehensive list of disease-related genes for all human diseases. OMIM reports nine genes directly related to glaucoma. These include five additional genes, FOXC1, LTBP2, NTF4, OPA1, and SBF2, and four genes listed above, CYP1B1, MYOC, OPTN, WDR36. OMIM lists 29 genes indirectly related to glaucoma: APOE, BEST1, BMP4, CA12, CANTl, CNTNAP2, CRBl, EPO, FOXE3, FOXL2, GJAl, GLIS3, ISPD, LMXIB, LOXL1, MTHFR, PAX6, PEX5, PITX2, PITX3, POMT1, RPS19, RRM2B, SLC4A4, TDRD7, TGFB2, TNF, and TTR as well as TMCOl listed above. The National Eye Institute's EyeGene project maintains a database of genes involved in any eye disease and their variants causing disease. EyeGene reports genes for eye diseases ranging in onset from congenital to late-age, including microphthalmia, retinal degeneration, macular degeneration and various forms of glaucoma. See also, genes discussed in van Koolwijk, et al., 2013 [24], Burdon et al, 2012 [25], Allingham, et al, 2009 [26]. One skilled in the art can combine prior art knowledge with the inventive features described and claimed herein to address disease.
4. Predicting Onset And/Or Progression And/Or Severity And/Or Recurrence Of Disease
[0098] Another important aspect is a method for predicting onset and/or progression and/or severity and/or recurrence of disease (e.g, primary open angle glaucoma (POAG)) in a subject, the method including receiving allelic information and/or expression levels of a collection of signature biomarkers from a biological sample taken from the subject suspected of developing or suffering a disease such as POAG, wherein said collection of signature biomarkers comprises one or more genes and/or microRNA selected from a group developed using the methods described herein.
[0099] One can then apply the allelic information and/or expression levels to a predictive model relating allelic information and/or expression levels of said collection of signature biomarkers with onset of POAG; and evaluate an output of said predictive model to predict onset of POAG in said individual.
[0100] One can then also apply the allelic information and/or expression levels to a predictive model relating allelic information and/or expression levels of said collection of signature biomarkers with progression of POAG; and evaluate an output of said predictive model to predict progression of POAG in said individual.
[0101] One can then also apply the allelic information and/or expression levels to a predictive model relating allelic information and/or expression levels of said collection of signature biomarkers with severity of POAG; and evaluate an output of said predictive model to predict severity of POAG in said individual.
[0102] One can then also apply the allelic information and/or expression levels to a predictive model relating allelic information and/or expression levels of said collection of signature biomarkers with recurrence of POAG; and evaluate an output of said predictive model to predict recurrence of POAG in said individual.
[0103] Combinations of onset, progression, severity, and recurrence can be carried out for a particular patient and used for further prognostic, diagnostic, and/or therapeutic steps. Kits can be used for testing of subjects.
EXAMPLES
[0104] The following examples are offered to illustrate, but not to limit the claimed invention.
Example 1
Disease-associated variants in coding and regulatory regions revealed by exome sequencing in high-pressure open-angle glaucoma patients
[0105] In glaucoma, progressive optic nerve degeneration can lead to irreversible vision impairment and eventual blindness, despite treatment. Genetic causes and influences are not yet clear in primary open angle glaucoma (POAG), the most prevalent form of the disease in North America, Europe, and several other parts of the world. The genetics of POAG are complex; to date, no single causative genomic variant has been established as causing the disease. We have studied the genomes of 295 high-pressure POAG (HPG) patients and compared findings with general population observations found in the 1000 Genomes and the Exome Sequencing Projects. We have identified 160 genome
polymorphisms greatly overrepresented in HPG patients compared with general populations. These changes are located in coding and regulatory regions of 140 genes and implicate these genes in HPG. The variants implicating these genes are potential causative factors. For all genes, mRNA expression was detected in ocular tissues. Five of the 140 were already associated with POAG or its phenotypic risk factors. The remaining 135
genes are newly implicated in HPG. These genes and their variants complement a growing list of genes involved in glaucoma found through linkage, genome wide association, and other studies. This opens new avenues for investigation into the genetic, molecular, and biochemical mechanisms of this disease. METHODS:
[0106] Inclusion and exclusion criteria. The DNA samples for this study are a subset of the de -identified samples from patients enrolled in the NEIGHBOR GWAS. Patients with primary open angle glaucoma (POAG) were enrolled in NEIGHBOR after confirmation of reliable visual field (VF) tests with characteristic defects on two or more tests, or with a single qualifying VF test accompanied by a vertical cup-disc ratio of 0.7 or more in at least one eye. Examination of the ocular anterior segment disclosed no signs of secondary causes for elevated IOP. The approach to the filtration structures in the anterior chamber angle was wide open on gonioscopic examination. All patients selected for the present study had a documented, confirmed history of IOP >22 mm Hg and were classified as HPG [8,27]. (Table 1. Demographics) Each NEIGHBOR-enrolled patient gave informed consent at their site of ophthalmic care to donate a blood sample for glaucoma genome investigations. Collaborating physicians obtained blood samples at the site of care and submitted them to NEIGHBOR for DNA preparation, storage, and study-related investigations. [0107] DNA target enrichment and sequencing. DNA samples from 295 patients were indexed and prepared for deep sequencing on the Illumina HiSeq2000 instrument (Illumina, San Diego, California, USA). Two indexed samples at a time were pooled, and DNA regions that code for proteins genome -wide were enriched using Nimblegen SeqCap EZ version 2 (204 samples) or version 3 (109 samples) (Roche Nimblegen, Madison, Wisconsin, USA). Paired DNA sequences (readpairs) of length 100 bases (2x100) were determined for enriched DNA to generate a minimum of 50 million readpairs per sample. The hgl9 reference genome 14 contains 21,210 genes with HUGO identifiers and 464,698 exons annotated in the Refseq database at NCBL The Nimblegen V2 probes were designed to cover 44,070,352 bases in 392,771 Refseq exons and 18,804 genes with HUGO identifiers. The Nimblegen V3 probes were designed to cover an expanded target region with 64,148,113 bases in 410,269 exons and 19,721 genes.
[0108] Read alignment. The sequence data analysis strategy was designed to minimize false positive observations and focus on SNP sites where nucleotides observed in
the patient DNA had differences, either homozygous or heterozygous, from the human reference genome version hgl9. Sequence data for human chromosomes 1 through 22, X, Y and mitochondria were downloaded from the UC Santa Cruz Genome Browser
(http://genome.ucsc.edu) [28] and prepared as a target for mapping paired reads using the BowTie software [29]. Reads with more than three mismatches to the reference genome or with matches to more than one genome locus were set aside as unmapped for future detection of insertions, deletions and tandem repeat expansions. Figure 1 illustrates the read mapping strategy. Mapped reads were converted from a text-based sequence alignment/map (SAM) format to a binary (BAM) format with Samtools [30]. [0109] Sequence data quality filtering and genoty ping. The BAM files for each sample were reviewed to determine whether reads were sufficient to determine genotypes at variant sites across the targeted capture regions. Any sample with insufficient breadth of coverage was excluded from further analysis. This yielded 295 samples with sufficient sequencing (Table 1). Each remaining BAM file was treated as follows: All sequence data were analyzed with respect to the forward strand of the hgl9 reference genome. The
Samtools "pileup" algorithm 16 was called to extract bases from reads at every sequenced site to produce a list of bases ("pileup") and a consensus base at each site. Each pileup was separated into evidence agreeing with the hgl9 reference base and evidence for an alternate base at that site. To call either a reference or alternate base as present in the patient genome, reads were required to be from both forward and reverse DNA strands, with at least three high quality reads per base for the genotype to be considered heterozygous (two or more differing nucleotides) or four high quality reads to be considered homozygous (two copies of one nucleotide). Further, for a heterozygous genotype, the ratio of reads supporting each nucleotide had to be between 0.5 and 2, indicating the reads were balanced between both chromosomes. If this analysis found evidence that supported either the hgl9 reference or an alternate base yet did not meet the criteria for a call, the site was designated as "no call" for the sample, and the observation of the site in the patient flagged as
"ambiguous". For a given patient, sites with reads in other patients but no reads in this patient were designated as "no call" and flagged as "missed". This process yielded an explicit genotype call, including flagged "no calls", for every sample at every site sequenced in any patient.
TABLE 1
Demographics
Variable1 Cases
Number 295
Female 54%
Age (years), mean (SD) 62 (±15)
IOP (mm Hg), mean (SD)2 16 (±6)
CDR, mean (SD)2 0.82 (±0.16)
POAG in 1° relatives
Hx Obtained 281
Positive 200
Percent Positive 71%
History of Diabetes 9%
History of Hypertension 43%
1. Abbreviations: IOP=treated intraocular pressure,
CDR=vertical cup to disc ratio, SD=standard deviation
2. Means are mean of both eyes.
[0110] HPG variant identification and annotation. Genome sites from the 295 patients with sufficient sequence data and evidence of difference from hgl9 reference were put into a Master Variant Table and submitted to the SeattleSeq Annotation server
(available at snp.gs.washington.edu) [31]. The table included every site observed with an allele call different from the reference genome in at least one patient. SeattleSeq returned annotations for each site with gene names, dbSNP database identifiers for known SNPs, whether a SNP changes a protein amino acid, likely impact of the change on the protein using the PolyPhen2 and SIFT2 algorithms [32,33,34], distance to nearest exon-intron splice site, distance to stop codon for SNPs in untranslated regions, distance to nearest gene for intergenic SNPs, relative conservation of DNA around the SNP across mammalian genomes, and any known clinical or disease association. The annotations were added to the Master Variant Table to support further analysis and search for genes associated with HPG.
[0111] HPG allele and zygosity frequencies. Allele and zygosity frequencies were calculated for every site in the Master Variant Table based on the genotype calls for each patient sample. For each SNP site in chromosomes 1 to 22, the observed frequency for an alternate base (a) was determined as the number of heterozygous observations (het) plus twice the homozygous alternate base observations (horn) divided by twice the number of
samples (n) that had a genotype call (including homozygous same as reference base) at that site, thus a = (het + 2*hom) / 2n. This allele frequency, a, became the basis for
identification of SNP site alleles potentially overrepresented in HPG patients.
[0112] Comparison with 1000 Genome (1000G) and Exome Sequencing Project (ESP) databases. Comparisons between the HPG data and general population databases were based on the less frequent (minor) allele in the 1000G database for every SNP site identified in the 295 patients. Comparison tables were constructed from public variant tables downloaded from the 1000G server (1000genomes.org/data,
ftp://ftp.1000genomes.ebi.ac.uk/voll/ftp/, Phasel Integrated Release Version3_20120430 with 38,248,780 sites, including 14,675,062 sites with frequencies derived from European subpopulations; and Phase 3 Release 20130502 with over 79 million variants, including variants measured in 505 subjects of European descent) [35] and the ESP Exome Variant Server (evs.gs.washington.edu/EVS/, ESP6500 Version 2 with 3,688,361 sites) [36]. These tables include chromosome positions, allele bases, allele frequencies, and supporting information. The minor allele in the 1000G database was identified for every 1000G site. To limit false positives in our analysis due to reference bias inherent in the mapping process, we identified all sites where the hgl9 reference base was the minor allele base in 1000G. The HPG and general population frequencies for the 1000G minor alleles were used in all further comparisons to identify sites of interest in HPG patients. [0113] Application of constraints. Technical, clinical and statistical considerations allowed definition of constraints to apply to the SNP sites identified in the HPG patients (Table 2, Abbreviations: HPG, high pressure primary open angle glaucoma, 1000G, 1000 Genomes Project; ESP, Exome Sequencing Project).
TABLE 2
[0114] Ten constraints for variant identification, exclusion and inclusion were applied as follows. First, sites were counted if observed as variant either from a reference genome or from other patients. Second, sites were evaluated if reported in a publicly available genome dataset, e.g., 1000G, the primary comparison population. Third, sites were restricted to those observed as variant in 3 or more patients. Fourth, to limit false positive effects due to reference bias during mapping, sites were excluded if the base in the hgl9 reference genome was the minor allele base in 1000G. Fifth, sites were included only if the alternate allele remained the minor allele in general populations of similar ethnic descent as the patient cohort. Sixth, sites found to have more than one alternative base were set aside for future consideration. Seventh, eighth and ninth, sites were restricted to those in genome regions with balanced G+C and A+T content; located outside low complexity regions; and located in genome regions without nearly identical, e.g., within 95% identity, paralogs. Tenth, any sites located on the X-chromosome are unlikely to contribute to HPG because this disease has no clear gender predilection; X and Y chromosome sites were excluded. [0115] Next, three constraints based on clinical criteria were applied as prerequisites for association with disease. Eleventh, a SNP site must be observed in enough patients to calculate its importance in disease. Because sequencing does not always capture a given site in all samples, the denominator for frequency calculation for a SNP site becomes twice the number of samples with reads at that site. Sites were excluded from consideration if they were measured in fewer than 25 patients. Twelfth, sites with patient frequencies within measurement error, e.g., 0.05, of the 1000 Genomes Phase 1 general population frequency were set aside, as were sites with patient frequencies within measurement error of the European subset of the 1000 Genomes Phase 3 subjects. Likewise, sites with patient frequencies within measurement error of the European subset of the Exome Sequencing Project (ESP) were set aside. Thirteenth, since POAG occurs in about 2 to 4% of the adult general population, sites were kept if their patient allele frequency substantially exceeded general population, e.g., by 0.10 or greater in a comparison adult general population.
[0116] Next, two gene-base criteria were applied. Fourteenth, sites outside of a gene or regulatory regions influencing its expression as RNA or protein were excluded from further analysis as off target. Fifteenth, sites within or near genes expressed in tissues relevant to disease were retained. Figures 2-4 illustrate gene expression in ocular tissues.
[0117] Next, three statistical criteria were applied. Sixteenth, odds ratio and confidence interval were calculated for each site based on number of patients in whom the
site was measured, the number of alternate alleles observed, and the number of measured and alternate alleles in the 1000G Phase 3 database. Sites with a 95% odds ratio confidence interval lower bound above 1.0 were retained. Seventeenth, sites were further retained if their frequency in patients exceeded a least squares linear regression fit of datapoints where X was reference general population frequency and Y was patient frequency. Eighteenth, a 2x2 Fishers Exact Test was applied to obtain p-values. Sites were grouped by the number of patients, N, in which they had been measured, and a significance threshold was calculated for each measurement group using the Bonferroni formula (0.05/N) to correct for multiple testing. SNP sites passing these constraints indicate genes important in HPG. [0118] Gene Expression in Ocular Tissues. To measure gene expression in six ocular tissues (retrobulbar optic nerve, optic disc, retina, choroid, ciliary body, and trabecular meshwork), we performed whole transcriptome sequencing (R A-seq) of tissues dissected from 5 fresh donor human autopsy eyes, two with history of primary open angle glaucoma and three without. RNA was extracted, fragmented to -200 bp (basepairs), ligated with Adaptor Mix, converted to cDNA with ArrayScript Reverse Transcriptase
(Ambion), size selected (-200 bp) by gel electrophoresis, and PCR amplified with adaptor primers. Deep sequencing was done on an Illumina HiSeq 2000. Differential gene expression analysis was done with TopHat and CuffLinks. See, e.g., Trapnell, et al., Nat Biotechnol. (2013) 3 l(l):46-53. For each tissue, reads were pooled and mapped to the hgl9 reference genome. Reads per kilobase of exon per million mapped reads (RPKM) were calculated for each gene and used as an estimate of expression.
RESULTS:
[0119] Demographics. The genomes of 295 patients with HPG were the focus of this study (Table 1). Females constituted 54%. The mean age at diagnosis was 62 (±15 SD, range 30 to 94) years. Treated mean IOP at the time of blood sampling was 16 (±6, range 4 to 32) mmHg. The mean of the vertical cup-disc ratio was 0.82 (±0.16, range 0.30 to 1.00). There was a self-reported history of open-angle glaucoma in a 1st degree relative in 69 percent, of Type 2 diabetes in 9 percent and a history of hypertension in 43 percent of patients in the present study. [0120] HPG target enriched sequencing, alignment, and annotation. Of the 295 samples analyzed, 105 were captured with Nimblegen V3 and 190 with Nimblegen V2.
[0121] Identification of glaucoma-related SNP sites and genes. The initial review of the sequencing data disclosed 4,267,157 sites in the HPG patients that differed from the hgl9 reference genome in any patient. A series of constraints were applied to identify the SNP sites in or near exons wherein an alternate allele was over-represented in HPG patients. [0122] First, a series of ten constraints identified, included or excluded variants. Of the sequenced sites, 4,267,157 were variant in 1 or more HPG patients (Constraint 1). This number fell to 4,032,533 upon limiting to sites found in the 1000G public database
(Constraint 2). Of these, 2,748,984 were variant in 3 or more HPG patients (Constraint 3). Some of the sites in the reference genome had the minor allele in the comparison database, 1000G, potentially causing reference bias during analysis, and were eliminated from consideration; 1,560,081 sites had the major allele as the reference base (Constraint 4). For some sites, the alternate allele, although minor in the 1000G Phase 1 generation population, became the major allele in the European population and were eliminated, yielding 1,432,461 sites (Constraint 5). Next, 1,423,956 of the sites remaining after the previous constraint had no more than one alternate allele in the HPG patients (Constraint 6). Of these, 1,350,492 had balanced G+C content (Constraint 7); 1,350,455 were located outside low complexity regions (e.g., tandem repeats) (Constraint 8); and 1,302,588 had no identical or nearly identical paralogs (Constraint 9). After restricting sites to Chromosomes 1 - 22 (Constraint 10), 1,279,295 sites remained. [0123] Second, a series of five constraints based on clinical criteria were applied as prerequisites for association with disease. The number of sites fell to 455,413 when restricted to those measured in at least 25 of the HPG patients (Constraint 11). Next, 40,860 remained after restriction to those with alternate allele frequencies in HPG patients that differed more than 0.05 from 1000G Phase 1 frequencies (Constraint 12a). Of these, 8,336 also exceeded by more than 0.05 the frequencies measured in the European subset of 1000G Phase 3 (Constraint 12b); 7,985 exceeded by more than 0.05 the frequencies measured in the European subset of the Exome Sequencing Project (Constraint 12c). To minimize false positives, sites were further restricted to those with frequencies that exceeded by more than 0.10 the frequencies in any of the comparison databases, leaving 2,235 sites (Constraint 13). [0124] Third, two gene-based criteria further restricted sites. 1,408 sites remained when sites between genes were removed because intergenic sites found in sequencing are off target (Constraint 14). Of these, 933 were in genes detected as expressed in ocular tissues in associated laboratory studies (Constraint 15).
[0125] Fourth, we applied three statistical filters. Odds ratio and confidence interval were calculated for each site based on number of patients in whom the site was measured, the number of alternate alleles observed, and the number of measured and alternate alleles in the 1000G Phase 3 database. 506 sites had a 95% odds ratio confidence interval lower bound above 1.0 (Constraint 16). Data were fit with least squares linear regression to identify all sites above the fitted line, leaving 199 sites (Constraint 17). A 2x2 Fishers Exact Test was applied to obtain p-values. Sites were grouped by the number of patients, N, in which they had been measured, and a significance threshold was calculated for each measurement group using the Bonferroni formula (0.05/N) to correct for multiple testing. A final 160 sites remained significant after correction for multiple testing (Constraint 18). A total of 140 genes contained the 160 sites. See, Table 2. Variant sites evaluated in
Constraints 13-18 are shown in Table 2.
[0126] For sites remaining after filtering, 53 (33%) each occurred in 25 to 49 of the
HPG patients, and 107 (67%) each occurred in at least 50 of the HPG patients. Due to fluctuation in DNA capture efficiency, sites located in introns farther from exon splice sites tended to have smaller numbers of observations.
[0127] The 160 SNP sites are found in 140 genes. While 12 genes contained 2 SNP sites and 4 genes contained 3 SNP sites, 124 of the 140 genes contained a single SNP site. The genes are distributed across the genome. See, Tables 3 and 4. The nomenclature and sequence identification of these genes and other biomarkers described herein are known in the art and incorporated herein by reference (e.g., HUGO Gene Nomenclature Committee, National Center for Biotechnology Information, NCBI; GenBank accession numbers).
[0128] These constraints reduced the number of SNP sites that are potentially more important in identifying genes that cause HPG from over 4 million sites to 160 sites in 140 genes. During filtering many SNP sites were set aside for further analysis.
TABLE 3
Properties of 140 genes and 160 SNP sites
Gene properties
a. 124 w/ 1 SNP site
12 w/ 2 SNP sites
4 w/ 3 SNP sites
SNP site properties
b. 23 Codon
118 Intron
13 utr-3p
4 utr-5p
2 utr-NC
c. 12 Missense
11 synonymous
d. 84 intron, within 500 bp of splice site
34 intron, >500 bp from splice site
SNP site distance distributions
e. 140 1st SNP site in gene
1 SNP site adjacent to a 1st site in gene
3 SNP sites 2 - 3 bp of 1st site in gene
9 SNP sites 4 - 55 bp of 1st site in gene
2 SNP sites 150 - 250 bp of 1st site in gene
f. 24 SNP sites within 100,000 bp of prior site
Gene annotations, 85 genes (49 in multiple categories) g- 51 Cell cycle, apoptosis, proliferation
33 Neural-related
30 Adhesion
28 Immune-related
19 Transcription factor or RNA binding
14 Mitochondrial
11 Ocular
h. 5 Prior glaucoma-related
i. 1 Prior glaucoma-related & neural & immune
1 Prior glaucoma-related & retinal
SNPs per gene, b. location in gene, c. codon effect, d. distance to splicesite, proximal SNPs within genes, f. proximal SNPs in adjacent genes, genes with functions relevant to glaucoma, h. prior glaucoma related genes, glaucoma related and relevant functions.
TABLE 4
160 SNP sites identifying 140 genes as risk variants for high pressure glaucoma (HPG)
4a. 134 of 140 Genes: strongest SNP site
1 145,015,877 G T rs77741369 PDE4DIP codon missense 18 0.192 0.370 2.45 1.84 - 3.25 1.30E-09
3 195,506,914 G A rsl 86560307 MUC4 2 codon missense 79 0.154 0.397 3.60 2.58 - 5.01 1.70E-13
11 22,271,870 A T rs7481951 AN05 codon missense 48 0.349 0.576 2.54 1.92 - 3.35 3.90E-11
11 117,789,345 G C rs61900347 TMPRSS13 2 codon missense 209 0.100 0.443 7.17 5.25 - 9.79 1.60E-35
14 105,415,748 G A rsl l8171013 AHNAK2 codon missense 5,389 0.301 0.544 2.75 2.09 - 3.60 2.10E-13
15 74,336,633 T C rs5742915 PML codon missense 67 0.226 0.460 2.90 2.19 - 3.83 1.40E-13
19 7,935,879 G A rs12984448 FLJ22184 codon missense 1,279 0.071 0.263 4.55 2.78 - 7.44 2.40E-08
1 87,045,902 A T rsl932809 CLCA4 codon synonym 278 0.231 0.677 6.94 5.14 - 9.35 6.10E-40
1 181,759,614 A C rs35611740 CACNA1E codon synonym 34 0.013 0.198 18.95 10.00 - 35.89 1.20E-23
2 216,973,904 C A rs1647764 XRCC5 codon synonym 91 0.054 0.205 4.48 2.62 - 7.63 3.90E-07
3 122,642,590 G A rs2276778 SEMA5B codon synonym 10 0.432 0.640 2.31 1.75 - 3.04 1.90E-09
4 140,810,700 C T rs 11729794 MAML3 codon synonym 190 0.259 0.482 2.63 2.00 - 3.45 5.60E-12
7 5,352,659 G T rsl38591330 TNRC18 codon synonym 526 0.348 0.680 3.99 2.39 - 6.65 4.10E-08
9 79,318,378 G A rsl3290609 PRUNE2 3 codon synonym 126 0.328 0.700 4.79 3.55 - 6.44 5.00E-27
16 70,726,795 C A rs2278983 VAC 14 codon synonym 72 0.283 0.471 2.26 1.71 - 2.98 1.00E-08
17 5,085,389 C T rsl48322165 ZNF594 codon synonym 2, 183 0.021 0.260 16.56 9.17 - 29.89 1.30E-19
20 56,137,834 A G rsl062601 PCK1 codon synonym 83 0.322 0.512 2.19 1.66 - 2.87 1.60E-08
TABLE 4
160 SNP sites identifying 140 genes as risk variants for high pressure glaucoma (HPG)
TABLE 4
160 SNP sites identifying 140 genes as risk variants for high pressure glaucoma (HPG)
TABLE 4
160 SNP sites identifying 140 genes as risk variants for high pressure glaucoma (HPG)
TABLE 4
160 SNP sites identifying 140 genes as risk variants for high pressure glaucoma (HPG)
TABLE 4
160 SNP sites identifying 140 genes as risk variants for high pressure glaucoma (HPG)
TABLE 4
160 SNP sites identifying 140 genes as risk variants for high pressure glaucoma (HPG)
TABLE 4
160 SNP sites identifying 140 genes as risk variants for high pressure glaucoma (HPG)
4b. 6 of 140 genes with >90% identical paralog, strongest SNP site (5 with 1 SNP)
ANKRD36
2 98,166,215 C G rs6750205 B 2 intron 136 0.470 0.691 2.50 1.88 - 3.31 6.20E-11
9 15,883 A G rsl41156662 WASH1 7 intron 24 0.192 0.500 4.27 2.86 - 6.35 3.60E-12
16 18,531,692 T C rsl37862509 NOM02 2 intron 225 0.225 0.543 4.10 2.25 - 7.46 7.50E-06
18 14,529,901 A G rs62081684 POTEC 2 intron 579 0.303 0.580 3.45 1.53 - 7.76 3.30E-03
19 54,724,457 T C rsl 80678650 LILRB3 1 2 codon missense 60 0.168 0.673 10.21 5.39 - 19.32 2.20E-13
19 54,778,909 A G rsl 17474097 LILRB2 2 intron 224 0.029 0.263 11.96 7.50 - 19.06 2.00E-27
TABLE 4
160 SNP sites identifying 140 genes as risk variants for high pressure glaucoma (HPG)
4c. 2nd SNP site for 16 of 140 genes
2 26,618,543 G A rs7092 EPT1 2 utr-3p 6,470 0.282 0.564 3.31 2.18 - 5.02 2.40E-08
3 56,599,208 G C rs55831745 CCDC66 2 intron 1,053 0.283 0.582 3.54 2.28 - 5.48 1.80E-08
3 195,505,907 T G rsl38720131 MUC4 2 codon missense 161 0.175 0.344 2.46 1.74 - 3.46 7.50E-07
6 32,548,122 C G rs4448132 HLA-DRB1 2 intron 73 0.130 0.338 3.35 2.12 - 5.28 8.60E-07
9 41,954,776 C G rs28706047 MGC21881 2 utr-NC 28 0.322 0.573 2.78 1.86 - 4.15 7.00E-07
9 79,318,381 A T rsl 13471142 PRUNE2 3 codon synonym 129 0.237 0.593 4.66 3.41 - 6.36 1.60E-22
11 1,971,918 A C rs217200 MRPL23 2 intron 209 0.178 0.530 4.89 2.42 - 9.86 1.70E-05
11 117,789,327 T C rs61900346 TMPRSS13 2 codon missense 204 0.187 0.428 3.23 2.42 - 4.29 2.20E-15
12 81,231,631 A G rsl2318213 LIN7A 2 intron 4,277 0.309 0.523 2.34 1.26 - 4.31 7.00E-03
16 7,008,029 C T rsl2917775 RBFOX1 2 intron 94,027 0.185 0.440 3.45 1.54 - 7.72 3.40E-03
17 4,859,156 c T rsl46545678 EN03 3 intron 79 0.156 0.378 3.25 2.26 - 4.66 1.10E-09
17 44,340,253 A G rsl 13507264 LRRC37A 3 intron 12,532 0.249 0.500 3.23 1.53 - 6.78 2.10E-03
17 75,212,491 C A rsl 130549 SEC14L1 3 utr-3p 138 0.177 0.349 2.48 1.77 - 3.46 3.20E-07
19 53,122,146 A C rs35489438 ZNF83 2 intron 41 0.112 0.304 3.48 2.40 - 5.02 1.90E-10
19 54,724,458 A G rsl 85399462 LILRB3 1 2 codon missense 61 0.284 0.684 5.20 2.76 - 9.77 1.40E-07
22 45,595,265 C G rs78963667 KIAA0930 2 intron 487 0.130 0.318 3.09 2.08 - 4.56 7.20E-08
TABLE 4
160 SNP sites identifying 140 genes as risk variants for high pressure glaucoma (HPG)
4d. 3rd SNP site for 4 of 140 genes
9 79,318,384 C A rsl99893827 PRUNE2 3 codon missense 132 0.141 0.334 3.02 2.06 - 4.42 5.90E-08
17 4,859,134 C T rsl 17488294 EN03 3 intron 101 0.248 0.500 3.07 2.05 - 4.58 6.10E-08
17 44,326,845 A G rsl l 8111151 LRRC37A 3 intron 1,318 0.263 0.447 2.26 1.17 - 4.34 1.56E-02
17 75,212,489 T C rs62079472 SEC14L1 3 utr-3p 140 0.166 0.339 2.56 1.80 - 3.62 3.90E-07
Abbreviations: Gene identifiers obtained from the Human Genome Nomenclature Committee from genenames.org, with data downloaded from ftp://ftp.ebi.ac.uk/pub/databases/genenames/new/tsv/hgnc_complete_set.txt. [37] HPG, High Pressure Primary Open Angle Glaucoma; CHR,
Chromosome; REF, hgl9 Reference base; ALT, alternate base observed in HPG patients; dbSNP, NCBI identifier for SNP site; missense, site position in codon, amino acid changes in sequence translated from mRNA upon replacement of REF base with ALT base; synonym, site position in codon, no change in amino acid sequence translated from mRNA upon replacement of REF base with ALT base; utr-3p, transcribed but untranslated region (UTR) of mRNA (UTR) in final (3') exon; utr-5p, UTR in first (5p) exon; utr-NC, UTR in internal exon; SS DIST, distance to splicesite; OR, Odds ratio; Conf. Int., Confidence interval; pValue, probability that HPG and KG allele distributions are not different.
TABLE 5
43 microRNAs differentially regulated in glaucoma optic nerve (GON) vs normal optic nerve (ON) and targeting HPG genes, with microRNA name and the mature arm with strongest differential expression. Group 1 and 2: 11 microRNA elevated in GON. Group 3 and 4: 11 microRNA decreased in GON. Group 5 and 6: 16 microRNA present in ON and absent or very low in GON. microRNA names, miRbase [38], Ambros, et al., 2002 [39].
TABLE 5
log2
ON g log2
GON ON RT / ON ON n microRNA Stronger arm level level level n / RT n
Group 1 hsa-miR-130a hsa-miR-130a-3p 46,075 21 ,770 14,388 1 1
8 up GON»ON hsa-miR-1246 hsa-miR-1246 1 ,990 1 ,093 421 1 1
ON » RT hsa-miR-214 hsa-miR-214-3p 3,655 2,018 32 1 6 hsa-miR-452 hsa-miR-452-5p 1 ,258 381 41 2 3 hsa-miR-224 hsa-miR-224-5p 741 306 49 1 3 hsa-miR-4448 hsa-miR-4448 720 195 27 2 3 hsa-miR-483 hsa-miR-483-5p 1 ,365 353 1 2 7 hsa-miR-483 hsa-miR-483-3p 505 231 1 1 7
Group 2 hsa-miR-9 hsa-miR-9-3p 12,281 5,817 57,284 1 -3
3 up GON » ON hsa-miR-767 hsa-miR-767-5p 290 83 890 2 -3
RT » ON hsa-miR-449a hsa-miR-449a 376 1 27 8 -4
Group 3 hsa-miR-100 hsa-miR-100-5p 260,330 403,037 50,309 -1 3
7 down GON hsa-miR-219 hsa-miR-219-5p 92,592 226,290 454 -1 9
ON » RT hsa-miR-219 hsa-miR-219-2-3p 80,029 121 ,995 23 -1 12 hsa-miR-99b hsa-miR-99b-5p 77,728 123,776 37,955 -1 2 hsa-miR-139 hsa-miR-139-5p 1 ,067 2,434 1 ,077 -1 1 hsa-miR-323b hsa-miR-323b-3p 466 1 ,772 913 -2 1 hsa-miR-3613 hsa-miR-3613-3p 268 493 44 -1 3
Group 4 hsa-miR-124 hsa-miR-124-3p 8,155 21 ,083 90,984 -1 -2
4 down GON « ON hsa-miR-129 hsa-miR-129-5p 1 ,281 3,925 11 ,942 -2 -2
RT » ON hsa-miR-211 hsa-miR-211 -5p 279 1 ,283 20,755 -2 -4 hsa-miR-124 hsa-miR-124-5p 165 316 18,110 -1 -6
Group 5 hsa-miR-34b hsa-miR-34b-3p 0 125 0 -7 7
TABLE 5
log2
ON g log2
GON ON RT / ON ON n microRNA Stronger arm level level level n / RT n
16 off in GON hsa-miR-3182 hsa-miR-3182 0 106 0 -7 7
ON » RT hsa-miR-4640 hsa-miR-4640-3p 0 99 0 -7 7 hsa-miR-2276 hsa-miR-2276 0 74 0 -6 6 hsa-miR-4423 hsa-miR-4423-5p 0 65 0 -6 6 hsa-miR-2277 hsa-miR-2277-3p 0 65 0 -6 6 hsa-miR-1250 hsa-miR-1250 0 199 21 -8 3 hsa-miR-1226 hsa-miR-1226-3p 0 153 55 -7 1 hsa-miR-18a hsa-miR-18a-3p 0 143 31 -7 2 hsa-miR-4677 hsa-miR-4677-3p 0 111 35 -7 2 hsa-miR-513c hsa-miR-513c-5p 0 107 15 -7 3 hsa-miR-138 hsa-miR-138-2-3p 0 99 29 -7 2 hsa-miR-548ah hsa-miR-548ah-3p 0 99 29 -7 2 hsa-miR-505 hsa-miR-505-5p 0 87 24 -6 2 hsa-miR-193b hsa-miR-193b-5p 0 60 27 -6 1 hsa-miR-18b hsa-miR-18b-5p 0 46 26 -6 1
Group 6 hsa-miR-3117 hsa-miR-3117-3p 0 132 659 -7 -2
5 off in GON hsa-miR-30b hsa-miR-30b-3p 0 124 747 -7 -3
RT » ON hsa-miR-105 hsa-miR-105-5p 0 111 648 -7 -3 hsa-miR-19b hsa-miR-19b-1-5p 0 71 152 -6 -1 hsa-miR-376a hsa-miR-376a-5p 0 67 168 -6 -1
RT, RT n, retina; ΟΝ,ΟΝ n, optic nerve; GON, ON g, glaucomatous optic nerve; A » B, "A significantly higher than B".
TABLE 6.
18 microRNAs differentially regulated in glaucoma vs normal optic nerve and targeting HPG genes, with microRNA name and the mature arm with strongest differential expression, evaluated through maximum and total expression levels. Group 1 : 13 microRNA elevated in GON, lower or absent in RT. Group 2: microRNA decreased in GON, lower in RT.
Max Max log2 ON g / Total Total log2 ON n / microRNA Stronger arm ON g level ON n level ON n ON levels RT levels RT n
Group 1 hsa-let-7c hsa-let-7c 263,645 188,920 0.48 811 ,062 174,457 2.22
13 up GON»ON hsa-miR-1248 hsa-miR-1248 2,582 1 ,069 1.27 5,046 708 2.83
ON » RT hsa-miR-574 hsa-miR-574-5p 956 660 0.53 2,845 210 3.75 hsa-miR-27a hsa-miR-27a-5p 161 66 1.27 271 24 3.44 hsa-miR-145 hsa-miR-145-3p 116 75 0.62 230 0 7.85 hsa-miR-5584 hsa-miR-5584-5p 97 71 0.44 229 0 7.84 hsa-let-7a-2 hsa-let-7a-2-3p 107 55 0.95 169 25 2.71 hsa-miR-549 hsa-miR-549 161 0 7.34 129 0 7.02 hsa-miR-675 hsa-miR-675-3p 161 0 7.34 129 0 7.02 hsa-miR-148a hsa-miR-148a-3p 42,129 26,930 0.65 107,405 81 ,283 0.40 hsa-miR-455 hsa-miR-455-5p 1 ,203 616 0.96 3,180 2,419 0.39 hsa-miR-31 hsa-miR-31-5p 10,311 5,767 0.84 26,198 77,611 -1.57 hsa-miR-216a hsa-miR-216a 524 330 0.67 1 ,355 5,990 -2.14
Group 2 hsa-miR-181 b hsa-miR-181 b-5p 96,524 115,563 -0.26 400,963 283,331 0.50
4 down GON « ON hsa-miR-545 hsa-miR-545-5p 0 83 -6.39 110 25 2.10
ON » RT hsa-miR-3622a hsa-miR-3622a-5p 0 97 -6.61 112 16 2.73 hsa-miR-548ah hsa-miR-548ah-5p 0 69 -6.13 86 47 0.86
RT, RT n, retina; ON, ON n, optic nerve; GON, ON g, glaucomatous optic nerve; A » B, level in A significantly higher than B.
microRNA names, miRbase [38] and Ambros, ef a/., 2002 [39]
[0129] Inhibitory nucleic acids or small inhibitory nucleic acids (siNAs) can be used in therapy treatments in combination with measurement of expression levels. For example, Tables 5 and 6 list microRNA differentially expressed in glaucomatous optic nerve (GON) versus normal optic nerve (ON or NON). microRNA underexpressed in GON can be neuroprotective when administered to a glaucoma patient. Targeting microRNA
overexpressed in NON, e.g., with an inhibitory nucleic acid, can be neuroprotective to a glaucoma patient. Conversely, microRNA underexpressed in GON can be pathological and thus targeted, e.g., with an inhibitory nucleic acid, in a glaucoma patient; microRNA overexpressed in NON can be neuroprotective when administered to a glaucoma patient. DISCUSSION
[0130] Primary open-angle glaucoma (POAG) is clinically and genetically complex and enigmatic. Clinically, it is usually bilateral, though it may be asymmetric. People develop elevated intraocular pressure (IOP) due to disturbed aqueous humor dynamics. They have hampered outflow from the eye of the nutrient-containing aqueous humor. This is associated with nearly constant rate of aqueous production, no matter what the steady state IOP. Sustained, above-normal levels of IOP constitute the largest risk factor for developing characteristic damage to visual function, the clinical basis for glaucoma diagnosis. This damage affects the retinal ganglion cells, their axons, and the optic nerve in a diagnostic manner. Of clinical interest, not all eyes that have sustained elevated IOP develop damaged visual function (this is called ocular hypertension); and not all eyes that develop characteristic glaucomatous visual function damage have elevated IOP. Ten percent of untreated patients included in the Ocular Hypertension Treatment Study, enrolled with a sustained IOP elevation of 32 mmHg or less, developed glaucoma within 5 years, and 90 percent did not [40]. Thus, structures in the front of the eye and different structures in the posterior pole of the eye and its optic nerve are clinically separately impacted. This suggests the majority of high-pressure POAG (HPG) cases involve separate sets of causative gene alterations, one set for the anterior segment and the other set for the posterior segment of the eye. In addition to this, a history of POAG in a close family member doubles the risk for a person developing the disease. [0131] In the study, genome-wide exome sequencing was used to investigate DNA variants in exons genome-wide from 295 Caucasian, high-pressure POAG patients whose genomes were previously evaluated in the NEIGHBOR GWAS. Our analysis strategy minimized false positive observations, focused on single nucleotide polymorphisms (SNPs),
and compared frequencies of variants found in the POAG cases to frequencies in the ESP and 1000G databases. Further analysis of SNPs with POAG frequencies that differed significantly from 1000G or ESP identified genes of interest, grouped by number of SNPs with frequency differences and maximum frequency difference. [0132] This study shows that in the part of the genome sequence containing exons and nearby bases, we found nearly 3 million SNP sites with bases different from the hgl9 reference genome in three or more HPG patients. These sites were also found as SNP sites in the comparative general population databases, specifically, in 1000G Phase 1, in the European subset of 1000G Phase 3, in ESP, and in the European subset of ESP. The HPG variant sites were calculated directly from the sequence data and then compared. The high level of consistency with the public databases indicates the alignment and variant calling methods used to process patient sequence data were accurate.
[0133] For all sites that differed from hgl9 in three or more patients, we revisited the exome sequence data, and for every patient inspected whether the data supported each possible allele, including the reference hgl9 base, the most frequent alternative (non-hgl9) allele, and any additional allele observed by us or others at that site. We calculated that fewer than 5% of the hgl9 reference bases in 1000G SNP sites were the minor, less frequent allele. Since genotype calls for those sites using exome sequence data may be biased toward the minor (same as hgl9) allele, we set those sites aside for future consideration. We further calculated what minimum number of patient observations at a given site would be needed to obtain allele frequencies between 0.00 and 1.00 with 0.02 increments; that number was 25 patients. So we further set aside any SNP site that was measured in fewer than 25 patients.
[0134] When we compared the HPG patient minor allele frequencies for each remaining site with the 1000G database frequencies for that site, we noted, when the filters were applied, the HPG patients had a number of sites where the general population minor allele was over-represented in HPG. These sites provide pointers to locations within the genome where polymorphisms occur at disproportionate rates in HPG patients. Next, having identified how many HPG had the minor allele in comparison with the normative database, we were able to identify the SNP sites that are vastly overrepresented in HPG.
[0135] Using a minimum of 0.10 frequency difference between HPG patients and general population databases, requiring a minimum of 25 HPG patients with observations at a given site, and considering only sites within or near genes and expressed in ocular tissues,
933 SNP sites were retained for statistical analysis. Requiring the odds ratio 95% confidence interval lower bound be above 1.0, HPG frequency exceed least squares fit to data, and p-value remain significant after Bonferroni correction for multiple testing, 160 sites in 140 genes remained. [0136] We compared the 140 genes to lists of genes previously implicated in glaucoma, neurological diseases, or other eye diseases and to lists of genes involved in inflammatory response, cell adhesion, or expression in trabecular meshwork and obtained annotations for 85 of 140 genes.
[0137] This is the first investigation of the actual exome of HPG patients. This contrasts with array-based association studies that looked for markers primarily outside the exome. The sites investigated here are all within gene regulatory or coding regions. Not all genes were sequenced in sufficient depth here for full consideration. For example, the CDKN2B/CDKN2B-AS1 association found through array-based GWAS was not replicated here because the associated sites were not sequenced. There may be other genes with a similar status.
[0138] Clinical filters were used here for discovery. Prior studies that used exome sequence data for genome-wide association used p-values as the discriminating criterion, some in a burden test and some through classic association tests. It would be of interest to return to the data in these other studies with the clinical criteria used here. [0139] In the current study, the majority of the sites identified as associated with disease are located in the regulatory regions of the genome adjacent to coding regions. Schork et al, PLoS Genet. 2013 Apr;9(4):el003449 [41] noted that for associations in traditional GWAS, imputation indicates SNPs in untranslated regions and proximal promoters are over-represented, consistent with our findings here through direct exome sequencing. While it is entirely possible that SNPs involved in glaucoma with high pressure are located outside these regulatory regions, as in CDKN2B/CDKN2B-AS1, this study is the first deep sequencing analysis of regulatory exons and proximal promoter and intron bases.
[0140] The first gene linked to glaucoma was Myocilin (MYOC). We measured all bases in MYOC in patients, and no single site passed our filters. MYOC was linked to glaucoma through family studies and was primarily related to juvenile onset open-angle glaucoma. MYOC mutations are present in only about 3-4% of adult POAG, as reported in
Alward et al, Arch. Ophthalmology. (2002) 120(9): 1189-97 [42], and reviewed in Fingert el al, Surv. Ophthalmology. (2002) 47(6):547-61 [43].
[0141] Similarly, we would not expect to find SNPs associated with the optineurin gene (OPTN) in our investigation since it has been found to be associated with normal tension glaucoma, not with HPG, as reported in Rezaie et al, Science. (2002)
295(5557): 1077-9 [44].
[0142] For some SNP sites, a high percentage, greater than 80%, of HPG patients had the minor allele compared with a much smaller fraction of the normative databases. This finding may point to a few select genes, whose polymorphisms are heavily represented in the HPG patients.
[0143] In this study, we confined our attention to sequencing and analyzing the exome in self-reported Caucasian, European-background HPG patients. Our sequencing included the exons, their UTRs, and nearby bases in the introns. After sequencing, comparison with the hgl9 reference database disclosed a huge number, many millions, of SNP sites in our HPG patients, any one, or more, of which might explain HPG. To find the HPG sites and the associated genes has been the goal. To identify sites related to disease, we developed a clinically intuitive, serial procedure to identify, in comparison with general population databases, e.g., the ESP and 1000 Genome databases, a workable number of candidate SNP sites in HPG patients. This method provides a path to a list of associated, potentially causative disease genes that can be used to predict onset, progression, severity, or recurrence of disease after treatment. Additional work will require assessment of the role of candidate genes in the anterior and posterior segments of the eye. Further, the sites and their genes can be considered in doublets or higher numbers of interacting mutations that affect the eye and cause HPG. [0144] This investigation identified, and categorized, SNP-containing genes present in unusually high frequency in HPG patients compared with the general population.
[0145] In summary, we found 140 genes associated with and/or causative of HPG and appropriate for predicting onset, progression, or severity of disease or recurrence after treatment. The vast majority of the 140 genes were not previously associated with HPG. These genes were found selectively in HPG compared to general and European population datasets.
[0146] Five of the 140 genes identified in the present study were previously associated with glaucoma. This study shows that the 135 newly associated genes and the five previously associated genes all have variants with highly elevated frequencies in HPG.
REFERENCES
1. Quigley HA, Broman AT. The number of people with glaucoma worldwide in 2010 and 2020. Br J Ophthalmol. 2006 Mar;90(3):262-7. PubMed PMID: 16488940; PubMed Central PMCID: PMC1856963.
2. Tielsch JM, Sommer A, Katz J, Royall RM, Quigley HA, Javitt J. Racial variations in the prevalence of primary open-angle glaucoma. The Baltimore Eye Survey. JAMA. 1991 Jul 17;266(3):369-74. PubMed PMID: 2056646.
3. The AGIS Investigators. The Advanced Glaucoma Intervention Study (AGIS): 7.
The relationship between control of intraocular pressure and visual field
deterioration. Am J Ophthalmol. 2000 Oct;130(4):429-40. PubMed PMID:
11024415.
4. Anderson DR. Glaucoma: the damage caused by pressure. XL VI Edward Jackson memorial lecture. Am J Ophthalmol. 1989 Nov 15;108(5):485-95. Review. PubMed PMID: 2683792.
5. Mitchell P, Rochtchina E, Lee AJ, Wang JJ. Bias in self-reported family history and relationship to glaucoma: the Blue Mountains Eye Study. Ophthalmic Epidemiol. 2002 Dec;9(5):333-45. PubMed PMID: 12528918.
6. Tielsch JM, Katz J, Sommer A, Quigley HA, Javitt JC. Family history and risk of primary open angle glaucoma. The Baltimore Eye Survey. Arch Ophthalmol. 1994 Jan;112(l):69-73. PubMed PMID: 8285897.
7. Sommer A, Tielsch JM, Katz J, Quigley HA, Gottsch JD, Javitt J, Singh K.
Relationship between intraocular pressure and primary open angle glaucoma among white and black Americans. The Baltimore Eye Survey. Arch Ophthalmol. 1991 Aug; 109(8): 1090-5. PubMed PMID: 1867550.
8. Wiggs JL, Yaspan BL, Hauser MA, Kang JH, Allingham RR, Olson LM, Abdrabou W, Fan BJ, Wang DY, Brodeur W, Budenz DL, Caprioli J, Crenshaw A, Crooks K, Delbono E, Doheny KF, Friedman DS, Gaasterland D, Gaasterland T, Laurie C, Lee RK, Lichter PR, Loomis S, Liu Y, Medeiros FA, McCarty C, Mirel D, Moroi SE, Musch DC, Realini A, Rozsa FW, Schuman JS, Scott K, Singh K, Stein JD, Trager EH, Vanveldhuisen P, Vollrath D, Wollstein G, Yoneyama S, Zhang K, Weinreb RN, Ernst J, Kellis M, Masuda T, Zack D, Richards JE, Pericak-Vance M, Pasquale LR, Haines JL. Common variants at 9p21 and 8q22 are associated with increased susceptibility to optic nerve degeneration in glaucoma. PLoS Genet.
2012;8(4):el002654. doi: 10.1371/journal.pgen.1002654. Epub 2012 Apr 26.
PubMed PMID: 22570617; PubMed Central PMCID: PMC3343074.
Fan BJ, Wang DY, Pasquale LR, Haines JL, Wiggs JL. Genetic variants associated with optic nerve vertical cup-to-disc ratio are risk factors for primary open angle glaucoma in a US Caucasian population. Invest Ophthalmol Vis Sci. 2011 Mar 28;52(3): 1788-92. doi: 10.1167/iovs.10-6339. PubMed PMID: 21398277; PubMed Central PMCID: PMC3101676.
Thorleifsson G, Magnusson KP, Sulem P, Walters GB, Gudbjartsson DF, Stefansson H, Jonsson T, Jonasdottir A, Jonasdottir A, Stefansdottir G, Masson G, Hardarson GA, Petursson H, Arnarsson A, Motallebipour M, Wallerman O, Wadelius C, Gulcher JR, Thorsteinsdottir U, Kong A, Jonasson F, Stefansson K. Common sequence variants in the LOXL1 gene confer susceptibility to exfoliation glaucoma. Science. 2007 Sep 7;317(5843): 1397-400. Epub 2007 Aug 9. PubMed PMID:
17690259.
Thorleifsson G, Walters GB, Hewitt AW, Masson G, Helgason A, DeWan A, Sigurdsson A, Jonasdottir A, Gudjonsson SA, Magnusson KP, Stefansson H, Lam DS, Tarn PO, Gudmundsdottir GJ, Southgate L, Burdon KP, Gottfredsdottir MS, Aldred MA, Mitchell P, St Clair D, Collier DA, Tang N, Sveinsson O, Macgregor S, Martin NG, Cree AJ, Gibson J, Macleod A, Jacob A, Ennis S, Young TL, Chan JC, Karwatowski WS, Hammond CJ, Thordarson K, Zhang M, Wadelius C, Lotery AJ, Trembath RC, Pang CP, Hoh J, Craig JE, Kong A, Mackey DA, Jonasson F, Thorsteinsdottir U, Stefansson K. Common variants near CAV1 and CAV2 are associated with primary open-angle glaucoma. Nat Genet. 2010 Oct;42(10):906-9. doi: 10.1038/ng.661. Epub 2010 Sep 12. PubMed PMID: 20835238; PubMed Central PMCID: PMC3222888.
Burdon KP, Macgregor S, Hewitt AW, Sharma S, Chidlow G, Mills RA, Danoy P, Casson R, Viswanathan AC, Liu JZ, Landers J, Henders AK, Wood J, Souzeau E, Crawford A, Leo P, Wang JJ, Rochtchina E, Nyholt DR, Martin NG, Montgomery GW, Mitchell P, Brown MA, Mackey DA, Craig JE. Genome-wide association study identifies susceptibility loci for open angle glaucoma at TMCOl and
CDKN2B-AS 1. Nat Genet. 2011 Jun;43(6):574-8. doi: 10.1038/ng.824. Epub 2011 May 1. PubMed PMID: 21532571.
Nowak A, Majsterek I, Przybylowska-Sygut K, Pytel D, Szymanek K, Szaflik J, Szaflik JP. Analysis of the Expression and Polymorphism of APOE, HSP, BDNF, and GRIN2B Genes Associated with the Neurodegeneration Process in the
Pathogenesis of Primary Open Angle Glaucoma. Biomed Res Int.
2015;2015:258281. doi: 10.1155/2015/258281. Epub 2015 Mar 29. PubMed PMID: 25893192; PubMed Central PMCID: PMC4393917.
Nowak A, Szaflik JP, Gacek M, Przybylowska-Sygut K, Kaminska A, Szaflik J, Majsterek I. BDNF and HSP gene polymorphisms and their influence on the progression of primary open-angle glaucoma in a Polish population. Arch Med Sci. 2014 Dec 22;10(6): 1206-13. doi: 10.5114/aoms.2014.45089. Epub 2014 Sep 5. PubMed PMID: 25624860; PubMed Central PMCID: PMC4296062.
Psychiatric GWAS Consortium Coordinating Committee, Cichon S, Craddock N, Daly M, Faraone SV, Gejman PV, Kelsoe J, Lehner T, Levinson DF, Moran A, Sklar P, Sullivan PF. Genomewide association studies: history, rationale, and prospects for psychiatric disorders. Am J Psychiatry. 2009 May;166(5):540-56. doi: 10.1176/appi.ajp.2008.08091354. Epub 2009 Apr 1. Review. PubMed PMID:
19339359; PubMed Central PMCID: PMC3894622.
Nho K, Shen L, Kim S, Swaminathan S, Risacher SL, Saykin AJ; Alzheimer's Disease Neuroimaging Initiative (ADNI). The effect of reference panels and software tools on genotype imputation. AMIA Annu Symp Proc. 2011;2011 : 1013-8. Epub 2011 Oct 22. PubMed PMID: 22195161; PubMed Central PMCID:
PMC3243280.
Lee S, Abecasis GR, Boehnke M, Lin X. Rare-variant association analysis: study designs and statistical tests. Am J Hum Genet. 2014 Jul 3;95(l):5-23. doi:
10.1016/j.ajhg.2014.06.009. Review PubMed PMID: 24995866; PubMed Central PMCID: PMC4085641.
Zuk O, Schaffner SF, Samocha K, Do R, Hechter E, Kathiresan S, Daly MJ, Neale BM, Sunyaev SR, Lander ES. Searching for missing heritability: designing rare variant association studies. Proc Natl Acad Sci U S A. 2014 Jan 28;11 l(4):E455-64. doi: 10.1073/pnas. l 322563111. Epub 2014 Jan 17. PubMed PMID: 24443550; PubMed Central PMCID: PMC3910587.
Nails MA, Pankratz N, Lill CM, Do CB, Hernandez DG, Saad M, DeStefano AL, Kara E, Bras J, Sharma M, Schulte C, Keller MF, Arepalli S, Letson C, Edsall C, Stefansson H, Liu X, Pliner H, Lee JH, Cheng R; International Parkinson's Disease Genomics Consortium (IPDGC); Parkinson's Study Group (PSG) Parkinson's Research: The Organized GENetics Initiative (PROGENI); 23andMe; GenePD; NeuroGenetics Research Consortium (NGRC); Hussman Institute of Human Genomics (HIHG); Ashkenazi Jewish Dataset Investigator; Cohorts for Health and Aging Research in Genetic Epidemiology (CHARGE); North American Brain Expression Consortium (NABEC); United Kingdom Brain Expression Consortium (UKBEC); Greek Parkinson's Disease Consortium; Alzheimer Genetic Analysis Group, Ikram MA, loannidis JP, Hadjigeorgiou GM, Bis JC, Martinez M, Perlmutter JS, Goate A, Marder K, Fiske B, Sutherland M, Xiromerisiou G, Myers RH, Clark LN, Stefansson K, Hardy JA, Heutink P, Chen H, Wood NW, Houlden H, Payami H, Brice A, Scott WK, Gasser T, Bertram L, Eriksson N, Foroud T, Singleton AB. Large-scale meta-analysis of genome -wide association data identifies six new risk loci for Parkinson's disease. Nat Genet. 2014 Sep;46(9):989-93. doi:
10.1038/ng.3043. Epub 2014 Jul 27. PubMed PMID: 25064009; PubMed Central PMCID: PMC4146673.
Ng PC, Kirkness EF. Whole genome sequencing. Methods Mol Biol.
2010;628:215-26. doi: 10.1007/978-1-60327-367-1 12. Review. PubMed PMID: 20238084.
Fu W, O'Connor TD, Jun G, Kang HM, Abecasis G, Leal SM, Gabriel S, Rieder MJ, Altshuler D, Shendure J, Nickerson DA, Bamshad MJ; NHLBI Exome Sequencing Project, Akey JM. Analysis of 6,515 exomes reveals the recent origin of most human protein-coding variants. Nature. 2013 Jan 10;493(7431):216-20. doi:
10.1038/naturel 1690. Epub 2012 Nov 28. Erratum in: Nature. 2013 Mar
14;495(7440):270. Rieder, Mark J [added]. PubMed PMID: 23201682; PubMed Central PMCID: PMC3676746.
Volk A, Conboy E, Wical B, Patterson M, Kirmani S. Whole-Exome Sequencing in the Clinic: Lessons from Six Consecutive Cases from the Clinician's Perspective. Mol Syndromol. 2015 Feb;6(l):23-31. doi: 10.1159/000371598. Epub 2015 Feb 3.
Review. PubMed PMID: 25852444; PubMed Central PMCID: PMC4369115.
Ederer F, Gaasterland DE, Sullivan EK; AGIS Investigators. The Advanced
Glaucoma Intervention Study (AGIS): 1. Study design and methods and baseline characteristics of study patients. Control Clin Trials. 1994 Aug; 15(4) :299- 325. PubMed PMID: 7956270.
van Koolwijk LM, Bunce C, Viswanathan AC. Gene finding in primary open-angle glaucoma. J Glaucoma. 2013 Aug;22(6):473-86. doi:
10.1097/IJG.0b013e318255bc37. Review. PubMed PMID: 22549476.
Burdon KP. Genome -wide association studies in the hunt for genes causing primary open-angle glaucoma: a review. Clin Experiment Ophthalmol. 2012 May-
Jun;40(4):358-63. doi: 10.1111/j.1442-9071.2011.02744.x. Epub 2012 Feb 20.
Review. PubMed PMID: 22171998.
Allingham RR, Liu Y, Rhee DJ. The genetics of primary open-angle glaucoma: a review. Exp Eye Res. 2009 Apr;88(4):837-44. doi: 10.1016/j.exer.2008.11.003. Epub 2008 Nov 14. Review. PubMed PMID: 19061886.
Wiggs JL, Hauser MA, Abdrabou W, Allingham RR, Budenz DL, Delbono E, Friedman DS, Kang JH, Gaasterland D, Gaasterland T, Lee RK, Lichter PR, Loomis S, Liu Y, McCarty C, Medeiros FA, Moroi SE, Olson LM, Realini A, Richards JE, Rozsa FW, Schuman JS, Singh K, Stein JD, Vollrath D, Weinreb RN, Wollstein G, Yaspan BL, Yoneyama S, Zack D, Zhang K, Pericak- Vance M,
Pasquale LR, Haines JL. The NEIGHBOR consortium primary open-angle glaucoma genome -wide association study: rationale, study design, and clinical variables. J Glaucoma. 2013 Sep;22(7):517-25. doi:
10.1097/IJG.0b013e31824d4fd8. PubMed PMID: 22828004; PubMed Central PMCID: PMC3485429.
Karolchik D, Barber GP, Casper J, Clawson H, Cline MS, Diekhans M, Dreszer TR, Fujita PA, Guruvadoo L, Haeussler M, Harte RA, Heitner S, Hinrichs AS, Learned K, Lee BT, Li CH, Raney BJ, Rhead B, Rosenbloom KR, Sloan CA, Speir ML, Zweig AS, Haussler D, Kuhn RM, Kent WJ. The UCSC Genome Browser database: 2014 update. Nucleic Acids Res. 2013 Nov 21. [Epub ahead of print] PubMed
PMID: 24270787.
29. Langmead B, Salzberg SL. Fast gapped-read alignment with Bowtie 2. Nat Methods. 2012 Mar 4;9(4):357-9. doi: 10.1038/nmeth. l923. PubMed PMID: 22388286;
PubMed Central PMCID: PMC3322381.
30. Li H.*, Handsaker B.*, Wysoker A., Fennell T., Ruan J., Homer N., Marth G.,
Abecasis G., Durbin R. and 1000 Genome Project Data Processing Subgroup (2009)
The Sequence alignment/map (SAM) format and SAMtools. Bioinformatics, 25, 2078-9. PubMed PMID: 19505943.
31. Seattleseq Annotation Server, Seattle,WA (URL:
http://snp.gs.washington.edu/SeattleSeqAnnotationl37) [September - December 2013]
32. Ng SB, Turner EH, Robertson PD, Flygare SD, Bigham AW, Lee C, Shaffer T, Wong M, Bhattacharjee A, Eichler EE, Bamshad M, Nickerson DA, Shendure J. Targeted capture and massively parallel sequencing of 12 human exomes. Nature. 2009 Sepl0;461(7261):272-6. doi: 10.1038/nature08250. Epub 2009 Aug 16.
PubMed PMID: 19684571; PubMed Central PMCID: PMC2844771.
33. Adzhubei IA, Schmidt S, Peshkin L, Ramensky VE, Gerasimova A, Bork P,
Kondrashov AS, Sunyaev SR. A method and server for predicting damaging missense mutations. Nat Methods. 2010 Apr;7(4):248-9. doi: 10.1038/nmeth0410- 248. PubMed PMID: 20354512; PubMed Central PMCID: PMC2855889.
34. Sim NL, Kumar P, Hu J, Henikoff S, Schneider G, Ng PC. SIFT web server:
predicting effects of amino acid substitutions on proteins. Nucleic Acids Res. 2012 Jul;40(Web Server issue):W452-7. doi: 10.1093/nar/gks539. Epub 2012 Jun 11. PubMed PMID: 22689647; PubMed Central PMCID: PMC3394338.
35. 1000 Genomes Project Consortium, Abecasis GR, Auton A, Brooks LD, DePristo MA, Durbin RM, Handsaker RE, Kang HM, Marth GT, McVean GA. An integrated map of genetic variation from 1,092 human genomes. Nature. 2012 Nov
l;491(7422):56-65. doi: 10.1038/naturel l632. PubMed PMID: 23128226; PubMed Central PMCID: PMC3498066.
36. Exome Variant Server, NHLBI GO Exome Sequencing Project (ESP), Seattle, WA (URL: http://evs.gs.washington.edu/EVS/) [September - December, 2013].
37. Human Gene Nomenclature Committee, genenames.org (Gene Identifier Table from ftp://ftp.ebi.ac.uk/pub/databases/genenames/new/tsv/hgnc_complete_set.txt)
38. mirBase, mirbase.org (microRNA identifiers with matures sequences from
ftp://mirbase.org/pub/mirbase/CURRENT/mature.fa.gz)
39. Ambros V, Barrel B, Barrel DP, Burge CB, Carrington JC, Chen X, Dreyfuss G, Eddy SR, Griffiths- Jones S, Marshall M, Matzke M, Ruvkun G, Tuschl T. A uniform system for microRNA annotation. RNA. 2003 Mar;9(3):277-9. PubMed PMID: 12592000; PubMed Central PMCID: PMC1370393.
40. Kass MA, Heuer DK, Higginbotham EJ, Johnson CA, Keltner JL, Miller JP, Parrish RK 2nd, Wilson MR, Gordon MO. The Ocular Hypertension Treatment Study: a
randomized trial determines that topical ocular hypotensive medication delays or prevents the onset of primary open-angle glaucoma. Arch Ophthalmol. 2002 Jun;120(6):701-13; discussion 829-30. PubMed PMID: 12049574.
41. Schork AJ, Thompson WK, Pham P, Torkamani A, Roddey JC, Sullivan PF, Kelsoe JR, O'Donovan MC, Furberg H; Tobacco and Genetics Consortium; Bipolar Disorder Psychiatric Genomics Consortium; Schizophrenia Psychiatric Genomics Consortium, Schork NJ, Andreassen OA, Dale AM. All SNPs are not created equal: genome-wide association studies reveal a consistent pattern of enrichment among functionally annotated SNPs. PLoS Genet. 2013 Apr;9(4):el003449. doi:
10.1371/journal.pgen.1003449. Epub 2013 Apr 25. PubMed PMID: 23637621; PubMed Central PMCID: PMC3636284
42. Alward WL, Kwon YH, Khanna CL, Johnson AT, Hayreh SS, Zimmerman MB, Narkiewicz J, Andorf JL, Moore PA, Fingert JH, Sheffield VC, Stone EM.
Variations in the myocilin gene in patients with open-angle glaucoma. Arch
Ophthalmol. 2002 Sep; 120(9): 1189-97. PubMed PMID: 12215093.
43. Fingert JH, Stone EM, Sheffield VC, Alward WL. Myocilin glaucoma. Surv
Ophthalmol. 2002 Nov-Dec;47(6):547-61. Review. PubMed PMID: 12504739.
44. Rezaie T, Child A, Hitchings R, Brice G, Miller L, Coca-Prados M, Heon E, Krupin T, Ritch R, Kreutzer D, Crick RP, Sarfarazi M. Adult-onset primary open-angle glaucoma caused by mutations in optineurin. Science. 2002 Feb 8;295(5557):1077- 9. PubMed PMID: 11834836.
[0147] It is understood that the examples and embodiments described herein are for illustrative purposes only and that various modifications or changes in light thereof will be suggested to persons skilled in the art and are to be included within the spirit and purview of this application and scope of the appended claims. All publications, patents, and patent applications cited herein are hereby incorporated by reference in their entirety for all purposes.
Claims
1. A method of identifying genes whose alleles are associative with or causative of the onset and/or progression and/or severity and/or recurrence i a disease, comprising:
a) sequencing or reviewing multiple exomes from patients who have been diagnosed with the disease and one or more exomes from one or more individuals known not to have the disease, wherein the one or more exomes from one or more individuals known not to have the disease comprise one or more reference exomes;
b) selecting exomes sequenced and read with a fidelity of 4 or fewer mismatches per 100 bases;
c) selecting for genes having one or more site variants in the exomes from patients who have been diagnosed with the disease with one or more properties, for example, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, or 18 properties, selected from:
i) site variant is found in one or more patients;
ii) site variant is observed in a general population dataset; iii) site variant is found in three or more patients;
iv) one or more reference exomes have the major allele;
v) site variant is the minor allele in reference exomes;
vi) site variant has only one alternate allele;
vii) site is within genome region with balanced G+C and A+T content;
site is located outside low complexity genome regions; site is located in genome region with no paralog within 95% identity; and
x) site variant is located on chromosomes 1-22 or site variant located on chromosome X or Y only if disease incidence is gender-biased;
xi) site was measured in 25 or more patients;
xii) site variant frequency in patients differs from general populations by more than expected measurement error, e.g., 0.05 (on a frequency scale from 0.00 - 1.00);
xiii) site variant frequency in patients exceeds general populations, e.g., by more than 0.10;
xiv) site variant is within a gene or regulatory regions influencing its expression as R A or protein;
xv) site variant is within or near a gene expressed in tissues relevant to disease;
xvi) odds ratio 95% confidence interval lower bound calculated for the site from patient and reference general population frequencies is above 1.00;
xvii) frequency of site variant in patients is above a line fitted to filtered sites represented as datapoints where X is reference general population frequency and Y is patient frequency, e.g. , fit with least squares linear regression; and
xviii) a p-value calculated with a 2x2 statistical test, e.g., Fisher's Exact Test, from numbers of alternate and reference alleles observed for the site in patients and in general population remains significant after correction for multiple testing.
2. The method of claim 1, wherein the disease is a neurodegenerative disease, cancer, a cardiovascular disease, an immune disease, an autoimmune disease, an endocrinologic disease, or an inflammatory disease.
3. The method of any one of claims 1 to 2, wherein the disease is a neurodegenerative disease.
4. The method of claim 3, wherein the disease is an ocular disease.
5. The method of claim 4, wherein the disease is primary open angle glaucoma (POAG).
6. The method of any one of claims 1 to 5, wherein the patients are symptomatic for the disease.
7. The method of any one of claims 1 to 6, wherein the method is computer implemented.
8. The method of any one of claims 1 to 7, wherein the site variants are selected from single nucleotide polymorphisms (SNPs), insertions, deletions and rearrangements.
9. The method of any one of claims 1 to 8, further comprising determining the expression levels of the genes from patient exomes and reference exomes.
10. The method of any one of claims 1 to 9, further comprising determining the expression levels of the microRNA from patient exomes and reference exomes.
11. The method of any one of claims 1 to 10, wherein sequencing comprises employing a next-generation sequencing (NGS) technique or method.
12. The method of any one of claims 1 to 11, comprising selecting exomes sequenced and read with a fidelity of 3 or fewer mismatches per 100 bases.
13. The method of any one of claims 1 to 12, wherein the general population comparison dataset is selected from one or more of 1000 Genomes
(1000genomes.org), the Exome Sequencing Project (evs.gs.washington.edu/EVS/) datasets, UKIOK (uklOk.org/), UCSC Genome Bioinformatics Site (genome.ucsc.edu/), and/or other available public and proprietary datasets.
14. The method of any one of claims 1 to 13, further comprising weighting said selected genes according to predictive power rankings of the collection of signature biomarkers.
15. A method for predicting onset and/or progression and/or severity and/or recurrence of primary open angle glaucoma (POAG) in a subject, the method comprising:
(a) receiving allelic information and/or expression levels of a collection of signature biomarkers from a biological sample taken from said subject suspected of developing or suffering POAG, wherein said collection of signature biomarkers comprises one or more genes and/or microRNA selected from the group consisting of: AATF, ABI1, ABBBP, ACTN2, ADAMTS15, ADCY2, AHNAK2, ANGEL2, ANKRD36, ANKRD36B, AN05, AP1M1, ARHGAP30, ASTN1, ATP6V1E2, BAB, CACNA1E, CACNA1I, CALM1, CCDC66, CD163, CDH13, CDH4, CDK17, CELF5, CHD8, CLCA4, CLEC7A,
CLSTN2, CNNM2, CNOT6, COL23A1, COL4A2, CRTAC1, CTU2, CYBA, DCBLD2, DHCR7, DNAJB11, DPF3, DRD2, EBF2, EN03, EPT1, ERI2, FDX1L, FLJ22184, FOXD4, FOXRED2, FRYL, GAS 7, GNG7, GOLGA3, GRIA1, GRID1, GRM4, HERC2, HLA-A, HLA-DRB 1 , IFI6, IMMT, INPP5D, ITGB4, KIAA0930, LACTB2, LCP2, LEMD3, LILRB2, LILRB3, LIN7A, LOC642846, LOC643387, LOC728537, LPHN3, LRP3, LRP4, LRRC37A, MAML3, MATR3, MCCC1, MCF2L, MEGF11, MGC21881, MINK1, MRPL23, MUC4, MYH9, MY01E, N6AMT1, NBPF16, NOM02, NUCKS1, PALM2, PCK1, PCM1, PDE4DIP, PML, POTEC, PPFIA2, PRKAG2, PRKCH, PRKDl, PRUNE2, R3HDM1, RABGAPl, RAD51B, RBFOXl, RIN3, SARDH, SCAF8, SEC14L1, SEL1L3, SEMA5A, SEMA5B, SIRT1, SLC30A8, SNTB1, SPN, SPRY1, SRRM2,
TMPRSS13, TNRC18, TOR1A, TRIM58, TSPAN11, TXNRD1, UNC5B, USP20, USP6, VAC 14, VARS2, VCAN, WASH1, XRCC5, ZDHHC7, ZMYND11, ZNF155, ZNF573, ZNF594, ZNF83, hsa-miR-100, hsa-miR-100-5p, hsa-miR-105, hsa-miR-105-5p, hsa-miR- 1226, hsa-miR-1226-3p, hsa-miR-124, hsa-miR-124-3p, hsa-miR-124-5p, hsa-miR-1250, hsa-miR-129, hsa-miR-129-5p, hsa-miR-138, hsa-miR-138-1, hsa-miR-138-2, hsa-miR- 138-2-3p, hsa-miR-139, hsa-miR-139-5p, hsa-miR-181b, hsa-miR-181b-5p, hsa-miR-18a, hsa-miR-18a-3p, hsa-miR-18b, hsa-miR-18b-5p, hsa-miR-193b, hsa-miR-193b-5p, hsa- miR-19b, hsa-miR-19b-l, hsa-miR-19b-l-5p, hsa-miR-211, hsa-miR-21 l-5p, hsa-miR-219, hsa-miR-219-1, hsa-miR-219-2, hsa-miR-219-2-3p, hsa-miR-219-5p, hsa-miR-2276, hsa- miR-2277, hsa-miR-2277-3p, hsa-miR-30b, hsa-miR-30b-3p, hsa-miR-3117, hsa-miR- 3117-3p, hsa-miR-3182, hsa-miR-323b, hsa-miR-323b-3p, hsa-miR-34b, hsa-miR-34b-3p, hsa-miR-3613, hsa-miR-3613-3p, hsa-miR-3622a, hsa-miR-3622a-5p, hsa-miR-376a , hsa- miR-376a-5p, hsa-miR-4423, hsa-miR-4423-5p, hsa-miR-4640, hsa-miR-4640-3p, hsa- miR-4677, hsa-miR-4677-3p, hsa-miR-505, hsa-miR-505-5p, hsa-miR-513c, hsa-miR- 513c-5p, hsa-miR-545, hsa-miR-545-5p, hsa-miR-548ah, hsa-miR-548ah-3p, hsa-miR- 548ah-5p, hsa-miR-99b, hsa-miR-99b-5p, hsa-miR-1246, hsa-miR-1248, hsa-miR-130a, hsa-miR-130a-3p, hsa-miR-145, hsa-miR-145-3p, hsa-miR-148a, hsa-miR-148a-3p, hsa- miR-214, hsa-miR-214-3p, hsa-miR-216a, hsa-miR-224, hsa-miR-224-5p, hsa-miR-27a-5p, hsa-miR-31, hsa-miR-31 -5p, hsa-miR-4448, hsa-miR-449a, hsa-miR-452, hsa-miR-452-5p, hsa-miR-455, hsa-miR-455-5p, hsa-miR-483, hsa-miR-483-3p, hsa-miR-483-5p, hsa-miR- 549, hsa-miR-5584, hsa-miR-5584-5p, hsa-miR-574, hsa-miR-574-5p, hsa-miR-675, hsa- miR-675-3p, hsa-miR-767, hsa-miR-767-5p, hsa-miR-9, hsa-miR-9-3p, msa-miR-27a, hsa- let-7a, hsa-let-7a-2, hsa-let-7a-2-3p, and hsa-let-7c;
(b) applying the allelic information and/or expression levels to a predictive model relating allelic information and/or expression levels of said collection of signature biomarkers with onset of POAG; and (c) evaluating an output of said predictive model to predict onset of POAG in said individual; and/or
(c) applying the allelic information and/or expression levels to a predictive model relating allelic information and/or expression levels of said collection of signature biomarkers with progression of POAG; and (e) evaluating an output of said predictive model to predict progression of POAG in said individual; and/or
(d) applying the allelic information and/or expression levels to a predictive model relating allelic information and/or expression levels of said collection of signature biomarkers with severity of POAG; and (g) evaluating an output of said predictive model to predict severity of POAG in said individual; and/or
(e) applying the allelic information and/or expression levels to a predictive model relating allelic information and/or expression levels of said collection of signature biomarkers with recurrence of POAG; and (i) evaluating an output of said predictive model to predict recurrence of POAG in said individual.
16. The method of claim 15, wherein said collection of signature biomarkers comprises one or more genes selected from the group consisting of: AATF, ABI1, ABI3BP, ACTN2, ADAMTS15, ADCY2, AHNAK2, ANGEL2, ANKRD36, ANKRD36B, AN05, AP1M1, ARHGAP30, ASTN1, ATP6V1E2, BAI3, CACNA1E,
CACNAll, CALMl, CCDC66, CD163, CDH13, CDH4, CDK17, CELF5, CHD8, CLCA4, CLEC7A, CLSTN2, CNNM2, CNOT6, COL23A1, COL4A2, CRTAC1, CTU2, CYBA, DCBLD2, DHCR7, DNAJB11, DPF3, DRD2, EBF2, EN03, EPT1, ERI2, FDX1L, FLJ22184, FOXD4, FOXRED2, FRYL, GAS 7, GNG7, GOLGA3, GRIAl, GRIDl, GRM4, HERC2, HLA-A, HLA-DRB1, IFI6, IMMT, INPP5D, ITGB4, KIAA0930, LACTB2, LCP2, LEMD3, LILRB2, LILRB3, LIN7A, LOC642846, LOC643387, LOC728537, LPHN3, LRP3, LRP4, LRRC37A, MAML3, MATR3, MCCC1, MCF2L, MEGF11, MGC21881, MINK1, MRPL23, MUC4, MYH9, MYOIE, N6AMT1, NBPF16, NOM02, NUCKSl, PALM2, PCKl, PCMl, PDE4DIP, PML, POTEC, PPFIA2, PRKAG2, PRKCH, PRKD1, PRUNE2, R3HDM1, RABGAP1, RAD51B, RBFOX1, RIN3, SARDH, SCAF8, SEC14L1, SEL1L3, SEMA5A, SEMA5B, SIRT1, SLC30A8, SNTB1, SPN, SPRY1, SRRM2, TMPRSS13, TNRC18, TORIA, TRIM58, TSPANl l, TXNRDl, UNC5B, USP20, USP6, VAC 14, VARS2, VCAN, WASH1, XRCC5, ZDHHC7, ZMYND11, ZNF155,
ZNF573, ZNF594, and ZNF83 wherein the position and allele of the genetic variation associated with and/or causative of POAG is as provided in Table 4.
17. The method of any one of claims 15 to 16, wherein said collection of signature biomarkers comprises one or more genes is selected from the group consisting of: COL4A2, COL23 Al , GAS 7, VCAN, and HLA-DRB 1 , wherein the position and allele of the genetic variation associated with and/or causative of POAG is as provided in Table 4.
18. The method of claim 15 to 17, wherein overexpression of one or more microRNAs selected from hsa-miR-1246, hsa-miR-1248, hsa-miR-130a, hsa-miR- 130a-3p, hsa-miR-145, hsa-miR-145-3p, hsa-miR-148a, hsa-miR- 148a-3p, hsa-miR-214, hsa-miR-214-3p, hsa-miR-216a, hsa-miR-224, hsa-miR-224-5p, hsa-miR-27a-5p, hsa-miR- 31, hsa-miR-31-5p, hsa-miR-4448, hsa-miR-449a, hsa-miR-452, hsa-miR-452-5p, hsa-miR- 455, hsa-miR-455-5p, hsa-miR-483, hsa-miR-483-3p, hsa-miR-483-5p, hsa-miR-549, hsa- miR-5584, hsa-miR-5584-5p, hsa-miR-574, hsa-miR-574-5p, hsa-miR-675, hsa-miR-675- 3p, hsa-miR-767, hsa-miR-767-5p, hsa-miR-9, hsa-miR-9-3p, msa-miR-27a, hsa-let-7a, hsa-let-7a-2, hsa-let-7a-2-3p, and hsa-let-7c in the biological sample from the subject in comparison to a control sample from an individual known not to have POAG predicts a negative outcome or onset and/or progression and/or severity and/or recurrence of POAG.
19. The method of claim 18, further comprising administering to the subject an inhibitory nucleic acid that reduces or inhibits the expression of one or more microRNAs selected from hsa-miR-1246, hsa-miR-1248, hsa-miR-130a, hsa-miR- 130a-3p, hsa-miR-145, hsa-miR-145-3p, hsa-miR-148a, hsa-miR- 148a-3p, hsa-miR-214, hsa-miR- 214-3p, hsa-miR-216a, hsa-miR-224, hsa-miR-224-5p, hsa-miR-27a-5p, hsa-miR-31, hsa- miR-31-5p, hsa-miR-4448, hsa-miR-449a, hsa-miR-452, hsa-miR-452-5p, hsa-miR-455, hsa-miR-455-5p, hsa-miR-483, hsa-miR-483-3p, hsa-miR-483-5p, hsa-miR-549, hsa-miR- 5584, hsa-miR-5584-5p, hsa-miR-574, hsa-miR-574-5p, hsa-miR-675, hsa-miR-675-3p, hsa-miR-767, hsa-miR-767-5p, hsa-miR-9, hsa-miR-9-3p, msa-miR-27a, hsa-let-7a, hsa-let- 7a-2, hsa-let-7a-2-3p, and hsa-let-7c.
20. The method of claim 18, further comprising administering to the subject one or more microRNAs or one or more mimics of microRNAs selected from hsa- miR-130a, hsa-miR-1246, hsa-miR-214, hsa-miR-452, hsa-miR-224, hsa-miR-4448, hsa- miR-483, hsa-miR-9, hsa-miR-767, hsa-miR-449a, hsa-miR- 130a-3p, hsa-miR-214-3p, hsa-
miR-452-5p, hsa-miR-224-5p, hsa-miR-483-5p, hsa-miR-483-3p, hsa-miR-9-3p and hsa- miR-767-5p.
21. The method of any one of claims 15 to 20, wherein underexpression or nonexpression of one or more microRNAs selected from hsa-miR- 100, hsa-miR- 100-5p, hsa-miR-105, hsa-miR- 105-5p, hsa-miR-1226, hsa-miR- 1226-3p, hsa-miR-124, hsa-miR- 124-3p, hsa-miR- 124-5p, hsa-miR-1250, hsa-miR-129, hsa-miR- 129-5p, hsa-miR-138, hsa- miR-138-1, hsa-miR- 138-2, hsa-miR- 138-2-3p, hsa-miR-139, hsa-miR-139-5p, hsa-miR- 181b, hsa-miR-181b-5p, hsa-miR-18a, hsa-miR- 18a-3p, hsa-miR-18b, hsa-miR- 18b-5p, hsa-miR- 193b, hsa-miR- 193b-5p, hsa-miR- 19b, hsa-miR- 19b- 1, hsa-miR- 19b- l-5p, hsa- miR-211, hsa-miR-21 l-5p, hsa-miR-219, hsa-miR-219-1, hsa-miR-219-2, hsa-miR-219-2- 3p, hsa-miR-219-5p, hsa-miR-2276, hsa-miR-2277, hsa-miR-2277-3p, hsa-miR-30b, hsa- miR-30b-3p, hsa-miR-3117, hsa-miR-3117-3p, hsa-miR-3182, hsa-miR-323b, hsa-miR- 323b-3p, hsa-miR-34b, hsa-miR-34b-3p, hsa-miR-3613, hsa-miR-3613-3p, hsa-miR-3622a, hsa-miR-3622a-5p, hsa-miR-376a , hsa-miR-376a-5p, hsa-miR-4423, hsa-miR-4423-5p, hsa-miR-4640, hsa-miR-4640-3p, hsa-miR-4677, hsa-miR-4677-3p, hsa-miR-505, hsa-miR- 505-5p, hsa-miR-513c, hsa-miR-513c-5p, hsa-miR-545, hsa-miR-545-5p, hsa-miR-548ah, hsa-miR-548ah-3p, hsa-miR-548ah-5p, hsa-miR-99b, and hsa-miR-99b-5p in the biological sample from the subject in comparison to a control sample from an individual known not to have POAG predicts a negative outcome or onset and/or progression and/or severity and/or recurrence of POAG.
22. The method of claim 21 , further comprising administering to the subject an inhibitory nucleic acid that reduces or inhibits the expression of one or more microRNAs selected from hsa-miR-100, hsa-miR- 100-5p, hsa-miR-105, hsa-miR- 105-5p, hsa-miR-1226, hsa-miR- 1226-3p, hsa-miR-124, hsa-miR- 124-3p, hsa-miR- 124-5p, hsa- miR-1250, hsa-miR-129, hsa-miR- 129-5p, hsa-miR-138, hsa-miR- 138-1, hsa-miR- 138-2, hsa-miR- 138-2-3p, hsa-miR-139, hsa-miR- 139-5p, hsa-miR-181b, hsa-miR-181b-5p, hsa- miR- 18a, hsa-miR- 18a-3p, hsa-miR- 18b, hsa-miR- 18b-5p, hsa-miR- 193b, hsa-miR- 193b- 5p, hsa-miR-19b, hsa-miR- 19b- 1, hsa-miR- 19b- l-5p, hsa-miR-211, hsa-miR-21 l-5p, hsa- miR-219, hsa-miR-219-1, hsa-miR-219-2, hsa-miR-219-2-3p, hsa-miR-219-5p, hsa-miR- 2276, hsa-miR-2277, hsa-miR-2277-3p, hsa-miR-30b, hsa-miR-30b-3p, hsa-miR-3117, hsa- miR-3117-3p, hsa-miR-3182, hsa-miR-323b, hsa-miR-323b-3p, hsa-miR-34b, hsa-miR- 34b-3p, hsa-miR-3613, hsa-miR-3613-3p, hsa-miR-3622a, hsa-miR-3622a-5p, hsa-miR- 376a , hsa-miR-376a-5p, hsa-miR-4423, hsa-miR-4423-5p, hsa-miR-4640, hsa-miR-4640-
3p, hsa-miR-4677, hsa-miR-4677-3p, hsa-miR-505, hsa-miR-505-5p, hsa-miR-513c, hsa- miR-513c-5p, hsa-miR-545, hsa-miR-545-5p, hsa-miR-548ah, hsa-miR-548ah-3p, hsa-miR- 548ah-5p, hsa-miR-99b, and hsa-miR-99b-5p.
23. The method of claim 21 , further comprising administering to the subject one or more microRNAs or one or more mimics of microRNAs selected from hsa- miR-100, hsa-miR- 100-5p, hsa-miR-105, hsa-miR- 105-5p, hsa-miR-1226, hsa-miR-1226- 3p, hsa-miR-124, hsa-miR- 124-3p, hsa-miR- 124-5p, hsa-miR-1250, hsa-miR-129, hsa-miR- 129-5p, hsa-miR-138, hsa-miR- 138-1, hsa-miR- 138-2, hsa-miR- 138-2-3p, hsa-miR-139, hsa-miR- 139-5p, hsa-miR- 18 lb, hsa-miR- 18 lb-5p, hsa-miR- 18a, hsa-miR- 18a-3p, hsa- miR-18b, hsa-miR- 18b-5p, hsa-miR- 193b, hsa-miR- 193b-5p, hsa-miR- 19b, hsa-miR- 19b- 1, hsa-miR- 19b- l-5p, hsa-miR-211, hsa-miR-21 l-5p, hsa-miR-219, hsa-miR-219-1, hsa-miR- 219-2, hsa-miR-219-2-3p, hsa-miR-219-5p, hsa-miR-2276, hsa-miR-2277, hsa-miR-2277- 3p, hsa-miR-30b, hsa-miR-30b-3p, hsa-miR-3117, hsa-miR-3117-3p, hsa-miR-3182, hsa- miR-323b, hsa-miR-323b-3p, hsa-miR-34b, hsa-miR-34b-3p, hsa-miR-3613, hsa-miR- 3613-3p, hsa-miR-3622a, hsa-miR-3622a-5p, hsa-miR-376a , hsa-miR-376a-5p, hsa-miR- 4423, hsa-miR-4423-5p, hsa-miR-4640, hsa-miR-4640-3p, hsa-miR-4677, hsa-miR-4677- 3p, hsa-miR-505, hsa-miR-505-5p, hsa-miR-513c, hsa-miR-513c-5p, hsa-miR-545, hsa- miR-545-5p, hsa-miR-548ah, hsa-miR-548ah-3p, hsa-miR-548ah-5p, hsa-miR-99b, and hsa-miR-99b-5p.
24. The method of any one of claims 15 to 23, wherein the individual is symptomatic for POAG.
25. The method of any one of claims 15 to 24, wherein the individual has a family history of POAG.
26. The method of any one of claims 15 to 25, wherein said output of the predictive model predicts a likelihood of onset and/or progression and/or severity and/or recurrence of POAG in the individual after said individual has undergone treatment for POAG.
27. The method of any one of claims 15 to 26, further comprising providing a report having a prediction of onset and/or progression and/or severity and/or recurrence of POAG of said individual.
28. The method of any one of claims 15 to 27, further comprising combining the allelic information and/or gene expression levels of said signature biomarkers with one or more other biomarkers to predict onset and/or progression and/or severity and/or recurrence of POAG in said individual.
29. The method of any one of claims 15 to 28, wherein the expression levels of a collection of signature biomarkers comprise gene expression levels is measured at multiple times.
30. The method of claim 29, further comprising using the dynamics of the gene expression levels measured at multiple times to predict onset and/or progression and/or severity and/or recurrence of disease in said subject.
31. The method of any one of claims 15 to 30, further comprising evaluating the output of the predictive model to determine whether or not the individual falls in a high risk group.
32. The method of any one of claims 15 to 31, further comprising developing said predictive model using stability selection or logistic regression.
33. The method of any one of claims 15 to 32, wherein applying said allelic information and/or expression levels of the collection of signature biomarkers to said predictive model comprises weighting said expression levels according to stability rankings or predictive power rankings of the collection of signature biomarkers.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP15723385.9A EP3140422A1 (en) | 2014-05-03 | 2015-05-01 | Methods of identifying biomarkers associated with or causative of the progression of disease, in particular for use in prognosticating primary open angle glaucoma |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201461988202P | 2014-05-03 | 2014-05-03 | |
US61/988,202 | 2014-05-03 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2015171457A1 true WO2015171457A1 (en) | 2015-11-12 |
Family
ID=53189201
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2015/028833 WO2015171457A1 (en) | 2014-05-03 | 2015-05-01 | Methods of identifying biomarkers associated with or causative of the progression of disease, in particular for use in prognosticating primary open angle glaucoma |
Country Status (3)
Country | Link |
---|---|
US (1) | US20150315645A1 (en) |
EP (1) | EP3140422A1 (en) |
WO (1) | WO2015171457A1 (en) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105925685A (en) * | 2016-05-13 | 2016-09-07 | 万康源(天津)基因科技有限公司 | Exome potential pathogenic mutation detection method based on family line |
WO2017132673A1 (en) * | 2016-01-29 | 2017-08-03 | Apte Rajendra S | Gdf15 in glaucoma and methods of use thereof |
CN107435073A (en) * | 2017-08-31 | 2017-12-05 | 北京泱深生物信息技术有限公司 | Mir 3613 and its ripe miRNA new application |
CN108752453A (en) * | 2018-06-12 | 2018-11-06 | 北京市神经外科研究所 | The application of LEMD3 and its mutation in BAVM diagnosis and treatment |
CN109701019A (en) * | 2019-01-04 | 2019-05-03 | 中国人民解放军第二军医大学 | A kind of new long-chain non-coding RNA, that is, lnc-Dpf3, its sequence, immunological effect and purposes |
EP3546938A1 (en) | 2018-03-30 | 2019-10-02 | Université d'Angers | Metabolic signature and use thereof for the diagnosis of glaucoma |
KR20230075217A (en) * | 2021-11-22 | 2023-05-31 | 경상국립대학교산학협력단 | miRNA biomarker for diagnosis of pseudoexfoliation glaucoma and uses thereof |
KR20230075213A (en) * | 2021-11-22 | 2023-05-31 | 경상국립대학교산학협력단 | miRNA biomarker for diagnosis of normal tension glaucoma and uses thereof |
RU2799582C1 (en) * | 2023-05-03 | 2023-07-06 | Федеральное государственное бюджетное учреждение "Национальный медицинский исследовательский центр глазных болезней имени Гельмгольца" Министерства здравоохранения Российской Федерации (ФГБУ "НМИЦ ГБ им. Гельмгольца" Минздрава России) | Method of prevention of the progression of primary open-angle glaucoma |
Families Citing this family (20)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080228698A1 (en) | 2007-03-16 | 2008-09-18 | Expanse Networks, Inc. | Creation of Attribute Combination Databases |
US8463554B2 (en) | 2008-12-31 | 2013-06-11 | 23Andme, Inc. | Finding relatives in a database |
US10777302B2 (en) * | 2012-06-04 | 2020-09-15 | 23Andme, Inc. | Identifying variants of interest by imputation |
US10713383B2 (en) * | 2014-11-29 | 2020-07-14 | Ethan Huang | Methods and systems for anonymizing genome segments and sequences and associated information |
US10395759B2 (en) | 2015-05-18 | 2019-08-27 | Regeneron Pharmaceuticals, Inc. | Methods and systems for copy number variant detection |
CN105701365B (en) * | 2016-01-12 | 2018-09-07 | 西安电子科技大学 | It was found that the method and related system of cancer related gene, process for preparing medicine |
CA3014292A1 (en) | 2016-02-12 | 2017-08-17 | Regeneron Pharmaceuticals, Inc. | Methods and systems for detection of abnormal karyotypes |
CN105861697B (en) * | 2016-05-13 | 2019-08-20 | 万康源(天津)基因科技有限公司 | A kind of potential pathogenic variation detection system of exon group based on family |
CN106778072B (en) * | 2016-12-30 | 2019-05-21 | 西安交通大学 | For the process bearing calibration of second generation Oncogenome high-flux sequence data |
CN107058545B (en) * | 2017-04-27 | 2020-07-24 | 四川农业大学 | SNP molecular marker of corn embryogenic callus induction related gene GRMZM2G020814 and application thereof |
US11468194B2 (en) | 2017-05-11 | 2022-10-11 | Ethan Huang | Methods and systems for anonymizing genome segments and sequences and associated information |
ES2911249T3 (en) * | 2017-11-10 | 2022-05-18 | Regeneron Pharma | Non-human animals comprising a slc30a8 mutation and methods of use |
EP3810797A4 (en) * | 2018-06-20 | 2022-03-30 | The Flinders University of South Australia | GLAUCOMA RISK ASSESSMENT METHODS AND SYSTEMS |
US20210217493A1 (en) * | 2018-07-27 | 2021-07-15 | Seekin, Inc. | Reducing noise in sequencing data |
CN108950691A (en) * | 2018-08-08 | 2018-12-07 | 广州嘉检医学检测有限公司 | Probe compositions, kit and the application of genetic disease construction of gene library based on exon trapping |
CN109988834B (en) * | 2018-12-19 | 2020-02-18 | 浙江大学医学院附属妇产科医院 | Application of plasma exosome molecular marker hsa-miR-219a-5p |
CN109920481B (en) * | 2019-01-31 | 2021-06-01 | 北京诺禾致源科技股份有限公司 | BRCA1/2 gene variation interpretation database and construction method thereof |
CN110672860B (en) * | 2019-11-04 | 2023-07-14 | 中国科学院近代物理研究所 | Combination of five cytokines as biomarkers of ionizing radiation damage |
CN110938684A (en) * | 2019-11-25 | 2020-03-31 | 福州福瑞医学检验实验室有限公司 | Nucleic acid for encoding LTBP2 gene mutant and application thereof |
CN111540407B (en) * | 2020-04-13 | 2023-06-27 | 中南大学湘雅医院 | Method for screening candidate genes by integrating multiple neurodevelopmental diseases |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2007062101A2 (en) * | 2005-11-22 | 2007-05-31 | Mcgill University | Intraocular pressure-regulated early genes and uses thereof |
WO2008082529A2 (en) * | 2006-12-19 | 2008-07-10 | Source Precision Medicine, Inc. | Gene expression profiling for identification, monitoring, and treatment of ocular disease |
EP2147975A1 (en) * | 2007-04-17 | 2010-01-27 | Santen Pharmaceutical Co., Ltd | Method for determination of onset risk of glaucoma |
WO2013067001A1 (en) * | 2011-10-31 | 2013-05-10 | The Scripps Research Institute | Systems and methods for genomic annotation and distributed variant interpretation |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP2310507A4 (en) * | 2008-07-08 | 2013-03-20 | David Gladstone Inst | METHODS AND COMPOSITIONS FOR MODULATING ANGIOGENESIS |
WO2010027838A1 (en) * | 2008-08-27 | 2010-03-11 | The United States Of America, As Represented By The Secretary, Department Of Health And Human Services | Mir 204, mir 211, their anti-mirs, and therapeutic uses of same |
CN102499987A (en) * | 2011-12-19 | 2012-06-20 | 天津医科大学眼科中心 | Application of miR-1 in production of preparation for treating primary glaucoma |
US9422561B2 (en) * | 2012-01-24 | 2016-08-23 | Bar-Ilan University | Treatment of disease by modulation of SIRT6 |
US9388412B2 (en) * | 2012-06-15 | 2016-07-12 | The General Hospital Corporation | Inhibitors of microRNAs that regulate production of atrial natriuretic peptide (ANP) as therapeutics and uses thereof |
KR20140046339A (en) * | 2012-10-10 | 2014-04-18 | 서울대학교산학협력단 | Method for differentiation into retinal cells from stem cells using inhibition of mirna-203 |
-
2015
- 2015-05-01 WO PCT/US2015/028833 patent/WO2015171457A1/en active Application Filing
- 2015-05-01 EP EP15723385.9A patent/EP3140422A1/en not_active Withdrawn
- 2015-05-01 US US14/701,965 patent/US20150315645A1/en not_active Abandoned
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2007062101A2 (en) * | 2005-11-22 | 2007-05-31 | Mcgill University | Intraocular pressure-regulated early genes and uses thereof |
WO2008082529A2 (en) * | 2006-12-19 | 2008-07-10 | Source Precision Medicine, Inc. | Gene expression profiling for identification, monitoring, and treatment of ocular disease |
EP2147975A1 (en) * | 2007-04-17 | 2010-01-27 | Santen Pharmaceutical Co., Ltd | Method for determination of onset risk of glaucoma |
WO2013067001A1 (en) * | 2011-10-31 | 2013-05-10 | The Scripps Research Institute | Systems and methods for genomic annotation and distributed variant interpretation |
Non-Patent Citations (8)
Title |
---|
A GUSEV ET AL: "Low-pass Genomewide Sequencing and Variant Imputation Using Identity-by-descent in an Isolated Human Population", 17 February 2011 (2011-02-17), XP055202138, Retrieved from the Internet <URL:http://arxiv.org/abs/1102.3720> [retrieved on 20150714] * |
A. I. IGLESIAS ET AL: "Exome sequencing and functional analyses suggest that SIX6 is a gene involved in an altered proliferation-differentiation balance early in life and optic nerve degeneration at old age", HUMAN MOLECULAR GENETICS, vol. 23, no. 5, 1 March 2014 (2014-03-01), pages 1320 - 1332, XP055201298, ISSN: 0964-6906, DOI: 10.1093/hmg/ddt522 * |
CHRISTIAN GILISSEN ET AL: "Disease gene identification strategies for exome sequencing", EUROPEAN JOURNAL OF HUMAN GENETICS, vol. 20, no. 5, 18 January 2012 (2012-01-18), pages 490 - 497, XP055201231, ISSN: 1018-4813, DOI: 10.1038/ejhg.2011.258 * |
D. G. MACARTHUR ET AL: "Guidelines for investigating causality of sequence variants in human disease", NATURE, vol. 508, no. 7497, 23 April 2014 (2014-04-23), pages 469 - 476, XP055201334, ISSN: 0028-0836, DOI: 10.1038/nature13127 * |
DANNY CHALLIS ET AL: "An integrative variant analysis suite for whole exome next-generation sequencing data", BMC BIOINFORMATICS, BIOMED CENTRAL, LONDON, GB, vol. 13, no. 1, 12 January 2012 (2012-01-12), pages 8, XP021117710, ISSN: 1471-2105, DOI: 10.1186/1471-2105-13-8 * |
S. PABINGER ET AL: "A survey of tools for variant analysis of next-generation genome sequencing data", BRIEFINGS IN BIOINFORMATICS, 21 January 2013 (2013-01-21), XP055073207, ISSN: 1467-5463, DOI: 10.1093/bib/bbs086 * |
See also references of EP3140422A1 * |
TERRY GAASTERLAND ET AL: "Identification of disease-associated genome variants in regulatory regions using exome sequencing in 295 POAG cases", INVESTIGATIVE OPHTHALMOLOGY AND VISUAL SCIENCE, 1 April 2014 (2014-04-01), XP055201302, Retrieved from the Internet <URL:http://iovs.arvojournals.org/Article.aspx?articleid=2269246> [retrieved on 20150709] * |
Cited By (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11137408B2 (en) | 2016-01-29 | 2021-10-05 | Washington University | GDF15 in glaucoma and methods of use thereof |
WO2017132673A1 (en) * | 2016-01-29 | 2017-08-03 | Apte Rajendra S | Gdf15 in glaucoma and methods of use thereof |
US11933791B2 (en) | 2016-01-29 | 2024-03-19 | Washington University | GDF15 in glaucoma and methods of use thereof |
JP2019508064A (en) * | 2016-01-29 | 2019-03-28 | ラジェンドラ・エス・アプテRajendra S. APTE | GDF 15 in glaucoma and method of use thereof |
JP7318978B2 (en) | 2016-01-29 | 2023-08-01 | ワシントン・ユニバーシティ | GDF15 and its use in glaucoma |
JP2022028664A (en) * | 2016-01-29 | 2022-02-16 | ワシントン・ユニバーシティ | GDF15 and its usage in glaucoma |
CN105925685A (en) * | 2016-05-13 | 2016-09-07 | 万康源(天津)基因科技有限公司 | Exome potential pathogenic mutation detection method based on family line |
CN107435073A (en) * | 2017-08-31 | 2017-12-05 | 北京泱深生物信息技术有限公司 | Mir 3613 and its ripe miRNA new application |
EP3546938A1 (en) | 2018-03-30 | 2019-10-02 | Université d'Angers | Metabolic signature and use thereof for the diagnosis of glaucoma |
WO2019185918A1 (en) | 2018-03-30 | 2019-10-03 | Université d'Angers | Metabolic signature and use thereof for the diagnosis of glaucoma |
CN108752453B (en) * | 2018-06-12 | 2021-02-02 | 北京市神经外科研究所 | LEMD3 and application of mutation thereof in BAVM diagnosis and treatment |
CN108752453A (en) * | 2018-06-12 | 2018-11-06 | 北京市神经外科研究所 | The application of LEMD3 and its mutation in BAVM diagnosis and treatment |
CN109701019B (en) * | 2019-01-04 | 2021-07-16 | 中国人民解放军第二军医大学 | A novel long-chain non-coding RNA, lnc-Dpf3, its sequence, immune effect and use |
CN109701019A (en) * | 2019-01-04 | 2019-05-03 | 中国人民解放军第二军医大学 | A kind of new long-chain non-coding RNA, that is, lnc-Dpf3, its sequence, immunological effect and purposes |
KR20230075217A (en) * | 2021-11-22 | 2023-05-31 | 경상국립대학교산학협력단 | miRNA biomarker for diagnosis of pseudoexfoliation glaucoma and uses thereof |
KR20230075213A (en) * | 2021-11-22 | 2023-05-31 | 경상국립대학교산학협력단 | miRNA biomarker for diagnosis of normal tension glaucoma and uses thereof |
KR102692739B1 (en) | 2021-11-22 | 2024-08-06 | 경상국립대학교산학협력단 | miRNA biomarker for diagnosis of normal tension glaucoma and uses thereof |
KR102692742B1 (en) | 2021-11-22 | 2024-08-06 | 경상국립대학교산학협력단 | miRNA biomarker for diagnosis of pseudoexfoliation glaucoma and uses thereof |
RU2799582C1 (en) * | 2023-05-03 | 2023-07-06 | Федеральное государственное бюджетное учреждение "Национальный медицинский исследовательский центр глазных болезней имени Гельмгольца" Министерства здравоохранения Российской Федерации (ФГБУ "НМИЦ ГБ им. Гельмгольца" Минздрава России) | Method of prevention of the progression of primary open-angle glaucoma |
Also Published As
Publication number | Publication date |
---|---|
EP3140422A1 (en) | 2017-03-15 |
US20150315645A1 (en) | 2015-11-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20150315645A1 (en) | Methods of identifying biomarkers associated with or causative of the progression of disease | |
Gittelman et al. | Comprehensive identification and analysis of human accelerated regulatory DNA | |
Siu et al. | Functional DNA methylation signatures for autism spectrum disorder genomic risk loci: 16p11. 2 deletions and CHD8 variants | |
Wiggs et al. | Common variants at 9p21 and 8q22 are associated with increased susceptibility to optic nerve degeneration in glaucoma | |
Davidson et al. | Autosomal-dominant corneal endothelial dystrophies CHED1 and PPCD1 are allelic disorders caused by non-coding mutations in the promoter of OVOL2 | |
Costa et al. | Massive-scale RNA-Seq analysis of non ribosomal transcriptome in human trisomy 21 | |
Gao et al. | Evaluation of a target region capture sequencing platform using monogenic diabetes as a study-model | |
Smith et al. | Human iPSC-derived retinal pigment epithelium: a model system for prioritizing and functionally characterizing causal variants at AMD risk loci | |
Hu et al. | Temporal dynamics of miRNAs in human DLPFC and its association with miRNA dysregulation in schizophrenia | |
AU2016324166A1 (en) | Predicting disease burden from genome variants | |
Monteiro et al. | Lessons from postgenome‐wide association studies: functional analysis of cancer predisposition loci | |
Subaran et al. | Novel variants in ZNF34 and other brain‐expressed transcription factors are shared among early‐onset MDD relatives | |
CN103773859B (en) | The SNP mark of the mitochondrial DNA that a kind of azoospermia agnogenic to clinic is relevant and application thereof | |
US10787708B2 (en) | Method of identifying a gene associated with a disease or pathological condition of the disease | |
Shimada et al. | Epigenome-wide association study of narcolepsy-affected lateral hypothalamic brains, and overlapping DNA methylation profiles between narcolepsy and multiple sclerosis | |
Banerji et al. | The FSHD muscle–blood biomarker: A circulating transcriptomic biomarker for clinical severity in facioscapulohumeral muscular dystrophy | |
Gonzalez‐Latapi et al. | Alterations in blood methylome as potential epigenetic biomarker in sporadic Parkinson's disease | |
CN103361373B (en) | SNRNP200 gene mutant and application thereof | |
Billingsley et al. | Long-read sequencing of hundreds of diverse brains provides insight into the impact of structural variation on gene expression and DNA methylation | |
Hao et al. | Identification of LncRNA-MiRNA-MRNA networks in the lenticular nucleus region of the brain contributes to hepatolenticular degeneration pathogenesis and therapy | |
Omidsalar et al. | Common mitochondrial deletions in RNA-Seq: evaluation of bulk, single-cell, and spatial transcriptomic datasets | |
Ferrarini et al. | The use of non-variant sites to improve the clinical assessment of whole-genome sequence data | |
CN108715893A (en) | One group with radiotherapy caused by the relevant SNP markers of Brain Radiation Injury and its application | |
AU2017100960A4 (en) | Method of identifying a gene associated with a disease or pathological condition of the disease | |
Peng et al. | Targeted capture sequencing identifies novel genetic variations in Chinese patients with idiopathic inflammatory myopathies |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 15723385 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
REEP | Request for entry into the european phase |
Ref document number: 2015723385 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2015723385 Country of ref document: EP |