First step is getting pathway data or gene set data. I downloaded these data from GSEA website. I downloaded KEGG gene sets, gene symbols, c2.cp.kegg.v5.1.symbols.gmt.

Total 186 Kegg pathways are in the file.

bash-3.2$ awk ' END {print NR} ' c2.cp.kegg.v5.1.symbols.gmt
186

Here are first three pathways.

bash-3.2$ head -n 3 data/c2.cp.kegg.v5.1.symbols.gmt
KEGG_GLYCOLYSIS_GLUCONEOGENESIS      http://www.broadinstitute.org/gsea/msigdb/cards/KEGG_GLYCOLYSIS_GLUCONEOGENESIS    ACSS2   GCK PGK2    PGK1    PDHB    PDHA1   PDHA2   PGM2    TPI1    ACSS1   FBP1    ADH1B   HK2 ADH1C   HK1 HK3 ADH4    PGAM2   ADH5    PGAM1   ADH1A   ALDOC   ALDH7A1 LDHAL6B PKLR    LDHAL6A ENO1    PKM2    PFKP    BPGM    PCK2    PCK1    ALDH1B1 ALDH2   ALDH3A1 AKR1A1  FBP2    PFKM    PFKL    LDHC    GAPDH   ENO3    ENO2    PGAM4   ADH7    ADH6    LDHB    ALDH1A3 ALDH3B1 ALDH3B2 ALDH9A1 ALDH3A2 GALM    ALDOA   DLD DLAT    ALDOB   G6PC2   LDHA    G6PC    PGM1    GPI
KEGG_CITRATE_CYCLE_TCA_CYCLE         http://www.broadinstitute.org/gsea/msigdb/cards/KEGG_CITRATE_CYCLE_TCA_CYCLE       IDH3B   DLST        PCK2    CS      PDHB    PCK1    PDHA1   LOC642502       PDHA2   LOC283398   FH      SDHD    OGDH    SDHB    IDH3A   SDHC    IDH2    IDH1    ACO1    ACLY    MDH2    DLD     MDH1    DLAT    OGDHL   PC      SDHA    SUCLG1  SUCLA2  SUCLG2  IDH3G   ACO2
KEGG_PENTOSE_PHOSPHATE_PATHWAY       http://www.broadinstitute.org/gsea/msigdb/cards/KEGG_PENTOSE_PHOSPHATE_PATHWAY     RPE     RPIA        PGM2    PGLS    PRPS2   FBP2    PFKM    PFKL            TALDO1  TKT         FBP1    TKTL2   PGD     RBKS    ALDOA   ALDOC   ALDOB   H6PD    LOC729020       PRPS1L1 PRPS1   DERA    G6PD    PGM1    TKTL1   PFKP    GPI

I write a perl script, processkegg.pl, to process this raw data. The program create a directory and save all 186 kegg pathways in separate files, kegg1.txt to kegg186.txt.

bash-3.2$ mkdir kegg
bash-3.2$ ./processkegg.pl data/c2.cp.kegg.v5.1.symbols.gmt kegg
bash-3.2$ cd kegg
bash-3.2$ head -n 5 kegg1.txt
KEGG_GLYCOLYSIS_GLUCONEOGENESIS
ACSS2
GCK
PGK2
PGK1

The first row of kegg1.txt is name of the pathway and gene symbols inside of the pathway are follows.

To perform a pathway based tests using these 186 pathways. We need to get corresponding gene.info and snp.info for each pathway. gene.info is a GENE information matrix, The 1st column is GENE id, 2nd column is chromosome number, 3rd and 4th column indicate start and end positions of the gene. for the corresponding pathway. You can take a subset of gene database downloadable from MAGMA website. snp.info is a SNP information matrix for corresponding pathway. The 1st column is SNP id, 2nd column is chromosome #, 3rd column indicates SNP location.

Once you get snp.info and gene.info for each pathway, read Vignette for aSPUs and aSPUsPath. You can find how to perform aSPUsPath test. aSPUpath and MTaSPUsPath are also similar to use.