Hi,
As suggested in the title, I successfully ran the deletion calling pipeline to completion on 20 individuals, 2 million contiguous, autosomal bases, 100-10,000bp in length.
However, when I run the same script but modified to include 40 individuals, 9 million bases, 100-100,000bp in length, the process fails seemingly because there is no discovery genotypes file generated for it to work on.
Initial logic suggests the problem is caused by a difference in the inputs: maybe one of the additional individuals is corrupting the process with bad data, or there's a memory limit reached with trying to look for deletions in such a wide size range. I've double-checked the variables.
The point in the log at which the two runs diverge is in the Discovery QGraph stage, where the successful one precedes Function Edge: Starting java.. , and the unsuccessful run just seems to write a job report, which contains only #:GATKReport.v1.1:0 Also the gender report is good quality. Below are sections from the logs. Any help would be much appreciated,
Cheers,
Will
(Successful job) :
INFO 21:10:33,052 HelpFormatter - Date/Time: 2016/01/18 21:10:33
INFO 21:10:33,052 HelpFormatter - ----------------------------------------------------------------------
INFO 21:10:33,053 HelpFormatter - ----------------------------------------------------------------------
INFO 21:10:33,067 QCommandLine - Scripting SVDiscovery
INFO 21:10:33,498 QCommandLine - Added 2 functions
INFO 21:10:33,498 QGraph - Generating graph.
INFO 21:10:33,512 QGraph - Running jobs.
INFO 21:10:33,740 FunctionEdge - Starting: 'java' '-Xmx4096m' '-XX:+UseParallelOldGC' '-XX:ParallelGCThreads=4' '-XX:GCTimeLimit=50' '-XX:GCHeapFreeLimit=10' '-Djava.io.tmpdir=/lustre/scratch/bioenv/wg39/LHm_analysis/genotyping/CNVs/tmpdir' '-cp'
(Unsuccessful job) :
INFO 02:11:51,565 HelpFormatter - Date/Time: 2016/01/21 02:11:51
INFO 02:11:51,565 HelpFormatter - ----------------------------------------------------------------------
INFO 02:11:51,565 HelpFormatter - ----------------------------------------------------------------------
INFO 02:11:51,576 QCommandLine - Scripting SVDiscovery
INFO 02:11:53,150 QCommandLine - Added 2 functions
INFO 02:11:53,151 QGraph - Generating graph.
INFO 02:11:53,165 QGraph - Running jobs.
INFO 02:11:54,753 QGraph - 0 Pend, 0 Run, 0 Fail, 2 Done
INFO 02:11:54,755 QCommandLine - Writing final jobs report...
INFO 02:11:54,756 QJobsReporter - Writing JobLogging GATKReport to file /lustre/scratch/bioenv/wg39/LHm_analysis/genotyping/CNVs/SVDiscovery.jobreport.txt
INFO 02:11:54,800 QJobsReporter - Plotting JobLogging GATKReport to file /lustre/scratch/bioenv/wg39/LHm_analysis/genotyping/CNVs/SVDiscovery.jobreport.pdf
WARN 02:11:56,452 RScriptExecutor - RScript exited with 1. Run with -l DEBUG for more info.
INFO 02:11:56,456 QCommandLine - Script completed successfully with 2 total jobs