-
Notifications
You must be signed in to change notification settings - Fork 621
Description
Hi all,
That is not an issue but a technical question about using the 'jar' of gatk as a java library.
I want to genotype a lot of BAM files in a small region of the genome in the same process of my pipeline.
The JVM is slow on startup and I think the initialization process takes more time than the calling itself.
So I wrote a small java program that use gatk as a library and invokes all the steps in the same JVM instance:
So Each call to gatk is basically the following code:
public class Gatk4ProxyImpl extends org.broadinstitute.hellbender.Main implements Gatk4Proxy {
(...)
public void execute(final List<String> argv) throws Exception {
final String[] args = argv.toArray(new String[argv.size()]);
LOG.info(getCommandLineName()+":Executing: gatk "+ String.join(" ",argv));
final CommandLineProgram program =
this.setupConfigAndExtractProgram(args,
this.getPackageList(),
this.getClassList(),
this.getCommandLineName()
);
final Object result = Main.runCommandLineProgram(program, args);
(...)
(full code at https://github.com/lindenb/jvarkit/blob/master/src/main/java/com/github/lindenb/jvarkit/gatk/Gatk4ProxyImpl.java )
my question is: The code works but is there any hidden drawback in using gatk such way ? For example there any chance that I'm missing any function that dispose some resources, that loads a huge resource for each call, etc....
thanks,
Pierre