After updating to dataflow 0.4.20150727 we've started getting this exception at runtime. It seems to happen nearly every time we apply
any transform without explicitly giving it a name. Is this the intent?
@Test
public void exampleFailure(){
Pipeline p = TestPipeline.create();
final PCollection<Integer> pInts1 = p.apply(Create.of(Arrays.asList(1, 2, 3)));
final PCollection<Integer> pInts2 = p.apply(Create.of(Arrays.asList(1, 2, 3)));
p.run();
}
java.lang.IllegalStateException: Transform Create.Values2 does not have a stable unique name. This will prevent reloading of pipelines.
at com.google.cloud.dataflow.sdk.Pipeline.applyInternal(Pipeline.java:330)
at com.google.cloud.dataflow.sdk.Pipeline.applyTransform(Pipeline.java:253)
at com.google.cloud.dataflow.sdk.values.PBegin.apply(PBegin.java:48)
at com.google.cloud.dataflow.sdk.Pipeline.apply(Pipeline.java:137)
at org.broadinstitute.hellbender.engine.dataflow.transforms.GetOverlappingReadsAndVariantsUnitTest.exampleFailure(GetOverlappingReadsAndVariantsUnitTest.java:63)
@Test
public void exampleFailure(){
Pipeline p = TestPipeline.create();
final PCollection<Integer> pInts1 = p.apply("Create.Values1",Create.of(Arrays.asList(1, 2, 3)));
final PCollection<Integer> pInts2 = p.apply("Create.Values2",Create.of(Arrays.asList(1, 2, 3)));
p.run();
}
but this should be unnecessary since I gave it the same name it had already inferred for that transform.
@Test
public void exampleFailure(){
Pipeline p = TestPipeline.create();
final PCollection<Integer> pInts1 = p.apply(UUID.randomUUID().toString(),Create.of(Arrays.asList(1, 2, 3)));
final PCollection<Integer> pInts2 = p.apply(UUID.randomUUID().toString(),Create.of(Arrays.asList(1, 2, 3)));
p.run();
}
I assume it's not because it says "stable". What's the danger though? If a randomly generated name isn't ok, is it what should we do for programmatically generated transforms? Is an incremented counter ok?