[index] [prev] [next] [options] [help]

provenance_challenge_ipaw_info messages

Re: [provenance-challenge] Submission of workflows for Provenance Challenge 3

From: David Holland <dholland AT eecs.harvard.edu>
Date: Fri, 1 Aug 2008 21:14:19 -0400


Threading: [provenance-challenge] Submission of workflows for Provenance Challenge 3 from pgroth AT isi.edu
      • This Message

On Fri, Aug 01, 2008 at 02:31:36PM -0700, Paul Groth wrote:
 > At the Open Provenance Model Workshop, we had agreed that it would be  
 > advantageous for the Third Provenance Challenge to have additional  
 > workflows beyond the Brain Atlas workflow used in previous challenges.  
 > To that end, it was decided that a number of teams would propose new  
 > workflows by August 1.

I've been looking at preparing a small compile workload. This turns
out to not be entirely trivial; most compiles are very large compared
to what most groups are prepared to cope with in this context, and
really small ones are inherently not very interesting.  There are also
some issues with portability, and also, most compiles already have a
workflow engine (make) and interesting ones tend to generate parts of
the makefile on the fly. This makes replacing the makefile with a
workflow specification problematic, but just having the workflow
engine run make turns the whole thing into one big glob.

I think the solution to this is to have the workload specification
know what the outputs are going to be and ask successive make runs to
deliver them one at a time. It is not all that natural, but it should
serve.

Along these lines, I have a small workload that's a drastically cut
down build of the toy kernel we use for teaching our OS course. It
currently has five phases: tree configure, kernel configure, make
depend, compile, and link.

 - tree configure runs a "configure" script to generate 
"treedefs.mk",
which is used by all later make invocations.

 - kernel configure runs the kernel config script on a kernel config
file and a sources list to generate these files: 
	autoconf.c autoconf.h opt-sfs.h
	defs.mk files.mk

 - make depend uses the .mk files, a Makefile, 14 .h files, 11 .c
files, plus autoconf.[ch], to generate depend.mk.

 - the compile phase compiles each of the 11 .c files, plus
autoconf.c, using the header files as well and all the make bits, to
make 12 .o files.

 - the link phase creates a "kernel" image from the 12 .o files using
the make bits, an extra shell script, and an extra generated .c file
that might or might not be worth modeling independently.

I have cut out all the machine-dependent goop so it should be
compilable on any reasonable platform, even Windows (although you'll
need cygwin or equivalent to run the scripts) and ought to even
compile with almost any compiler, I think, although it still needs
some tweaking.

It also has a variant form (a different kernel config) and I have some
queries in mind although I haven't written them up yet. Nor have I
written up the workflow itself in any more detail than the above.

So.

Is this workload too large? It is a bit more than twice the size of
the original challenge workload. I can cut it down a bit, but not that
much.

If it is going to be too large I don't want to put any more effort
into it...

Opinions?


(If anyone wants to look at it, I've stuck the files here:
http://www.eecs.harvard.edu/~dholland/tmp/challenge3/. To try running
it, do ./configure; ./config GENERIC; make depend; make.)

-- 
   - David A. Holland / dholland AT eecs.harvard.edu


[index] [prev] [next] [options] [help]