csv - spring batch: Dump a set of queries over a database in parallel to flat files -
so scenario drilled down essence follows: essentially, have config file containing set of sql queries result sets need exported csv files. since queries may return billions of rows, , because may interrupt process (bug, crash, ...), want use framework such spring batch, gives me restartabilty , job monitoring. using file based h2 database persisting spring batch jobs.
so, here questions:
upon creating job, need provide rowmapper initial configuration. happens when job needs restarted after e.g. crash? concretly:
- is state of rowmapper automatically persisted, , upon restart spring batch try restore object database, or
- will rowmapper object used part of original spring batch xml config file, or
- i have maintain rowmapper's state using step's/job's executioncontext?
above question related whether there magic going on when using spring batch xml configuration, or whether create these beans in programmatic way: since need parse own config format spring batch job config, rather use spring batch's java classes (beans) , fill them out appropriately, rather attempting manually write out valid xml. however, if job crashes, create beans myself again. spring batch automagically restore job state database?
if need xml, there way serialize spring-batch jobrepository (or 1 of these objects) spring batch xml config?
right now, tried configure step following code - unsure if proper way this:
- is taskletstep way go?
- is way create chunked reader/writer correct, or there other object should use instead?
i have assumed opening of reader , writer occur automatically part of jobexecution, if don't open these resources prior running job, exception telling me need open them first. maybe need create other object manages resoures (jdbc connection , file handle)?
jdbccursoritemreader<foobar> itemreader = new jdbccursoritemreader<foobar>(); itemreader.setsql(sqlstr); itemreader.setdatasource(datasource); itemreader.setrowmapper(rowmapper); itemreader.afterpropertiesset(); executioncontext executioncontext = new executioncontext(); itemreader.open(executioncontext); flatfileitemwriter<string> itemwriter = new flatfileitemwriter<string>(); itemwriter.setlineaggregator(new passthroughlineaggregator<string>()); itemwriter.setresource(outresource); itemwriter.afterpropertiesset(); itemwriter.open(executioncontext); int commitinterval = 50000; completionpolicy completionpolicy = new simplecompletionpolicy(commitinterval); repeattemplate repeattemplate = new repeattemplate(); repeattemplate.setcompletionpolicy(completionpolicy); repeatoperations repeatoperations = repeattemplate; chunkprovider<foobar> chunkprovider = new simplechunkprovider<foobar>(itemreader, repeatoperations); itemprocessor<foobar, string> itemprocessor = new itemprocessor<foobar, string>() { /* custom implemtation */ }; chunkprocessor<foobar> chunkprocessor = new simplechunkprocessor<foobar, string>(itemprocessor, itemwriter); tasklet tasklet = new chunkorientedtasklet<quadpattern>(chunkprovider, chunkprocessor); //new splitfilestasklet(); taskletstep taskletstep = new taskletstep(); taskletstep.setname(taskletname); taskletstep.setjobrepository(jobrepository); taskletstep.settransactionmanager(transactionmanager); taskletstep.settasklet(tasklet); taskletstep.afterpropertiesset(); job.addstep(taskletstep);
most of questions complex , can difficult give answer without write long paper.
i'm new spring-batch you, , found lot of useful info - , answers questions - reading spring batch in action: it's completed, explained, full of example , cover aspects of framework (reader/writer/processor, job/tasklet/chunk lifecycle/persistence, tx/resources management, job flow, integration other service, partitioning, restarting/retry, failure management , lot of interesting things).
hope help
Comments
Post a Comment