1. General Information
In this tutorial, we'll take a hands-on, code-focused introduction to Spring Batch. Spring Batch is a processing framework designed for robust job execution.
The current version 4.3 supports Spring 5 and Java 8. It also supports JSR-352, the new Java specification for batch processing.
Here you aresome interesting and practical use cases of the framework.
2. Workflow basics
Spring Batch follows the traditional batch architecture, where a job repository does the work of scheduling and interacting with the job.
A job can have more than one step. And each step usually follows the sequence of reading, processing, and writing data.
And of course the framework will do most of the heavy lifting here, especially when it comes to the low-level persistence work of editing jobs, withsqliteto the job repository.
2.1. application example
The simple use case we're covering here is migrating some financial transaction data from CSV to XML.
The input file has a very simple structure.
Contains one transaction per line consisting of username, user ID, transaction date, and amount:
Username, User ID, Transaction date, Transaction amount to become, 1234, 10/31/2015, 10000john, 2134, 12/3/2015, 12321robin, 2134, 2/2/2015, 23411
3. Das Maven-POM
The required dependencies for this project are Spring Core, Spring Batch andsqliteConector JDBC:
<!-- SQLite-Banktreiber --><dependency> <groupId>org.xerial</groupId> <artifactId>sqlite-jdbc</artifactId> <version>3.15.1</version></dependency> <dependency> <groupId>org.springframework</groupId> <artifactId>spring-oxm</artifactId> <version>5.3.0</version></dependency><dependency> <groupId>org.springframework</groupId> <artifactId> spring-jdbc</artifactId> <version>5.3.0</version></dependency><dependency> <groupId>org.springframework.batch</groupId> <artifactId>spring-batch-core</artifactId > <versión >4.3.0</versión></dependencia>
4. Spring Batch Setup
First, let's configure Spring Batch using XML:
<!-- Verbindung zu SQLite-Datenbanken --><bean id="dataSource" class="org.springframework.jdbc.datasource.DriverManagerDataSource"> <property name="driverClassName" value="org.sqlite.JDBC" / > <nombre de propiedad="url" valor="jdbc:sqlite:repositorio.sqlite" /> <nombre de propiedad="nombre de usuario" valor="" /> <nombre de propiedad="contraseña" valor="" /></ bean > <!-- criar metatabelas de trabalho automaticamente --><jdbc:initialize-database data-source="dataSource"> <jdbc:script location="org/springframework/batch/core/schema-drop-sqlite.sql " /> <jdbc:script location="org/springframework/batch/core/schema-sqlite.sql" /></jdbc:initialize-database><!-- Job-Meta en Erinnerung speichern --><!- - <bean id="jobRepository" class="org.springframework.batch.core.repository.support.MapJobRepositoryFactoryBean"> <property name="transactionManager" ref="transactionManager" /></bean> --><!- - job-meta armado en el banco de dados --><bean id="jobRepository" class="org.springframework.batch.core.repository.support.JobReposit oryFactoryBean"> <property name="data Source" ref="dataSource" /> <property name="transactionManager" ref="transactionManager" /> <property name="databaseType" value="sqlite" /></bean > <bean id="transactionManager" class= "org.springframework.batch.support.transaction.ResourcelessTransactionManager" /><bean id="jobLauncher" class="org.springframework.batch.core.launch.support.SimpleJobLauncher"> < Eigenschaftsname="jobRepository" ref="jobRepository" /></bean>
Of course, a Java configuration is also available:
@Configuration@EnableBatchProcessing@Profile("spring")public class SpringConfig { @Value("org/springframework/batch/core/schema-drop-sqlite.sql") private Resource dropReopsitoryTables; @Value("org/springframework/batch/core/schema-sqlite.sql") private Ressource dataReopsitorySchema; @Bean public DataSource dataSource() { DriverManagerDataSource dataSource = new DriverManagerDataSource(); fuente de datos.setDriverClassName("org.sqlite.JDBC"); dataSource.setUrl("jdbc:sqlite:repository.sqlite"); devolver fuente de datos; } @Bean public DataSourceInitializer dataSourceInitializer(DataSource dataSource) löst MalformedURLException { ResourceDatabasePopulator databasePopulator = new ResourceDatabasePopulator(); base de datosPopulator.addScript (dropReopsitoryTables); base de datosPopulator.addScript (dataReopsitorySchema); base de datosPopulator.setIgnoreFailedDrops(true); Initialisierung DataSourceInitializer = new DataSourceInitializer(); initializer.setDataSource (Fechas); inicializador.setDatabasePopulator (DatenbankPopulator); inicializador de retorno; } privates JobRepository getJobRepository() wirft Ausnahme {fábrica JobRepositoryFactoryBean = new JobRepositoryFactoryBean(); fabrica.setDataSource(dataSource()); fabrica.setTransactionManager(getTransactionManager()); fabrica.afterPropertiesSet(); return (JobRepository) fabrica.getObject(); } privado PlatformTransactionManager getTransactionManager() { volver nuevo ResourcelessTransactionManager(); } public JobLauncher getJobLauncher() wirft Exception { SimpleJobLauncher jobLauncher = new SimpleJobLauncher(); jobLauncher.setJobRepository(getJobRepository()); jobLauncher.afterPropertiesSet(); jobLauncher zurückgeben; }}
5. Spring-Batch-Job-Setup
Now let's write our job description for the CSV to XML job:
<importar recurso="spring.xml" /><bean id="record" class="com.baeldung.spring_batch_intro.model.Transaction"></bean><bean id="itemReader" class="org.springframework. batch.item.file.FlatFileItemReader"> <nombre de propiedad="resource" value="input/record.csv" /> <nombre de propiedad="lineMapper"> <bean class="org.springframework.batch.item.file. mapeo.DefaultLineMapper"> <property name="lineTokenizer"> <bean class= "org.springframework.batch.item.file.transform.DelimitedLineTokenizer"> <property name="names" value="username,useid,transactiondate,amount " /> </bean> </property> <property name="fieldSetMapper"> <bean class="com.baeldung.spring_batch_intro.service.RecordFieldSetMapper" /> </property> </bean> </property></ bean><bean id="itemProcessor" class="com.baeldung.spring_batch_intro.service.CustomItemProcessor" /><bean id="itemWriter" class="org.springframework.batch.item.xml.StaxEventItemWriter"> <Nombre da propiedad ="resource" value="file:xml/output.xml" /> <property name="marshaller" ref ="recordMarshaller" /> <property name="rootTagName" value="transactionRecord" /></bean><bean id="recordMarshaller" class="org.springframework.oxm.jaxb.Jaxb2Marshaller"> <property name= " classToBeBound" > <list> <value>com.baeldung.spring_batch_intro.model.Transaction</value> </list> </property></bean><batch:job id="firstBatchJob"> <batch:step id = "paso1 "> <lote:tarea> <lote:trozo lector="elementoLector"escritor="elementoEscritor" processador="elementoProcesador" commit-interval="10"> </lote:trozo> </lote:tarea> < / lote:etapa></lote:trabalho>
And here is the similar working configuration based on Java:
@Profile("spring")public class SpringBatchConfig { @Autowired private JobBuilderFactory jobs; StepBuilderFactory privada Schritte @Autowired; @Value("input/record.csv") recurso privado inputCsv; @Value("file:xml/output.xml") private Ressource outputXml; @Bean public ItemReader<Transaction> itemReader() löst UnexpectedInputException, ParseException { FlatFileItemReader<Transaction> lector = new FlatFileItemReader<Transaction>(); tokenizador DelimitedLineTokenizer = new DelimitedLineTokenizer(); String[] tokens = { "Benutzername", "Benutzer-ID", "Transaktionsdatum", "Betrag" }; tokenizer.setNames (token); lector.setResource (inputCsv); DefaultLineMapper<Transacción> lineMapper = new DefaultLineMapper<Transacción>(); lineMapper.setLineTokenizer (tokenizador); lineMapper.setFieldSetMapper (sin RecordFieldSetMapper()); lector.setLineMapper (lineMapper); Ruckleser; } @Bean public ItemProcessor<Transacción, Transacción> itemProcessor() { return new CustomItemProcessor(); } @Bean public ItemWriter<Transaction> itemWriter(Marshaller marshaller) löst MalformedURLException aus {StaxEventItemWriter<Transaction> itemWriter = new StaxEventItemWriter<Transaction>(); itemWriter.setMarshaller (Marshaller); itemWriter.setRootTagName("transactionRecord"); itemWriter.setResource (salidaXml); devolver artículoEscritor; } @Bean public Marshaller marshaller() { Jaxb2Marshaller marshaller = new Jaxb2Marshaller(); marshaller.setClassesToBeBound (neue Klasse [] { Transaction.class }); Rückkehrmarschall; } @Bean protected Schritt step1(ItemReader<Transaktion> Leser, ItemProcessor<Transaktion, Transaktion> Prozessor, ItemWriter<Transaktion> Schreiber) { return steps.get("step1").<Transaktion, Transaktion> chunk(10) .reader( lector).procesador(procesador).escritor(escritor).build(); } @Bean(name = "firstBatchJob") public Job job(@Qualifier("step1") Step step1) { return jobs.get("firstBatchJob").start(step1).build(); }}
Now that we have the setup complete, let's dive in and start the discussion.
5.1. Read data and create objects witharticle reader
First we set thecvsFileItemReaderwhere the data is read fromrecord.csvand convert it totransactionObject:
@SuppressWarnings("constraint")@XmlRootElement(name = "transactionRecord") public class Transaction { private String username; user ID private int; private transaction date LocalDateTime; private duplo bravery; /* Getters and setters for attributes */ @Override public String toString() { return "Transaction [username=" + username + ", userID=" + userID + ", transactionDate=" + transactionDate + ", amount =" + bravery + "]"; }}
A custom mapper is used for this:
public class RecordFieldSetMapper implements FieldSetMapper<Transaction> { public transaction mapFieldSet(FieldSet fieldSet) throws BindException { DateTimeFormatter formatter = DateTimeFormatter.ofPattern("d/M/yyy"); transaction transaction = new transaction(); transaction.setUsername(fieldSet.readString("Username")); transaction.setUserId(fieldSet.readInt(1)); transaction.setAmount(fieldSet.readDouble(3)); String dateString = fieldSet.readString(2); transaction.setTransactionDate(LocalDate.parse(dateString, formatter).atStartOfDay()); investment; }}
5.2. data processing withelement renderer
We have created our own article processor,Custom Item Processor. This does not process anything related to the transaction object.
It just passes the original object coming from the reader to the writer:
la clase pública CustomItemProcessor implementa ItemProcessor<Transaction, Transaction> { public Transaction process(Transaction item) { return item; }}
5.3. Write objects in FS withElement Writer
Finally, let's put that away.transactionin an XML file located atxml/salida.xml:
<bean id="itemWriter" class="org.springframework.batch.item.xml.StaxEventItemWriter"> <property name="resource" value="file:xml/output.xml" /> <property name="marshaller" ref="recordMarshaller" /> <property name="rootTagName" value="transactionRecord" /></bean>
5.4. Configure the batch job
All we have to do is connect the dots to a job that uses thatstack: workSyntax.
Note theconfirmation interval. This is the number of transactions held in memory before the batch is sent toarticleWriter.
Until this point (or until the end of the input data is reached), the transactions are kept in memory:
<batch:job id="firstBatchJob"> <batch:step id="step1"> <batch:task> <batch:fragment reader="itemReader"writer="itemWriter" Processor="itemProcessor" commit-interval="10 "> </batch:fragment> </batch:task> </batch:step></batch:job>
5.5. Running the batch job
Now let's set up and run everything:
@Profile("spring")public class App { public static void main(String[] args) { // Spring Java config AnnotationConfigApplicationContext context = new AnnotationConfigApplicationContext(); contexto.registrar (SpringConfig.class); contexto.registrar (SpringBatchConfig.class); Kontext.refresh(); JobLauncher jobLauncher = (JobLauncher) context.getBean("jobLauncher"); Trabajo trabajo = (Trabajo) context.getBean("firstBatchJob"); System.out.println("Iniciando o trabalho em lote"); intente {Ejecución de JobExecution = jobLauncher.run(job, new JobParameters()); System.out.println("Estado de la tarifa: " + ejecución.getStatus()); System.out.println("Trabalho concluido"); } catch (Ausnahme e) {e.printStackTrace(); System.out.println("Falha no trabalho"); } }}
We run our Spring application with-Dspring.profiles.active=FrühlingProfile.
In the next section, we'll set up our example in a Spring Boot application.
6. Spring Boot Configuration
In this section, we will create a Spring Boot application and convert the above Spring Batch Config to run in the Spring Boot environment. In fact, this is pretty much the same as the Spring Batch example above.
6.1. expert units
Let's start with the statement.spring-boot-arranque-loteDependency of a Spring Boot application onpom.xml:
<dependencia> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-batch</artifactId></dependency>
We need a database to store the Spring Batch job information. In this tutorial we will use an in-memory fileHSQLDBDatabase. So we have to usehsqldbwith spring shoe:
<dependencia> <groupId>org.hsqldb</groupId> <artifactId>hsqldb</artifactId> <version>2.7.0</version> <scope>Ausführungstempo</scope></dependency>
6.2. Spring Boot Configuration
we use that@Profileto differentiate between Spring and Spring Boot configurations. we define themspring bootsProfile in our application:
@SpringBootApplicationpublic class SpringBatchApplication { public static void main(String[] args) { SpringApplication springApp = new SpringApplication(SpringBatchApplication.class); springApp.setAdditionalProfiles("spring-boot"); springApp.run (argumentos); }}
6.3. spring-batch-job-configuration
We will use the batch job settings in the same way asSpringBatchConfigprevious lessons:
@Configuration@EnableBatchProcessing@Profile("spring-boot")public class SpringBootBatchConfig { @Autowired private JobBuilderFactory jobBuilderFactory; @Autowired privado StepBuilderFactory stepBuilderFactory; @Value("input/record.csv") recurso privado inputCsv; @Value("input/recordWithInvalidData.csv") recurso privado invalidInputCsv; @Value("file:xml/output.xml") private Ressource outputXml; // ...}
We'll start with a feather@IdeasAnnotation. Then we add those@EnableBatchProcessingNote for the class.o@EnableBatchProcessingThe annotation automatically creates theData sourceobject and gives it to our work.
7. Conclusion
In this article, we learned how to work with Spring Batch and how to use it in a simple use case.
We've seen how to easily build our batch processing pipeline and how to customize the different stages of reading, processing, and writing.
As always, see the full implementation in this article.sin GitHub.
Start with Spring 5 and Spring Boot 2 on thelearn springCourse:
>> THE COURSE
Learn how to build your API
with spring?
Download the ebook