Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
389 views
in Technique[技术] by (71.8m points)

java - Best approach using Spring batch to process big file

I am using Spring batch to download a big file in order to process it. the scenario is pretty simple:

1. Download the file via http
2. process it(validations,transformations)
3. send it into queue
  • no need to save the input file data.
  • we might have multiple job instances(of the same scenario) running in the same time

I am looking for best practice to handle this situation.

Should I create Tasklet to download the file locally and than start processing it via regular steps?
in that case I need to consider some temp-file-concerns (make sure I delete it, make sure i am not overriding other temp filename, etc..)

In other hand I could download it and keep it in-memory but than I afraid that if I run many jobs instances ill be out of memory very soon.

How would you suggest to nail this scenario ?? Should I use tasklet at all?

thank you.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

If you have a large file, I'd recommend storing it to disk unless there is a good reason not to. By saving the file to disk it allows you to restart the job without the need to re-download the file if an error occurs.

With regards to the Tasklet vs Spring Integration, we typically recommend Spring Integration for this type of functionality since FTP functionality is already available there. That being said, Spring XD uses a Tasklet for FTP functionality so it's not uncommon to take that approach as well.

A good video to watch about the integration of Spring Batch and Spring Integration is the talk Gunnar Hillert and I gave at SpringOne2GX. You can find the entire video here: https://www.youtube.com/watch?v=8tiqeV07XlI. The section that talks about using Spring Batch Integration for FTP before Spring Batch is at about 29:37.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...