Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
462 views
in Technique[技术] by (71.8m points)

mysql - Hibernate, JDBC and Java performance on medium and big result set

Issue

We are trying to optimize our dataserver application. It stores stocks and quotes over a mysql database. And we are not satisfied with the fetching performances.

Context

- database
    - table stock : around 500 lines
    - table quote : 3 000 000 to 10 000 000 lines
    - one-to-many association : one stock owns n quotes
    - fetching around 1000 quotes per request
    - there is an index on (stockId,date) in the quote table
    - no cache, because in production, querys are always different
- Hibernate 3
- mysql 5.5
- Java 6
- JDBC mysql Connector 5.1.13
- c3p0 pooling

Tests and results

Protocol

  • Execution times on mysql server are obtained with running the generated sql queries in mysql command line bin.
  • The server is in a test context : no other DB readings, no DB writings
  • We fetch 857 quotes for the AAPL stock

Case 1 : Hibernate with association

This fills up our stock object with 857 quotes object (everything correctly mapped in hibernate.xml)

session.enableFilter("after").setParameter("after", 1322910573000L);
Stock stock = (Stock) session.createCriteria(Stock.class).
add(Restrictions.eq("stockId", stockId)).
setFetchMode("quotes", FetchMode.JOIN).uniqueResult();

SQL generated :

SELECT this_.stockId AS stockId1_1_,
       this_.symbol AS symbol1_1_,
       this_.name AS name1_1_,
       quotes2_.stockId AS stockId1_3_,
       quotes2_.quoteId AS quoteId3_,
       quotes2_.quoteId AS quoteId0_0_,
       quotes2_.value AS value0_0_,
       quotes2_.stockId AS stockId0_0_,
       quotes2_.volume AS volume0_0_,
       quotes2_.quality AS quality0_0_,
       quotes2_.date AS date0_0_,
       quotes2_.createdDate AS createdD7_0_0_,
       quotes2_.fetcher AS fetcher0_0_
FROM stock this_
LEFT OUTER JOIN quote quotes2_ ON this_.stockId=quotes2_.stockId
AND quotes2_.date > 1322910573000
WHERE this_.stockId='AAPL'
ORDER BY quotes2_.date ASC

Results :

  • Execution time on mysql server : ~10 ms
  • Execution time in Java : ~400ms

Case 2 : Hibernate without association without HQL

Thinking to increase performance, we've used that code that fetch only the quotes objects and we manually add them to a stock (so we don't fetch repeated infos about the stock for every line). We used createSQLQuery to minimize effects of aliases and HQL mess.

String filter = " AND q.date>1322910573000";
filter += " ORDER BY q.date DESC";
Stock stock = new Stock(stockId);
stock.addQuotes((ArrayList<Quote>) session.createSQLQuery("select * from quote q where stockId='" + stockId + "' " + filter).addEntity(Quote.class).list());

SQL generated :

SELECT *
FROM quote q
WHERE stockId='AAPL'
  AND q.date>1322910573000
ORDER BY q.date ASC

Results :

  • Execution time on mysql server : ~10 ms
  • Execution time in Java : ~370ms

Case 3 : JDBC without Hibernate

String filter = " AND q.date>1322910573000";
filter += " ORDER BY q.date DESC";
Stock stock = new Stock(stockId);
Connection conn = SimpleJDBC.getConnection();
Statement stmt = conn.createStatement();
ResultSet rs = stmt.executeQuery("select * from quote q where stockId='" + stockId + "' " + filter);
while(rs.next())
{
    stock.addQuote(new Quote(rs.getInt("volume"), rs.getLong("date"), rs.getFloat("value"), rs.getByte("fetcher")));
}
stmt.close();
conn.close();

Results :

  • Execution time on mysql server : ~10 ms
  • Execution time in Java : ~100ms

Our understandings

  • The JDBC driver is common to all the cases
  • There is a fundamental time cost in JDBC driving
  • With similar sql queries, Hibernate spends more time than pure JDBC code in converting result sets in objects
  • Hibernate createCriteria, createSQLQuery or createQuery are similar in time cost
  • In production, where we have lots of writing concurrently, pure JDBC solution seemed to be slower than the hibernate one (maybe because our JDBC solutions was not pooled)
  • Mysql wise, the server seems to behave very well, and the time cost is very acceptable

Our questions

  • Is there a way to optimize the performance of JDBC driver ?
  • And will Hibernate benefit this optimization ?
  • Is there a way to optimize Hibernate performance when converting result sets ?
  • Are we facing something not tunable because of Java fundamental object and memory management ?
  • Are we missing a point, are we stupid and all of this is vain ?
  • Are we french ? Yes.

Your help is very welcome.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Can you do a smoke test with the simples query possible like:

SELECT current_timestamp()

or

SELECT 1 + 1

This will tell you what is the actual JDBC driver overhead. Also it is not clear whether both tests are performed from the same machine.

Is there a way to optimize the performance of JDBC driver ?

Run the same query several thousand times in Java. JVM needs some time to warm-up (class-loading, JIT). Also I assume SimpleJDBC.getConnection() uses C3P0 connection pooling - the cost of establishing a connection is pretty high so first few execution could be slow.

Also prefer named queries to ad-hoc querying or criteria query.

And will Hibernate benefit this optimization ?

Hibernate is a very complex framework. As you can see it consumes 75% of the overall execution time compared to raw JDBC. If you need raw ORM (no lazy-loading, dirty checking, advanced caching), consider mybatis. Or maybe even JdbcTemplate with RowMapper abstraction.

Is there a way to optimize Hibernate performance when converting result sets ?

Not really. Check out the Chapter 19. Improving performance in Hibernate documentation. There is a lot of reflection happening out there + class generation. Once again, Hibernate might not be a best solution when you want to squeeze every millisecond from your database.

However it is a good choice when you want to increase the overall user experience due to extensive caching support. Check out the performance doc again. It mostly talks about caching. There is a first level cache, second level cache, query cache... This is the place where Hibernate might actually outperform simple JDBC - it can cache a lot in a ways you could not even imagine. On the other hand - poor cache configuration would lead to even slower setup.

Check out: Caching with Hibernate + Spring - some Questions!

Are we facing something not tunable because of Java fundamental object and memory management ?

JVM (especially in server configuration) is quite fast. Object creation on the heap is as fast as on the stack in e.g. C, garbage collection has been greatly optimized. I don't think the Java version running plain JDBC would be much slower compared to more native connection. That's why I suggested few improvements in your benchmark.

Are we missing a point, are we stupid and all of this is vain ?

I believe that JDBC is a good choice if performance is your biggest issue. Java has been used successfully in a lot of database-heavy applications.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...