I have some code that performs an UPSERT, also known as a Merge. I want to clean-up this code, specifically, I want to move away from exception handling, and reduce overall verbosity and sheer complexity of the code for such a simple operation. The requirement is to insert each item unless it already exists:
public void batchInsert(IncomingItem[] items) {
try(Session session = sessionFactory.openSession()) {
batchInsert(session, items);
}
catch(PersistenceException e) {
if(e.getCause() instanceof ConstraintViolationException) {
logger.warn("attempting to recover from constraint violation");
DateTimeFormatter dbFormat = DateTimeFormatter.ofPattern("yyyy-MM-dd HH:mm:ss.SSS");
items = Arrays.stream(items).filter(item -> {
int n = db.queryForObject("select count(*) from rets where source = ? and systemid = ? and updtdate = ?::timestamp",
Integer.class,
item.getSource().name(), item.getSystemID(),
dbFormat.format(item.getUpdtDateObj()));
if(n != 0) {
logger.warn("REMOVED DUPLICATE: " +
item.getSource() + " " + item.getSystemID() + " " + item.getUpdtDate());
return false;
}
else {
return true; // keep
}
}).toArray(IncomingItem[]::new);
try(Session session = sessionFactory.openSession()) {
batchInsert(session, items);
}
}
}
}
An initial search of SO is unsatisfactory:
In the question How to do ON DUPLICATE KEY UPDATE in Spring Data JPA? which was marked as a duplicate, I noticed this intriguing comment:
That was a dead-end as I really don't understand the comment, despite it sounding like a clever solution, and mention of "actual same SQL statement".
Another promising approach is this: Hibernate and Spring modify query Before Submitting to DB
ON CONFLICT DO NOTHING / ON DUPLICATE KEY UPDATE
Both of the major open-source databases support a mechanism to push idempotency down to the database. The examples below use the PostgreSQL syntax, but can be easily adapted for MySQL.
By following the ideas in Hibernate and Spring modify query Before Submitting to DB, Hooking into Hibernate's query generation, and How I can configure StatementInspector in Hibernate?, I implemented:
import org.hibernate.resource.jdbc.spi.StatementInspector;
@SuppressWarnings("serial")
public class IdempotentInspector implements StatementInspector {
@Override
public String inspect(String sql) {
if(sql.startsWith("insert into rets")) {
sql += " ON CONFLICT DO NOTHING";
}
return sql;
}
}
with property
<prop key="hibernate.session_factory.statement_inspector">com.myapp.IdempotentInspector</prop>
Unfortunately this leads to the following error when a duplicate is encountered:
Caused by:
org.springframework.orm.hibernate5.HibernateOptimisticLockingFailureException:
Batch update returned unexpected row count from update [0]; actual row
count: 0; expected: 1; nested exception is
org.hibernate.StaleStateException: Batch update returned unexpected
row count from update [0]; actual row count: 0; expected: 1
Which makes sense, if you think about what's going on under the covers: the ON CONFLICT DO NOTHING
causes zero rows to be inserted, but one insert is expected.
Is there a solution that enables thread-safe exception-free concurrent idempotent inserts and doesn't require manually defining the entire SQL insert statement to be executed by Hibernate?
For what it's worth, I feel that the approaches that push the dupcheck down to the database are the path to a proper solution.
CLARIFICATION
The IncomingItem
objects consumed by the batchInsert
method originate from a system where records are immutable. Under this special condition the ON CONFLICT DO NOTHING
behaves the same as an UPSERT, notwithstanding possible loss of the Nth update.
See Question&Answers more detail:
os