Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
429 views
in Technique[技术] by (71.8m points)

javascript - Optimized Bulk (Chunk) Upload Of Objects Into IndexedDB

I want to add objects into some table in IndexedDB in one transaction:

_that.bulkSet = function(data, key) {
    var transaction = _db.transaction([_tblName], "readwrite"),
        store = transaction.objectStore(_tblName),
        ii = 0;

    _bulkKWVals.push(data);
    _bulkKWKeys.push(key);

    if (_bulkKWVals.length == 3000) {
        insertNext();
    }

    function insertNext() {
        if (ii < _bulkKWVals.length) {
            store.add(_bulkKWVals[ii], _bulkKWKeys[ii]).onsuccess = insertNext;
            ++ii;
        } else {
            console.log(_bulkKWVals.length);
        }
    }
};

Looks like that it works fine, but it is not very optimized way of doing that especially if the number of objects is very high (~50.000-500.000). How could I possibly optimize it? Ideally I want to add first 3000, then remove it from the array, then add another 3000, namely in chunks. Any ideas?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Inserting that many rows consecutively, is not possible to get good performance.

I'm an IndexedDB dev and have real-world experience with IndexedDB at the scale you're talking about (writing hundreds of thousands of rows consecutively). It ain't too pretty.

In my opinion, IDB is not suitable for use when a large amount of data has to be written consecutively. If I were to architect an IndexedDB app that needed lots of data, I would figure out a way to seed it slowly over time.

The issue is writes, and the problem as I see it is that the slowness of writes, combined with their i/o intensive nature, makes gets worse over time. (Reads are always lightening fast in IDB, for what it's worth.)

To start, you'll get savings from re-using transactions. Because of that your first instinct might be to try to cram everything into the same transaction. But from what I've found in Chrome, for example, is that the browser doesn't seem to like long-running writes, perhaps because of some mechanism meant to throttle misbehaving tabs.

I'm not sure what kind of performance you're seeing, but average numbers might fool you depending on the size of your test. The limiting faster is throughput, but if you're trying to insert large amounts of data consecutively pay attention to writes over time specifically.

I happen to be working on a demo with several hundred thousand rows at my disposal, and have stats. With my visualization disabled, running pure dash on IDB, here's what I see right now in Chrome 32 on a single object store with a single non-unique index with an auto-incrementing primary key.

A much, much smaller 27k row dataset, I saw 60-70 entries/second:
* ~30 seconds: 921 entries/second on average (there's always a great burst of inserts at the start), 62/second at the moment I sampled
* ~60 seconds: 389/second average (sustained decreases starting to outweigh effect initial burst) 71/second at moment
* ~1:30: 258/second, 67/second at moment
* ~2:00 (~1/3 done): 188/second on average, 66/second at moment

Some examples with a much smaller dataset show far better performance, but similar characteristics. Ditto much larger datasets - the effects are greatly exaggerated and I've seen as little as <1 entries per second when leaving for multiple hours.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...