Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.8k views
in Technique[技术] by (71.8m points)

javascript - Querying same document in parallel in the same API in mongoDB

I have a an API written in typescript and I try to run parallel queries for same document by using promise.allsettled however it performs worse and I guess they run sequentially. Is there a way to perform parallel queries on the same document in the same connection for mongoDB. here is the code:

console.time("normal");
let normal = await ContentRepo.geBySkillIdWithSourceFiltered(
    [chosenSkillsArr[0].sid!],
    readContentIds,
    body.isVideoIncluded,
    true,
    true
);
console.timeEnd("normal");

console.time("parallel");
const parallel = await Promise.allSettled(
    chosenSkillsArr.map(async (skill: IScrapeSkillDocument) => {
        const result = await ContentRepo.geBySkillIdWithSourceFiltered(
            [skill.sid!],
            readContentIds,
            body.isVideoIncluded,
            true,
            true
        );
    })
);
console.timeEnd("parallel");

The function I called is here:

async geBySkillIdWithSourceFiltered(
    skillIds: string[],
    contentIds: string[],
    isVideoIncluded?: boolean,
    isCuratorIdFilter?: boolean,
    activeSourceFilter?: boolean
): Promise<IContentWithSource[]> {
    try {
        console.time(`single-${skillIds}`);
        var contents = await ContentM.find({
            $and: [
                { "skills.skillId": { $in: skillIds } },
                { recordStatus: true },
                isCuratorIdFilter ? { curatorId: 0 } : {},
                isVideoIncluded ? {} : { type: contentTypeNumber.read },
                { _id: { $nin: contentIds } },
            ],
        }).exec();
        var items: IContentWithSource[] = [];
        var sourceIds = new Set<string>();
        contents.forEach((content) => {
            if (!this.isEmpty(content.sourceId)) {
                sourceIds.add(content.sourceId!);
            }
        });
        var sources: any = {};
        var sourcesArr = await new SourceRepo().getByIds(
            Array.from(sourceIds)
        );
        sourcesArr.forEach((source) => {
            sources[source._id] = source;
        });

        if (activeSourceFilter) {
            contents
                .map((i) => i.toJSON() as IContentWithSource)
                .map((k) => {
                    if (sources[k.sourceId!].isActive) {
                        k.source = sources[k.sourceId!];
                        items.push(k);
                    }
                });
        } else {
            contents
                .map((i) => i.toJSON() as IContentWithSource)
                .map((k) => {
                    k.source = sources[k.sourceId!];
                    items.push(k);
                });
        }
        console.timeEnd(`single-${skillIds}`);

        return items;
    } catch (err) {
        throw err;
    }
}

And the results are:

single-KS120B874P2P6BK1MQ0T: 1872.735ms
normal: 1873.934ms
single-KS120B874P2P6BK1MQ0T: 3369.925ms
single-KS440QS66YCBN23Y8K25: 3721.214ms
single-KS1226Y6DNDT05G7FJ4J: 3799.050ms
parallel: 3800.586ms

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

It seems like you are running more code in the parallel version

// The normal version
let normal = await ContentRepo.geBySkillIdWithSourceFiltered(
    [chosenSkillsArr[0].sid!],
    readContentIds,
    body.isVideoIncluded,
    true,
    true
);


// The code inside the parallel version:
chosenSkillsArr.map(async (skill: IScrapeSkillDocument) => {
        const result = await ContentRepo.geBySkillIdWithSourceFiltered(
            [skill.sid!],
            readContentIds,
            body.isVideoIncluded,
            true,
            true
        );
    })
[chosenSkillsArr[0].sid!], vs  chosenSkillsArr.map()

For the parallel version, you are putting the function call (ContentRepo.geBySkillIdWithSourceFiltered) inside a loop. That's why it is slower.

For the question about running promises in parallel:

Like Promise.all, Promise.allSettled await multiple promises. It doesn't care about what order they resolve, or whether the computations are running in parallel. They both do not guarantee concurrency nor the opposite. Their task is just to ensure all the promises passed to it are handled.

So you can't manually guarantee the parallelism of promise execution

Here is a really interesting article explaining parallelism and Promise.All and how browser Nodejs API differs from Nodejs API installed on your computer in terms of parallelism.

Here is the extract of the article's conclusion:

JavaScript runtime is single-threaded. We do not have access to thread in JavaScript. Even if you have multi-core CPU you still can't run tasks in parallel using JavaScript. But, the browser/NodeJS uses C/C++ (!) where they have access to thread. So, they can achieve parallelism.

Side Note:

There is one subtle difference:

  1. Promise.all: Resolves only when all promises passed to it resolves else it will reject with the first rejected promise error.

  2. Promise.allSettled: Will always get resolved with an array having info about resolved and rejected promises.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...