Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
552 views
in Technique[技术] by (71.8m points)

python - Django: how to do get_or_create() in a threadsafe way?

In my Django app very often I need to do something similar to get_or_create(). E.g.,

User submits a tag. Need to see if that tag already is in the database. If not, create a new record for it. If it is, just update the existing record.

But looking into the doc for get_or_create() it looks like it's not threadsafe. Thread A checks and finds Record X does not exist. Then Thread B checks and finds that Record X does not exist. Now both Thread A and Thread B will create a new Record X.

This must be a very common situation. How do I handle it in a threadsafe way?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Since 2013 or so, get_or_create is atomic, so it handles concurrency nicely:

This method is atomic assuming correct usage, correct database configuration, and correct behavior of the underlying database. However, if uniqueness is not enforced at the database level for the kwargs used in a get_or_create call (see unique or unique_together), this method is prone to a race-condition which can result in multiple rows with the same parameters being inserted simultaneously.

If you are using MySQL, be sure to use the READ COMMITTED isolation level rather than REPEATABLE READ (the default), otherwise you may see cases where get_or_create will raise an IntegrityError but the object won’t appear in a subsequent get() call.

From: https://docs.djangoproject.com/en/dev/ref/models/querysets/#get-or-create

Here's an example of how you could do it:

Define a model with either unique=True:

class MyModel(models.Model):
    slug = models.SlugField(max_length=255, unique=True)
    name = models.CharField(max_length=255)

MyModel.objects.get_or_create(slug=<user_slug_here>, defaults={"name": <user_name_here>})

... or by using unique_togheter:

class MyModel(models.Model):
    prefix = models.CharField(max_length=3)
    slug = models.SlugField(max_length=255)
    name = models.CharField(max_length=255)

    class Meta:
        unique_together = ("prefix", "slug")

MyModel.objects.get_or_create(prefix=<user_prefix_here>, slug=<user_slug_here>, defaults={"name": <user_name_here>})

Note how the non-unique fields are in the defaults dict, NOT among the unique fields in get_or_create. This will ensure your creates are atomic.

Here's how it's implemented in Django: https://github.com/django/django/blob/fd60e6c8878986a102f0125d9cdf61c717605cf1/django/db/models/query.py#L466 - Try creating an object, catch an eventual IntegrityError, and return the copy in that case. In other words: handle atomicity in the database.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...