Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
479 views
in Technique[技术] by (71.8m points)

data structures - Adjustable, versioned graph database

I'm currently working on a project where I use natural language processing to extract emotions from text to correlate them with contextual information.

Definition of contextual information: Every information that is relevant to describe an entity's situation in time an space.

Description of the data structure I'm looking for:

There is a arbitrary number of entities (an entity can either be a person or a group for example (twitter hash tags)) of which I want to track contextual information and their conversations with other entities. Conversations between entities are processed in order to classify their emotional features. Basic emotional features consist of a vector that specifies their occurrence percentually: {fear: 0.1, happiness: 0.4, joy: 0.1, surprise: 0.9, anger: 0} Entities can also submit any contextual information they'd like to share, for example: location, room-temperature, blood pressure, ... and so on (will refer to this as contextual variables). Because neither the number of conversations of an entity, nor the number of contextual variables they want to share is clear at any point in time, the data structure needs to be able to adjust accordingly.

Important: Every change in the data must also represent an own state as I'm looking forward to correlate certain changes in state with each other.

Example: Bob and Alice have a conversation that shows high magnitude of fear. A couple of hours later they have another conversation that shows no more fear, but happiness. Now, one could argue that high magnitude fear, followed by happiness actually could be interpreted as the emotion relief.

However, in order to be able to extract this very information I need to be able to correlate different states with each other. Same goes for using contextual information to correlate them with the tracked emotions in conversations. This is why every state change must be recorded and available.

To make this more clear to you, I've created a graphic and attached it to the question.

enter image description here Now, the actual question I have is: Which database/data structure can I use to solve this problem? I've looked into event-sourcing databases but wasn't quite convinced if I can easily recreate a graph structure with them. I also looked at graph databases but didn't find what I was looking for.

Therefore it would be nice if someone here could at least point me in the right direction or help me adjust my structure accordingly to solve the problem. If however there are data structures supporting, what I call it graph databases with snapshots then ease of usage is probably the most important feature to filter for.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

There's a database called Datomic by Rich Hickey (of Clojure fame) that stores facts over time. Every entry in the database is a fact with a timestamp, append-only as in Event Sourcing.

These facts can be queried with a relational/logical language ala Datalog (remiscent of Prolog). Please see This post by kisai for a quick overview. It has been used for querying graphs with some success in the past: Using Datomic as a Graph Database.

While I have no experience with Datomic, it does seem to be quite suitable for your particular problem.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...