In a relational schema many to many relationships are generally implemented using intersection tables. Lets say you have two entities Cases and Actions that have many to many relationship. Relationally, you'd create a Case_Action table that stores action_id and case_id. Now, what if you have to represent this relationship in a schema-less data-store like Solr? To start, you'd try to index so that you can get a document of type Case which stores the list of Case fields. This will be your base entity.
(NOTE: In Solr the term document is equivalent to a database row)
Well, there are several ways of doing this:
1. OPTION A: Create a multivalued field called action_id_list that lists all actions for a particular case. Populate this field for every document of type Case.
2. OPTION B: Create a multivalued field called case_id_list that lists all cases for a particular action. Populate this field for every document of type Action.
3. OPTION C: Don't create a multivalued field, instead, create multiple action documents for each case that the action is related to. This means your id will be a composite string of action_id and case_id
The option you choose depends on how you want to access the entities in a flattened structure. Option A and Option B are cleaner and easier to implement than C. But they have a limitation.
UPDATE: Here are the diagrams as promised. Again,
(NOTE: In Solr the term document is equivalent to a database row)
Well, there are several ways of doing this:
1. OPTION A: Create a multivalued field called action_id_list that lists all actions for a particular case. Populate this field for every document of type Case.
2. OPTION B: Create a multivalued field called case_id_list that lists all cases for a particular action. Populate this field for every document of type Action.
3. OPTION C: Don't create a multivalued field, instead, create multiple action documents for each case that the action is related to. This means your id will be a composite string of action_id and case_id
The option you choose depends on how you want to access the entities in a flattened structure. Option A and Option B are cleaner and easier to implement than C. But they have a limitation.
- Option B should be used only when an action document is expected to store only the case_id and no other case related field. Otherwise, an extra query needs to fetch the required fields from a case document.
- Similarly, Option A should be used if no other action field needs to be queried when you query a case document except for the field action_id.
- Option C is like a intersection table where it gives you the flexibility to pick whichever fields of the case and action entity. The only drawback is that it makes the id field a little more complicated and perhaps difficult to use in your query.
UPDATE: Here are the diagrams as promised. Again,
1. OPTION A: Create a multivalued field called action_id_list that lists all actions for a particular case. Populate this field for every document of type Case.
2. OPTION B: Create a multivalued field called case_id_list that lists all cases for a particular action. Populate this field for every document of type Action.
3. OPTION C: Don't create a multivalued field, instead, create multiple action documents for each case that the action is related to. This means your id will be a composite string of action_id and case_id
Hi, could you illustrate with a diagram? it is still hard to understand. thanks!
ReplyDeletecould you illustrate? please!
ReplyDeletecan you please provide a code example like database-config.xml schema. it would be very helpful.
ReplyDeletethanks
Translating a many-to-many relationship in database design requires an intermediary table to link related entities. Is Arcade Snake This ensures accurate data representation.
ReplyDelete