In my detailed post on NoSQL data modeling, I listed down ways of modeling NoSQL data (using mongodb collections). For one-to-one relation, usually, it is not apparent why one needs a separate collection instead of embedding everything in single document.
In this post I’ll address this and share an example.
University/School Example
Keeping our example real-world but simple, let’s model a university or school. A few obvious entities emerge:
-
Student
-
Professor
-
Receptionist
-
Security Guard
-
Janitor
And here are minimum attributes required for each (not including obvious fields like id, createDate, updateDate etc. )
Student
firstName
lastName
dob
email
password
education
batch
CGPA
enrolDate
Professor
firstName
lastName
dob
email
password
degrees
experience
bio
joinDate
Receptionist
firstName
lastName
dob
email
password
certificates
vocationalTraining
experience
joinDate
Janitor
firstName
lastName
dob
email
password
experience
joinDate
Guard
firstName
lastName
dob
email
password
weapon
experience
joinDate
How To Model?
Now we’ve listed down the possible entities, we need to actually model them. Let’s see a few possible ways:
All Entities Have Their Own Models/Collections
This way each of the above listed entities have their own models. It seems to be a good choice at first but reveals a problem on further analysis. When application logic is written around this design, and especially when we have a single point of entry into the system (same login page / API with no parameter to identify role) for all kinds of users, it requires us to search each of the five collections for email/password combination to find out which user has logged in (and perhaps take them to their own dashboard)
All Entities In A Single Collection
Note that many of the fields listed above are common across all entities. That includes email and password, which are credentials for login. This commonality offers us an easy solution to the above problem: We can merge all the current entities into one collection, say Person or User, and keep another field userType (or userRole) that tells the user type.
User
firstName
lastName
dob
email
password
education
batch
CGPA
certificates
degrees
experience
joinDate
enrolDate
vocationalTraining
weapon
bio
userType
It saves us the trouble of identifying user type and logging-in difficulty by merging common fields like firstName, lastName, email, password etc in to one collection. At the same time, however, it makes application management hard. Because most of the other fields are exclusive to specific user types, such as, only student has batch and enrolDate; only professor has degrees; only guard has weapon; and so on. That’s a lot of tracking and management.
It’s true that mongodb does not store any field as null if it’s not provided (even with mongoose schema which needs to have all of the merged fields defined upfront) but still our collection isn’t meaningful enough and is very hard to scale as more user roles are brought in to the system.
A General And Specialized Collection
Finally, the best way, in my opinion, is one-to-one relationship in split form i.e. use two collections and link them with a reference. In our case, User is the generalized collection containing common fields, while all other user types are specialized collections with only relevant fields placed in them.
In 1-1 relations, the choice of collection to keep reference is arbitrary, as both referred and referenced document are unique and performance-wise it doesn’t make much of a difference to keep the reference on either side. In the modified modeling below, we place user
field in all specialized collections to keep User reference (user id).
So let’s see how our modeling stands at this stage:
User
Note that we still need to keep userType
because user has no way to know of it’s type otherwise.
firstName
lastName
dob
email
password
education
userType
Student
education
batch
CGPA
enrolDate
user
(For example: user: '507f191e810c19729de860ea')
Professor
degrees
experience
bio
joinDate
user
Receptionist
certificates
vocationalTraining
experience
joinDate
user
Janitor
experience
joinDate
user
Guard
weapon
experience
user
With such splitting, it becomes super easy to add more roles. The login logic too requires little or no change with each addition!
Conclusion
Most of the times in NoSQL data design and modeling phase we don’t seem to find a relevant case of separating collections for one to one cases. Usually it’s not required too, as trivial cases are well served by embedding the document. But in this post we went through an example, where we are better off splitting the collections — in their specialized and generalized forms — and using reference to link them, for better application management and easy scalability .