Databases are not cheap, especially when your application is just taking off and you are low on budget. It is then that freeing up retrievable and precious space from the database is a good way to save some dollars for a while.
For that, you have to set up a criteria based on which pruning to happen.
Pruning Criteria
The focal point of majority apps is user, and most of the other data in the database is usually related to it. Therefore, you might want to plan the clean-up around user. Following are mongoose schemas of three collections, including user, of a hypothetical application.
Schemas
User
// user.js
const mongoose = require('mongoose');
const UserSchema = new mongoose.Schema({
email: { type: String, index: true },
password: { type: String, index: true },
deactivatedOn: { type: Date, index: true },
active: { type: Boolean, index: true }
//other fields
});
module.exports = mongoose.model("User", UserSchema);
deactivatedOn
and active
fields define solely our criteria to delete the user and all of its data.
Assuming you are following a subscription model in your application, and user gets deactivated (active: false
) when subscription is not upgraded, it is at this point you should also fill deactivatedOn
with current time with Date.now()
. We will give the user some grace period (say 20 days) in case they want to return.
Message
// message.js
const mongoose = require('mongoose');
const MessageSchema = new mongoose.Schema({
user: { type: mongoose.Schema.Types.ObjectId, ref: 'User', index: true }
//other fields
});
module.exports = mongoose.model("Message", MessageSchema);
Project
// project.js
const mongoose = require('mongoose');
const ProjectSchema = new mongoose.Schema({
user: { type: mongoose.Schema.Types.ObjectId, ref: 'User', index: true }
//other fields
});
module.exports = mongoose.model("Project", ProjectSchema);
The Cleaner Script
If you haven’t already, install mongoose and mongo-date-query (a handy library to form a date interval query for you).
npm i mongoose mongo-date-query --save
Below is the cleaner script (call it userCleaner.js
) that runs every 5 seconds to look for one inactive user who’s deactivated more than 20 days ago (grace period). If found, all of its messages, projects are deleted, and finally the user itself is removed.
const mongoose = require('mongoose');
const mdq = require('mongo-date-query');
const User = require('./user');
const Message = require('.message');
const Project = require('./project');
const checkInterval = 5000;
mongoose.connect('mongodb://127.0.0.1/webapp');
mongoose.connection.on('connected', checkUsers);
function checkUsers() {
console.log("*** Cleaner Looking For Users ***");
let userId;
User.findOne({ active: false, deactivatedOn: mdq.beforeLastDays(20) })
.exec()
.then((user) => {
if (!user) {
throw "No User Found"; //this message shows in the catch block below when no user is found
}
else {
userId = user._id;
console.log(`User with pending deletion found (email: ${user.email})`);
return Message.remove({ user: userId });
}
})
.then(() => {
console.log("All user messages removed");
return Project.remove({ user: userId });
})
.then(() => {
console.log("All user projects removed");
return User.findByIdAndRemove(userId);
})
.then(() => {
console.log(`User removed from database`);
setTimeout(checkUsers, checkInterval);
})
.catch(e => {
console.log(e);
setTimeout(checkUsers, checkInterval);
})
}
That’s about it!
Additionally, you might want to consider compacting (repairing) the fragmented database, to compress it and recover disk space.
See also
- Node JS Mongo Client for Atlas Data API
- SignatureDoesNotMatch: The request signature we calculated does not match the signature you provided. Check your key and signing method.
- Exactly Same Query Behaving Differently in Mongo Client and Mongoose
- MongoDB Single Update Query to Change the Field Name in All Matching Documents of the Collection
- AWS Layer: Generate nodejs Zip Layer File Based on the Lambda's Dependencies
- In Node JS HTML to PDF conversion, Populate Images From URLs
- Convert HTML to PDF in Nodejs