MongoDB from a SQL point of view

By Thomas James 15/02/2020

I have spent all my professional years of development using MSSQL database meaning I have spent my whole life in a normalized world, this is a world most developers live in as most businesses use some type of relational database, this norm is changing with NoSQL becoming more and more used, and many more jobs coming up in this area meaning you as a developer should look into it even if it’s only to understand when it could/should be used.

Kasera compares furniture from many different companies getting its data from several data feeds, each having their own data structure. In short, we take data feeds, convert it into our data structure, and de-dupe it. This is a large data set that does not change to much day to day with it massively weighted to reading the data over writing.

Kasera started out life using a MSSQL database for a couple of reasons –

  1. Being a very small team and with me doing 95% of the coding I kept to what I knew and that’s SQL
  2. We were hosting in Azure and SQL just works.

However, SQL in Azure is not cheap, in fact to get any power within a database you need to start paying some large sums of money and that’s an issue when doing a start-up, cash is very limited. Search queries that on my local would take 0.3s where taking 20+ seconds in azure as the DTUs we were paying for where just too low. It was at this point we had to look elsewhere, my search for a solution took me to the NoSQL world where MongoDB shined out as a solid and well-maintained database, with very high read speeds, and the best thing its free. So, I took Kasera off Azure and we moved over to DigitalOcean Linux machine this move could and probably should be its own blog so let’s leave it as - I did it, it was very interesting, I felt like a god once it was all working *pats self on back*.

It’s now been just over 5 months since that move, and I thought it would be useful to give my first impressions of MongoDB and the NoSQL world

Speed

This I must say has been amazing, yes you have to design your documents (tables) very differently to a relational database and you might even call it cheating when you do your first NoSQL design, it’s not it just how NoSQL works, you don’t have joins you can’t normalize data like you want to, what you need when pulling out data can (should) all be in a single document (table), you will and more than likely should have duplicated data. I truly struggled at first with this it was just so wrong coming from such a pure world of relational databases. But the read speed you get it amazing, and it should be with no joins you can make a very complex query and it will never slow down, adding more complexity to your query over time is no issue.

Cost

If you have unlimited power in both a SQL and a MongoDB server you can make both query more or less the same speed, but when it comes to cost to speed MongoDB is by far superior, we currently spend $80 a month on DigitalOcean and with this we have more power than we currently need, compare this to azure where we spent anywhere up to $100 per month for one API, one Website, and a 5 DTU database. So, if you have limited funds Linux + MongoDB = cheap power.

Changing Design

Past the two obvious things of speed and cost a big plus for NoSQL is the easy at which you can change a design, as we all known big fundamental changes to SQL tables is more or less a no no you would likely look at making a new table over fundamental changes to a current table. Not in NoSQL world. I had to change our feeder table to handle multiple data feeds all with completely different fields, in SQL this would be a new table for each data feed then linking it to the feeder table in NoSQL I simple made a sub document and moved the current data into its own sub document and added the other sub documents for the other data feeds, no joke it was dead easy and keeps your database very clean. NoSQL takes a big plus if you have a design that’s likely to change over time.

Ad-hoc Queries

For me the only real down side to NoSQL is Ad-hoc queries on it, it’s just not as easy to do as SQL it’s much harder to come up with interesting facts on the data in a fast and easy way, the tooling is not their or I have not found it yet. I use C# Linq to do my Ad-hoc queries on the database which works but it’s a bit of a pain, so if you are always asked questions about the data SQL take a point on that. This should not come as to much of a surprise its very common to have a copy of your data in some type of data warehouse that you can run python on to answer them important questions, but as a start-up we just deal with the need to write it out in C# Linq.

Conclusion

Overall do I think relational databases are dead? No, both relational and non-relational databases have a place where they shine, for most businesses this I feel is still weighted to relational but when you need that read speed for large data sets, that flexibility to change your data, then non-relational databases are there to answer your call.