I often hear the argument that one should only use the standard or in the cloud, the (OpenSource) tool specific features to be able to move anywhere else anytime.
No use of AWS specifics which would make developers live easier but doesn’t exist in Azure. SQL Standard instead of effective new datatypes or procedures that are vendor specific.
My favorite quote of the article:
> „Data has gravity“. Moving data can be both time-consuming and costly.
If you need to scale-out, the ArangoDB packages are more affordable then MongoDB Atlas, as you don't need to spin-up a whole 3 node replica-set to add another shard to your cluster. The smaller instances are cheaper in Atlas, here you benefit from the established cloud service, which can negotiate better conditions with the large cloud providers. However, we will pass on lower cloud costs to customers, so there is hope that we will move closer over time. But, I don't see ArangoDB in direct competition with smaller, pure document-use cases. Most users need the multi-model capabilities and use graphs in combination with document operations.
What I miss in the conclusion: Multi-model databases
More and more products support multiple data models today.
This reduces the number of technologies in your tech stack and allows to combine different access patterns without the need to duplicate and sync data between systems.
Okay, Stephen O’Grady - here’s the obvious one you’ve asked for:
CSS is not a programming language. ;-)
Despite that, the list is quite complete and feels reasonable. Did you try to research how languages are used in certain use cases? Which languages compete in a certain domain?
Ha! We get asked about CSS every time. Our general answer is that we try very hard not to editorialize, and let GitHub’s Linguist make determinations. We do make decisions, but to date, CSS has continued to make the cut.
As for how languages are used, we spend a lot of time trying to understand that broadly, and where the rankings reveal anomalous patterns (e.g. Kotlin a year or two ago) we do more targeted research to understand those.
Imagine you would go to your preferred online marketplace and search for a generic product.
You get 1000+ results.
So you filter by avg.star-rating > 4.0
Still 500+ results.
Those with just one 5 star rating in front of the one with 300 reviews and a 4.8 avg. Annoying.
What I really want:
I would like to filter for products that have at least 5 (relatively long) reviews, an average rating of 4.0 and at least 2 of these review comments mentioning the use case for which I would like to use this product. Maybe I just want the verified purchases to be counted or the reviews of friends and friends of friends...
Using a native multi-model approach you can do both. Simply retrieve all category X products ranked by product rating, limit 50/page or perform advanced lookups - without having to synchronize data from a document or relational model with an additional graph or search engine.
Combining full text search with scorers, graph traversals and/or join operations you could do an ad-hoc query in AQL to get the most relevant products & reviews with a single query.
Multi-model provides choice. In data modeling and querying.
Yes, we had cluster stability issues 1.5 years ago. That has changed, cluster stability and performance has the top priority and we invest a lot to improve the developer and devops experience with every release. Now, e.g. with K8s deployments or the arangodb starter, it's much easier to run and maintain clusters. Hope you find the time to give it a second try.
Is anyone already implementing such a service?