Vorstella Blog

Thoughts, tips, and reflections on making infrastructure easy

Subscribe to our newsletter

Deep Learning for High-Dimensional Time Series


Dimensionality Reduction, Embedding, PredictionTime series data is prevalent in important applications such as robotics, finance, healthcare, and cloud monitoring. In these applications, we typically encounter time series with very high dimensions…

Infrastructure and Software Ownership


Software ownership is a huge part of successful development. When done right, DevOps forces developers to be ultimately responsible for the uptime and performance of their applications. This makes sense. Who better to own the results of code than the…

How a Cassandra performance issue looks to a classification model


Problems in distributed systems like Cassandra usually present themselves as:Higher latenciesRequest failuresDowned nodesIt’s up to us to dig into our alerting and dashboards, parse it out, and figure out what’s wrong. In previous posts we’ve talked…

3 reasons distributed data systems are a perfect use case for AIOps


We’ve been writing about automating infrastructure operations with machine learning (ML). Out of all the possible infrastructure components, we’ve focused mostly on distributed data systems in our examples. In this post we identify the 3 key…

Control theory and infrastructure management


Engineering teams are being asked to manage a growing amount of infrastructure software like databases, search engines, and message queues. To operators, these systems often feel like black boxes that are hard to observe and manage. In this post we…

ML for Data Platform Operations


We’ve spent our careers helping enterprises adopt, run, and scale open-source tech like Cassandra, Kafka, Spark, and Elasticsearch to make the most of their data. We’re taking our years of domain knowledge and combining it with machine learning to…

Is AWS killing the open-core business model?


This post is in response to the ongoing discussion on the future of open-source, recently re-kindled by Amazon’s Open Distro for Elasticsearch announcement. This analysis is admittedly biased towards specialized data systems. Things like Kubernetes…

Infrastructure operations done right


We’ve seen dozens of companies struggle to manage open-source data platform technologies like Cassandra, Kafka, and Elasticsearch in-house, at scale. This post discusses the challenges and opportunities we’ve seen to deal with this problem.More tech…

Autotuning Cassandra to reduce latencies


At Vorstella we’re building an AI expert that helps teams run distributed systems at scale. A while back our team was passing around some research papers about autonomous databases and optimizing system configurations, and they grabbed our attention…