Evergen, headquartered in Australia, builds world-class software platforms and products that enable monitoring, control, optimization, and orchestration of a broad range of distributed energy resources and utility-scale assets.
Evergen’s mission is to decarbonize the energy system by facilitating the transition to resilient and renewable energy systems. Its cloud-native approach ensures all stakeholders across the renewable energy chain have access to the information they need to make informed decisions about their energy usage and production.
As Evergen’s infrastructure scaled, it sought a time-series and real-time analytics platform that could scale with its needs. Here’s how Evergen discovered TigerData (creators of TimescaleDB), and how TigerData transformed its operations—based on an interview with Evergen’s Lead Software Engineer Jose Luis Ordiales Coscia.
Why Evergen Needed a Time-Series Database
“Renewable energy optimization,” says Jose, “is a big part of Evergen’s business, and time-series data is essential for that. It gets used for reporting and mobile apps that show energy usage over time.”
Evergen had originally resorted to MongoDB Atlas for time-series because it had been their default database for regular non-time series data. Yet Evergen’s time-series setup using MongoDB turned out to be cost-prohibitive and technically restrictive: “Our team did create their own schema on top of MongoDB by storing data in buckets, where each bucket is one day. So they tried to replicate what TimescaleDB does—behind the scenes—in MongoDB, which sort of worked, but it also didn’t work.”
The breaking point for MongoDB was the number of devices onboarded into the system and the minute-frequency data they generated. “We saw that as the data went through, MongoDB became heavier to use because we were patching more data and storing more data at the same time,” says Jose.
Technical Complexity and Scalability Constraints
The technical challenge began with raw data storage limits in MongoDB. Evergen has hundreds of integrations to ensure compatibility across device manufacturers, and at regular intervals, pulls data from each of those integrations and sends it to a Kafka topic where it publishes all the raw samples. Kafka Streams (a streaming library for Kafka) read those raw samples and did pre-aggregations in-memory, for five-minute and 30-minute data. Then all that data was stored in MongoDB.
“We did pre-aggregations in-memory,” says Jose, “because it was too expensive for us to store all the raw observations in MongoDB. MongoDB doesn’t really support aggregations on the fly, as TimescaleDB does with continuous aggregates. If we wanted to have both the raw data and the aggregations, we had to do all of that manually for each one.”
Due to the high data volume, their Kafka stream processing service was huge. Evergen ran some 30 instances of that service, which accounted for a large percentage of their usage in Kubernetes. Doing pre-aggregations in-memory had even more shortcomings:
- They had to manually backfill data every day, which was painful. Late-arriving data—more than 15 minutes for the five-minute aggregation and one hour for the 30-minute aggregation—got lost because they didn’t have the pre-aggregated data in-memory anymore.
- Not having the raw data created lack of transparency for audit trails and debugging.
MongoDB performance for time-series, according to Jose, was sub-par: “Everyone sort of knew that it wasn’t ideal, but it was one of those things that you think, at some point we’ll fix it, but no one actually did. When we hit scaling issues with MongoDB, this became more important, and we started looking at alternatives.” That’s when they started testing databases designed to handle time-series data at scale.
The Database Evaluation Process
In their evaluation process, Evergen looked briefly at InfluxDB, but since it was dropping support for the Australia region at the time, that was a no-go from the start. Because all their infrastructure is in AWS and they use other AWS services, Evergen first evaluated Amazon Timestream. Yet it turned out to be very limited when they tried using it, according to Jose, presenting several issues at the time:
- Lagging performance (even fetching data for a day or a week was in the order of 2-5 seconds) and lack of query or performance tuning.
- Inability to run a Timestream database locally with Docker (while running Postgres and TimescaleDB locally was easy)—running local tests with Timestream required connecting to AWS to create an actual database then remove it at the end.
- Unusable for renewable energy forecasting—whereby Timestream allowed storing data only up to 15 minutes in the future.
Evergen then tried using Timestream for historical time series data and Redis for forecasted time-series data. They had jobs and logic to move data from Redis into Timestream over time, yet data related to devices and sites was still in MongoDB. “We had three databases with different types of data, and every time we needed to join those three sources, it was just painful. That was one of the key things we were looking at when we started this evaluation process—to be able to use one single database for all our data: time series and non-time series. That was a big selling point for us when it came to TimescaleDB because it’s just Postgres underneath. You can use it as a regular relational database. And then you have all these cool features for time- series data,” notes Jose.
Evergen also tried the time-series support in MongoDB: “They have this new-ish type of collection in MongoDB that handles time-series data. But they were missing a bunch of features that TigerData has, like continuous aggregates, retention policies, and compression. The performance also was not as good as what we get with TigerData.”
Discovering and Testing TigerData
Jose initially found out about TigerData through a friend who worked there as a backend engineer. Upon learning about the product, Jose had applied to work at TigerData, when he was looking to switch roles from a previous company. To prepare for applying, he deep-dove into TimescaleDB features.
Then, when he joined Evergen, he became a big advocate for using TimescaleDB as his team began considering it—he already knew its features and that it was built on battle-tested, boring old technology, which is “great for a database—you don’t want any surprises there.”
While evaluating TimescaleDB, Evergen leveraged TigerData resources: “The official documentation was very comprehensive and easy to understand and follow. The TigerData blog also has some really nice discussions about trade-offs of different approaches. That was particularly helpful during the POC. The community Slack channel has also been great.”
Evergen’s proof of concept involved setting up dual writes and dual reads by running MongoDB and TimescaleDB in parallel. This enabled them to test TimescaleDB without disrupting operations or impacting customers. Once they had a few months of data stored in TimescaleDB, they made the switch. From a scalability, operational, and developer experience perspective, TimescaleDB checked all the boxes—delivering ingestion/query performance, flexibility, and ease of use.
“Because we have hundreds of thousands of devices, and potentially looking at millions of devices in the future, we need to make sure the ingestion rate is smooth. So far it’s been amazing with TigerData.” — Jose Luis Ordiales Coscia, Lead Software Engineer at Evergen
“Because we have hundreds of thousands of devices, and potentially looking at millions of devices in the future, we need to make sure the ingestion rate is smooth. So far it’s been amazing with TigerData,” says Jose. Evergen also appreciated TimescaleDB’s query performance because that data powers real-time customer-facing dashboards: “You don’t want your users to have to wait 5-10 seconds just to see those graphs.”
Developer tooling availability was a deciding factor as well: “Because Postgres is such an established player in the market, there are thousands of libraries and tools to work with.” So was familiarity with how to query the data: “Everyone knows SQL at some level, even non-technical people. We can give someone in the data science team or customer support team access to TimescaleDB, and they’ll figure out how to query the data, which wasn’t the case with MongoDB.”
“Everyone knows SQL at some level, even non-technical people. We can give someone in the data science team or customer support team access to TimescaleDB, and they’ll figure out how to query the data, which wasn’t the case with MongoDB.” — Jose Luis Ordiales Coscia, Lead Software Engineer at Evergen
Security was also a consideration. TigerData’s security features met Evergen’s requirements.
How Evergen Uses TigerData
Replacing MongoDB with TimescaleDB for time-series data and setting up a “Telemetry Service” achieved data centralization that wasn’t previously possible. “We try to encapsulate and isolate access to the one service that is handling time-series data. If you are doing transformations with that data—for example, converting power to energy values—you want to centralize that in a single place so that you don’t have every client doing it slightly differently.”
A goal of this migration was to consolidate all reads and writes to a single place—the “Telemetry Service”. It handles time-series and relational data, and has near-exclusive access to the Tiger Cloud (managed TimescaleDB) instance at Evergen. One exception is the access they provide to their data science team for exploratory querying of the data. This is where having a Read Replica of the main database, just for that team, helps as it gives access without impacting performance.
Evergen also heavily uses TigerData’s integrated PopSQL IDE. Says Jose: “It’s a great tool to explore databases and share queries. We didn’t have that with MongoDB. We had to log into the MongoDB Atlas page every time and manually write down our queries.”
How Tiger Cloud Benefits End Customers
With MongoDB, Evergen could only store three months of data due to cost constraints. “Now with TigerData,” shares Jose, “it’s up to us to define those retention policies. Right now we have that set at two years, and we have compression and tiered storage. So it’s cheaper to store more data. That’s definitely a selling point for our product team, enabling a home energy report for the past year versus one for only the last three months.”
Query performance was another big win: “Before, you had to wait a few seconds when accessing your web app or mobile app to see those graphs. Now it’s just under 500 milliseconds, which is really, really good. The data itself is critical for the organization because everything we do has to deal with time-series data. Customers definitely need to have that information available.”
The time-series data now handled by TigerData is also critical for Evergen’s energy optimization service. That service includes behind-the-meter optimization (machine learning optimization that delivers advanced individualized electricity cost-minimization) and front-of-meter optimization (which enables dispatchable assets to be monetized).
What Adopting Tiger Cloud Meant for Evergen
For Evergen, replacing MongoDB with Tiger Cloud cut Kubernetes cluster resource usage by more than 50%. Cost savings, efficient compression, tiered storage, and constant access to historical data meant newfound technical and business agility.
“Having that freedom to decide how much data to keep, what’s useful, and what’s not,” says Jose, “has been a huge win because before there was a hard limit of how much data we could store. Now we have that flexibility to say this old data—we want to keep it in S3. It will be slower to access, but that’s fine. And we want to keep this much data in high-performance storage. Definitely that has been a huge win for the organization because Tiger Cloud is essentially Postgres enhanced.”
Tiger Cloud also simplified Evergen’s stack: “Simplifying our architecture—being able to replace all these different specific databases—that’s a huge thing for overall complexity of the architecture, which makes extending this in the future easier as well.”
Accessing all the raw data meant they can reference it for debugging and trace it to particular devices—which is much easier to do with raw data than with pre-aggregated data. With unconstrained raw data storage, Evergen also gained the benefit of real-time analytics insights.
“The ability to store all the raw data,” explains Jose, “means we can create new aggregations that we haven’t even considered—on the fly. We couldn’t do that before. With TigerData, if tomorrow we decide we need this new data derived from the raw data, we can just create a new aggregate, run the whole backfill process, and that’s it—we have it. That’s huge for flexibility, where we don’t need to predict what we’ll need a month or a year from now.”
Built on familiar Postgres, Tiger Cloud had a positive impact on team onboarding. “Because everyone knows SQL, at some level, and everyone has worked with SQL at some point in their careers, it was straightforward for [newly hired engineers] to just jump into the storage code and figure out what was happening.”
Future Plans Using TigerData
Choosing TigerData helped Evergen future-proof their architecture as they scaled. “It is true—you can use Postgres for everything these days. There’s an extension for absolutely everything. That was another big factor for the decision—thinking what if we need to store this type of data in the future? Oh yeah, there’s this extension already available. Having that possibility—I think it was a big thing for the company,” says Jose.
With TigerData, Evergen built a data foundation for scalability to support their plans to reach new markets and increase the amount of devices they have by an order of magnitude. Evergen is also planning to use a feature TigerData is working on, that would make the data Evergen has in Tiger Cloud available to other teams that might be using different tooling.
Jose’s advice to engineers considering the switch: “When creating abstractions in your code base, keep everything related to one particular technology isolated from the rest of the code as that made it a lot easier for us to run our dual reads and dual writes experiment—because there was a single place in the code base that we had to go and change for that to happen. I know this is one of those things that you think you’ll never need to do—like switch databases—but sometimes, it happens.”