Better Stack's Story: The Observability Platform Built in a Week

The Alert That Sparked a Revolution

Juraj Masar, co-founder of Better Stack, faced a developer's quintessential nightmare: a silent website outage. He simply needed a phone call when his site went down, a direct and undeniable alert that something had broken. This seemingly straightforward request, however, exposed a gaping chasm in the existing developer toolkit.

The industry standard offered a clunky workaround. Developers often resorted to chaining together disparate services like Pingdom for uptime checks and PagerDuty for incident alerting. This integration, Masar recalls, involved "terrible APIs" – a common refrain among engineers grappling with complex, brittle connections between essential monitoring tools. The setup was cumbersome, prone to failure, and far from the seamless experience developers craved.

Frustration boiled over into a pivotal question: "How hard can this be?" This query sparked the genesis of Better Stack in 2021. Masar, alongside Veronika Kolejak and Juraj Masr, made a bold bet: they could build a superior solution in just one week. Their initial goal was modest yet revolutionary: a simple service where users could input a credit card, a phone number, and a URL, guaranteeing an immediate call upon website downtime.

This wasn't just Masar's problem; it represented a universal pain point for thousands of developers and SRE teams. The struggle to achieve reliable, instant notifications without wrestling with intricate configurations highlighted a critical gap in observability. Better Stack emerged to fill this void, evolving rapidly from a simple phone alert system to a comprehensive platform now serving over 200,000 developers and 4,000+ customers, including giants like Time Magazine and Salesforce. The company's initial focus on incident management laid the groundwork for its $28.6 million in funding and its ambition to redefine how teams monitor their digital infrastructure.

From a Bet to a Business in 7 Days

Frustrated with the cumbersome integrations of tools like Pingdom and PagerDuty, Juraj Masar and his friends posed a challenge: how hard could it be to build a simple uptime monitoring service? The answer became a daring bet – launch a fully functional product in just seven days. This audacious goal set the stage for Better Stack’s origin.

Their vision for a Minimum Viable Product (MVP) was strikingly clear and concise. Users would simply: - Enter a credit card number - Provide a phone number - Input a website URL The service would then automatically place a phone call the moment their site went offline. This stripped-down approach bypassed the complexities of existing solutions, focusing solely on the immediate need for reliable downtime alerts.

This intense, week-long sprint epitomized a rapid prototyping mindset. Founders Juraj Masar, Veronika Kolejak, and Juraj Masr weren't aiming for perfection; they pursued functionality. The team executed with singular focus, bringing the core concept to life through sheer speed and a commitment to shipping. They proved that a well-defined problem could yield a tangible solution in an incredibly short timeframe.

The initial development offered profound lessons in product strategy: the power of extreme focus, the necessity of speed, and the effectiveness of solving a single, well-understood problem. At this stage, the team deliberately avoided the future complexities of eBPF-based telemetry, log management, or AI SRE, prioritizing the foundational user pain point above all else.

This swift launch validated their approach, demonstrating that a straightforward, dependable phone alert system held significant market value. It underscored that simplicity and direct utility often trump feature bloat, especially when addressing a critical developer frustration. The success of this initial bet provided the crucial momentum.

The seven-day challenge wasn't just a technical exercise; it was a business proving ground. This foundational sprint established Better Stack's ethos: identifying a clear need and delivering an elegant, effective solution quickly. This successful, focused start laid the groundwork for the comprehensive observability platform Better Stack would ultimately become.

Escaping the One-Trick Pony Trap

Initial success with Better Stack's phone call alert service for website downtime quickly posed a critical question for founders Juraj Masar, Veronika Kolejak, and Juraj Masr: could a single-feature product sustain long-term growth? They recognized the market's unmet need for integrated solutions beyond simple uptime monitoring, avoiding the "one-trick pony" trap.

Leadership embraced a philosophy of 'following the market,' actively listening to customer requests to guide their development roadmap. This organic, customer-driven approach ensured each new feature addressed genuine developer pain points, transforming their initial seven-day bet into a strategic, comprehensive platform.

Demand drove the rapid integration of advanced capabilities, moving far beyond basic uptime checks. Better Stack soon added essential tools, consolidating previously disparate systems into a unified experience: - Log management, leveraging ClickHouse for efficient storage - Customizable status pages for transparent communication - eBPF-based telemetry tracing for deep system instrumentation - AI-powered Site Reliability Engineering (SRE) for proactive incident resolution

This strategic expansion transitioned Better Stack from a niche alerting tool into a comprehensive observability suite. Now serving over 200,000 developers and 4,000+ customers, including enterprises like Salesforce and Time Magazine, the platform unifies monitoring, logging, and incident management in a single interface. Discover more about their full range of capabilities at Better Stack: Observability platform for developers.

The Uphill Battle Against Datadog

Better Stack entered a fiercely contested observability market, daring to challenge established giants like Datadog and New Relic. The Prague-based startup, founded in 2021 by Juraj Masar, Veronika Kolejak, and Juraj Masr, positioned itself as a modern, streamlined alternative to complex, often costly legacy platforms. It aimed to disrupt the status quo by offering a fundamentally different approach to monitoring and incident response.

At its core, Better Stack champions a unified platform philosophy. Instead of requiring developers to stitch together disparate tools for uptime monitoring, log management, and incident response, it seamlessly integrates these functions. The platform combines uptime monitoring, log management, incident management, status pages, and AI-powered Site Reliability Engineering (SRE) tools into a single, intuitive interface, aiming to simplify the entire observability stack and reduce operational overhead.

The company's value proposition extends to significant cost savings, often touted as considerably more affordable than its competitors. This economic advantage proves a major draw for budget-conscious startups and mid-sized enterprises. Better Stack achieves this efficiency, in part, by leveraging the high-performance ClickHouse database for efficient data storage and querying, alongside OpenTelemetry and eBPF for efficient, code-less data collection, minimizing resource consumption.

However, the pricing model carries a nuanced caveat for some users. While initial tiers offer substantial savings compared to rivals, particularly for basic monitoring, some smaller teams report that scaling Better Stack’s services to match the extensive, high-volume feature sets of incumbents can see costs rise. These instances can sometimes approach competitor pricing at higher usage levels, suggesting its most pronounced competitive edge exists at specific operational scales and feature requirements.

Despite the uphill battle against multi-billion-dollar incumbents, Better Stack has secured impressive customer wins and rapid growth. The platform now serves over 200,000 developers and more than 4,000 paying customers globally. Its roster includes high-profile enterprises such as: - Time Magazine - Salesforce - ESET - Accenture - Decathlon - Rakuten

This market penetration underscores Better Stack's competitive strength and validates its disruptive strategy. The company has raised a total of $28.6 million in funding across two Series A rounds, with significant investments from Creandum and KAYA. Further backing from notable angel investors like Aaron Levie (CEO of Box) and Kulpreet Singh (ex-UiPath) further solidifies its position as a credible and rapidly growing player in the observability space.

Under the Hood: ClickHouse and OpenTelemetry

Better Stack built its core data infrastructure on ClickHouse, a columnar database known for its extreme speed and efficiency in analytical queries. This strategic decision underpins the platform's ability to ingest and query massive volumes of observability data—logs, metrics, and traces—at scale. Unlike traditional relational databases, ClickHouse processes data by columns, enabling lightning-fast aggregation and filtering, which is crucial for real-time monitoring.

This architectural choice allows Better Stack to deliver performance benchmarks that often outpace competitors relying on older, less optimized database architectures. Its ability to handle petabytes of data with low latency ensures that developers receive immediate insights, accelerating troubleshooting and incident resolution. The efficiency of ClickHouse directly contributes to the platform’s competitive pricing model by minimizing the computational resources required for data storage and processing.

Adopting OpenTelemetry, an industry-standard collection of APIs, SDKs, and tools, further solidifies Better Stack's commitment to modern, open observability. This open-source framework provides a vendor-agnostic method for instrumenting applications and infrastructure, collecting unified telemetry data without proprietary lock-in. It allows users to instrument their systems once and send data to any compatible backend, offering unparalleled flexibility.

OpenTelemetry ensures a future-proof approach to data collection, giving customers complete control over their monitoring ecosystem. It abstracts away the complexities of data formats and protocols, enabling seamless integration with a wide array of tools and services. This commitment to open standards significantly reduces the friction associated with migrating monitoring solutions or integrating with existing tech stacks.

These foundational technical choices deliver significant user benefits. Developers experience unparalleled speed in data ingestion and query execution, drastically reducing the time spent diagnosing issues and fixing problems. The efficiency of ClickHouse also translates into substantial cost savings for customers, as it requires fewer resources to store and process vast amounts of data. Crucially, the reliance on open standards like OpenTelemetry guarantees no vendor lock-in, empowering users to migrate or integrate with other tools seamlessly.

The 'No-Code' Revolution with eBPF

eBPF stands as a pivotal technology for modern observability, fundamentally changing how developers instrument complex systems without altering application code. This Linux kernel innovation allows for deep, granular insights into system behavior, representing a paradigm shift from traditional methods that often demand intrusive agents or manual instrumentation. It eliminates the need for developers to embed SDKs or modify existing codebases, a significant operational burden that consumes valuable engineering time.

At its core, eBPF enables small, safe programs to run directly within the kernel, attaching to various system events like network calls, file system operations, process execution, and CPU utilization. This capability allows Better Stack to collect high-fidelity telemetry data — including metrics, logs, and traces — seamlessly. It bypasses the tedious and error-prone process of manual instrumentation or deploying resource-heavy sidecar containers solely for monitoring, significantly streamlining data acquisition.

Enjoying this? Get one like it in your inbox each morning.

one email a day · unsubscribe in two clicks · no third-party tracking

Better Stack leverages this power to deliver advanced, automatic OpenTelemetry tracing for containerized environments. Its eBPF-based solution provides unparalleled visibility into Kubernetes and Docker workloads, capturing detailed traces across distributed microservices. This means developers gain full visibility into service interactions, latency, error propagation, and resource consumption within their complex applications, all without the architectural complexity or performance impact typically associated with deep tracing.

This approach offers profound benefits for developers and Site Reliability Engineers (SREs). It significantly reduces operational overhead, accelerating the deployment of robust monitoring solutions from days to minutes. Teams avoid spending valuable development cycles on instrumenting code, instead focusing on innovation and feature delivery. Furthermore, eBPF yields richer, more accurate data collection, providing a clearer, real-time picture of system health, performance bottlenecks, and elusive root causes for incidents.

The "no-code" revolution, powered by eBPF, simplifies the entire observability pipeline, democratizing access to advanced tracing capabilities. It makes sophisticated diagnostics accessible to more teams without requiring specialized kernel knowledge or extensive code changes. This strategic investment in cutting-edge technology positions Better Stack as a formidable competitor in the observability space, as evidenced by its funding rounds. For more on how Better Stack is challenging incumbents, see Better Stack raises $18.6M for a ClickHouse-based Datadog challenger.

Meet Your New AI SRE Teammate

Better Stack’s latest innovation introduces a sophisticated AI SRE agent, embedding it directly into team communication channels like Slack. This intelligent bot isn't merely a notification system; it acts as an active, always-on teammate, fundamentally transforming incident response by tackling the initial chaos of an outage. It aims to provide immediate, actionable insights, reducing the critical time-to-resolution.

When an incident strikes, the AI agent springs into action, autonomously investigating the issue. It leverages Better Stack’s unified data platform, which is built on ClickHouse and ingests data via OpenTelemetry and eBPF. The agent meticulously sifts through vast amounts of information, including: - Logs - Metrics - Traces

It correlates disparate data points across the entire system, rapidly forming initial hypotheses about the incident’s root cause. The AI then presents these findings and potential diagnostic paths in a clear, concise format within Slack, eliminating the need for engineers to manually comb through dashboards.

Crucially, Better Stack champions a human-in-the-loop approach. While the AI provides comprehensive diagnostics and suggests potential remediation steps, humans retain ultimate authority. Engineers review the AI’s suggestions and approve any actions, ensuring operational safety and preventing autonomous missteps. This collaborative model empowers teams with advanced insights without sacrificing oversight.

Better Stack’s vision for AI isn't to replace skilled engineers but to provide powerful augmentation. By automating the laborious, time-consuming tasks of initial investigation and data correlation, the AI SRE agent dramatically speeds up root cause analysis. This frees valuable engineering time, allowing teams to focus on strategic problem-solving, system improvements, and innovation, rather than reactive firefighting, effectively condensing hours of manual triage into minutes of AI-assisted diagnosis.

The Power of Being 'Unintentionally Profitable'

Better Stack's financial trajectory defies the typical Silicon Valley playbook, accumulating $28.6 million across two strategic funding rounds. Its initial Series A in July 2022 secured $18.6 million, spearheaded by Creandum, establishing a robust foundation for its ambitious observability platform. This early backing signaled strong investor confidence in the team's vision to disrupt established giants.

Remarkably, Better Stack achieved profitability in 2023, a surprising milestone that allowed the company to preserve its initial funding. This unexpected financial health positioned the observability platform uniquely, diverging from many venture-backed startups often burning cash to fuel growth. It underscored a lean, efficient operational model, proving that rapid expansion doesn't always necessitate endless capital.

Rather than signaling distress, Better Stack’s subsequent $10 million Series A in January 2024 came from a position of undeniable strength. KAYA led this round, with notable angel investors like Box CEO Aaron Levie and ex-UiPath executive Kulpreet Singh joining. This capital infusion wasn't a lifeline; it was strategic fuel, enabling the company to accelerate product development, expand its global footprint, and further innovate in areas like eBPF and AI SRE.

This sustainable growth model signals profound health and a long-term vision for Better Stack. Achieving profitability while still early in its growth phase showcases astute financial management and a product that resonates strongly enough to generate substantial recurring revenue from over 4,000 customers, including enterprises like Time Magazine and Salesforce. The company now leverages its capital to innovate further, not merely survive, fostering a robust future in the competitive observability landscape by focusing on customer value and operational efficiency.

Brutally Honest: What Users Actually Say

Users frequently laud Better Stack for its unified platform, streamlining what typically requires multiple disparate tools. Developers and SREs consistently praise the straightforward setup process, often noting how quickly they can onboard and configure monitoring for their services. The reliability of its alerting system also receives high marks, delivering critical notifications precisely when needed, whether via phone call, Slack, or PagerDuty integrations. This all-in-one approach consolidates uptime monitoring, log management, and incident response into a cohesive observability experience, a significant draw for teams seeking simplicity without sacrificing functionality. For a deeper dive into its capabilities, see Introducing Better Stack: A unified platform to Spot, Resolve and Prevent downtime.

Despite the widespread approval, user feedback also highlights areas for refinement. Some advanced users express a desire for more granular and customizable configuration options, particularly when integrating Better Stack into highly complex, bespoke infrastructure setups. While the platform excels at out-of-the-box functionality, teams with unique compliance needs or intricate alert routing structures sometimes find the initial learning curve steeper for sophisticated customizations, contrasting with its overall user-friendly design.

Another common point of discussion revolves around its tracing capabilities. While present and functional, many users perceive Better Stack’s distributed tracing as "lighter" than the robust, feature-rich offerings found in established enterprise APM (Application Performance Monitoring) solutions. Platforms like Datadog and New Relic often provide deeper, more expansive code-level insights and profiling, which Better Stack, even with its eBPF integration, continues to evolve towards for full parity. This perception doesn't diminish its strengths in other areas, but it marks an opportunity for growth.

This candid feedback paints a picture of a rapidly maturing product. Better Stack excels at providing a comprehensive, accessible observability suite, particularly for teams prioritizing ease of use and consolidated tooling over highly specialized, niche features. The company actively incorporates user suggestions, demonstrating a commitment to enhancing its advanced features and bridging the gap in areas like tracing, ensuring it remains a competitive force against industry behemoths. Its trajectory suggests a continuous push towards enterprise-grade depth while retaining its core simplicity.

Observability's Next Chapter is Already Written

Better Stack’s strategic moves—integrating a Slack-native AI SRE agent, leveraging deep system instrumentation with eBPF, and building collaborative dashboards—reveal the definitive future of observability. These aren't merely isolated features; they represent a cohesive vision for proactive incident prevention and rapid resolution, reshaping how engineering teams interact with their infrastructure. The company's journey, fueled by $28.6 million in funding, underscores this profound commitment to innovation.

This integrated approach signals an undeniable industry trend towards unified platforms that are AI-assisted and inherently developer-friendly. Engineering teams demand comprehensive insights, spanning uptime monitoring, log management, and trace analysis, all accessible within a single, intuitive interface. Better Stack’s rapid adoption by over 200,000 developers and 4,000+ customers, including enterprises like Time Magazine, Salesforce, and ESET, powerfully validates this critical market shift away from complexity.

Consequently, the appeal of cobbling together multiple, disparate monitoring tools rapidly diminishes. Developers once juggled services like Pingdom for uptime, PagerDuty for alerting, and separate logging solutions, creating integration nightmares, data silos, and significant operational overhead. Better Stack's all-in-one solution liberates teams from this fragmented paradigm, streamlining workflows and dramatically accelerating incident response times.

Observability’s next chapter empowers engineering teams to transcend reactive incident management. By providing real-time, deep system insights via technologies like ClickHouse and OpenTelemetry, coupled with intelligent automation, this new model fosters a proactive stance. Teams can now actively prevent downtime, rather than merely responding to outages, ensuring superior system reliability and operational resilience. The era of fragmented monitoring ends; the future of effortless prevention, driven by platforms like Better Stack, has already arrived.

Frequently Asked Questions

What is Better Stack?

Better Stack is a comprehensive observability platform that unifies uptime monitoring, log management, incident management, and status pages into a single, user-friendly interface for developers and SRE teams.

How is Better Stack different from Datadog or PagerDuty?

Better Stack differentiates itself by offering a unified, all-in-one platform from the start, often at a more competitive price point. It avoids the complexity of integrating separate tools like Pingdom and PagerDuty by combining their core functionalities.

What is AI SRE in Better Stack?

The AI SRE is a Slack-native agent that automatically investigates incidents by analyzing logs, metrics, and traces. It suggests potential root causes to engineers, helping to resolve downtime faster.

Why does Better Stack use eBPF for tracing?

Better Stack uses eBPF to allow engineering teams to collect logs, metrics, and network traces from their systems without needing to modify their application's code, simplifying the instrumentation process significantly.

Found this useful? Share it.

For builders

Want Stork to write one of these about your product?

Send us a URL. We use the product, form a view, and publish what we actually think — in 8 languages, labeled Sponsored, with no copy approval on your side. That last part is what makes it worth quoting.

See how it works$500 · AI tools & software only

One Dev's Frustration Built a $28M Datadog Killer