<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="https://www.rishibhatia.io/feed.xml" rel="self" type="application/atom+xml" /><link href="https://www.rishibhatia.io/" rel="alternate" type="text/html" /><updated>2026-06-09T20:45:26+00:00</updated><id>https://www.rishibhatia.io/feed.xml</id><title type="html">Rishi Bhatia</title><subtitle>Data &amp; AI strategist and engineer in San Francisco. A decade building data and AI systems at scale — previously Meta, Deloitte, and AI &amp; data startups.</subtitle><author><name>Rishi Bhatia</name></author><entry><title type="html">So, you want to build your data stack?</title><link href="https://www.rishibhatia.io/writing/2022/11/14/so-you-want-to-build-your-data-stack.html" rel="alternate" type="text/html" title="So, you want to build your data stack?" /><published>2022-11-14T17:30:06+00:00</published><updated>2022-11-14T17:30:06+00:00</updated><id>https://www.rishibhatia.io/writing/2022/11/14/so-you-want-to-build-your-data-stack</id><content type="html" xml:base="https://www.rishibhatia.io/writing/2022/11/14/so-you-want-to-build-your-data-stack.html"><![CDATA[<p>You are a Founder of a bright and shiny business, or a Product Manager looking to make an immediate impact in a new team.</p>

<p>People around you say things like,</p>

<blockquote>
  <p>“We are flying blind; we don’t have any metrics to look at”</p>
</blockquote>

<blockquote>
  <p>“We have multiple sources of data, but don’t know how to stitch it all together”</p>
</blockquote>

<p>If this is you, it’s time to include building a modern data stack in your roadmap.</p>

<p><img src="https://miro.medium.com/max/700/1*ZbqQrUPaC1z4Km5UQv6Yww.png" alt="" />
<!-- *Dall-E prompt: “A database, connected through multiple nodes, with data flowing between the nodes, in a futuristic design style”* --></p>

<p>This article presents a framework you can use to get your modern data stack off the ground.
<br /></p>

<h1 id="where-you-are-right-now">Where you are right now</h1>

<p>Let’s assume:</p>

<ul>
  <li>Your business has a website</li>
  <li>You handle payments through a payment processor like Stripe or PayPal</li>
  <li>You advertise using Google Ads / Meta Ads</li>
  <li>You have an MVP product that generates data
<br /></li>
</ul>

<h1 id="the-framework-for-building-your-modern-data-stack">The framework for building your modern data stack</h1>

<p>You should spend a few extra brain cycles on day zero to think about how your data stack will evolve as your business grows. This is exactly what this framework helps with — build once, and use multiple times.</p>

<p>There are 4 steps. Let’s go through them one by one.</p>

<h2 id="step-1--figure-out-your-constraints"><strong>Step 1 — Figure out your constraints</strong></h2>

<p>As with any project, start with identifying your constraints,</p>

<p><strong>Speed</strong></p>

<ul>
  <li>Do you want answers immediately? Maybe you have an upcoming board meeting for which you need these metrics?</li>
  <li>Or, can you wait a bit and build something future proof?</li>
</ul>

<p><strong>$$</strong></p>

<ul>
  <li>What’s your budget for this project? Make sure to include any software related recurring expenses.</li>
  <li>Do you plan on hiring data experts to support this project in the future?</li>
</ul>

<p><strong>Complexity</strong></p>

<ul>
  <li>How many sources of data do you need to handle?</li>
  <li>How specialized will your insights and metrics become later down the road? Will you need to slice your data across multiple geos, channels, customer profiles?</li>
</ul>

<p><strong>Team’s data savviness</strong></p>

<ul>
  <li>Can anyone on your team code? Can they write SQL?</li>
  <li>Or, is a no-code solution needed?</li>
</ul>

<p><strong>Future requirements</strong></p>

<ul>
  <li>What data do you need today? What do you need 6 months, or a year from now?</li>
  <li>Will this data give you a competitive advantage in the future, even if it doesn’t today?</li>
</ul>

<h2 id="step-2--list-your-core-metrics"><strong>Step 2 — List your core metrics</strong></h2>

<p>Simple and easily explainable metrics are powerful; prioritize them. Once you have a handle on your constraints, talk to your team and hone in on what you want to measure.</p>

<p>We’ll use the metric — LTV/CAC as a working example to walk you through the next two steps.</p>

<p><em>(LTV is the average ‘Life Time Value’ of your customer, and CAC is the average ‘Customer Acquisition Cost’. This is a useful ratio for any subscription business to track as it gives you a direct relationship between the average revenue you receive per customer and the amount you spend in acquiring them.)</em></p>

<h2 id="step-3--break-down-your-metrics-and-draw-a-line-to-their-sources"><strong>Step 3 — Break down your metrics and draw a line to their sources</strong></h2>

<p>Break down your core metrics into smaller and more digestible pieces. The goal of this step is to identify your inputs and map them to their source app.</p>

<p>Let’s apply this to our working example.</p>

<p><img src="https://miro.medium.com/max/700/1*HhXQZqpfZKtVXbJDFJq_jw.png" alt="" /></p>

<p>In the diagram above, you can see that LTV depends on revenue and churn, which is sourced from your payment processor (Stripe/PayPal) and from your website/app (WordPress, Shopify) respectively.</p>

<p>Similarly, CAC comes from your ad spend (For the sake of simplicity, we have assumed sales commissions and other sales related spend is non-existent)</p>

<p>(<em>Check out the</em> <strong><em>Helpful Equations</em></strong> <em>section at the bottom of the article to see how we arrived at the inputs in the diagram</em>)</p>

<h2 id="step-4--assemble-the-data-stack"><strong>Step 4 — Assemble the data stack</strong></h2>

<p>This is it; we are in the end game. Now we have everything we need to  <em>finally</em>  start building.</p>

<p><img src="https://miro.medium.com/max/700/1*wadntmIoCtaBMyWTe784eQ.png" alt="" /></p>

<p>A data stack has the following blocks,</p>

<p><strong>Source</strong></p>

<p>You have a list of sources/tools from Step 3. Every tool you use has dedicated hooks that help you connect to their backend data store. Build a list of these APIs/Webhooks and create the necessary access permissions.</p>

<p>Data Source examples: Stripe/PayPal, Product App, Website, CRM…<br />
<br /></p>

<p><strong>Ingest</strong></p>

<p>Connect all your data sources, and then siphon, clean, and transport your data to a warehouse or visualization tool. Customer Data Platforms (CDPs) like Segment provide simple user interfaces to do this. There are other tools such as Stitch and Fivetran that will work as well.</p>

<p>It doesn’t matter which tool you use as long as it is compatible with all your source and destination apps.</p>

<p><em>Ingestion Tool examples: Segment, Fivetran, Stitch</em><br />
<br /></p>

<p><strong>Park &amp; Optimize [Optional]</strong></p>

<p>You can choose to clean and model your data in a warehouse before consuming it. Doing so will lower your data fees and reduce the burden on your visualization tool. Note, you will need in-house SQL expertise in order to do this.</p>

<p><em>Data Warehouse Tool examples: Snowflake, Amazon Redshift/ Google BigQuery/ Microsoft Azure</em><br />
<br /></p>

<p><strong>Consume</strong></p>

<p>Finally, link your data to a BI/visualization tool. Note, if you decided to skip the ‘park &amp; optimize’ step, you will need a feature rich visualization tool. Tools like Tableau or Domo can handle heavy data wrangling and let you interact with them in a No-code fashion (drag and drop); they also cost you $$.</p>

<p>You are now ready to create meaningful charts to track your business!</p>

<p><em>Visualization Tool examples: Looker, Tableau, Mixpanel, Domo</em><br />
<br /></p>

<h1 id="close">Close</h1>

<p>We hope this framework gets you started on your journey to create your company’s data stack. There are several data tools out there and we have intentionally shied away from prescribing a holy grail solution.</p>

<p><em>Do you have any other questions that we can answer? How would you build your data stack?</em></p>

<p><em>Reach out to us with your comments, or questions, we’re super interested in your perspective!</em></p>

<p><a href="https://www.linkedin.com/in/rishhbhatia/">Rishi Bhatia</a> and <a href="https://www.linkedin.com/in/raylanvaz/">Raylan Vaz</a></p>

<h4>…</h4>

<h2 id="helpful-equations">Helpful Equations</h2>

<p><strong>LTV/CAC</strong></p>

<p>LTV/CAC = Average Life Time Value (LTV)/ Customer Acquisition Cost (CAC)</p>

<p><strong>LTV</strong></p>

<p>LTV = Average Revenue Per User (ARPU)/ Churn</p>

<p><strong>ARPU</strong></p>

<p>ARPU = Revenue for a given period / Number Of Customers in that period</p>

<p><strong>Churn</strong></p>

<p>Churn = Customers Who Left This Period/ Customers At Start Of Period</p>

<p><strong>CAC</strong></p>

<p>CAC = Total Spent On Customer Acquisition/Customers Acquired</p>]]></content><author><name>Rishi Bhatia</name></author><category term="writing" /><summary type="html"><![CDATA[You are a Founder of a bright and shiny business, or a Product Manager looking to make an immediate impact in a new team.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.rishibhatia.io/assets/og/data-stack.png" /><media:content medium="image" url="https://www.rishibhatia.io/assets/og/data-stack.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Why am I starting my website</title><link href="https://www.rishibhatia.io/writing/2022/08/23/why-am-i-starting-my-website.html" rel="alternate" type="text/html" title="Why am I starting my website" /><published>2022-08-23T22:31:06+00:00</published><updated>2022-08-23T22:31:06+00:00</updated><id>https://www.rishibhatia.io/writing/2022/08/23/why-am-i-starting-my-website</id><content type="html" xml:base="https://www.rishibhatia.io/writing/2022/08/23/why-am-i-starting-my-website.html"><![CDATA[<p>As I write this, I am sitting on the deck of an A-frame cabin perched somewhere in Sonora, CA. It’s sunny outside, the wind is rustling, and a family of deers is grazing 10 feet away from me. I hadn’t heard about Sonora until a week ago, let alone know that this cabin exists. The initial plan was to head to Yosemite, a place I am familiar with, and book a place to stay near the park. I found this A-frame on Airbnb through their <a href="https://news.airbnb.com/unique-stays-hosts-earn-more-than-300-million-since-start-of-pandemic/">newly launched feature ‘I’m flexible’</a>. Ordinarily, I see no path to how I could have found this fantastic place - the place that’s inspiring me to sit down today and write all this. I’m delighted that this feature and product exist.</p>

<p><img src="https://openai-labs-public-images-prod.azureedge.net/user-wyrVlJsP8g7Gq5MHGB84uB1I/generations/generation-9whznA4xUom690Kqh7GnO2zf/image.webp" alt="" /></p>

<p>Ok, Airbnb isn’t sponsoring this post. So how is this connected with me starting my website? I turned 30 this year. A year ago, I joined a big tech company as a Data Engineer, helping build AR products. I live in San Francisco, CA. I am surrounded by tech, innovation, ideas, and, most importantly, fun and intelligent people. A lot of these ideas and innovations directly connect with my daily life. I read a lot of what these people have to say. But, I want to add something of <em>my own</em> to this mix. I can’t live in this time and place and not create.</p>

<h3 id="what-is-this-space-going-to-be-about">What is this space going to be about?</h3>
<p><br />
<strong>Writing is thinking</strong></p>

<p>I want a place to park my fully formed ideas and my thought breadcrumbs. <a href="https://boz.com/articles/writing-thinking">Andrew Bosworth writes that ‘writing is thinking’ </a>. Thinking is what I have been doing all my life. I just haven’t been writing enough. In my head, I churn through ideas after ideas, formulating my opinion on them, but never really pausing and writing deeply. I use <a href="https://roamresearch.com/">Roam research</a> to log my daily journals. These journals are where I brain dump my thoughts at the start of my day before I get ready for work. I write a little more consciously when I’m fortunate enough to have the luxury of time, like at a coffee shop or some cabin in the woods. But it is still not enough. It isn’t something I would show the world. It isn’t proofread. I don’t have anyone reading the drafts and giving me feedback. This action, right here, is me taking writing seriously and a step further.</p>

<p><strong>On Data</strong></p>

<p>I know a thing or two about the field of data. In my career, I have worked as an Analyst, a Data Scientist, and recently as a Data Engineer. I see data and the impact it can and can’t have in unique ways. As I broaden my understanding and extend from Data, I want to share everything I have learned about this fascinating space.</p>

<p><strong>Product</strong></p>

<p>Product has been top of mind for me lately. Everything about it. How do you build products that people use and scale themselves? What’s product strategy, and how do you rally a team to execute it? These are problems that excite me now, and I want to get better at them. So I want to share my learnings as I go on this journey!</p>

<p><strong>Building in public</strong></p>

<p>I’ve started and dropped many projects historically and want to create a forcing function to break this habit of mine. For example, I kickstarted a podcast with a friend, and we just recorded the first episode this weekend. I want this website to be where I can document the journey of building products and content and hold myself somewhat accountable. I’ll be more committed if I create in public.</p>

<h3 id="fin">Fin</h3>
<p><br />
These are the themes I have for now. It is the first time I publicly put out more than 100 words. The last time I did something similar was when I published a <a href="https://medium.com/@rishh.bhatia/chess-3f5966834371">terribly-tiny-tale inspired short story on Medium</a>. Maybe, let’s not count that?</p>]]></content><author><name>Rishi Bhatia</name></author><category term="writing" /><summary type="html"><![CDATA[As I write this, I am sitting on the deck of an A-frame cabin perched somewhere in Sonora, CA. It’s sunny outside, the wind is rustling, and a family of deers is grazing 10 feet away from me. I hadn’t heard about Sonora until a week ago, let alone know that this cabin exists. The initial plan was to head to Yosemite, a place I am familiar with, and book a place to stay near the park. I found this A-frame on Airbnb through their newly launched feature ‘I’m flexible’. Ordinarily, I see no path to how I could have found this fantastic place - the place that’s inspiring me to sit down today and write all this. I’m delighted that this feature and product exist.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.rishibhatia.io/assets/og/why-website.png" /><media:content medium="image" url="https://www.rishibhatia.io/assets/og/why-website.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry></feed>