Listing Lifetime Value (LTV) Prediction
LTV prediction for vacation rental listings is a problem where the framing matters more than the model: define "lifetime" or "value" wrong and even a perfect model produces useless outputs. The stakes are high because LTV scores drive host acquisition spend, market investment decisions, and which listings get proactive support. I'll work through what we're predicting, business and ML objectives, system architecture, data and features, modeling, infrastructure, evaluation, and robustness.
Solution Walkthrough
Before diving into the modeling, I want to be really precise about what we're predicting, because this is where most LTV systems go wrong. People say "predict lifetime value" as if that's a well-defined quantity, but there are at least three definitions you could use, and they lead to very different systems.
Value Definition
The value we're predicting is the total revenue the platform extracts from this listing over its lifetime. That means the fees the platform charges on each booking, typically a host service fee of around 3% of the booking subtotal, plus a guest service fee of 14 to 16% of the booking subtotal. Together, the platform captures roughly 17 to 19% of the gross booking value on every transaction. So if a listing generates $100K in gross booking value over its lifetime, the platform captures about $17K to $19K. That's the value we're modeling.
It's important to be explicit about what we're not counting. We're not counting revenue from the same host's other listings, that's host LTV, which is a related but distinct problem. We're not counting indirect network effects, like this listing attracting guests who then book other listings on the platform. And we're not counting brand value or market positioning effects. These are all real value, but they're either attributable to different entities or too diffuse to model at the listing level.
Why does this definition matter? Because it aligns incentives correctly. The platform wants listings that generate high booking volume consistently. A listing that gets one $10K booking per year is actually less valuable than one that gets 50 bookings at $1K each, even though the gross booking value is identical. The high-frequency listing generates more guest diversity, more reviews, more marketplace liquidity, and more opportunities for the platform to earn fees. Our value definition naturally captures this because fees scale with transaction count.
Lifetime Definition
Lifetime is the time from listing activation until the listing becomes permanently inactive. An active listing is one that's available for booking, hasn't been delisted, and whose host is still responsive. A listing dies for several reasons: the host might remove it voluntarily or stop responding to booking requests, the platform might delist it for policy violations, the property might be sold or repurposed for something other than short-term rental, or the listing might go dormant (no bookings and no calendar updates for twelve or more months) which we treat as effectively dead.
Here's the fundamental challenge: most listings in our dataset are still active today. We don't know their full lifetime. A listing created two years ago that's still active; we know its LTV is at least X, but we don't know the final value. This is the censoring problem, and it's the single most important technical consideration in the entire design. Seventy to eighty percent of our training data is censored, which means we can't just do naive regression on observed LTV without massively underestimating the true value of active listings. We need survival analysis to handle this properly.
For practical planning, we predict over a fixed horizon rather than trying to estimate infinite-horizon LTV. We typically predict three-year LTV or five-year LTV. Infinite horizon has too much uncertainty, you can say "expected revenue in the next three years" with reasonable confidence, but "expected revenue until the listing eventually dies" depends on events that haven't happened yet and may be inherently unpredictable. The average listing lifetime is about three to four years, some are multi-year powerhouses that generate steady bookings for a decade, while many churn within the first year because the host didn't enjoy the experience or property circumstances changed. What we care most about is separating the winners from the early churners, and a three-year horizon captures that distinction well.
Unlock Full Solution
Get access to the complete walkthrough, key concepts, summary, and follow-up questions.