Start leveraging web data to produce granular insights
Having worked with many hedge funds now, we are still surprised by how infrequently fund managers use web scraped data to generate alpha. The web is already a goldmine of information, and companies are only increasing their online presence. We are able to answer key questions and gain highly predictive insights simply by scouring the web for data.
What funds often don’t realize is that there is far more information on the web than what a user can see on the front-end of a website. Oftentimes, there is highly predictive data in the network requests and code of a website. We’ve been able to generate incredible returns and save funds from major drawdowns through this “back-end” data.
Let’s dive into a couple of examples across industries that can drive alpha for your fund:
Perhaps the best industry to find web data is retail. The most common way we see fund managers leverage web scraped data in retail is tracking pricing through large providers. There are two main problems with this. The first is you’re likely not generating alpha as other funds see the same data at the same time. The second is that large providers standardize their extraction, at the expense of missing the metrics that provide the most predictive signals.
To solve the first problem, you need to extract data at a higher frequency than the large providers. They generally release retail data monthly to reduce their costs, making a higher extraction frequency relatively trivial. Even by setting up weekly scrapes, your analysts and PMs will know when a company is pushing promotions and eating into their gross margins weeks before other funds. At Durable Alpha, we helped one of our clients identify a reduction in promotion depth and intensity at $LULU weeks before other funds caught on, helping them generate incredible returns.
The second problem of missing metrics isa result of providers only scraping “front-end” data. There are many retail websites where we’ve found exact inventory counts in the retailer’s network requests. This provides insight into both how much volume they’re selling and if they’re having inventory issues. For instance, we helped a fund avoid a catastrophe with a stock because we were able to identify inventory issues from back-end data before earnings. Other metrics we’ve found to be helpful include 1) number of reviews across products as a directional view for sales, 2) star rating of products as a measure of NPS, and 3) special tags for products on the back-end. For example, Home Depot labels certain products as super SKUs on the back-end, which we can use to track how well industrial companies’ products are selling through the HD channel.
This is where we’ve seen the largest data gap for funds. Traditional datasets like credit card, foot traffic, app downloads etc. can’t tell you how Snowflake is performing. Yet many of these companies have a strong online presence that can reveal valuable insights. Below are three ways we track enterprise software companies.
Any website that contains information on unique inventory can provide incredible insights. Examples of this include car inventory (Carvana), home inventory (Pulte Homes), and room inventory (WeWork). Each of these have uniquely identifiable units that can be tracked when they are added and tagged as “sold” when they disappear from the website. By tracking the website frequently, you can predict revenue extremely accurately.
These are just a few of the dozens of approaches we’ve developed to generate alpha using web data. We’re always discovering more alongside our clients. If any of these sound interesting or if you’d simply like to bounce around ideas, we’re always open to having a conversation. Feel free to reach out to us using the link below.