
Accurate demand forecasting is key to profitability in e-commerce, directly affecting pricing, inventory management, and customer experience. At Zalando, the limitations of traditional machine learning models like Random Forest became increasingly evident as the business scaled. Mariia Bulycheva, an Applied Scientist at Zalando, explains how replacing the Random Forest model with Deep Learning helped refine demand forecasting and saved the platform millions of euros.
The Importance of Demand Forecasting in E-Commerce
Finding more accurate demand forecasting methods is critical for the e-commerce sector. Highly precise forecasts allow for more effective pricing and inventory management, ultimately maximizing company profits.
From a pricing perspective on a marketplace, demand forecasting provides insights into how quickly specific products will sell, enabling better planning for price adjustments and discounts. It is particularly important to estimate how a specific product will sell under different discount levels and how this will impact the company’s overall profitability. To achieve this, a model is needed that can determine optimal discounts based on demand forecasts. Such a model must calculate profit derivatives based on discount dependency and select the discount that maximizes profit while maintaining the company’s targeted growth rate.
For inventory management, more accurate forecasts help reduce instances where popular products go out of stock. For example, demand predictions might indicate that a certain model of skis is highly popular among customers and could sell out before winter ends, prompting additional stock purchases. A lack of availability on a marketplace leads to customer dissatisfaction, so avoiding such situations enhances customer loyalty.
Random Forest: Capabilities and Limitations
AI-powered demand forecasting tools have been in use for quite some time. Initially, Zalando relied on the Random Forest model, which was technologically advanced at the time and effectively met business needs.
Random Forest is based on an ensemble of decision trees, where accuracy is achieved by aggregating multiple trees—each individually offering relatively low precision. Since it is a stepwise function, derivatives can only be estimated approximately using traditional numerical methods. This means that the more precise the required calculation, the longer it takes.
While Random Forest provides high-quality predictions with limited data, its performance degrades significantly when dealing with massive datasets, which is a critical limitation for a large-scale marketplace like Zalando.
One major drawback is that Random Forest does not naturally model sequential dependencies. When forecasting multiple time intervals into the future, it requires additional lagged features, iterative predictions, or separate models for each interval. This increases development and maintenance complexity while demanding significant computational resources.
These constraints hinder rapid forecasting. For example, at the start of a new week, the pricing and discount team receives an updated demand forecast based on the previous week’s data. However, calculating optimal discounts using Random Forest takes around 6–8 hours due to slow derivative calculations. As a result, new prices would not be available on the website at 8 AM but rather at 4 PM, negatively impacting both financial performance and user experience.
Additionally, Zalando needed to forecast demand for multiple weeks ahead to efficiently allocate the discount budget throughout the season and manage stock levels. Random Forest struggled with long-term forecasts, as it failed to capture complex temporal dependencies, particularly long-term trends. It also did not account for seasonality and other time-based correlations. Each historical data row, corresponding to a specific time point, was treated as independent, meaning Random Forest did not inherently model sequential dependencies.
Over time, Random Forest became inadequate for Zalando’s growing needs. This was due not only to the limitations of the method itself but also to the evolving demands of the marketplace. As the business expanded, so did the number of products requiring simultaneous demand forecasting, as well as the number of countries where Zalando operates. The increasing competitive landscape also heightened the need for faster computations. Moreover, since 2014, Zalando has been a publicly traded company, requiring stricter forecasting accuracy to inform shareholders about expected sales growth. Furthermore, demand forecasting was no longer just for pricing—it also became crucial for accurately assessing seasonal demand fluctuations.
The Solution: Deep Learning
To overcome the limitations of Random Forest, Zalando implemented a Deep Learning model. Modern Deep Learning models, such as transformers, are specifically designed to process sequential data and predict sequences of any type and length. Unlike Random Forest, transformers can efficiently handle thousands of products and parameters in real-time. Zalando’s first transformer-based model was deployed as early as 2019.
Transformers excel at identifying complex patterns and dependencies in large datasets. They outperform traditional models in handling sequential data, leading to more accurate and reliable long-term forecasts.
During this period, machine learning frameworks like PyTorch became widely adopted, enabling automatic derivative computation for differentiable functions. From a marketplace perspective, this significantly accelerated the computation of profit derivatives concerning discounts (through demand-price-profit relationships). To achieve this, demand forecasting needed to be based on smooth functions—precisely what neural networks provide.
Zalando required forecasts not only for several weeks ahead but also for multiple time horizons (e.g., five weeks, an entire season). With transformers, this was easily achieved by introducing an additional parameter into the model, eliminating the need for extensive computational resources to extend forecasts to new intervals.
The core task involved forecasting time series. The neural network was trained on historical sales data, discounts, and product attributes. Sales of different products are often interrelated: some items complement each other (e.g., jeans and T-shirts), while others compete (e.g., new sneaker releases cannibalizing the sales of older models).
To address this, the model learned from all sales data simultaneously, capturing both recent and long-term trends. The multi-head attention mechanism in transformers was particularly useful, allowing the model to capture dependencies across long sequences. By feeding a year’s worth of sales data into the network, it could identify short- and long-term trends, including seasonal effects and upcoming holiday-driven demand.
The implementation also incorporated a monotonic demand-price dependency, ensuring that demand predictions always reflected the natural relationship where higher discounts lead to higher sales. This was achieved by adding a dedicated layer between the transformer’s encoder and decoder.
Another key improvement was stock availability estimation. Previously, forecasts underestimated demand, leading to frequent out-of-stock situations. The new Deep Learning model slightly overestimates demand, ensuring that sufficient stock remains available, improving customer satisfaction.
The Business Impact of Deep Learning Implementation
The Deep Learning model significantly accelerated derivative calculations. Whereas Random Forest required 6–8 hours, the new model performed these computations 15 times faster, reducing runtime to about 30 minutes. This meant that new prices appeared on the website almost immediately after updated demand forecasts were available, resulting in a €1 million annual savings on runtime costs alone. Additionally, parallelized GPU processing reduced training time from 24 hours to just 8 hours.
Deep Learning also provided precise and flexible discount planning, accounting for seasonality. For example, it could predict peak swimwear demand in July, prompting early stock purchases, while recommending discounts as demand declined in late August.
The implementation of monotonic demand modeling allowed for more accurate discount budgeting. As a result, discount levels aligned better with overall pricing objectives.
Zalando also significantly reduced the number of out-of-stock situations caused by underestimating demand. Consequently, popular products were almost always available for purchase, enhancing customer loyalty.
The accuracy of demand forecasts improved substantially: across various product categories and countries, forecast deviations from actual sales decreased by 20.5 percentage points compared to Random Forest. Error rates across different categories dropped by 5 to 30.5 percentage points.
From a financial perspective, implementing Deep Learning is saving Zalando millions of euros annually through improved forecasting accuracy.