AI & Technology

From satellite relays to neural networks: How engineering school helped Andrey Shcherbinin become an architect of high-load ML systems

The rapid development of artificial intelligence technologies in recent years has created a paradoxical market situation. On the one hand, basic machine learning models have become accessible to virtually every business, but on the other, creating reliable, industrial-strength systems capable of operating under high loads and generating real profit remains an exceptionally complex task. Most projects stall at the prototype stage precisely because they lack architectural coherence and an engineering foundation. The challenge lies not so much in setting up neural networks as in integrating them into a company’s ecosystem, ensuring data stability and transparency of results.

That’s why the experience of outstanding specialists like Andrey Shcherbinin is particularly valuable to the industry. Andrey isn’t just a data expert; he’s an engineer with a deep academic background and a PhD, who has risen from fundamental science to managing machine learning teams at major tech companies. Now, as a team leader at Social Discovery Group (an international technology company specializing in the creation and development of social platforms and dating apps), he designs systems that automate complex business processes and save the company millions. We spoke with Andrey about how a fundamental education helps build the architecture of modern IT solutions and why mathematical precision is important not only in code but also in team management.

Andrey, the topic of our interview harks back to your engineering background. While many IT professionals are coming from short-term training programs, you have a solid academic background and a PhD. How exactly does your fundamental engineering knowledge, including signal processing, help you today in your work with modern machine learning systems?

An engineering education, especially at the PhD level, provides more than just a knowledge of formulas—it develops systems thinking. In electronics or signal processing, you deal with huge streams of noisy data from which you need to extract useful information in real time. In modern machine learning, the tasks are conceptually very similar. When we design the architecture for a recommender system or attribution algorithm, we are essentially solving the same filtering and optimization problem, only the tools have changed.

Understanding statistics and probability theory allows me to view ML models not as elements that simply produce results, but as mathematically sound systems. This is crucial when something goes wrong. An engineer with a fundamental foundation won’t randomly try out parameters; they’ll analyze the nature of the error, whether it’s data variance or a technical issue.

Moreover, academic experience instills discipline in experimentation. In science, conclusions cannot be drawn based on a single successful launch. The same is true in the development of high-load systems: any implementation must be supported by a rigorous testing methodology. This allows for the development of reliable solutions that do not break down at scale, which is the key difference between industrial development and amateur development.

You mentioned attribution algorithms. Your portfolio includes a case study on implementing a new marketing attribution model, which yielded a dramatic increase in the accuracy of first-time sales forecasting. Could you explain the engineering complexity of this task and why standard solutions weren’t working?

Marketing attribution is an attempt to understand which advertising channel led a customer to a purchase. Standard approaches often simplify the picture, giving all the credit, for example, to the last click. However, in reality, the user journey is complex and convoluted. The engineering challenge lay in implementing probabilistic models and SOTA neural network approaches that fairly distribute credit across all customer interactions with a brand. This requires enormous computing power, as it requires calculating all possible interaction combinations.

We encountered a problem with existing calculations, which took approximately six hours, making it impossible to make timely decisions. Marketing needs data “here and now.” My task was not only to improve the mathematical model but also to optimize the calculation process itself. We revised the algorithms and data processing architecture, which allowed us to reduce calculation time from six hours to 30 minutes—a 12-fold improvement.

As a result, we created a system that not only works faster but also allows the business to spend its budget more efficiently. We improved budget efficiency by stopping spending on channels that weren’t actually producing results, but appeared successful under the old models. This is an example of how complex math can be converted into direct financial benefits.

Speeding up processes by 12 times is a colossal achievement. Clearly, this requires a major infrastructure overhaul. How did you approach data flow automation to ensure such performance and reliability?

Achieving such speed and reliability requires a transition from manual process management to fully automated data processing pipelines. At Social Discovery Group, we built a complex chain of interactions: from MSSQL and BQ databases through the Airflow orchestrator and Kafka message broker to Athena analytics and S3 cloud storage. Previously, many processes required manual intervention, which created vulnerabilities and increased the risk of human error.

We automated these flows, reducing manual workload by 80%. This isn’t just a matter of engineer convenience; it’s a matter of business stability. When the system operates autonomously, it becomes predictable. We also reduced data latency by 35%. This means analysts and models receive up-to-date information almost instantly, which is critical for dynamic markets.

Scalability is a key aspect here. The architecture we implemented is capable of analyzing vastly larger volumes of data without the need to rewrite code. We used cloud solutions and distributed systems to ensure that as the load increases, the system simply utilizes more resources rather than collapsing under the weight of requests.

Another important area of ​​your work was the creation of a support chatbot with intent detection. You’ve automated 95% of conversations. How did you achieve such a high accuracy rate (0.978) while still ensuring user data security?

Creating a chatbot for customer support is always a balance between automation and user satisfaction. The main problem with many bots is that they lack context. We developed a model that recognizes 35 different user intents with exceptional accuracy. An accuracy of 0.978 indicates that the system classifies requests almost flawlessly, distinguishing, for example, refund inquiries from technical issues. This allowed us to handle the vast majority of requests without the involvement of human operators.

However, beyond the system’s intelligence, security is critically important. We work with large language models, and feeding them raw user data is unacceptable. Therefore, I implemented a message anonymization mechanism. Before the text is fed into the neural network to generate a response, all personal data is removed.

This engineering solution allows us to harness the power of modern generative networks without compromising customer privacy or regulatory requirements. We’ve created a kind of security gateway that filters incoming and outgoing traffic. The result is a tool that not only saves the company’s resources but also provides users with instant and secure assistance.

Any complex system is prone to degradation or failure over time. You’ve developed a monitoring system for ML models that covers 100% of the algorithms used. Why is this so important, and how has it impacted the stability of your services?

In the world of machine learning, there’s a concept called “data drift.” A model trained on last month’s data may stop working today due to changes in user behavior or external conditions. Without high-quality monitoring, a company can lose money for weeks without even realizing that the algorithm has begun to make mistakes. Therefore, creating a comprehensive monitoring system was a priority for me.

We used a technology stack to visualize the state of all models in real time. This reduced the time to detect problems by 60%. Now, if the quality of recommendations or classification accuracy declines, engineers are notified immediately, rather than through user complaints.

Furthermore, monitoring has optimized the model retraining process. We reduced the preparation time for retraining by 40%, as the system automatically identifies which data has changed and requires attention. This transforms ML system support from a chaotic “firefighting” process into a planned, managed process that guarantees transparency and business stability.

You currently lead a team and manage a cross-functional group of engineers. How did you manage to reduce new hire onboarding time by 40% and build collaboration between ML specialists, MLOps, and backend developers?

Transitioning from a solo developer to a leadership role requires shifting the focus from code to processes. The main problem in mixed teams is communication gaps. An ML engineer can create a great model, but a backend developer won’t know how to integrate it properly. To solve this problem, I implemented a responsibility matrix and a clear system of key performance indicators (KPIs). This made responsibilities transparent: each team member understands their responsibilities and who to contact with questions.

I paid special attention to knowledge management. Information is often passed on by word of mouth, which is ineffective. We standardized documentation and the onboarding process. By creating a unified knowledge base and clear procedures, new engineers are now up to speed almost twice as fast. They don’t have to spend weeks trying to understand the architecture—everything is documented.

This approach also improved project handoffs between development stages. We eliminated situations where projects would stall due to someone misunderstanding or forgetting something. A systematic approach to managing people is just as effective as managing servers—it reduces uncertainty and improves overall team productivity.

You actively work to ensure that technical implementation aligns with the company’s business goals (OKRs). How difficult is it for an engineer to adopt the language of business, and why is this necessary for a successful ML product?

It’s tempting for an engineer to get carried away with technology for its own sake—using the latest neural network simply because it’s trendy. But in commercial development, that’s a dead end. My job as a leader is to translate between the language of tensors and the language of profit. I work closely with product management to ensure that every line of code works toward specific business goals (OKRs), whether that’s revenue growth or user retention.

When the team understands how their work impacts the company’s success, their motivation and the quality of their decisions change. We don’t just develop new features and elements; we solve business problems. For example, improving the attribution algorithm wasn’t done for the sake of fancy math, but to optimize the marketing budget. And the team saw this direct impact.

This synchronization allows us to focus on what’s truly important and cut out tasks that won’t bring value. This creates a culture of responsibility and performance. An engineer who understands the business context can offer solutions that a manager couldn’t even imagine, because they see technical opportunities for profit growth.

Finally, Andrey, as an architect of complex systems, how do you see the future of ML infrastructure development in the coming years? What skills will be key for engineers who want to follow in your footsteps?

The industry is moving toward simplified access to models, but increasingly complex orchestration. The future lies in robust MLOps platforms that make model lifecycle management as easy as managing regular code. We’ll see even more automation in training and deployment processes. Manual model management will become a thing of the past, giving way to high-level architectural design.

Multidisciplinarity will become a key skill for engineers. It’s no longer enough to simply be a good data scientist. You need to understand how clouds work, how databases are structured, and how to ensure security.

And, of course, the need for a fundamental foundation won’t go away. Technologies change every six months, but the principles of statistics, linear algebra, and algorithmic theory remain constant. Understanding the fundamentals allows you to quickly adapt to any changes. My advice to aspiring specialists: learn the fundamentals and learn to see the larger system behind the code—this is the only way to create truly outstanding products.

Andrey Shcherbinin’s experience clearly demonstrates that success in the modern world of high technology is built on a solid foundation of classical engineering. The ability to apply rigorous mathematical methods to chaotic business processes, the ability to see the entire system architecture, and the talent to transform scientific research into effective monetization tools—this is what separates a visionary from a mere implementer.

Andrey’s case studies—from accelerating computations by 12 times to creating secure, intelligent chatbots—prove that machine learning is no longer just an experimental field. In the hands of a professional of his caliber, it becomes a powerful driver of business growth. It is these outstanding specialists who set industry standards today and shape the technologies of tomorrow.

Author

Related Articles

Back to top button