Divide and conquer: research and applied projects in MIL
Machine Intelligence Laboratory was founded in 2017, since then, the members of our team have been tirelessly working: we implemented 16 R&D projects, published 3 Q2 articles, organised 9 events, started the Science Club and MVP projects development. We are happy to share our experience and open the doorway into the lab’s inner workings for you to have a possibility not only evaluate the finished product but also observe the development process.
Ever since the lab was founded, MIL. Team has been successfully solving business-related problems for big companies. For this aim, we apply AI approaches implementing both research and applied projects. A year ago, our team separated these main work directions and formed two units:
- 2 teams -> 2 types of expertise;
- 2 technology stacks -> 2 approaches to projects implementation.
By the end of the article you will know:
- why we decided to separate the tracks of research and applied projects;
- how this decision affected the cooperation with our partners;
- what did we get in the end?
Impact on the cooperation with partners
The experience of work with large Russian and international companies, studying requirements and specifics of the requests to outsource AI projects development sparked us to separate two main directions of the lab and form two autonomous teams. The thing is that outsourced projects may be also divided into two types corresponding to the MIL’s work directions. In the first case, the client’s request focuses on getting the finished solution in a small period of time — the perfectly suitable task for Applied projects Team. As for the second case, when companies are ready to invest in technologies and be the part of long-term and high-risk projects, the scope is usually limited by modelling process and does not involve other fields of ML and AI. Industrial Research Team successfully deals with these kinds of tasks.
Different qualifications
By forming two separate units we split the duties and created a comfortable environment where every employee focuses on the set of competencies inevitable for the team. From the members of Research AI Team, we expect the ability to formulate and check a big number of scientific hypotheses per unit of time, this subdivision mainly presents the team of experienced ML-engineers and researchers. As for Applied AI Team, the members are capable of analysing business processes, working with raw data, feature engineering and infrastructure for the application and monitoring models. The team is represented by analysts understanding how to properly decompose the business process into its components and apply the technologies for its optimization, data scientists responsible for data collection from the company’s internal systems and its systematization into data marts, and DevOps/MLOps engineers customizing the delivery of the system and ensuring its integration with the surrounding systems.
Different expectations
In addition to a different set of tasks and competencies, the teams are expected to show different results. Industrial Research unit is expected to provide a ready-to-use library or trained model corresponding to the discussed limitations. In this case, we deal with the state-of-the-art decision as usually the similar models hadn’t already been created. As for described in SotA-articles decisions, the page-to-page realization doesn’t often work.
As the outcome of the applied projects, it is expected to get the production-ready service successfully built into the business-process and passed the tests. In most of the cases, companies don’t focus on the technology stack used in the projects as the most meaningful indicators of the success are results interpreted by A/B testing or product/business-process metrics analyses.
The cases when the project includes the implementation of both research and service solution are considered separately. Usually, the project is divided into several stages.
Different technology stacks
Besides the described above distinctions, team members use different technological stacks. Research AI Team uses modelling frameworks mainly: TensorFlow or PyTorch, also, they develop their own ones — BigARTM, TopicNet. Other instruments are geared towards increasing the number of hypotheses checked in the period of time (WanDB, MLflow) and auxiliary metrics like code quality assessed by code review.
Applied AI Team mainly applies existing solutions, the task is to embed them into the right product pipeline and configure to provide the necessary functionality and control over the models. For example, Airflow is used for models control, Kubernetes — also for models control and for data flows direction, Hadoop and Spark are perfect for big data analyses and MLflow serve for tracking pipelines.
Different risks and management model
Finally, the teams face different risks and limitations working on the projects, that is why different management models are required. The Achilles’ heel of research projects are technology limitations so the best solution here is to focus on the rapid test of scientific hypotheses claimed to eliminate the uncertainty of obtaining the target result. In this connection, the management should be based on addressing such risks by intensive brainstorming, analysing the literature and conducting experiments.
The key risks related to applied projects implementation are concentrated in failing to gain the expected results. That is why the management process should be focused on fast pipeline building and testing product value. For this aim, the team analyses the business-process, tests several variations of the prototype, integrates the ready solution into the process and tracks business metrics based on the results of integration.
Results
So, what did we get by separating Research AI and Applied AI and forming two autonomous teams?
- A comfortable environment where every employee has the possibility to move in the direction of interest.
- Transparency of expectations and requirements. The processes of creating vacancies, hiring new employees, conducting onboarding and evaluating results become more simple.
- Understanding the target audience: we know which companies are research-oriented, and which ones expect a rapid improvement of the business indicators.