{"id":29165,"date":"2026-04-27T16:30:45","date_gmt":"2026-04-27T16:30:45","guid":{"rendered":"https:\/\/hes.mephi.ru\/?page_id=29165"},"modified":"2026-04-27T21:01:24","modified_gmt":"2026-04-27T21:01:24","slug":"big-data-engineering","status":"publish","type":"page","link":"https:\/\/hes.mephi.ru\/?page_id=29165","title":{"rendered":"Big Data Engineering"},"content":{"rendered":"<div id=\"pl-29165\"  class=\"panel-layout\" ><div id=\"pg-29165-0\"  class=\"panel-grid panel-has-style\" ><div class=\"siteorigin-panels-stretch panel-row-style panel-row-style-for-29165-0\" data-stretch-type=\"full\" ><div id=\"pgc-29165-0-0\"  class=\"panel-grid-cell\" ><div id=\"panel-29165-0-0-0\" class=\"so-panel widget widget_sow-editor panel-first-child panel-last-child\" data-index=\"0\" ><div class=\"so-widget-sow-editor so-widget-sow-editor-base\">\n<div class=\"siteorigin-widget-tinymce textwidget\">\n\t<p><img decoding=\"async\" loading=\"lazy\" class=\"aligncenter wp-image-28479\" src=\"http:\/\/hes.mephi.ru\/wp-content\/uploads\/2026\/04\/Logo_Vish_eng-1.png\" alt=\"\" width=\"250\" height=\"125\" \/><\/p>\n<\/div>\n<\/div><\/div><\/div><\/div><\/div><div id=\"pg-29165-1\"  class=\"panel-grid panel-has-style\" ><div class=\"siteorigin-panels-stretch panel-row-style panel-row-style-for-29165-1\" data-stretch-type=\"full-stretched\" ><div id=\"pgc-29165-1-0\"  class=\"panel-grid-cell\" ><div id=\"panel-29165-1-0-0\" class=\"so-panel widget widget_sow-headline panel-first-child panel-last-child\" data-index=\"1\" ><div class=\"panel-widget-style panel-widget-style-for-29165-1-0-0\" ><div class=\"so-widget-sow-headline so-widget-sow-headline-default-cae038182b94-29165\"><div class=\"sow-headline-container \">\n\t<h3 class='sow-headline'>\t\t\t\t\t\t<a href=\"http:\/\/hes.mephi.ru\/wp-content\/uploads\/2026\/04\/04.01-Big-Data-Engineering.pdf\" target=\"_blank\" rel=\"noopener noreferrer\">\n\t\t\t\t\tDOWNLOAD THE FULL COURSE SYLLABUS<\/a><\/h3><\/div><\/div><\/div><\/div><\/div><div id=\"pgc-29165-1-1\"  class=\"panel-grid-cell\" ><div id=\"panel-29165-1-1-0\" class=\"so-panel widget widget_sow-headline panel-first-child panel-last-child\" data-index=\"2\" ><div class=\"so-widget-sow-headline so-widget-sow-headline-default-cae038182b94-29165\"><div class=\"sow-headline-container \">\n\t<h3 class='sow-headline'>\t\t\t\t\t\t<a href=\"https:\/\/hes.mephi.ru\/?page_id=28339\" >\n\t\t\t\t\tBACK TO THE CURRICULUM<\/a><\/h3><\/div><\/div><\/div><\/div><div id=\"pgc-29165-1-2\"  class=\"panel-grid-cell\" ><div id=\"panel-29165-1-2-0\" class=\"so-panel widget widget_sow-headline panel-first-child panel-last-child\" data-index=\"3\" ><div class=\"so-widget-sow-headline so-widget-sow-headline-default-cae038182b94-29165\"><div class=\"sow-headline-container \">\n\t<h3 class='sow-headline'>\t\t\t\t\t\t<a href=\"https:\/\/hes.mephi.ru\/?page_id=28855\" target=\"_blank\" rel=\"noopener noreferrer\">\n\t\t\t\t\tBACK TO MASTER'S PROGRAM<\/a><\/h3><\/div><\/div><\/div><\/div><div id=\"pgc-29165-1-3\"  class=\"panel-grid-cell\" ><div id=\"panel-29165-1-3-0\" class=\"so-panel widget widget_sow-headline panel-first-child panel-last-child\" data-index=\"4\" ><div class=\"so-widget-sow-headline so-widget-sow-headline-default-cae038182b94-29165\"><div class=\"sow-headline-container \">\n\t<h3 class='sow-headline'>\t\t\t\t\t\t<a href=\"https:\/\/hes.mephi.ru\/?page_id=28947\" target=\"_blank\" rel=\"noopener noreferrer\">\n\t\t\t\t\tABOUT HES MEPHI<\/a><\/h3><\/div><\/div><\/div><\/div><\/div><\/div><div id=\"pg-29165-2\"  class=\"panel-grid panel-has-style\" ><div class=\"siteorigin-panels-stretch panel-row-style panel-row-style-for-29165-2\" data-stretch-type=\"full\" ><div id=\"pgc-29165-2-0\"  class=\"panel-grid-cell panel-grid-cell-empty\" ><\/div><div id=\"pgc-29165-2-1\"  class=\"panel-grid-cell\" ><div id=\"panel-29165-2-1-0\" class=\"so-panel widget widget_sow-headline panel-first-child\" data-index=\"5\" ><div class=\"so-widget-sow-headline so-widget-sow-headline-default-664267c7ac08-29165\"><div class=\"sow-headline-container \">\n\t<h2 class='sow-headline'>Big Data Engineering<\/h2><\/div><\/div><\/div><div id=\"panel-29165-2-1-1\" class=\"so-panel widget widget_sow-editor panel-last-child\" data-index=\"6\" ><div class=\"so-widget-sow-editor so-widget-sow-editor-base\">\n<div class=\"siteorigin-widget-tinymce textwidget\">\n\t<h3 style=\"text-align: justify;\"><span style=\"text-align: justify;\"><span style=\"color: #ffffff; font-family: 'Open Sans';\">The course treats big data not only as large volumes of information, but as a specific class of engineering tasks, where data origin, ingestion mode, storage architecture, processing logic, quality requirements, observability and reproducibility form a unified project system.<\/span><\/span><\/h3>\n<\/div>\n<\/div><\/div><\/div><\/div><\/div><div id=\"pg-29165-3\"  class=\"panel-grid panel-has-style\" ><div class=\"siteorigin-panels-stretch panel-row-style panel-row-style-for-29165-3\" data-stretch-type=\"full\" ><div id=\"pgc-29165-3-0\"  class=\"panel-grid-cell\" ><div id=\"panel-29165-3-0-0\" class=\"so-panel widget widget_sow-editor panel-first-child panel-last-child\" data-index=\"7\" ><div class=\"so-widget-sow-editor so-widget-sow-editor-base\">\n<div class=\"siteorigin-widget-tinymce textwidget\">\n\t&nbsp;\n<p style=\"text-align: justify; font-family: 'Open Sans';\">The course has a theoretical and applied engineering character. Its goal is not to focus on the isolated mastery of individual technologies, but to develop students\u2019 systemic understanding of how big data processing systems are designed, implemented, and maintained under real engineering constraints.<\/p>\n\n<p style=\"text-align: justify; font-family: 'Open Sans';\">A modern digital system encompasses not only application logic and an infrastructure platform, but also a dedicated data processing workflow. This workflow covers data origin, methods of data acquisition, storage architecture, processing, result publication, quality control, and data lifecycle management.<\/p>\n\n<p style=\"text-align: justify; font-family: 'Open Sans';\">Therefore, within the course, big data engineering is treated not as a collection of isolated technologies, but as a fully\u2011fledged architectural layer of a digital system.<\/p>\n\n<p style=\"text-align: justify; font-family: 'Open Sans';\">Architectural decisions in data engineering are always made with consideration of constraints imposed by the type of data origin, the method of data ingestion, the structure and variability of sources, quality requirements and deadlines for delivering results, execution environment limitations, and requirements for maintenance and evolution.<\/p>\n\n<p style=\"text-align: justify; font-family: 'Open Sans';\">Thus, designing a data pipeline and platform means finding a balance between the requirements for the outcome and the actual properties of the data and the operational environment.<\/p>\n\n<p style=\"text-align: justify; font-family: 'Open Sans';\">Within the course, data solutions are not viewed as standard templates, but rather as a space of alternatives. Students analyse the differences between transactional sources, event\u2011based sources, APIs, and web sources, as well as between regular data loading and sources with changing contours, centralised versus distributed processing, simple pipelines versus more mature data platforms, and batch logic versus architectures that require a subsequent transition to stream processing. The data pipeline is considered part of the lifecycle of an engineering project.<\/p>\n\n<p style=\"text-align: justify; font-family: 'Open Sans';\">A distinctive feature of the course is the end\u2011to\u2011end use of large language models (LLMs) as a tool for engineering analysis, generating architectural alternatives, project reflection, and supporting students\u2019 independent work.<\/p>\n\n<p style=\"text-align: justify; font-family: 'Open Sans';\">The course aims to build the ability to select a solution based on the real characteristics of a task, rather than on the popularity of a particular technology.<\/p><\/div>\n<\/div><\/div><\/div><\/div><\/div><div id=\"pg-29165-4\"  class=\"panel-grid panel-has-style\" ><div class=\"siteorigin-panels-stretch panel-row-style panel-row-style-for-29165-4\" data-stretch-type=\"full\" ><div id=\"pgc-29165-4-0\"  class=\"panel-grid-cell\" ><div id=\"panel-29165-4-0-0\" class=\"so-panel widget widget_sow-headline panel-first-child\" data-index=\"8\" ><div class=\"so-widget-sow-headline so-widget-sow-headline-default-4e1b8d3af015-29165\"><div class=\"sow-headline-container \">\n\t<h3 class='sow-headline'>OBJECTIVES<\/h3><\/div><\/div><\/div><div id=\"panel-29165-4-0-1\" class=\"so-panel widget widget_sow-editor panel-last-child\" data-index=\"9\" ><div class=\"so-widget-sow-editor so-widget-sow-editor-base\">\n<div class=\"siteorigin-widget-tinymce textwidget\">\n\t<h3 style=\"text-align: justify;\"><span style=\"text-align: justify;\"><span style=\"color: #ffffff; font-family: 'Open Sans';\">Understanding data engineering as a distinct engineering layer of digital systems;<\/span><\/span><\/h3>\n<h3 style=\"text-align: justify;\"><span style=\"text-align: justify;\"><span style=\"color: #ffffff; font-family: 'Open Sans';\">Analysing data origin and its influence on architectural decisions;<\/span><\/span><\/h3>\n<h3 style=\"text-align: justify;\"><span style=\"text-align: justify;\"><span style=\"color: #ffffff; font-family: 'Open Sans';\">Mastering principles of data pipeline and data platform design;<\/span><\/span><\/h3>\n<h3 style=\"text-align: justify;\"><span style=\"text-align: justify;\"><span style=\"color: #ffffff; font-family: 'Open Sans';\">Developing the ability to choose architectural solutions depending on data type, ingestion mode and environmental constraints;<\/span><\/span><\/h3>\n<h3 style=\"text-align: justify;\"><span style=\"text-align: justify;\"><span style=\"color: #ffffff; font-family: 'Open Sans';\">Building skills in engineering decomposition and design of data solutions;<\/span><\/span><\/h3>\n<h3 style=\"text-align: justify;\"><span style=\"text-align: justify;\"><span style=\"color: #ffffff; font-family: 'Open Sans';\">Fostering a culture of using LLMs in engineering analysis and project activities.<\/span><\/span><\/h3>\n<\/div>\n<\/div><\/div><\/div><div id=\"pgc-29165-4-1\"  class=\"panel-grid-cell panel-grid-cell-empty\" ><\/div><div id=\"pgc-29165-4-2\"  class=\"panel-grid-cell\" ><div id=\"panel-29165-4-2-0\" class=\"so-panel widget widget_sow-headline panel-first-child\" data-index=\"10\" ><div class=\"so-widget-sow-headline so-widget-sow-headline-default-4e1b8d3af015-29165\"><div class=\"sow-headline-container \">\n\t<h3 class='sow-headline'>KEY TASKS<\/h3><\/div><\/div><\/div><div id=\"panel-29165-4-2-1\" class=\"so-panel widget widget_sow-editor panel-last-child\" data-index=\"11\" ><div class=\"so-widget-sow-editor so-widget-sow-editor-base\">\n<div class=\"siteorigin-widget-tinymce textwidget\">\n\t<h3 style=\"text-align: justify;\"><span style=\"text-align: justify;\"><span style=\"color: #ffffff; font-family: 'Open Sans';\">To develop a systemic view of the place and role of data engineering in the architecture of digital systems;<\/span><\/span><\/h3>\n<h3 style=\"text-align: justify;\"><span style=\"text-align: justify;\"><span style=\"color: #ffffff; font-family: 'Open Sans';\">To study types of data origin and strategies for data acquisition;<\/span><\/span><\/h3>\n<h3 style=\"text-align: justify;\"><span style=\"text-align: justify;\"><span style=\"color: #ffffff; font-family: 'Open Sans';\">To master principles of designing data platform and data pipeline architecture;<\/span><\/span><\/h3>\n<h3 style=\"text-align: justify;\"><span style=\"text-align: justify;\"><span style=\"color: #ffffff; font-family: 'Open Sans';\">To learn data storage models and basic approaches to big data processing;<\/span><\/span><\/h3>\n<h3 style=\"text-align: justify;\"><span style=\"text-align: justify;\"><span style=\"color: #ffffff; font-family: 'Open Sans';\">To master ETL\/ELT principles, batch and distributed processing;<\/span><\/span><\/h3>\n<h3 style=\"text-align: justify;\"><span style=\"text-align: justify;\"><span style=\"color: #ffffff; font-family: 'Open Sans';\">To study methods for ensuring data quality;<\/span><\/span><\/h3>\n<h3 style=\"text-align: justify;\"><span style=\"text-align: justify;\"><span style=\"color: #ffffff; font-family: 'Open Sans';\">To analyse principles of orchestration, reproducibility, reliability and observability of data pipelines;<\/span><\/span><\/h3>\n<h3 style=\"text-align: justify;\"><span style=\"text-align: justify;\"><span style=\"color: #ffffff; font-family: 'Open Sans';\">To build skills in project decomposition of engineering data solutions;<\/span><\/span><\/h3>\n<h3 style=\"text-align: justify;\"><span style=\"text-align: justify;\"><span style=\"color: #ffffff; font-family: 'Open Sans';\">To prepare for completing a coursework on the Big Data track topic;<\/span><\/span><\/h3>\n<h3 style=\"text-align: justify;\"><span style=\"text-align: justify;\"><span style=\"color: #ffffff; font-family: 'Open Sans';\">To foster a culture of conscious use of LLMs in engineering design and analysis.<\/span><\/span><\/h3>\n<\/div>\n<\/div><\/div><\/div><\/div><\/div><div id=\"pg-29165-5\"  class=\"panel-grid panel-has-style\" ><div class=\"panel-row-style panel-row-style-for-29165-5\" ><div id=\"pgc-29165-5-0\"  class=\"panel-grid-cell\" ><div id=\"panel-29165-5-0-0\" class=\"so-panel widget widget_sow-editor panel-first-child panel-last-child\" data-index=\"12\" ><div class=\"so-widget-sow-editor so-widget-sow-editor-base\">\n<div class=\"siteorigin-widget-tinymce textwidget\">\n\t<h3 style=\"text-align: justify; font-family: 'Open Sans';\"><span style=\"color: #000000;\"><strong>Main topics of the course:<\/strong><\/span><\/h3>\n<p style=\"text-align: justify; font-family: 'Open Sans';\"><strong>1. Introduction to Data Engineering. <\/strong> The role of data engineering in digital systems, differences from Data Science, analytics, and applied development. The lifecycle of an engineering data task \u2014 from problem statement and requirements to solution operation.<\/p>\n<p style=\"text-align: justify; font-family: 'Open Sans';\"><strong>2. Data Platform Architecture. <\/strong> Components: data sources, ingestion layer, storage, processing, result publication, monitoring, and quality control. Architecture as a way to reconcile requirements and constraints, not just a set of technologies.<\/p>\n<p style=\"text-align: justify; font-family: 'Open Sans';\"><strong>3. Data Origin as a Key Architectural Factor. <\/strong> Analysis of transactional sources, event streams, APIs, and regular data dumps. The impact of data origin on processing mode, update frequency, and quality requirements.<\/p>\n<p style=\"text-align: justify; font-family: 'Open Sans';\"><strong>4. Complex Data Acquisition Scenarios. <\/strong> Web scraping, regular collection from external sources, adversarial sources, and changing source scope. Technical, organisational, and legal constraints of data acquisition methods.<\/p>\n<p style=\"text-align: justify; font-family: 'Open Sans';\"><strong>5. Lifecycle of a Data Engineering Project. <\/strong> Stages: problem statement, requirements, source analysis, architectural design, pipeline development, quality control, operation, and evolution. The concept of architectural forks.<\/p>\n<p style=\"text-align: justify; font-family: 'Open Sans';\"><strong>6. Decomposition of an Engineering Project. <\/strong> Transition from a general idea to a sequence of steps: data collection, preparation, storage, processing, quality control, and result delivery. Analysis of stage dependencies and critical points.<\/p>\n<p style=\"text-align: justify; font-family: 'Open Sans';\"><strong>7. Data Storage Principles in Big Data Systems. <\/strong> Local, centralised, and distributed storage; file\u2011based, table\u2011based, and object\u2011based approaches. The impact of storage method on processing and maintenance.<\/p>\n<p style=\"text-align: justify; font-family: 'Open Sans';\"><strong>8. Batch Data Processing. <\/strong> ETL and ELT approaches, processing steps, launch window, idempotency, re\u2011runs, and result reproducibility. Practical construction of a batch processing scenario: reading, cleaning, transformation, aggregation, and result publication.<\/p>\n<p style=\"text-align: justify; font-family: 'Open Sans';\"><strong>9. Distributed Data Processing. <\/strong> Reasons for moving computations to a cluster, differences between heavy and light operations. Architectural significance of join, shuffle, and aggregation operations. Analysis of resource\u2011intensive operations and identification of bottlenecks.<\/p>\n<p style=\"text-align: justify; font-family: 'Open Sans';\"><strong>10. Data Quality, Cleaning, and Validation. <\/strong> Handling missing values, duplicates, incorrect data types, outliers, and business rule violations. Differences between technical validation, substantive checking, and data cleaning. Development of data quality rules for a dataset.<\/p>\n<p style=\"text-align: justify; font-family: 'Open Sans';\"><strong>11. Pipeline Orchestration and Processing Execution Management. <\/strong> Stage dependencies, scheduling, re\u2011runs, status control, and logging. Orchestration as a means of ensuring reproducibility and manageability (not just a launch schedule). Construction of a DAG (Directed Acyclic Graph) for a pipeline, analysis of success\/failure conditions and re\u2011execution requirements.<\/p>\n<p style=\"text-align: justify; font-family: 'Open Sans';\"><strong>12. Reliability, Maintenance, and Reproducibility of a Data Pipeline.<\/strong> Logging, version control, traceability, failure diagnostics, re\u2011execution, and change management. Analysis of typical incidents: source schema changes, incomplete loading, partial processing, and result quality violations. Development of a diagnostics and recovery plan for a failure scenario.<\/p>\n<p style=\"text-align: justify; font-family: 'Open Sans';\"><strong>13. Performance and Processing Optimisation. <\/strong> Identification of redundant steps, resource\u2011intensive join operations, repeated data reading, and unjustified intermediate layers. Trade\u2011off between solution speed and maintainability. Comparison of \u201cnaive\u201d and \u201cimproved\u201d pipeline variants: how acceleration is achieved and at what cost (increased architectural complexity). Detection of bottlenecks and assessment of optimisation feasibility.<\/p>\n<p style=\"text-align: justify; font-family: 'Open Sans';\"><strong>14. Overview Topics: Streaming Processing, Event\u2011Driven Approach, Data Lake, and Lakehouse. <\/strong> Conceptual introduction to technologies as a bridge to subsequent disciplines in the track (without in\u2011depth study). Comparison of batch processing, near real\u2011time processing, and Data Lake\/Lakehouse logic using a single case study. Analysis of requirements leading to architectural complexity. Distinction between dashboard, Data Lake, and Lakehouse logic based on complexity, cost, and applicability criteria.<\/p>\n<p style=\"text-align: justify; font-family: 'Open Sans';\"><strong>15. Project Integration into a Holistic Engineering System and Pre\u2011Defence. <\/strong> Presentation of project architecture, implementation stages, constraints, quality rules, and design decisions. Comprehensive project presentation: demonstration of component consistency and justification of architectural and engineering choices. Project refinement based on feedback: addressing weaknesses, clarifying constraints, quality rules, and pipeline stages.<\/p>\n<\/div>\n<\/div><\/div><\/div><\/div><\/div><div id=\"pg-29165-6\"  class=\"panel-grid panel-no-style\" ><div id=\"pgc-29165-6-0\"  class=\"panel-grid-cell\" ><div id=\"panel-29165-6-0-0\" class=\"so-panel widget widget_sow-editor panel-first-child panel-last-child\" data-index=\"13\" ><div class=\"so-widget-sow-editor so-widget-sow-editor-base\">\n<div class=\"siteorigin-widget-tinymce textwidget\">\n\t<p><a style=\"padding: 12px 24px; background: #1e8a8a; color: white; border: none; border-radius: 8px; font-family: Arial, sans-serif; font-size: 16px; font-weight: bold; cursor: pointer; box-shadow: 0 4px 8px rgba(30, 138, 138, 0.3); transition: all 0.3s ease; width: 100%; margin: 0; display: block; text-align: left; padding-left: 16px; text-decoration: none;\" href=\"http:\/\/hes.mephi.ru\/wp-content\/uploads\/2026\/04\/04.01-Big-Data-Engineering.pdf\" target=\"_blank\" rel=\"noopener noreferrer\">Download the extended description &gt;&gt;<\/a><\/p>\n<\/div>\n<\/div><\/div><\/div><\/div><div id=\"pg-29165-7\"  class=\"panel-grid panel-no-style\" ><div id=\"pgc-29165-7-0\"  class=\"panel-grid-cell\" ><div id=\"panel-29165-7-0-0\" class=\"so-panel widget widget_sow-editor panel-first-child panel-last-child\" data-index=\"14\" ><div class=\"so-widget-sow-editor so-widget-sow-editor-base\">\n<div class=\"siteorigin-widget-tinymce textwidget\">\n\t<p style=\"text-align: center;\"><a href=\"https:\/\/hes.mephi.ru\/?page_id=28339\">Return to the study plan overview<\/a><\/p>\n<\/div>\n<\/div><\/div><\/div><\/div><div id=\"pg-29165-8\"  class=\"panel-grid panel-has-style\" ><div class=\"siteorigin-panels-stretch panel-row-style panel-row-style-for-29165-8\" data-stretch-type=\"full\" ><div id=\"pgc-29165-8-0\"  class=\"panel-grid-cell\" ><div id=\"panel-29165-8-0-0\" class=\"so-panel widget widget_sow-editor panel-first-child panel-last-child\" data-index=\"15\" ><div class=\"so-widget-sow-editor so-widget-sow-editor-base\">\n<div class=\"siteorigin-widget-tinymce textwidget\">\n\t<p>&nbsp;<\/p>\n<h3 style=\"text-align: center;\"><span style=\"color: #ffffff;\">HES MEPhI<\/span><\/h3>\n<p style=\"text-align: center;\"><span style=\"color: #ffffff;\">+7 (495) 788-56-99 \u0434\u043e\u0431. 7691, 9570<\/span><br \/>\n<span style=\"color: #ffffff;\">+7 (929) 684-71-59<\/span><br \/>\n<strong><span style=\"color: #ff6600;\"><a style=\"color: #ff6600;\" href=\"mailto:hes@mephi.ru\" target=\"_blank\" rel=\"noopener\">hes@mephi.ru<\/a><\/span><\/strong><\/p>\n<p><a style=\"padding: 12px 24px; background: #CC4E5C; color: white; border: none; border-radius: 8px; font-family: Arial,sans-serif; font-size: 16px; font-weight: bold; cursor: pointer; box-shadow: 0 4px 6px rgba(204,78,92,0.4); transition: all 0.3s ease; width: 100%; margin: 0; display: block; text-align: center; text-decoration: none; position: relative; top: 0;\" href=\"https:\/\/forms.gle\/pyuPR9hYQ36Qo58GA\" target=\"_blank\" rel=\"noopener noreferrer\">Ask a question!<\/a><\/p>\n<\/div>\n<\/div><\/div><\/div><div id=\"pgc-29165-8-1\"  class=\"panel-grid-cell\" ><div id=\"panel-29165-8-1-0\" class=\"so-panel widget widget_sow-editor panel-first-child panel-last-child\" data-index=\"16\" ><div class=\"so-widget-sow-editor so-widget-sow-editor-base\">\n<div class=\"siteorigin-widget-tinymce textwidget\">\n\t<p>&nbsp;<\/p>\n<h3 style=\"text-align: center;\"><i class=\"fa-vk\"><\/i><a href=\"https:\/\/vk.com\/hesmephi\" target=\"_blank\" rel=\"noopener\">\u00a0<span style=\"color: #ff6600;\"> VK \/ Vkontakte<\/span><\/a><\/h3>\n<h3 style=\"text-align: center;\"><img decoding=\"async\" loading=\"lazy\" class=\"standard\" src=\"http:\/\/hes.mephi.ru\/wp-content\/uploads\/2022\/02\/zen-new-icon.png\" alt=\"\u0412\u0418\u0428 \u041c\u0418\u0424\u0418\" width=\"20\" height=\"20\" \/><a href=\"https:\/\/zen.yandex.ru\/id\/5e2972c4a1bb8700b092dbdd\" target=\"_blank\" rel=\"noopener\">\u00a0 <span style=\"color: #ff6600;\">Yandex.Dzen<\/span><\/a><\/h3>\n<h3 style=\"text-align: center;\"><img decoding=\"async\" loading=\"lazy\" class=\"standard\" src=\"http:\/\/hes.mephi.ru\/wp-content\/uploads\/2026\/03\/logo-max-hes-round-corners-black.png\" alt=\"\u0412\u0418\u0428 \u041c\u0418\u0424\u0418\" width=\"20\" height=\"20\" \/><span style=\"color: #ff6600;\"><a style=\"color: #ff6600;\" href=\"https:\/\/max.ru\/id7724068140_gos27\" target=\"_blank\" rel=\"noopener\"> MAX<\/a><\/span><\/h3>\n<h3 style=\"text-align: center;\"><img decoding=\"async\" loading=\"lazy\" class=\"standard\" src=\"http:\/\/hes.mephi.ru\/wp-content\/uploads\/2022\/04\/TG-black.jpg\" alt=\"\u0412\u0418\u0428 \u041c\u0418\u0424\u0418\" width=\"20\" height=\"20\" \/><a href=\"https:\/\/t.me\/hesmephi\" target=\"_blank\" rel=\"noopener\">\u00a0 <span style=\"color: #ff6600;\">Telegram<\/span><\/a><\/h3>\n<h3 style=\"text-align: center;\"><i class=\"fa-youtube\"><\/i><a href=\"https:\/\/www.youtube.com\/channel\/UChlJQWBMqKcweInG6YJiwIA\" target=\"_blank\" rel=\"noopener\">\u00a0 <span style=\"color: #ff6600;\">Youtube<\/span><\/a><\/h3>\n<h3 style=\"text-align: center;\"><img decoding=\"async\" loading=\"lazy\" class=\"standard\" src=\"http:\/\/hes.mephi.ru\/wp-content\/uploads\/2022\/03\/rutubelogo.jpg\" alt=\"\u0412\u0418\u0428 \u041c\u0418\u0424\u0418\" width=\"20\" height=\"20\" \/><a href=\"https:\/\/rutube.ru\/channel\/23944362\/\" target=\"_blank\" rel=\"noopener\">\u00a0 <span style=\"color: #ff6600;\">Rutube<\/span><\/a><\/h3>\n<p>&nbsp;<\/p>\n<\/div>\n<\/div><\/div><\/div><div id=\"pgc-29165-8-2\"  class=\"panel-grid-cell\" ><div id=\"panel-29165-8-2-0\" class=\"so-panel widget widget_sow-editor panel-first-child panel-last-child\" data-index=\"17\" ><div class=\"so-widget-sow-editor so-widget-sow-editor-base\">\n<div class=\"siteorigin-widget-tinymce textwidget\">\n\t<p>&nbsp;<\/p>\n<h3 style=\"text-align: center;\">NRNU MEPhI\u00a0Admissions Committee:<\/h3>\n<p style=\"text-align: center;\"><strong><span style=\"color: #ff6600;\"><a style=\"color: #ff6600;\" href=\"https:\/\/admission.mephi.ru\/\" target=\"_blank\" rel=\"noopener\">admission.mephi.ru<\/a><\/span><\/strong><\/p>\n<p style=\"text-align: center;\"><span style=\"color: #ffffff;\">115409, Moscow, Kashirskoe shosse, 31<\/span><\/p>\n<\/div>\n<\/div><\/div><\/div><\/div><\/div><\/div>","protected":false},"excerpt":{"rendered":"<p>DOWNLOAD THE FULL COURSE SYLLABUS BACK TO THE CURRICULUM BACK TO MASTER&#8217;S PROGRAM ABOUT HES MEPHI Big Data Engineering The course treats big data not only as large volumes of information, but as a specific class of engineering tasks, where data origin, ingestion mode, storage architecture, processing logic, quality requirements, observability and reproducibility form a unified project system. &nbsp; The [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"template-blank3.php","meta":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v18.2 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Big Data Engineering - \u0412\u0418\u0428 \u041c\u0418\u0424\u0418<\/title>\n<meta name=\"description\" content=\"The course is dedicated to the study of machine learning (ML) and deep learning (DL), with an emphasis on the engineering application of these technologies in digital systems. The programme covers the full cycle of working with ML models: from problem formulation and data analysis to training, quality assessment, deployment, and maintenance. Students will learn how to formalise applied problems as ML tasks, select appropriate models and evaluation metrics, work with different types of data (images, sequences, text), as well as design ML pipelines and take into account the operational constraints of real\u2011world systems.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/hes.mephi.ru\/?page_id=29165\" \/>\n<meta property=\"og:locale\" content=\"ru_RU\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Big Data Engineering - \u0412\u0418\u0428 \u041c\u0418\u0424\u0418\" \/>\n<meta property=\"og:description\" content=\"The course is dedicated to the study of machine learning (ML) and deep learning (DL), with an emphasis on the engineering application of these technologies in digital systems. The programme covers the full cycle of working with ML models: from problem formulation and data analysis to training, quality assessment, deployment, and maintenance. Students will learn how to formalise applied problems as ML tasks, select appropriate models and evaluation metrics, work with different types of data (images, sequences, text), as well as design ML pipelines and take into account the operational constraints of real\u2011world systems.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/hes.mephi.ru\/?page_id=29165\" \/>\n<meta property=\"og:site_name\" content=\"\u0412\u0418\u0428 \u041c\u0418\u0424\u0418\" \/>\n<meta property=\"article:modified_time\" content=\"2026-04-27T21:01:24+00:00\" \/>\n<meta property=\"og:image\" content=\"http:\/\/hes.mephi.ru\/wp-content\/uploads\/2026\/04\/Logo_Vish_eng-1.png\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"\u041f\u0440\u0438\u043c\u0435\u0440\u043d\u043e\u0435 \u0432\u0440\u0435\u043c\u044f \u0434\u043b\u044f \u0447\u0442\u0435\u043d\u0438\u044f\" \/>\n\t<meta name=\"twitter:data1\" content=\"8 \u043c\u0438\u043d\u0443\u0442\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebSite\",\"@id\":\"https:\/\/hes.mephi.ru\/#website\",\"url\":\"https:\/\/hes.mephi.ru\/\",\"name\":\"\u0412\u0418\u0428 \u041c\u0418\u0424\u0418\",\"description\":\"\u0412\u044b\u0441\u0448\u0430\u044f \u0438\u043d\u0436\u0438\u043d\u0438\u0440\u0438\u043d\u0433\u043e\u0432\u0430\u044f \u0448\u043a\u043e\u043b\u0430 \u041d\u0418\u042f\u0423 \u041c\u0418\u0424\u0418\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/hes.mephi.ru\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"ru-RU\"},{\"@type\":\"ImageObject\",\"@id\":\"https:\/\/hes.mephi.ru\/?page_id=29165#primaryimage\",\"inLanguage\":\"ru-RU\",\"url\":\"http:\/\/hes.mephi.ru\/wp-content\/uploads\/2026\/04\/Logo_Vish_eng-1.png\",\"contentUrl\":\"http:\/\/hes.mephi.ru\/wp-content\/uploads\/2026\/04\/Logo_Vish_eng-1.png\"},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/hes.mephi.ru\/?page_id=29165#webpage\",\"url\":\"https:\/\/hes.mephi.ru\/?page_id=29165\",\"name\":\"Big Data Engineering - \u0412\u0418\u0428 \u041c\u0418\u0424\u0418\",\"isPartOf\":{\"@id\":\"https:\/\/hes.mephi.ru\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/hes.mephi.ru\/?page_id=29165#primaryimage\"},\"datePublished\":\"2026-04-27T16:30:45+00:00\",\"dateModified\":\"2026-04-27T21:01:24+00:00\",\"description\":\"The course is dedicated to the study of machine learning (ML) and deep learning (DL), with an emphasis on the engineering application of these technologies in digital systems. The programme covers the full cycle of working with ML models: from problem formulation and data analysis to training, quality assessment, deployment, and maintenance. Students will learn how to formalise applied problems as ML tasks, select appropriate models and evaluation metrics, work with different types of data (images, sequences, text), as well as design ML pipelines and take into account the operational constraints of real\u2011world systems.\",\"breadcrumb\":{\"@id\":\"https:\/\/hes.mephi.ru\/?page_id=29165#breadcrumb\"},\"inLanguage\":\"ru-RU\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/hes.mephi.ru\/?page_id=29165\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/hes.mephi.ru\/?page_id=29165#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"\u0413\u043b\u0430\u0432\u043d\u0430\u044f \u0441\u0442\u0440\u0430\u043d\u0438\u0446\u0430\",\"item\":\"https:\/\/hes.mephi.ru\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Big Data Engineering\"}]}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Big Data Engineering - \u0412\u0418\u0428 \u041c\u0418\u0424\u0418","description":"The course is dedicated to the study of machine learning (ML) and deep learning (DL), with an emphasis on the engineering application of these technologies in digital systems. The programme covers the full cycle of working with ML models: from problem formulation and data analysis to training, quality assessment, deployment, and maintenance. Students will learn how to formalise applied problems as ML tasks, select appropriate models and evaluation metrics, work with different types of data (images, sequences, text), as well as design ML pipelines and take into account the operational constraints of real\u2011world systems.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/hes.mephi.ru\/?page_id=29165","og_locale":"ru_RU","og_type":"article","og_title":"Big Data Engineering - \u0412\u0418\u0428 \u041c\u0418\u0424\u0418","og_description":"The course is dedicated to the study of machine learning (ML) and deep learning (DL), with an emphasis on the engineering application of these technologies in digital systems. The programme covers the full cycle of working with ML models: from problem formulation and data analysis to training, quality assessment, deployment, and maintenance. Students will learn how to formalise applied problems as ML tasks, select appropriate models and evaluation metrics, work with different types of data (images, sequences, text), as well as design ML pipelines and take into account the operational constraints of real\u2011world systems.","og_url":"https:\/\/hes.mephi.ru\/?page_id=29165","og_site_name":"\u0412\u0418\u0428 \u041c\u0418\u0424\u0418","article_modified_time":"2026-04-27T21:01:24+00:00","og_image":[{"url":"http:\/\/hes.mephi.ru\/wp-content\/uploads\/2026\/04\/Logo_Vish_eng-1.png"}],"twitter_card":"summary_large_image","twitter_misc":{"\u041f\u0440\u0438\u043c\u0435\u0440\u043d\u043e\u0435 \u0432\u0440\u0435\u043c\u044f \u0434\u043b\u044f \u0447\u0442\u0435\u043d\u0438\u044f":"8 \u043c\u0438\u043d\u0443\u0442"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebSite","@id":"https:\/\/hes.mephi.ru\/#website","url":"https:\/\/hes.mephi.ru\/","name":"\u0412\u0418\u0428 \u041c\u0418\u0424\u0418","description":"\u0412\u044b\u0441\u0448\u0430\u044f \u0438\u043d\u0436\u0438\u043d\u0438\u0440\u0438\u043d\u0433\u043e\u0432\u0430\u044f \u0448\u043a\u043e\u043b\u0430 \u041d\u0418\u042f\u0423 \u041c\u0418\u0424\u0418","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/hes.mephi.ru\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"ru-RU"},{"@type":"ImageObject","@id":"https:\/\/hes.mephi.ru\/?page_id=29165#primaryimage","inLanguage":"ru-RU","url":"http:\/\/hes.mephi.ru\/wp-content\/uploads\/2026\/04\/Logo_Vish_eng-1.png","contentUrl":"http:\/\/hes.mephi.ru\/wp-content\/uploads\/2026\/04\/Logo_Vish_eng-1.png"},{"@type":"WebPage","@id":"https:\/\/hes.mephi.ru\/?page_id=29165#webpage","url":"https:\/\/hes.mephi.ru\/?page_id=29165","name":"Big Data Engineering - \u0412\u0418\u0428 \u041c\u0418\u0424\u0418","isPartOf":{"@id":"https:\/\/hes.mephi.ru\/#website"},"primaryImageOfPage":{"@id":"https:\/\/hes.mephi.ru\/?page_id=29165#primaryimage"},"datePublished":"2026-04-27T16:30:45+00:00","dateModified":"2026-04-27T21:01:24+00:00","description":"The course is dedicated to the study of machine learning (ML) and deep learning (DL), with an emphasis on the engineering application of these technologies in digital systems. The programme covers the full cycle of working with ML models: from problem formulation and data analysis to training, quality assessment, deployment, and maintenance. Students will learn how to formalise applied problems as ML tasks, select appropriate models and evaluation metrics, work with different types of data (images, sequences, text), as well as design ML pipelines and take into account the operational constraints of real\u2011world systems.","breadcrumb":{"@id":"https:\/\/hes.mephi.ru\/?page_id=29165#breadcrumb"},"inLanguage":"ru-RU","potentialAction":[{"@type":"ReadAction","target":["https:\/\/hes.mephi.ru\/?page_id=29165"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/hes.mephi.ru\/?page_id=29165#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"\u0413\u043b\u0430\u0432\u043d\u0430\u044f \u0441\u0442\u0440\u0430\u043d\u0438\u0446\u0430","item":"https:\/\/hes.mephi.ru\/"},{"@type":"ListItem","position":2,"name":"Big Data Engineering"}]}]}},"_links":{"self":[{"href":"https:\/\/hes.mephi.ru\/index.php?rest_route=\/wp\/v2\/pages\/29165"}],"collection":[{"href":"https:\/\/hes.mephi.ru\/index.php?rest_route=\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/hes.mephi.ru\/index.php?rest_route=\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/hes.mephi.ru\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/hes.mephi.ru\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=29165"}],"version-history":[{"count":10,"href":"https:\/\/hes.mephi.ru\/index.php?rest_route=\/wp\/v2\/pages\/29165\/revisions"}],"predecessor-version":[{"id":29187,"href":"https:\/\/hes.mephi.ru\/index.php?rest_route=\/wp\/v2\/pages\/29165\/revisions\/29187"}],"wp:attachment":[{"href":"https:\/\/hes.mephi.ru\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=29165"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}